Add force_add to OmegaConf.update(), effectively using open_dict for all nodes along the path #665

omry · 2021-04-07T01:00:31Z

Closes #664.

…all nodes along the path

omry · 2021-04-07T01:21:47Z

benchmark/benchmark.py

+def test_update_force_add(
+    large_dict_config: Any, force_add: bool, benchmark: Any
+) -> None:


Added a benchmark to verify the excessive flag overrides are not causing performance regression.
The concern is that the recursive flags cache invalidation when a flag is set will slow things down.
It does not seem to cause a significant slowdown for a large dict_config here though.

I don't think there's too much to worry here because in the worst case it would only invalidate the whole config cache once: after it's cleared, subsequent cache invalidation wouldn't trickle down anymore.
That being said, I'm not sure if large_dict_config actually comes from a cache set on all config nodes, which would be needed to evaluate this worst case scenario. If that's not the case then the benchmark isn't very relevant (at least to evaluate the impact of cache invalidation on performance).

I also suggested an optimization that should reduce the amount of cache invalidation.

Yeah, I didn't notice the gating in _invalidate_flags_cache.
Good point about the benchmark not doing much.

I fixed the benchmark to populate the flags cache on all nodes. We don't have a baseline though.
Overall I am not too concerned about relative performance here though, at 400us per call on the big config this is plenty fast enough.

docs/source/usage.rst

odelalleau

I only had time for a quick look, I'll need to re-review it more carefully, but already sharing a few comments

docs/source/usage.rst

news/664.feature

omegaconf/omegaconf.py

tests/test_update.py

omry

Updated, thanks.

omegaconf/omegaconf.py

tests/test_update.py

odelalleau

Overall looks good, just a few comments & suggestions

odelalleau · 2021-04-07T12:16:58Z

omegaconf/omegaconf.py

+            struct_override = False if force_add else root._get_node_flag("struct")
+            with flag_override(root, "struct", struct_override):
+                k = split[i]
+                # if next_root is a primitive (string, int etc) replace it with an empty map
+                next_root, key_ = _select_one(root, k, throw_on_missing=False)
+                if not isinstance(next_root, Container):
+                    root[key_] = {}
+                root = root[key_]


Optimization by only calling flag_override when needed:

Suggested change

struct_override = False if force_add else root._get_node_flag("struct")

with flag_override(root, "struct", struct_override):

k = split[i]

# if next_root is a primitive (string, int etc) replace it with an empty map

next_root, key_ = _select_one(root, k, throw_on_missing=False)

if not isinstance(next_root, Container):

root[key_] = {}

root = root[key_]

k = split[i]

# if next_root is a primitive (string, int etc) replace it with an empty map

next_root, key_ = _select_one(root, k, throw_on_missing=False)

if not isinstance(next_root, Container):

if force_add:

with flag_override(root, "struct", False):

root[key_] = {}

else:

root[key_] = {}

root = root[key_]

The code here will not apply cleanly, (see syntax error in 775).
In fact we do have a does_not_raise context manager similar to nullcontext in the tests. we can promote it to utils.
Feel free to followup with a PR optimizing both places.

Ah, oops, looks like my mouse select skills need improving. I just added the missing bracket.

odelalleau · 2021-04-07T12:29:44Z

benchmark/benchmark.py

+def test_update_force_add(
+    large_dict_config: Any, force_add: bool, benchmark: Any
+) -> None:


I don't think there's too much to worry here because in the worst case it would only invalidate the whole config cache once: after it's cleared, subsequent cache invalidation wouldn't trickle down anymore.
That being said, I'm not sure if large_dict_config actually comes from a cache set on all config nodes, which would be needed to evaluate this worst case scenario. If that's not the case then the benchmark isn't very relevant (at least to evaluate the impact of cache invalidation on performance).

I also suggested an optimization that should reduce the amount of cache invalidation.

odelalleau · 2021-04-07T12:35:13Z

omegaconf/omegaconf.py

+        struct_override = False if force_add else root._get_node_flag("struct")
+        with flag_override(root, "struct", struct_override):


A similar optimization could be done to avoid overriding for no reason. Something like:

Suggested change

struct_override = False if force_add else root._get_node_flag("struct")

with flag_override(root, "struct", struct_override):

ctx = flag_override(root, "struct", False) if force_add else nullcontext()

with ctx:

However nullcontext was only aldded in 3.7 so it would require writing our own implementation for 3.6 in _utils.py.

tests/test_update.py

odelalleau · 2021-04-07T13:02:14Z

omegaconf/omegaconf.py

@@ -739,7 +744,8 @@ def update(
        :param merge: If value is a dict or a list, True for merge, False for set.
            True to merge
            False to set
-            None (default) : deprecation warning and default to False
+            None (default): deprecation warning and default to False
+        :param force_add: insert the entire path regardless of Struct flag or Structured Config nodes.


It may be surprising for force_add not to allow adding to readonly configs, is that intended?

(if it's changed, it may be easier to rename it into force so that it can force both adding new values and overriding existing ones, as otherwise you'd need to make the difference between both situations for readonly configs, which seems tricky)

The reported problem is specifically about adding a field to a Structured Config.
The original implementation in Hydra was using open_dict on the root of the config and this was not sufficient because of Structured Configs are not inheriting the struct flag from their parent.

read-only is used as a mechanism to protect against changes to nodes here.
In contrast to the reported problem, using with read_write(cfg): will work fine because it's behavior us "normal" and in fact frozen support for dataclasses is implemented by setting the recursive read-only flag on the DictConfig node.

I see. I feel like it may be possible to have a version of open_dict() (or, more generally, flag_override()) that behaves as if all children had their flag set to the desired value (which would avoid the issue of intermediate nodes causing trouble because of their local flags). It'd be a bit tricky to get it to work efficiently though (you don't want to actually propagate the desired flag to all children -- only those whose flags are being accessed within the context)...

omry added 2 commits April 6, 2021 17:43

Add force_add to OmegaConf.update(), effectively using open_dict for …

6567068

…all nodes along the path

added a benchmark and updated docs

3d26236

omry requested review from odelalleau and Jasha10 April 7, 2021 01:06

omry commented Apr 7, 2021

View reviewed changes

updated news and usage

431c095

odelalleau requested changes Apr 7, 2021

View reviewed changes

docs/source/usage.rst Outdated Show resolved Hide resolved

news/664.feature Outdated Show resolved Hide resolved

omegaconf/omegaconf.py Show resolved Hide resolved

tests/test_update.py Show resolved Hide resolved

feedback

d7cf7fb

omry commented Apr 7, 2021

View reviewed changes

omegaconf/omegaconf.py Show resolved Hide resolved

tests/test_update.py Show resolved Hide resolved

odelalleau reviewed Apr 7, 2021

View reviewed changes

updated benchmark to populate flags cache

02a0c6b

omry merged commit 9e7df56 into master Apr 7, 2021

omry deleted the update_force_add branch April 7, 2021 23:04

This was referenced Apr 7, 2021

Optimization in update() #669

Merged

Additional optimization to update() without force_add #673

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add force_add to OmegaConf.update(), effectively using open_dict for all nodes along the path #665

Add force_add to OmegaConf.update(), effectively using open_dict for all nodes along the path #665

omry commented Apr 7, 2021 •

edited

Loading

omry Apr 7, 2021

odelalleau Apr 7, 2021

omry Apr 7, 2021 •

edited

Loading

odelalleau left a comment

omry left a comment

odelalleau left a comment

odelalleau Apr 7, 2021 •

edited

Loading

omry Apr 7, 2021

odelalleau Apr 7, 2021

odelalleau Apr 7, 2021

odelalleau Apr 7, 2021

odelalleau Apr 7, 2021

omry Apr 7, 2021 •

edited

Loading

odelalleau Apr 7, 2021

		struct_override = False if force_add else root._get_node_flag("struct")
		with flag_override(root, "struct", struct_override):

Add force_add to OmegaConf.update(), effectively using open_dict for all nodes along the path #665

Add force_add to OmegaConf.update(), effectively using open_dict for all nodes along the path #665

Conversation

omry commented Apr 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

omry Apr 7, 2021 • edited Loading

Choose a reason for hiding this comment

odelalleau left a comment

Choose a reason for hiding this comment

omry left a comment

Choose a reason for hiding this comment

odelalleau left a comment

Choose a reason for hiding this comment

odelalleau Apr 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

omry Apr 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

omry commented Apr 7, 2021 •

edited

Loading

omry Apr 7, 2021 •

edited

Loading

odelalleau Apr 7, 2021 •

edited

Loading

omry Apr 7, 2021 •

edited

Loading