-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new resolvers oc.dict.keys
and oc.dict.values
#644
Conversation
Something else: I just remembered that I have a WIP change that reverts the automatic conversion of list/dict resolver outputs, relying on explicitly using Do you prefer that |
Side notes:
|
I added an item in #535 for point 1. |
541, I just milestoned it for 2.1. I think it should be straight forward at this point. |
omegaconf/built_in_resolvers.py
Outdated
dict_method = getattr(in_dict, keys_or_values) | ||
assert isinstance(_parent_, BaseContainer) | ||
ret = OmegaConf.create(list(dict_method()), parent=_parent_) | ||
assert isinstance(ret, ListConfig) | ||
return ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO would be cleaner to return the dict/DictConfig object and allow the caller to just call the appropriate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in cdc60a8
omegaconf/built_in_resolvers.py
Outdated
return _dict_impl( | ||
keys_or_values="values", in_dict=in_dict, _root_=_root_, _parent_=_parent_ | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the desired behavior if there is a missing value in the doct selected from?
I think right now this will raise an exception on the call to access the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it's fine, I added tests in 57c3613
We may consider raising an InterpolationToMissingValueError
instead of a generic InterpolationResolutionError
wrapping a MissingMandatoryValue
exception. I can see arguments for either option, so let me know how you feel about this.
docs/source/usage.rst
Outdated
... # Config node name `machines` as input. | ||
... "ips": "${oc.dict.values:machines}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are changing two things at once, makes it a bit confusing (using os.dict.values instead of oc.dict.keys and using the alternative selection method). Update the comment to reflect those two changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearer in 6898d48?
def test_dict_keys(restore_resolvers: Any, cfg: Any, key: Any, expected: Any) -> None: | ||
OmegaConf.register_new_resolver("sum", lambda x: sum(x)) | ||
|
||
cfg = OmegaConf.create(cfg) | ||
val = cfg[key] | ||
assert val == expected | ||
assert type(val) is type(expected) | ||
|
||
if isinstance(val, ListConfig): | ||
assert val._parent is cfg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a bit cleaner to split into a basic test_dict_keys function and one that tests it with nesting into sum.
Beyond making the test simpler, the input data can also become pure lists (as the expected output type is always ListConfig).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 018d521
def test_dict_missing(cfg: Any) -> None: | ||
cfg = OmegaConf.create(cfg) | ||
with raises(InterpolationResolutionError, match="MissingMandatoryValue"): | ||
cfg.x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One none obvious behavior here is that those two functions are resolving interpolations eagerly.
OmegaConf interpolations are generally resolved on access.
This can lead to surprising behavior.
device: cuda
models_zoo:
resnet50:
device: ${device}
# ...
alexnet:
device: ${device}
# ...
test_models: ${oc.dict.values:models_zoo}
cfg = OmegaConf.create(cfg)
tester = ModelsTester(cfg.test_models)
if not torch.cuda.is_available():
cfg.device = 'cpu'
# At this point, the device in ModelsTester will no longer track cfg.device so the change will not be reflected.
# This is in contrast to the case where `test_models` was a "real" list in the yaml file.
I think this is not a huge deal, but it's solvable by iterating on the underlying _content and copying the contained nodes directly).
This would also make the MISSING behavior consistent (and lazy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's an interesting idea. I have a PoC that works with your example (using [v for _, v in in_dict.items_ex(resolve=False)]
instead of list(in_dict.values())
for DictConfigs.
There's a problem though: relative interpolations get broken if we do this. One potential solution (besides trying to "fix" relative references) would be instead to create nodes in the output ListConfig that are interpolated to the target nodes... but this would start getting a bit hairy => maybe best left for later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing the parent after copying the nodes to _parent_
should make relative interpolations be relative to the the node with ${oc.dict.values:...}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ret = []
for _, v in in_dict.items_ex(resolve=False):
vc = copy.deepcopy(v)
vc._set_parent(_parent_)
ret.append(vc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, although I agree that it would be nice to support this, note that the current behavior is actually (at least partially) consistent with "interpolations are resolved on access": the reason it doesn't work in your example is because you compute cfg.test_models
only once, but if you re-access it after changing cfg.device
, you will get the updated version.
(just making sure that part is clear, doesn't change that I agree "deeper" dynamism would be better)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tell me how such config nodes interact with OmegaConf.merge()?
Going to depend on the exact scenario (what are you merging into what?). Not sure it's worth spending too much time on it now -- people merging dynamically generated configs deserve whatever happens to them.
Anyway, I forced-pushed 2e604f9 to override my initial implementation with a new one that doesn't introduce any new low-level mechanics. An error referring to #650 is raised in case someone tries to extract values from a transient DictConfig that's not accessible through a path from the root. I'll fill in the content of #650 once we agree this is a good way to go.
(I force-pushed it as I assume it'd be easier for you to review, but let me know if you need the diff w.r.t. my initial implementation too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to depend on the exact scenario (what are you merging into what?). Not sure it's worth spending too much time on it now -- people merging dynamically generated configs deserve whatever happens to them.
Dynamically generating config nodes is an alternative to config composition.
Saying people using both deserves whatever happens is not a great approach.
If this feature is not compatible with merging we should be explicit about it and not leave it undefined.
In particular, if you are merging into an interpolating node, the node is expanded and inlined now:
In [4]: cfg = OmegaConf.create({"a": {"b": 10, "c": 20}, "z": "${a}"})
In [5]: cfg
Out[5]: {'a': {'b': 10, 'c': 20}, 'z': '${a}'}
In [6]: OmegaConf.merge(cfg, {"z": {"d": 30}})
Out[6]: {'a': {'b': 10, 'c': 20}, 'z': {'b': 10, 'c': 20, 'd': 30}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, if you are merging into an interpolating node, the node is expanded and inlined now:
I guess people deserve it to work the same way:
OmegaConf.register_new_resolver(
"make", lambda _parent_: OmegaConf.create({"b": 10, "c": 20}, parent=_parent_), replace=True)
cfg = OmegaConf.create({"z": "${make:}"})
OmegaConf.merge(cfg, {"z": {"d": 30}})
# {'z': {'b': 10, 'c': 20, 'd': 30}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, can you add a test covering this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in e87784c (I also rebased on top of master)
0cc68f2
to
2e604f9
Compare
388f52a
to
e87784c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through it except the tests.
Some requests/questions inline.
Also, while playing with it, I found a bug:
from omegaconf import OmegaConf
cfg = OmegaConf.create(
{
"a": {
"b": {
"x1": 10,
},
"abs_key": "${oc.dict.values:a.b}",
"abs_dc": "${oc.dict.values:${a.b}}",
# "relative_key": "${oc.dict.values:.b}", # this one is breaking
"relative_dc": "${oc.dict.values:${.b}}",
},
"abs_key": "${oc.dict.values:a.b}",
"abs_dc": "${oc.dict.values:${a.b}}",
"relative_key": "${oc.dict.values:.a.b}",
"relative_dc": "${oc.dict.values:${.a.b}}",
}
)
print("=== unresolved\n\n", OmegaConf.to_yaml(cfg))
print("=== resolved\n\n", OmegaConf.to_yaml(cfg, resolve=True))
omegaconf/built_in_resolvers.py
Outdated
dict_key: Optional[str] = None | ||
if in_dict._get_root() is _root_: | ||
# Try to obtain the full key through which we can access `in_dict`. | ||
if in_dict is _root_: | ||
dict_key = "" | ||
else: | ||
dict_key = in_dict._get_full_key(None) | ||
if dict_key: | ||
dict_key += "." # append dot for future concatenation | ||
else: | ||
# This can happen e.g. if `in_dict` is a transient node. | ||
dict_key = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the root check is actually not enough.
We want two things:
- The root of the input DictConfig is the same as our root.
- Each node along the path is a child of it's parent.
cfg = OmegaConf.create({})
fake_child = OmegaConf.create({}, parent=cfg)
Even though fake_child is in the same root, an interpolation into fake_child is not possible.
I think we want to ensure that OmegaConf.select(root, dict_key)
is indeed able to select the in_dict object.
If this succeeds, we don't even need to validate that it's the same root - it goes without saying.
Maybe this is all you need:
dict_key = in_dict._get_full_key(None)
if OmegaConf.select(_root_, dict_key) is not in_dict:
# error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg = OmegaConf.create({}) fake_child = OmegaConf.create({}, parent=cfg)Even though fake_child is in the same root, an interpolation into fake_child is not possible.
But here fake_child._get_full_key(None)
is equal to ""
so it's properly detected as invalid.
I guess someone could create a fake child with a fake key as well, that couldn't be obtained from the parent, but at this point I think it's fair for some stuff to break.
OmegaConf.select(_root_, dict_key) is not in_dict
wouldn't actually work for dynamically generated DictConfigs (in addition to add extra overhead). See the test test_dict_values_dictconfig_resolver_output
for what I mean by that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But here fake_child._get_full_key(None) is equal to "" so it's properly detected as invalid.
I guess someone could create a fake child with a fake key as well, that couldn't be obtained from the parent, but at this point I think it's fair for some stuff to break.
It could also be a real node incorrectly reused by changing it's parent without copying it (leaving the whole config broken).
OmegaConf.select(root, dict_key) is not in_dict wouldn't actually work for dynamically generated DictConfigs (in addition to add extra overhead). See the test test_dict_values_dictconfig_resolver_output for what I mean by that.
Yes. It won't. and I think we should not support them now.
Select is just calling _get_node(key) iteratively with fragments of the key, it should be very efficient. (we only need to validate the input node, not all the generated interpolation nodes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Select is just calling _get_node(key) iteratively with fragments of the key, it should be very efficient. (we only need to validate the input node, not all the generated interpolation nodes).
Just flagging that select()
still has to resolve interpolations along the way (if any).
Maybe if we only support the "select" syntax, i.e. ${oc.dict.values:abc.def}
, then it will be easier since we directly know the correct path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omegaconf/base.py
Outdated
# The interpolation is returning a transient (key-less) node attached | ||
# to the current parent. By setting its key to this node's key, we make | ||
# it possible to refer to it through an interpolation path. | ||
resolved._set_key(value._key()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When resolving an interpolation key and walking up the hierarchy, it is not enough that a node knows is key in the parent container.
the node should actually be in the _content_
of the parent (ListConfig or DictConfig).
I don't think this is actually achieving the goal (and I don't think this is actually a goal of this PR, I thought we agreed that we only support real DictConfig objects in oc.dict.{list,values}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's working for the use case I wanted to support, which is the one in test_dict_values_dictconfig_resolver_output()
.
We agreed not to support DictConfig objects that couldn't be accessed through interpolations. The specific change you quoted makes it possible to access dynamically generated DictConfig objects through interpolations.
Edit: re-reading what I wrote, that last sentence can be confusing / misleading. A better version (I think) would be: "The specific change you quoted makes it possible for resolvers like oc.dict.values
to know which full key to use to access dynamically generated input nodes through interpolations".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I will take a closer look at this test to form an opinion (could be a few days).
Hold with any changes for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be resolved by 27e59f8
omegaconf/built_in_resolvers.py
Outdated
if in_dict is _DEFAULT_SELECT_MARKER_: | ||
raise ConfigKeyError(f"Key not found: '{key}'") | ||
|
||
if not isinstance(in_dict, Mapping): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we limit to DictConfig nodes, don't forget to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was done in 27e59f8
I didn't think of relative interpolations with That being said, I couldn't get relative interpolations to work with from omegaconf import OmegaConf
cfg = OmegaConf.create({"a": {"b": 0}, "b": 1})
assert OmegaConf.select(cfg, ".b", default=0) is None
assert OmegaConf.select(cfg.a, ".b", default=0) is None
assert OmegaConf.select(cfg._get_node("a"), ".b", default=0) is None |
So, for now I just explicitly removed support for relative interpolations with the |
I looked at test_dict_values_dictconfig_resolver_output. I think oc.dict.values() should support only permanent config nodes in the same config tree. from omegaconf import OmegaConf, Node, DictConfig, Container
cfg = OmegaConf.create(({"a": {"b": 10}}))
def is_permanent_member_of(node: Node, root: Container) -> bool:
if node is root:
return True
parent = node._get_parent()
if parent is None:
return False
node_in_parent = parent._get_node(
node._key(),
throw_on_missing_value=False,
throw_on_missing_key=False,
validate_access=False,
)
if node_in_parent is not node:
return False
# could also be implemented iteratively.
return is_permanent_member_of(parent, root)
# positive cases
assert is_permanent_member_of(cfg, root=cfg)
assert is_permanent_member_of(cfg.a, root=cfg)
assert is_permanent_member_of(cfg.a._get_node("b"), root=cfg)
# negative cases
assert not is_permanent_member_of(DictConfig(content={}), root=cfg)
assert not is_permanent_member_of(DictConfig(content={}, parent=cfg), root=cfg)
assert not is_permanent_member_of(DictConfig(content={}, parent=cfg.a), root=cfg) As for the select bug, see #656. |
Ok, I'll change it in the next few days. |
988c238
to
27e59f8
Compare
This is still a WIP but @omry can you please have a look at 27e59f8 to tell me if you agree with the direction? I'll just wait for your green light before updating the tests & documentation. |
I think that's a good idea. |
Alright done in 5a1f49c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. a few nits about docs and news fragment.
docs/source/usage.rst
Outdated
Some config options that are stored as a ``DictConfig`` may sometimes be easier to manipulate as lists, | ||
when we care only about the keys or the associated values. | ||
|
||
The resolvers ``oc.dict.keys`` and ``oc.dict.values`` simplify such operations by extracting respectively |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracting sounds like it's creating a new list.
Maybe something along the lines of "by offering an alternative view of a dictionary's keys or values as a list"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracting sounds like it's creating a new list.
It is creating a new list.
Maybe something along the lines of "by offering an alternative view of a dictionary's keys or values as a list"
Done in 70e61fc
news/643.feature
Outdated
New resolvers `oc.dict.keys` and `oc.dict.values` allow extracting the lists of keys and values of config nodes | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New resolvers `oc.dict.keys` and `oc.dict.values` allow extracting the lists of keys and values of config nodes | |
New resolvers `oc.dict.keys` and `oc.dict.values` provides a list view of the keys or values or a DictConfig node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 8a65897 (up to minor typo fixes)
import os | ||
import warnings | ||
from typing import Any, List, Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In retrospect, it would have been better to make the refactoring a standalone pull request.
With so much moved code, it makes it hard to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was initially meant to be a very simple PR easily reviewed by individual commits... I should have cleaned it up when it started getting more complex.
To make it easier I rebased to have the two refactoring commits as the first two commits, and I created a dummy PR on my fork where you can see the diff without these refactoring commits: https://github.com/odelalleau/omegaconf/pull/8/files
omegaconf/built_in_resolvers.py
Outdated
def dict_keys( | ||
key: str, | ||
_parent_: Container, | ||
) -> ListConfig: | ||
assert isinstance(_parent_, BaseContainer) | ||
|
||
in_dict = _get_and_validate_dict_input( | ||
key, parent=_parent_, resolver_name="oc.dict.keys" | ||
) | ||
|
||
ret = OmegaConf.create(list(in_dict.keys()), parent=_parent_) | ||
assert isinstance(ret, ListConfig) | ||
return ret | ||
|
||
|
||
def dict_values(key: str, _root_: BaseContainer, _parent_: Container) -> ListConfig: | ||
assert isinstance(_parent_, BaseContainer) | ||
in_dict = _get_and_validate_dict_input( | ||
key, parent=_parent_, resolver_name="oc.dict.values" | ||
) | ||
|
||
content = in_dict._content | ||
assert isinstance(content, dict) | ||
|
||
ret = ListConfig([]) | ||
for k, node in content.items(): | ||
ref_node = AnyNode(f"${{{key}.{k}}}") | ||
ret.append(ref_node) | ||
|
||
# Finalize result by setting proper type and parent. | ||
element_type: Any = in_dict._metadata.element_type | ||
ret._metadata.element_type = element_type | ||
ret._metadata.ref_type = List[element_type] | ||
ret._set_parent(_parent_) | ||
|
||
return ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, so much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it again, there was one more simplification to do: ccba9b8
5f51751
to
fe3a692
Compare
This pull request introduces 1 alert when merging ccba9b8 into e1a599f - view on LGTM.com new alerts:
|
No functional change
No functional change
* Can now take a string as input for convenience * The output is always a ListConfig, whose parent is the parent of the node being processed
Of particular note: * When the result of an interpolation is a node whose parent is the current node's parent, but has no key, then we set its key to the current node's key. This makes it possible to use its full key as an identifier. * _get_and_validate_dict_input() now properly raises an exception if the desired key does not exist
Co-authored-by: Omry Yadan <[email protected]>
Plus some additional code clean-up.
ccba9b8
to
d1a4ef7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shipit!
Fixes #643
Better check commits separately: the meaningful one is the second one, the other two are refactor commits with no functional changes.