Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new resolver oc.create #677

Merged
merged 15 commits into from
Apr 16, 2021
Merged

Add new resolver oc.create #677

merged 15 commits into from
Apr 16, 2021

Conversation

odelalleau
Copy link
Collaborator

@odelalleau odelalleau commented Apr 12, 2021

Probably best reviewed commit by commit:

  • abc60ae makes it so that dictionaries and lists output by resolvers aren't converted into DictConfig and ListConfig automatically anymore (something we discussed a while back). It breaks a few things which is why it's not an independent PR by itself (tests marked as broken in this commit are fixed in the next one). It also removes the readonly flag from transient nodes generated when resolving interpolations (its main use was for containers and having it on other types of node was more a side effect than an important feature)
  • f5fb0cc adds the oc.create resolver which makes it possible again to easily create DictConfigs and ListConfigs from dicts/lists
  • 71b615e removes the readonly flag from containers created with oc.create, because I realized it would otherwise prevent people from merging configs easily. I made it a separate commit because there may be other alternatives (ex: making merge work with readonly configs?)

Fixes #645

Copy link
Owner

@omry omry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review for first commit in stack.

With ``oc.decode``, strings can be converted into their corresponding data types using the OmegaConf grammar.
This grammar recognizes typical data types like ``bool``, ``int``, ``float``, ``dict`` and ``list``,
e.g. ``"true"``, ``"1"``, ``"1e-3"``, ``"{a: b}"``, ``"[a, b, c]"``.
It will also resolve interpolations like ``"${foo}"``, returning the corresponding value of the node.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${v} will get resolved for any resolver. Are you thinking of something else?

v: "hello"
a: ${oc.decode: ${v}}
b: ${oc.deprecated(v, ${v}}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm thinking of the string "${foo}" being given as input to oc.decode. It's best seen in the code example below when we set

os.environ["DB_TIMEOUT"] = "${.port}"

However, it's tricky to explain succinctly here. Maybe I just shouldn't mention it?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think we can omit it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 01ddb95

docs/source/usage.rst Outdated Show resolved Hide resolved
Comment on lines 442 to 445
- When providing as input to ``oc.decode`` a string that is meant to be decoded into another string, in general
the input string should be quoted (since only a subset of characters are allowed by the grammar in unquoted
strings). For instance, a proper string interpolation could be: ``"'Hi! My name is: ${name}'"`` (with extra quotes).
- ``None`` (written as ``null`` in the grammar) is the only valid non-string input to ``oc.decode`` (returning ``None`` in that case)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolving the string interpolation "'Hi! My name is: ${name}'" is not a function of oc.decode. it would get resolved for any resolver.
e.g:

cfg = OmegaConf.create(
    {
        "msg": "E.T. call home",
        "a": 10,
        "b": {
            "c": "${oc.deprecated:a, 'message: ${msg}'}",
        },
    }
)
print(cfg.b.c)

Output:

$  python 1.py 
/home/omry/dev/omegaconf/omegaconf/resolvers/oc/__init__.py:83: UserWarning: message: E.T. call home
  warnings.warn(category=UserWarning, message=msg)
10

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I can see how this can be confusing. I'm talking about the string being provided as input to oc.decode. When you write:

"c": "${oc.deprecated:a, 'message: ${msg}'}",

The string provided to oc.deprecated isn't "message: ${msg}", it is "message: E.T. call home".

In general strings provided as input to oc.decode are much more likely to come from resolvers than to be written explicitly in the config (because if you can write it explicitly then there is no need for oc.decode). That's why my example also uses oc.env.

I'll need to think a bit more about it to try and figure out how to make it understandable in the doc. Suggestions welcome ;)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we don't mention it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also removed in 01ddb95

Comment on lines 561 to 562
# Primitive types get wrapped using the same logic as when setting the
# value of a node (i.e., through `_node_wrap()`).
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        # Primitive types get wrapped using `_node_wrap()`, ensuring value is validated and potentially converted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b3cc6ab

Comment on lines 572 to 576
except ValidationError: # pragma: no cover
# This is not supposed to happen because primitive types that must
# be wrapped should have already been validated inside
# `_validate_and_convert_interpolation_result()`.
assert False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not supposed to happen maybe just let it fly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 348e2ae

assert isinstance(c.x, DictConfig)
assert c.x == expected
assert dereference_node(c, "x")._get_flag("readonly")
c._set_flag("readonly", readonly)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OmegaConf.set_readonly(c, readonly)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ac26495

restore_resolvers: Any, cfg: Dict[str, Any], expected: Dict[str, Any]
) -> None:
OmegaConf.register_new_resolver("dict", lambda: cfg)
@mark.parametrize("readonly", [True, False, None])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can go with True and False only. I know that in other places we are also testing None but it's probably redundant in most of them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 55f8041

tests/interpolation/test_custom_resolvers.py Show resolved Hide resolved
tests/interpolation/test_custom_resolvers.py Outdated Show resolved Hide resolved
Comment on lines 411 to 400
id="dict_nested",
MissingList(list=SI("${identity:[0, 1, 2]}")),
"list",
[0, 1, 2],
ListConfig,
id="list_int_to_str",
),
param(
MissingDict(dict=SI("${identity:{a: 0, b: 1}}")),
"dict",
{"a": 0, "b": 1},
DictConfig,
id="dict_int_to_str",
),
param(
SubscriptedList(list=SI("${identity:[a, b]}")),
"list",
["a", "b"],
ListConfig,
id="list_type_mismatch",
),
param(
MissingDict(dict=SI("${identity:{0: b, 1: d}}")),
"dict",
{0: "b", 1: "d"},
DictConfig,
id="dict_key_type_mismatch",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confused by those. why are they not expecting list and dict objects?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a shortcut here as I was planning to keep the DictConfig / ListConfig but only after introducing oc.create, so in this commit the test is partially broken (I commented out the type check) and it's fixed in the next commit. Sorry it's a bit confusing.

)
def test_dict_values_transient_interpolation(
@mark.parametrize("dict_func", ["oc.dict.values", "oc.dict.keys"])
def test_extract_from_dict_resolver_output(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this function is testing from its name.
Is this new test related to the changes in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name means "test what happens when you try to extract [keys / values] from a dictionary which is the output of a resolver".

The relation with this PR is that before this commit, in this test y would be a DictConfig and thus you would be able to do ${oc.dict.keys:y} or ${oc.dict.values:y}.
But now y is a plain Python dictionary which isn't supported by oc.dict.{keys,values}.

This is essentially the replacement for the previous test test_dict_values_transient_interpolation[plain_dict] (which was poorly named because the interpolation isn't transient anymore, it was legacy from an earlier version)

The other previous test test_dict_values_transient_interpolation[dictconfig_with_parent] is gone because it seemed too redundant with test_dict_values_dictconfig_resolver_output[basic] to be worth keeping it.

They were being converted into DictConfig/ListConfig within
`_node_wrap()`. Now we only call `_node_wrap()` on primitive types,
while other types are stored within an `AnyNode` with the
`allow_objects` flag set to True.
The motivation is that otherwise it prevents merging the config with
another one.
@odelalleau
Copy link
Collaborator Author

I rebased on top of master -- the first commit that you had already reviewed hasn't changed much (I just had to apply the doc changes to the new location)

@odelalleau odelalleau requested a review from omry April 16, 2021 14:09
tests/test_nodes.py Show resolved Hide resolved
Comment on lines 209 to 225
``oc.create`` may be used for dynamic generation of config nodes
(typically from Python ``dict`` / ``list`` objects or YAML strings, similar to :ref:`OmegaConf.create<creating>`).
The following example combines ``oc.create`` with ``oc.decode`` and ``oc.env`` to generate
a sub-config from an environment variable:

.. doctest::

>>> cfg = OmegaConf.create(
... {
... "model": "${oc.create:${oc.decode:${oc.env:MODEL}}}",
... }
... )
>>> os.environ["MODEL"] = "{name: my_model, layer_size: [100, 200]}"
>>> show(cfg.model.layer_size)
type: ListConfig, value: [100, 200]


Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit much (using 3 resolvers to explain 1).
Maybe something like this?

OmegaConf.register_new_resolver("identity", lambda x: x)
cfg = OmegaConf.create(
    {
        "plain_dict": "${identity:{a:10}}",
        "dict_config": "${oc.create:{a:10}}",
        "dict_config_from_env": "${oc.create:${oc.env:YAML_ENV}}",
    }
)

os.environ["YAML_ENV"] = "A: 10\nb: 20\nC: ${.A}" 
show(cfg.plain_dict)
show(cfg.dict_config)
show(cfg.dict_config_from_env)
# interpolations works because this is a DictConfig
print(cfg.dict_config_from_env.C)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8107676 (with a few small changes)

docs/source/custom_resolvers.rst Outdated Show resolved Hide resolved
docs/source/usage.rst Show resolved Hide resolved
)
def test_dict_values_transient_interpolation(
@mark.parametrize("dict_func", ["oc.dict.values", "oc.dict.keys"])
def test_extract_from_dict_resolver_output(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_extract_from_dict_resolver_output(
def test_extract_from_plain_dict_unsupported(

Also, I think we have decided that the only supported input the string keys.
dunno if it makes sense to test something so intricate now that the behavior is so simple. I vote to just remove this test as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not particularly attached to this test => removed in 9050018
(note: it was still using a string key as input, it's just that the corresponding key was a plain dict -- I think using a resolver is the only way to get a plain dict in a config)

Comment on lines 349 to 372
@mark.parametrize("readonly", [True, False])
def test_resolver_output_dict(restore_resolvers: Any, readonly: bool) -> None:
some_dict = {"a": 0, "b": "${y}"}
OmegaConf.register_new_resolver("dict", lambda: some_dict)
c = OmegaConf.create({"x": "${dict:}", "y": -1})
assert isinstance(c.x, DictConfig)
assert c.x == expected
assert dereference_node(c, "x")._get_flag("readonly")


@mark.parametrize(
("cfg", "expected"),
[
([0, 1], [0, 1]),
(["${y}"], [-1]),
([0, "${x.0}"], [0, 0]),
([0, "${.0}"], [0, 0]),
(["${..y}"], [-1]),
],
)
def test_resolver_output_list_to_listconfig(
restore_resolvers: Any, cfg: List[Any], expected: List[Any]
) -> None:
OmegaConf.register_new_resolver("list", lambda: cfg)
OmegaConf.set_readonly(c, readonly)
assert isinstance(c.x, dict)
assert c.x == some_dict
x_node = dereference_node(c, "x")
assert isinstance(x_node, AnyNode)
assert x_node._get_flag("allow_objects")


@mark.parametrize("readonly", [True, False])
def test_resolver_output_list(restore_resolvers: Any, readonly: bool) -> None:
some_list = ["a", 0, "${y}"]
OmegaConf.register_new_resolver("list", lambda: some_list)
c = OmegaConf.create({"x": "${list:}", "y": -1})
assert isinstance(c.x, ListConfig)
assert c.x == expected
assert dereference_node(c, "x")._get_flag("readonly")
OmegaConf.set_readonly(c, readonly)
assert isinstance(c.x, list)
assert c.x == some_list
x_node = dereference_node(c, "x")
assert isinstance(x_node, AnyNode)
assert x_node._get_flag("allow_objects")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those two tests can be combined into one test using identity to pass through dict and list from the parameterization.

Copy link
Collaborator Author

@odelalleau odelalleau Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point, done in e4503d3 (I didn't use identity though because it would be a bit more cumbersome and harder to read IMO, it's easier to register a resolver that outputs the dict/list we want)

@odelalleau odelalleau requested a review from omry April 16, 2021 20:41
Copy link
Owner

@omry omry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@odelalleau odelalleau merged commit 3f235ab into omry:master Apr 16, 2021
@odelalleau odelalleau deleted the oc_create branch April 16, 2021 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add new resolver oc.create
2 participants