Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OmegaConf.to_object: Instantiate structured configs #502

Merged
merged 90 commits into from
Apr 7, 2021
Merged
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
15d9abb
to_container: instantiate_structured_configs flag
Jasha10 Dec 15, 2020
c4b042a
add a failing test
Jasha10 Dec 16, 2020
fe47c89
fix typo
Jasha10 Dec 17, 2020
c91be99
fix bug
Jasha10 Dec 22, 2020
59ee909
another bugfix
Jasha10 Dec 23, 2020
14b75be
solved an issue
Jasha10 Dec 23, 2020
3354bad
add type assert
Jasha10 Dec 23, 2020
5cb06ed
updates
Jasha10 Dec 23, 2020
a3676e3
test instantiation of subclass of Dict[str, User]
Jasha10 Dec 23, 2020
c5f5e73
test instantiate_structured_configs-Str2UserWithField
Jasha10 Dec 23, 2020
abf8af0
fix bug with allow_objects flag
Jasha10 Dec 29, 2020
a54cb98
add comment
Jasha10 Jan 21, 2021
5fdd019
remove dependence on ref_type; use object_type
Jasha10 Jan 21, 2021
fc49cbd
use keyword args in call to _instantiate_structured_config_impl
Jasha10 Jan 21, 2021
64b3ed7
refactor for clearer control flow
Jasha10 Jan 21, 2021
41588c1
move method _instantiate_structured_config_impl
Jasha10 Jan 24, 2021
653a54e
fix lint/mypy errors
Jasha10 Jan 30, 2021
65497ee
remove unnecessary allow_objects flag
Jasha10 Jan 31, 2021
1640f8b
rename parameter 'instantiate_structured_configs' -> 'instantiate'
Jasha10 Feb 3, 2021
7e52b24
create OmegaConf.to_object alias for OmegaConf.to_container
Jasha10 Feb 3, 2021
8dfd397
One use case per test
Jasha10 Feb 3, 2021
68b1f74
coverage: use to_object(cfg) instead of to_container(object, instanti…
Jasha10 Feb 3, 2021
5b47049
rename tests: to_object instead of to_container
Jasha10 Feb 3, 2021
15324e9
tests: user str key instead of int key
Jasha10 Feb 3, 2021
375babf
tests: change 'assert ... is MISSING' -> 'assert ... == MISSING'
Jasha10 Feb 3, 2021
6422f39
add tests for object nested inside object
Jasha10 Feb 3, 2021
ecc05b1
one use case per tests: dict subclass
Jasha10 Feb 3, 2021
9d7addc
test_structured_config.py: consolidate instantiate=True tests
Jasha10 Feb 3, 2021
7136baa
finish rebase against master
Jasha10 Feb 14, 2021
841ac01
Move TestInstantiateStructuredConfigs to test_to_container.py
Jasha10 Feb 17, 2021
fa81a4d
Create get_structured_config_field_names function
Jasha10 Feb 17, 2021
872850b
OmegaConf.to_object: resolve=True by default
Jasha10 Feb 17, 2021
072e8d8
change _instantiate_structured_config_impl fn signature
Jasha10 Feb 17, 2021
1953a47
separate positive and negative test cases
Jasha10 Feb 23, 2021
91de025
merge updates from master
Jasha10 Feb 26, 2021
a4af7f5
switch order of cases in _instantiate_structured_config_impl
Jasha10 Feb 26, 2021
efcc93b
switch order of cases in _instantiate_structured_config_impl
Jasha10 Feb 26, 2021
f94ecbd
merge
Jasha10 Feb 26, 2021
5b51861
regroup tests for extracting structured config info
Jasha10 Feb 26, 2021
7e299cf
Merge branch 'master' into instantiate-structured-configs
Jasha10 Mar 4, 2021
19de3b4
Undo a stylistic change to tests/structured_conf/test_structured_conf…
Jasha10 Mar 4, 2021
84c00ac
add failing tests for throw if MISSING
Jasha10 Mar 4, 2021
488a4b3
Update omegaconf/_utils.py
Jasha10 Mar 11, 2021
bbbb245
fix mypy and flake8 issues
Jasha10 Mar 12, 2021
37e055f
implement MissingMandatoryValue in case of MISSING param to dataclass…
Jasha10 Mar 12, 2021
961b9fd
update a test to reflect new behavior r.e. MISSING
Jasha10 Mar 12, 2021
7f8addb
use correct-typed value in test of KeyValidationError
Jasha10 Mar 12, 2021
a10fa5b
modify to_object docstring
Jasha10 Mar 12, 2021
6b014ae
use a set for _instantiate_structured_config_impl field names
Jasha10 Mar 12, 2021
4cefc6d
remove redundant call to set()
Jasha10 Mar 12, 2021
b2a5ab2
refactor TestInstantiateStructuredConfigs
Jasha10 Mar 12, 2021
d28ae5d
TestInstantiateStructuredConfigs: remove redundant isinstance assertions
Jasha10 Mar 12, 2021
62f34cf
Use setattr(instance, k, v) when structured config has extra fields
Jasha10 Mar 12, 2021
249ac36
add news fragment
Jasha10 Mar 12, 2021
0869121
refactoring: rename variables
Jasha10 Mar 12, 2021
3a47132
Test error message for MissingMandatoryValue
Jasha10 Mar 13, 2021
46aadbe
Formatting: delete whitespace
Jasha10 Mar 13, 2021
c581548
include $OBJECT_TYPE in MissingMandatoryValue err msg
Jasha10 Mar 13, 2021
2cf460f
change _instantiate_structured_config_impl to an instance method
Jasha10 Mar 15, 2021
d6e9749
simplify `retdict` & `retstruct` to `ret`
Jasha10 Mar 15, 2021
f16ad22
rename `conf` -> `self` in _instantiate_structured_config_impl
Jasha10 Mar 15, 2021
2bf73b0
remove `resolve` arg from `to_object`
Jasha10 Mar 16, 2021
eb41a37
Docs example for SCMode.INSTANTIATE
Jasha10 Mar 17, 2021
30550bf
docs: OmegaConf.to_object example
Jasha10 Mar 17, 2021
0611c93
Docs minor edit
Jasha10 Mar 17, 2021
32f6c68
updates to to_object docs
Jasha10 Mar 17, 2021
1019df6
Revert test_structured_config.py (remove redundant test)
Jasha10 Mar 18, 2021
3fef7f0
dict subclass: DictConfig items become instance attributes
Jasha10 Mar 18, 2021
0019bca
Merge branch 'master' into instantiate-structured-configs
Jasha10 Mar 19, 2021
15a03ea
docs: use `show` instead of `print`/`assert`
Jasha10 Mar 19, 2021
f3171f2
minor doc fix
Jasha10 Mar 19, 2021
80284f4
docs: Improve introduction to `to_object` method
Jasha10 Mar 29, 2021
8f17b9a
docs: Remove explanation r.e. equivalent OmegaConf.to_container calls
Jasha10 Mar 29, 2021
e1e034a
docs: clarification on ducktyping
Jasha10 Mar 29, 2021
5045503
Merge branch 'master' into instantiate-structured-configs
Jasha10 Mar 29, 2021
29323a4
to_container docs: explicitly document the new SCMode.INSTANTIATE member
Jasha10 Mar 29, 2021
fe5df1d
update `to_object` docstring
Jasha10 Mar 29, 2021
a9a05ee
docs: fix typos
Jasha10 Mar 29, 2021
db09880
empty commit (to trigger CI workflow)
Jasha10 Mar 31, 2021
a17a11f
refactor test_SCMode
Jasha10 Apr 1, 2021
29a526b
lowercase test fn name (test_SCMode -> test_scmode)
Jasha10 Apr 1, 2021
c672c10
StructuredConfigs have resolve=True and enum_to_str=False
Jasha10 Apr 2, 2021
672b180
minor: revert whitespace addition
Jasha10 Apr 2, 2021
8beb52a
Edit to news/472.feature
Jasha10 Apr 6, 2021
4787e8d
don't mention enum_to_str
Jasha10 Apr 7, 2021
c1d13f8
formatting and title for structured_config_mode docs
Jasha10 Apr 7, 2021
bc2f610
remove TODO comment
Jasha10 Apr 7, 2021
f1a4270
fix comment formatting
Jasha10 Apr 7, 2021
b51d33f
move `import get_structured_config_field_names` to top of file
Jasha10 Apr 7, 2021
d12701e
one last formatting adjustment
Jasha10 Apr 7, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/source/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -776,6 +776,11 @@ Structured Config nodes using the ``structured_config_mode`` option.
By default, Structured Config nodes are converted to plain dict.
Using ``structured_config_mode=SCMode.DICT_CONFIG`` causes such nodes to remain
as DictConfig, allowing attribute style access on the resulting node.
Using ``structured_config_mode=SCMode.INSTANTIATE``, Structured Config nodes
are converted to instances of the backing dataclass or attrs class. Note that
typically ``structured_config_mode=SCMode.INSTANTIATE`` makes the most sense
when combined with ``resolve=True``, so that interpolations are resolved before
being used to instantiate dataclass/attr class instances.

.. doctest::

Expand All @@ -788,6 +793,30 @@ as DictConfig, allowing attribute style access on the resulting node.
>>> show(container["structured_config"])
type: DictConfig, value: {'port': 80, 'host': 'localhost'}

OmegaConf.to_object
^^^^^^^^^^^^^^^^^^^^^^
The ``OmegaConf.to_object`` method recursively converts DictConfig and ListConfig objects
into dicts and lists, with the exception that Structured Config objects are
converted into instances of the backing dataclass or attr class. All OmegaConf
interpolations are resolved before conversion to Python containers.

.. doctest::

>>> container = OmegaConf.to_object(conf)
>>> show(container)
type: dict, value: {'structured_config': MyConfig(port=80, host='localhost')}
>>> show(container["structured_config"])
type: MyConfig, value: MyConfig(port=80, host='localhost')

Note that here, ``container["structured_config"]`` is actually an instance of
``MyConfig``, whereas in the previous examples we had a ``dict`` or a
``DictConfig`` object that was duck-typed to look like an instance of
``MyConfig``.

The call ``OmegaConf.to_object(conf)`` is equivalent to
``OmegaConf.to_container(conf, resolve=True,
structured_config_mode=SCMode.INSTANTIATE)``.

OmegaConf.resolve
^^^^^^^^^^^^^^^^^
.. code-block:: python
Expand Down
1 change: 1 addition & 0 deletions news/472.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add the OmegaConf.to_object method, which converts Structured Config objects back to native dataclasses or attrs classes.
19 changes: 19 additions & 0 deletions omegaconf/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,12 @@ def _resolve_forward(type_: Type[Any], module: str) -> Type[Any]:
return type_


def get_attr_class_field_names(obj: Any) -> List[str]:
is_type = isinstance(obj, type)
obj_type = obj if is_type else type(obj)
return list(attr.fields_dict(obj_type))


def get_attr_data(obj: Any, allow_objects: Optional[bool] = None) -> Dict[str, Any]:
from omegaconf.omegaconf import OmegaConf, _maybe_wrap

Expand Down Expand Up @@ -240,6 +246,10 @@ def get_attr_data(obj: Any, allow_objects: Optional[bool] = None) -> Dict[str, A
return d


def get_dataclass_field_names(obj: Any) -> List[str]:
return [field.name for field in dataclasses.fields(obj)]


def get_dataclass_data(
obj: Any, allow_objects: Optional[bool] = None
) -> Dict[str, Any]:
Expand Down Expand Up @@ -332,6 +342,15 @@ def is_structured_config_frozen(obj: Any) -> bool:
return False


def get_structured_config_field_names(obj: Any) -> List[str]:
if is_dataclass(obj):
return get_dataclass_field_names(obj)
elif is_attr_class(obj):
return get_attr_class_field_names(obj)
else:
raise ValueError(f"Unsupported type: {type(obj).__name__}")


def get_structured_config_data(
obj: Any, allow_objects: Optional[bool] = None
) -> Dict[str, Any]:
Expand Down
3 changes: 2 additions & 1 deletion omegaconf/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -728,5 +728,6 @@ def _has_ref_type(self) -> bool:


class SCMode(Enum):
DICT = 1 # convert to plain dict
DICT = 1 # Convert to plain dict
DICT_CONFIG = 2 # Keep as OmegaConf DictConfig
INSTANTIATE = 3 # Create a dataclass or attrs class instance
41 changes: 37 additions & 4 deletions omegaconf/basecontainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ def convert(val: Node) -> Any:
):
return conf

retdict: Dict[str, Any] = {}
ret: Any = {}
for key in conf.keys():
node = conf._get_node(key)
assert isinstance(node, Node)
Expand All @@ -225,15 +225,20 @@ def convert(val: Node) -> Any:
if enum_to_str and isinstance(key, Enum):
key = f"{key.name}"
if isinstance(node, Container):
retdict[key] = BaseContainer._to_content(
ret[key] = BaseContainer._to_content(
node,
resolve=resolve,
enum_to_str=enum_to_str,
structured_config_mode=structured_config_mode,
)
else:
retdict[key] = convert(node)
return retdict
ret[key] = convert(node)

if structured_config_mode == SCMode.INSTANTIATE and is_structured_config(
conf._metadata.object_type
):
ret = conf._instantiate_structured_config_impl(instance_data=ret)
return ret
elif isinstance(conf, ListConfig):
retlist: List[Any] = []
for index in range(len(conf)):
Expand All @@ -257,6 +262,34 @@ def convert(val: Node) -> Any:

assert False

def _instantiate_structured_config_impl(self, instance_data: Dict[str, Any]) -> Any:
"""Instantiate an instance of `self._metadata.object_type`, populated by `instance_data`."""
from ._utils import get_structured_config_field_names

object_type = self._metadata.object_type
object_type_field_names = set(get_structured_config_field_names(object_type))

field_items: Dict[str, Any] = {}
nonfield_items: Dict[str, Any] = {}
for k, v in instance_data.items():
if _is_missing_literal(v):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it make more sense to iterate on the content of self and not to pass instance_data at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would work. The values of self are subconfig objects that are not yet converted to python primitives (possibly DictConfig or ListConfig, and possibly containing unresolved interpolations). The values of instance_data are already resolved and converted to dict or list or to dataclass/attrclass instances.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data passed in is derived from the content of self, which is why I am asking it.
It feels redundant.
You have custom code here that is handling things, it can resolve interpolations and deal with containers.

am I missing anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data passed in is derived from the content of self, which is why I am asking it.

This is true.

It feels redundant.

Take a look at how instance_data is calculated in the calling function (DictConfig._to_content)... The calculation is nontrivial, and it depends on several things that are defined in the scope of the calling function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the calling function has to perform this calculation anyway (for the case where structured_config_mode == SCMode.DICT), I think it makes sense to reuse the calculation by passing in instance_data for the structured_config_mode == SCMode.INSTANTIATE case.

Copy link
Owner

@omry omry Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will not apply resolve and enum_to_str to the instantiated objects (but will apply it values in unstructured config containers).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense to me.
So:

  • the structured config gets processed with enum_to_str=False and resolve=False,
  • nested configs inside the structured config get processed with enum_to_str=False and resolve=False, and
  • non-structured parent configs have enum_to_str and resolve according to keyword args chosen by the client.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense to me.
So:

  • the structured config gets processed with enum_to_str=False and resolve=False,
    you mean resolve=True, right?
  • nested configs inside the structured config get processed with enum_to_str=False and resolve=False, and
  • non-structured parent configs have enum_to_str and resolve according to keyword args chosen by the client.

Since a Structured Config may contain an unstructured DictConfig, we could consider inheriting the resolve flag for them from the surrounding call, but I am not sure it's worth the added complexity.

in that case, the signature of to_object() will look something like:

def to_object(self, resolve_nested_configs: bool, enum_to_str_nested_configs: bool) -> Any:
  ...

An alternative is to just say that everything under a Structured Config is always resolved (and enums are not converted to strings).

I think I prefer the second option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean resolve=True, right?

Yes.

I think I prefer the second option.

Me too. I'll get started on the diff.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in c672c10.

self._format_and_raise(
key=k,
value=None,
cause=MissingMandatoryValue(
"Structured config of type `$OBJECT_TYPE` has missing mandatory value: $KEY"
),
)
if k in object_type_field_names:
field_items[k] = v
else:
nonfield_items[k] = v

result = object_type(**field_items)
for k, v in nonfield_items.items():
setattr(result, k, v)
return result

def pretty(self, resolve: bool = False, sort_keys: bool = False) -> str:
from omegaconf import OmegaConf

Expand Down
27 changes: 27 additions & 0 deletions omegaconf/omegaconf.py
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,9 @@ def to_container(
:param structured_config_mode: Specify how Structured Configs (DictConfigs backed by a dataclass) are handled.
By default (`structured_config_mode=SCMode.DICT`) structured configs are converted to plain dicts.
If `structured_config_mode=SCMode.DICT_CONFIG`, structured config nodes will remain as DictConfig.
If `structured_config_mode=SCMode.INSTANTIATE`, this function will instantiate structured configs
(DictConfigs backed by a dataclass), by creating an instance of the underlying dataclass.
See also OmegaConf.to_object.
:return: A dict or a list representing this config as a primitive container.
"""
if not OmegaConf.is_config(cfg):
Expand All @@ -594,6 +597,30 @@ def to_container(
structured_config_mode=structured_config_mode,
)

@staticmethod
def to_object(
cfg: Any,
*,
enum_to_str: bool = False,
) -> Union[Dict[DictKeyType, Any], List[Any], None, str, Any]:
"""
Resursively converts an OmegaConf config to a primitive container (dict or list).
Any DictConfig objects backed by dataclasses or attrs classes are instantiated
as instances of those backing classes.

This is an alias for OmegaConf.to_container(..., resolve=True, structured_config_mode=SCMode.INSTANTIATE)

:param cfg: the config to convert
:param enum_to_str: True to convert Enum values to strings
:return: A dict or a list or dataclass representing this config.
"""
return OmegaConf.to_container(
cfg=cfg,
resolve=True,
enum_to_str=enum_to_str,
structured_config_mode=SCMode.INSTANTIATE,
)

@staticmethod
def is_missing(cfg: Any, key: DictKeyType) -> bool:
assert isinstance(cfg, Container)
Expand Down
4 changes: 4 additions & 0 deletions tests/structured_conf/data/attr_classes.py
Original file line number Diff line number Diff line change
Expand Up @@ -440,6 +440,10 @@ class Str2StrWithField(Dict[str, str]):
class Str2IntWithStrField(Dict[str, int]):
foo: int = 1

@attr.s(auto_attribs=True)
class Str2UserWithField(Dict[str, User]):
foo: User = User("Bond", 7)

class Error:
@attr.s(auto_attribs=True)
class User2Str(Dict[User, str]):
Expand Down
4 changes: 4 additions & 0 deletions tests/structured_conf/data/dataclasses.py
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,10 @@ class Str2StrWithField(Dict[str, str]):
class Str2IntWithStrField(Dict[str, int]):
foo: int = 1

@dataclass
class Str2UserWithField(Dict[str, User]):
foo: User = User("Bond", 7)

class Error:
@dataclass
class User2Str(Dict[User, str]):
Expand Down
12 changes: 12 additions & 0 deletions tests/test_errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -1234,6 +1234,18 @@ def finalize(self, cfg: Any) -> None:
),
id="list,readonly:del",
),
# to_object
param(
Expected(
create=lambda: OmegaConf.structured(User),
op=lambda cfg: OmegaConf.to_object(cfg),
exception_type=MissingMandatoryValue,
msg="Structured config of type `User` has missing mandatory value: name",
key="name",
child_node=lambda cfg: cfg._get_node("name"),
),
id="to_object:structured-missing-field",
),
]


Expand Down
Loading