Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Recursive class instantiation #419

Closed
OlivierDehaene opened this issue Feb 14, 2020 · 5 comments
Closed

[Feature Request] Recursive class instantiation #419

OlivierDehaene opened this issue Feb 14, 2020 · 5 comments
Labels
awaiting_response Awaiting response enhancement Enhanvement request

Comments

@OlivierDehaene
Copy link

OlivierDehaene commented Feb 14, 2020

🚀 Feature Request

Motivation

Related to Nested object instantiation [Feature Request]

Is your feature request related to a problem? Please describe.

One of the current limitation of the hydra.utils.instantiate(...) method it that you can't use it for nested instantiation.
Let's say you have:

class Foo:
    def __init__(self, my_int: int, my_bool: bool):
        ...

class Bar:
    def __init__(self, foos: List[Foo])
        ...

I would like to be able to instantiate Bar with the following config:

bar: 
  class: test.Bar
  params:
    foos:
      - class: test.Foo
        params:
          my_int: 5
          my_bool: false 
      - class: test.Foo
        params:
          my_int: 0
          my_bool: true 

Pitch

Describe the solution you'd like

One very simple solution would be to check the argument we give to clazz one by one. Something like this:

def check_argument(a: Any):
    if isinstance(a, dict) and 'class' in a.keys():
        return instantiate(OmegaConf.create(a))

    if isinstance(a, list):
        for i, value in enumerate(a):
            a[i] = check_argument(value)
    return a


def instantiate(config: DictConfig, *args: Any, **kwargs: Any) -> Any:
    import copy

    assert config is not None, "Input config is None"
    # copy config to avoid mutating it when merging with kwargs
    config_copy = copy.deepcopy(config)
    config_copy._set_parent(config._get_parent())
    config = config_copy

    try:
        clazz = get_class(config["class"])
        params = config.params if "params" in config else OmegaConf.create()
        assert isinstance(
            params, DictConfig
        ), "Input config params are expected to be a mapping, found {}".format(
            type(config.params)
        )
        params.merge_with(OmegaConf.create(kwargs))

        # Needed to avoid omegaconf.errors.UnsupportedValueType triggered by
        # params[name] = check_argument(arg)
        params = OmegaConf.to_container(params) 

        args = list(args)
        for i, arg in enumerate(args):
            args[i] = check_argument(arg)

        for name, arg in params.items():
            params[name] = check_argument(arg)

        return clazz(*args, **params)

    except Exception as e:
        log.error("Error instantiating {} : {}".format(config["class"], e))
        raise e

This solution is not ideal but it works.

Are you willing to open a pull request? (See CONTRIBUTING)

Yes. I'm willing to work on it but I don't know the codebase at all ATM.

Additional context

Another cool thing that could be added to this method is to add type checking to args and kwargs. It could be done by using inspect.signature(clazz.__init__) and then checking the arguments vs the returned annotations. This could also allow us to track if the class has default values for some of its arguments. Should I open another feature request for this?

@OlivierDehaene OlivierDehaene added the enhancement Enhanvement request label Feb 14, 2020
@OlivierDehaene
Copy link
Author

OlivierDehaene commented Feb 14, 2020

I only saw #229 after writing my issue. Maybe it is better to close this one and add the code to #229 ?

@omry
Copy link
Collaborator

omry commented Feb 15, 2020

Hi, this is an interesting solution. I am a bit torn about extending instantiate to support it because it feels like there is quiet a bit more surface area introduced by recursive instantiation that you solution is not touching.
for example, what if you want to recursively instantiate something in a Dict and not a List?

class: foo
params:
  a: 10
  b:
    class: bar
    params:
      z : 20

This opens up the question of supporting code that is already taking such structure and instantiates the inner variables manually.

Another thing that is not solvable with this approach is passing additional parameters to the nested objects.
For example, when instantiating a PyTorch optimizer, you need to pass in the parameters of the model.
This can be done by:
instantiate(cfg.optimizer, model.parameters()).
This is inherently incompatible with what you are suggesting.

I think this can be a second instantiate function, say:
recursive_instantiate() that will do this and only passthrough additional arguments to the first object.
This function should also support objects nested in dicts, not just lists.

If you think you can work on this I can tell you where to write the tests and we can figure out together what are enough tests to get this into Hydra.

About your other suggested idea:
this is interesting and very much related to what I am working on now on master.
This is almost ready, you can already read the tutorial in this PR.

We can talk about your other suggestion in the context of structured configs. I suspect something like would not be supported right now but it's worth taking a look at it.

@OlivierDehaene
Copy link
Author

for example, what if you want to recursively instantiate something in a Dict and not a List?

class: foo
params:
  a: 10
  b:
    c:
      class: bar
      params:
        z : 20
    d:
      class: bar
      params:
        z: 20

Yes this should be covered.

For example, when instantiating a PyTorch optimizer, you need to pass in the parameters of the model.
This can be done by:
instantiate(cfg.optimizer, model.parameters()).
This is inherently incompatible with what you are suggesting.

You could use the kwargs merge logic to pass the weights but it would not be ideal. Maybe if you could pass a flat dict to the instantiate function it would be easier. Something like this:

class Pipeline:
    def __init__(self, optimizer: torch.optim.Adam):
        ...
class: Pipeline
params:
  optimizer:
    class: torch.optim.Adam
    params:
      lr: 1e-3
      params: ???
merge_with = {'pipeline.optimizer.params': model.weights()}
hydra.utils.instantiate(cfg, **merge_with)

If you think you can work on this I can tell you where to write the tests and we can figure out together what are enough tests to get this into Hydra.

I would be happy to do that :)

About your other suggested idea:
this is interesting and very much related to what I am working on now on master.
This is almost ready, you can already read the tutorial in this PR.

We can talk about your other suggestion in the context of structured configs. I suspect something like would not be supported right now but it's worth taking a look at it.

I really like this new logic but I think it interacts with the instantiation it a weird manner. On one hand, you register your structured configs, but on the other hand, you don't register your class and are supposed to pass an import path in the config. I think you should be able to register classes directly. The is the system implemented in AllenNLP if you want to have a look. The logic can be found in:

@omry
Copy link
Collaborator

omry commented Feb 15, 2020

Structured configs are new territory and I didn't even consider how they interact with instantiate yet.
You always have the choice to use lax typing if you don't know enough about the types when defining the schema for the config.

Here is what I suggest:
There is no need for any of this to be a part of Hydra core. It can all be in user land or in a library.
what do you think about creating an experimental hydra plugin (generic plugin). to house the new logic?
It can live either in your own repo or inside the hydra repo (which will give it more visibility).
I will be happy to review it and make suggestions.
This will give you the freedom to try to build something generic without worrying too much about supporting the rest of the Hydra users from day 1. and it will also not need to be compatible with the current instantiate logic.

If that thing becomes very good and popular I can consider folding it into the core of Hydra.

@omry omry added the awaiting_response Awaiting response label Feb 18, 2020
@omry
Copy link
Collaborator

omry commented May 4, 2020

Created #566 to track this. (no concrete plans to implement yet).

@omry omry closed this as completed May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting_response Awaiting response enhancement Enhanvement request
Projects
None yet
Development

No branches or pull requests

2 participants