Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OmegaConf.select is relative to the node it's called on #660

Merged
merged 6 commits into from
Apr 5, 2021

Conversation

omry
Copy link
Owner

@omry omry commented Apr 5, 2021

The support added for relative keys in select (#656) was a breaking change.
in 2.0, when select is called on a nested node, it's already relative.
#656 made that select absolute, which is a breaking change.

A non breaking fix is to keep nested selects relative, but add support for relative syntax (specifically for going up the hierarchy).
The resulting behavior is shown below and can be a bit surprising:

from omegaconf import OmegaConf

cfg = OmegaConf.create(
    {
        "a": {
            "b": {
                "c": 10,
            }
        },
        "z": 10,
    },
)
# works on 2.0
assert OmegaConf.select(cfg.a, "") == {"b": {"c": 10}}
assert OmegaConf.select(cfg.a, "b") == {"c": 10}
# new behavior, going up one level (as an example):
assert OmegaConf.select(cfg.a, ".") == {"a": {"b": {"c": 10}}, "z": 10}
assert OmegaConf.select(cfg.a, ".a") == {"b": {"c": 10}}
assert OmegaConf.select(cfg.a, ".z") == 10

In most scenarios, going up a level requires two dots. but since OmegaConf.select on a nested node is already relative to that node, only one dot is needed.

The API is establishing the non-breaking implementation.
An alternative is to make this breaking (the behavior from #656).

The non-breaking behavior will look weird when using select inside custom resolvers using select as an alternative way to access nodes (this is using a planned oc.select resolver that will just call OmegaConf.select):

a: 10
foo:
  a: 20
  b: ${oc.select: .a) # relative to foo, one level up: 10
  b: ${oc.select: a) # relative to foo: 20
  # as opposed to:
  c: ${a}  # absolute: 10
  c: ${.a}  # relative: 20

I am not particularly happy with this.
On the one hand, select being relative to the node it's called on is intuitive and is the current behavior.
On the other hand, it's inconsistent with interpolations.
Thoughts?

I am planning to merge this because this is fixing an unintentional breaking change, but let's decide if we actually want to make this an intentional breaking change (and at the point, users will need to use relative select syntax to get relative behavior, like in interpolations.

EDIT:
Final solution is different than anything above.
read the code and look at the tests to understand it.

@omry omry requested review from odelalleau and Jasha10 April 5, 2021 00:19
omegaconf/omegaconf.py Outdated Show resolved Hide resolved
omegaconf/omegaconf.py Outdated Show resolved Hide resolved
@Jasha10
Copy link
Collaborator

Jasha10 commented Apr 5, 2021

On the other hand, it's inconsistent with interpolations.
Thoughts?

I think behavior for select should be consistent with interpolations, that is, I like the behavior from #656.

@Jasha10
Copy link
Collaborator

Jasha10 commented Apr 5, 2021

One alternative (for both select and for interpolations) could be to emulate the convention used by filesystems, where all paths are relative to $PWD by default, and some leading prefix like "/" is used to signal that a path is absolute.

Either way, my opinion is that select should behave in the same way as interpolations :)

@omry
Copy link
Owner Author

omry commented Apr 5, 2021

One alternative (for both select and for interpolations) could be to emulate the convention used by filesystems, where all paths are relative to $PWD by default, and some leading prefix like "/" is used to signal that a path is absolute.

Either way, my opinion is that select should behave in the same way as interpolations :)

Changing from absolute by default to relative by default is a huge breaking change that will break all interpolations out there.
I don't think it's worth it.

Either way, my opinion is that select should behave in the same way as interpolations :)
So you are proposing this:

from omegaconf import OmegaConf

cfg = OmegaConf.create(
    {
        "a": {
            "b": {
                "c": 10,
            }
        },
        "z": 10,
    },
)
assert OmegaConf.select(cfg.a, "a") == {"a": {"b": {"c": 10}}}
assert OmegaConf.select(cfg.a, ".b") == {"c": 10}

Another option is to allow select to operate in both modes by another flag.
keep the default behavior what it is, and use nested_relative_key=True when we want the interpolation behavior (makes sense (only?) when used in custom resolvers).

@Jasha10
Copy link
Collaborator

Jasha10 commented Apr 5, 2021

Changing from absolute by default to relative by default is a huge breaking change that will break all interpolations out there.
I don't think it's worth it.

Ok.

Either way, my opinion is that select should behave in the same way as interpolations :)
So you are proposing this:
...

Yes, exactly.

My motivation for the proposal was the example you gave:

a: 10
foo:
  a: 20
  b: ${oc.select: .a) # relative to foo, one level up: 10
  b: ${oc.select: a) # relative to foo: 20
  # as opposed to:
  c: ${a}  # absolute: 10
  c: ${.a}  # relative: 20

I thought this was confusing, because b: ${oc.select: .a} means something different from b: "${.a}".
So that the API is simple / consistent, my feeling is those two things should point to the same place.

Another option is to allow select to operate in both modes by another flag.

Could work.

@Jasha10
Copy link
Collaborator

Jasha10 commented Apr 5, 2021

Ok, here is one more idea:

Have select be relative by default, but such that b: ${oc.select: .a} points to the same place as b: "${.a}".
I'll try to come up with an example of what this might look like to see if it makes any sense...

Edit:
I think part of my confusion is because OmegaConf.select takes two arguments, but in your example with b: ${oc.select: .a} there is only one explicit argument.

@omry omry force-pushed the select_nested_is_relative branch from a33298d to 470c7e3 Compare April 5, 2021 19:54
@omry
Copy link
Owner Author

omry commented Apr 5, 2021

Okay, the ambiguity cannot be resolved based on the key alone:

cfg = OmegaConf.create({"a": {"b" : 10}, "b": 20})
OmegaConf.select(cfg.a, "b") # which b is it?

I added a parameter to select: absolute_key, that changes the interpretation of absolute keys like b above.
The default behavior is that absolute keys are relative to the config node passed into select (cfg.a above).
By setting absolute_key=True, absolute keys becomes relative to the config root (this is how interpolations behaves).

@omry omry merged commit 30460aa into master Apr 5, 2021
@omry omry deleted the select_nested_is_relative branch April 5, 2021 20:38
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read all discussions, but if I understand correctly, the idea would be to set absolute_key to True for a future oc.select resolver? Although I'm ok with it, by looking at these examples I still find the expected behavior potentially a bit confusing.

This is actually related to this excerpt from the example in the original PR description:

  b: ${oc.select: .a) # relative to foo, one level up: 10

If I'm not mistaken, this specific line is actually incorrect, because OmegaConf.select(cfg.foo, ".a") would actually return cfg.foo.a (= 20).

But personally, I would find more intuitive for OmegaConf.select(cfg.foo, ".a") to indeed go one level up above foo then select a, i.e. return cfg.a.

In other words, it would make sense for me for OmegaConf.select() to work this way for relative paths:

  • Go up the parents hierarchy as many times as there are dots (note: this means that the input node doesn't need to be a container)
  • Then go down the rest of the path

I also believe that, if a resolver oc.select is added with such a signature, then b: ${oc.select: .a) should resolve to 20 in the example I quoted above. In that case, select() would actually be called on the node b (not its parent).

That being said, I'm not entirely sure what would be the purpose of oc.select (what does it allow that you can't do with interpolations?), and if we add one, wouldn't it make more sense for it to mimic the OmegaConf.select() signature, i.e., take a node as first input?

# 2. relative to the config root
# This is controlled by the absolute_key flag. By default, such keys are relative to cfg.
if not absolute_key and not key.startswith("."):
key = f".{key}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following achieves the same result but is a bit more efficient:

                if absolute_key or key.startswith("."):
                    cfg, key = cfg._resolve_key_and_root(key)

Copy link
Owner Author

@omry omry Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it more efficient? (I am expecting absolute_key to be False in most cases).

Copy link
Collaborator

@odelalleau odelalleau Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current code we will often go through this if condition (with absolute_key set to False and key absolute), which triggers the following steps:

  • updating the key string by pre-pending a dot
  • calling _resolve_key_and_root(), which will remove the dot we just added

My suggestion gets rid of both of these. But it probably doesn't matter much given everything else that happens later in most situations, so it's really not a big deal.

@omry
Copy link
Owner Author

omry commented Apr 7, 2021

I haven't read all discussions, but if I understand correctly, the idea would be to set absolute_key to True for a future oc.select resolver? Although I'm ok with it, by looking at these examples I still find the expected behavior potentially a bit confusing.

Yes, absolute_key would be used when select needs be consistent with interpolations.
The flag will probably also used for other custom resolvers that takes both nodes and a string representing the node path:

Does that make sense?

foo:
  a: 1
  b: 2

thing:
  bar:
    a: 3
    b: 4

  base1: ${foo}  # {a:1, b:2}
  # all are [1,2]
  k1: ${oc.dict:values: ${foo}}  # absolute interpolation
  k2: ${oc.dict:values: foo}      # absolute selection
  k3: ${oc.dict:values: ${..foo}}  # relative interpolation (one level up)
  k4: ${oc.dict:values: ..foo}       # relative selection (one level up)

  base2: ${bar} # {a:3, b:4}
  # [3,4]
  k5: ${oc.dict:values: ${.bar}}  # relative interpolation (same level)
  k6: ${oc.dict:values: .bar}       # relative selection (same level)

This is actually related to this excerpt from the example in the original PR description:

  b: ${oc.select: .a) # relative to foo, one level up: 10

If I'm not mistaken, this specific line is actually incorrect, because OmegaConf.select(cfg.foo, ".a") would actually return cfg.foo.a (= 20).

If I am not mistaken, that would have been the behavior before this fix. it's not a desired behavior.

But personally, I would find more intuitive for OmegaConf.select(cfg.foo, ".a") to indeed go one level up above foo then select a, i.e. return cfg.a.

Would you also find it intuitive for OmegaConf.select(cfg.foo, "a") to return the top level a? (cfg.a).
That would be a breaking change.
If the answer is yes:
How would we then select cfg.foo.a when calling select on the cfg.foo node?
If the answer is no (and I assume that OmegaConf.select(cfg.foo, "a") returns (cfg.foo.a)), how would you select something with an absolute addressing? (like ${a}).

I also believe that, if a resolver oc.select is added with such a signature, then b: ${oc.select: .a) should resolve to 20 in the example I quoted above. In that case, select() would actually be called on the node b (not its parent).

Yes, I this is the desired behavior.
oc.select on a string should be equivalent to using interpolation with the same string.
As I said, the example above would have been the behavior before this fix.

That being said, I'm not entirely sure what would be the purpose of oc.select (what does it allow that you can't do with interpolations?), and if we add one, wouldn't it make more sense for it to mimic the OmegaConf.select() signature, i.e., take a node as first input?

It allows for a selection with a default value.
The is a feature request for that.

a: ${oc.select: does_not_exist, 20) # 20

Another use case I have in mind is an oc.deprecated resolver that would work like oc.select, calling OmegaConf.select but will also issue a deprecation warning.

shiny_new_field: 10
rusy_old_field: ${oc.deprecated, shiny_new_field)

should be equivalent to

shiny_new_field: 10
rusy_old_field: ${shiny_new_field)

But will also issue a warning like:

`rusy_old_field` is deprecated, please change your config/code to use `shiny_new_field`.

@odelalleau
Copy link
Collaborator

Yes, absolute_key would be used when select needs be consistent with interpolations.
The flag will probably also used for other custom resolvers that takes both nodes and a string representing the node path:

Does that make sense?
(...)

Yes, I agree that for custom resolvers taking both nodes/strings, we want the same syntax to yield the same results.

If I'm not mistaken, this specific line is actually incorrect, because OmegaConf.select(cfg.foo, ".a") would actually return cfg.foo.a (= 20).

If I am not mistaken, that would have been the behavior before this fix. it's not a desired behavior.

In the current master, OmegaConf.select(cfg.foo, ".a") still returns cfg.foo.a

But personally, I would find more intuitive for OmegaConf.select(cfg.foo, ".a") to indeed go one level up above foo then select a, i.e. return cfg.a.

Would you also find it intuitive for OmegaConf.select(cfg.foo, "a") to return the top level a? (cfg.a).

No.

If the answer is no (and I assume that OmegaConf.select(cfg.foo, "a") returns (cfg.foo.a)), how would you select something with an absolute addressing? (like ${a}).

In most situations users should be able to simply use: OmegaConf.select(cfg, "a").
Internally we can use OmegaConf.select(some_node._get_root(), "a").
We may want to add a public OmegaConf.get_root() function as well.

That being said, I'm not entirely sure what would be the purpose of oc.select (what does it allow that you can't do with interpolations?), and if we add one, wouldn't it make more sense for it to mimic the OmegaConf.select() signature, i.e., take a node as first input?

It allows for a selection with a default value.
The is a feature request for that.

a: ${oc.select: does_not_exist, 20) # 20

Aaah right, now I remember.

Another use case I have in mind is an oc.deprecated resolver that would work like oc.select, calling OmegaConf.select but will also issue a deprecation warning.

Sounds good.

Summary of my thoughts so far:

  • We can keep things as they are now, it's not a big deal to me
  • We agree on the user-facing behavior of ${oc.select: ...}
  • I think the user-facing behavior of OmegaConf.select(node, relative_path) is a bit unintuitive (it behaves as if there was one less dot in the prefix compared to what I would expect intuitively), and I see little value in adding absolute_key to the public interface
  • Maybe oc.select should have a different name since it doesn't have the same signature as OmegaConf.select and uses a non-default setting absolute_key=True (what about oc.get instead?)

@omry
Copy link
Owner Author

omry commented Apr 7, 2021

To me the current behavior is like that of accessing files in the file system.

$ touch a
$ ls a
a
$ ls ./a
./a

For a file system, ./a and a are equivalent and are both relative to cwd.

The reason I added support to relative interpolations to OmegaConf.select was to enable custom resolvers to use it.
The realization that it's relative by default came later (which means it's fundamentally incompatible with interpolations, which are absolute by default).

We could potentially introduce a second API to select using interpolation style keys and remove the relative key support from OmegaConf.select() (with a proper error telling the user to use the other method if they try to use relative keys).

As for the name of oc.select, we can pick another name. oc.get sounds a bit too much like dict.get which is not exactly it.
oc.dereference makes a bit more sense to me.

OmegaConf.dereference(cfg, interpolation_key)

and:

a: 10
b:
  a: 20
  # this is what oc.select would do when called here, if we supported it too:
  c: ${oc.select: a} # 20
  c: ${oc.select: .a} # 10 or error if we don't want to support relative keys in select.
  d: ${oc.dereference: a} # 10
  d: ${oc.dereference: .a} # 20

Another option is oc.readlink, which is similar to the system call to resolve symbolic links. (but I guess that would return the final path of the interpolation and not the actual value).
another oc.follow.

@Jasha10
Copy link
Collaborator

Jasha10 commented Apr 8, 2021

As for the name of oc.select...

Another option is oc.resolve.

@omry
Copy link
Owner Author

omry commented Apr 8, 2021

nope, we already have OmegaConf.resolve and it will be confusing :).

@odelalleau
Copy link
Collaborator

To me the current behavior is like that of accessing files in the file system.
(...)
For a file system, ./a and a are equivalent and are both relative to cwd.

I had actually thought about the file system analogy but didn't find it convincing, since in general going up in the file system is done by increment of .., not a single dot, e.g. ../../../foo
Although, to my surprise, my shell extensions actually support more than two dots on my laptop (but not on another dev machine) so I guess it's something that could be argued for.

However, more importantly (to me), when I see an interpolation like x: ${.y}, I read it as "go up to the parent of x then find the node y". This interpretation clashes with OmegaConf.select(cfg.x, ".y") which instead means "find the node y inside x". To be more consistent IMO either the first one should be written x: ${..y} or the second one should search for y in cfg.
I realize that both can be reconciled by seeing x: ${<something>} as calling select() on the parent of x, but I don't find it intuitive. Might just be me though :)

a: 10
b:
  a: 20
  # this is what oc.select would do when called here, if we supported it too:
  c: ${oc.select: a} # 20
  c: ${oc.select: .a} # 10 or error if we don't want to support relative keys in select.
  d: ${oc.dereference: a} # 10
  d: ${oc.dereference: .a} # 20

Just to be sure I understand, this behavior of oc.dereference is what you intended for oc.select originally, correct?

Regarding the name, I'm still unsure. I like oc.getXYZ because the behavior is what typical get() functions do: try to access something by a key and fall back to a default value if not found (ex: dict.get(), os.getenv()). Maybe oc.getvar? Anyway, no biggie, I can live with any name -- including oc.select :)

@omry
Copy link
Owner Author

omry commented Apr 8, 2021

Going up more than one level in the file system requiring ../ is not really the point.
The difference is that in the file system there is no way for you to access an absolute path without /, which is how interpolations are behaving.
After this change, OmegaConf.select on relative interpolations is behaving much more like relative symlinks in the file system.
The interpolation is a symlink, the directory containing it is the container.
The following session creates a few files and directories in a scratch dir representing the top level config:

# create physical layout
~/tmp$ mkdir b
~/tmp$ echo "top level a" > a  
~/tmp$ echo "nested a" > b/a

# create symlinks in top level
~/tmp$ ln -s a z1
~/tmp$ ln -s ./a z2
~/tmp$ ln -s ../a z3

# create symlinks in b
~/tmp$ ln -s a b/z1
~/tmp$ ln -s ./a b/z2
~/tmp$ ln -s ../a b/z3
# layout
~/tmp$ tree
.
├── a
├── b
│   ├── a
│   ├── z1 -> a
│   ├── z2 -> ./a
│   └── z3 -> ../a
├── z1 -> a
├── z2 -> ./a
└── z3 -> ../a

1 directory, 8 files
# inspect content through symlinks
~/tmp$ cat z1
top level a
~/tmp$ cat z2
top level a
~/tmp$ cat z3
cat: z3: No such file or directory
~/tmp$ cat b/a 
nested a
~/tmp$ cat b/z1
nested a
~/tmp$ cat b/z2
nested a
~/tmp$ cat b/z3
top level a

(The behavior for links in b is the same if you are inside b).

However, more importantly (to me), when I see an interpolation like x: ${.y}, I read it as "go up to the parent of x then find the node y"

The Interpretation that works (here and in symlinks), is:
"Go up to the parent of the value ${.y} then find the node y".

a: 10
b:
  a: 20
  # this is what oc.select would do when called here, if we supported it too:
  c: ${oc.select: a} # 20
  c: ${oc.select: .a} # 10 or error if we don't want to support relative keys in select.
  d: ${oc.dereference: a} # 10
  d: ${oc.dereference: .a} # 20

Just to be sure I understand, this behavior of oc.dereference is what you intended for oc.select originally, correct?

Yes, my idea for oc.select was that it would select the same value an node interpolation would. (similarly, any custom resolver that operates on string keys should probably operate in that way too).
I can see why this can be a bit confusing because it's different than OmegaConf.select's default behavior.
We can explain the subtle difference in the doc for oc.select.

@odelalleau
Copy link
Collaborator

Alright, let's roll with it, I don't want us to spend too much time discussing what is or is not intuitive for me personally :)

@omry
Copy link
Owner Author

omry commented Apr 8, 2021

The point I was making is that this is consistent with how relative file system symlinks are behaving, which are pretty common.
Do you find the symlink behavior above intuitive?

About the name of oc.select: the jury is still out.
I agree with you that we may want to remove absolute_key from OmegaConf.select and use a different API from custom resolvers to resolve interpolations.

@odelalleau
Copy link
Collaborator

Do you find the symlink behavior above intuitive?

Yes for interpolations -- it's with OmegaConf.select() that my brain gets confused, but I'll survive.

@omry
Copy link
Owner Author

omry commented Apr 8, 2021

Gotcha.
I think of OmegaConf.select(cfg.b, "z1") as equivalent to accessing "z1" from the b directory:

~/tmp/b$ tree
.
├── a
├── z1 -> a
├── z2 -> ./a
└── z3 -> ../a
~/tmp/b$ cat z1
nested a
~/tmp/b$ cat z2
nested a
~/tmp/b$ cat z3
top level a

@Jasha10
Copy link
Collaborator

Jasha10 commented Apr 9, 2021

~/tmp/b$ tree

Here are the analogous omegaconf objects (using "oc.select" for now, even if the name may change):

cfg1 = OmegaConf.create(
    {
        "a": "top level a",
        "b": {
            "a": "nested a",
            "z1": "${oc.select: a}",  # nested a
            "z2": "${oc.select: .a}",  # nested a
            "z3": "${oc.select: ..a}",  # top level a
        },
    }
)


cfg2 = OmegaConf.create(
    {
        "a": "top level a",
        "b": {
            "a": "nested a",
            # "z1": ...
            "z2": "${.a}",  # nested a
            "z3": "${..a}",  # top level a
        },
    }
)

I find it very compelling that $[.a} behaves exactly the same as ${oc.select: .a}, and ${..a} the same as ${oc.select: ..a}.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants