-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for source and key aliases #367
Conversation
I've commented on a few specific parts, but overall I think this is in good shape. |
I wonder if we should have support for loading aliases automatically? The reason is because device names and properties can change so people will need to load different aliases for different proposals; for example MID has gone through a couple iterations of Karabo devices that aggregate properties from multiple devices (usually motors) in a single device, and each time the naming has changed slightly. The first way I thought of handling this was through instrument libraries (like SCS's toolbox), which could wrap from extra_data import open_run as ed_open_run
def open_run(proposal, run_no, **kwargs):
run = ed_open_run(proposal, run_no, **kwargs)
# Check if an alias file exists at a certain location (usr/extra-data-aliases.yaml?),
# or load a default list of aliases (e.g. for the XGM etc), or use an alias based on the date of the run.
return run.with_aliases(foo) So the user could run e.g.: import mid
run = mid.open_run(1234, 56) And then they would have their instrument specific aliases without having to call |
When I discussed this with Philipp, we agreed to get explicit alias support in and then discuss this question of how to make it more convenient. I'm against automatically loading aliases based on something like the working directory of the notebook, or the current user running it, or anything like that. I don't want to make it easy for people to write notebooks which break (or worse, appear to work but do something different) when they're moved, or when someone else runs them, or whatever. But I could see using the proposal number we pass to |
Ah, you beat me to replying to James just barely. Indeed the file support was initially motivated by the idea of loading such files automatically, e.g. based on location of the running code. I ended up leaving it out of this MR for the sake of easier review, but also because I didn't end up liking the idea so much anymore after trying to implement it. There were some cheap ways in terms of runtime cost to do it, but you quickly run into corner cases where it's outright confusing. Next to the point Thomas raised, some people write modules rather than notebook, should it apply here as well? I like the idea of attaching it to the proposal rather than the file as well, though it implies there's a canonical way to do aliases across a single user group. But one could define a fixed filename in some proposal location (say |
Sorry I guess wasn't clear, I never had anything in mind other than alias files belonging to proposals :)
Yes, that's exactly what I put in my code snippet (though I used the non-hidden |
I'd suggest we do that in a follow-up PR, if that's OK with everyone. Then we needn't hold this one up with discussion of what exactly we call the file, whether we prefer one of the available formats over another, and so on. 🙂 |
00735e1
to
ce1a731
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nitpicks, but otherwise this LGTM
Thanks, I've cleaned up all the fixup commits then, ready to go from my side. |
Thanks, LGTM |
Would you like to have documentation as part of this MR or add it afterwards? I plan to get it done next week. |
I'm happy enough for it to come a bit later - even after a release if needed - so long as it doesn't get forgotten. |
I rebased on top of |
It's been requested frequently to somehow allow easier access to sources and keys without the need to deal with Karabo device names and their properties. This PR adds support for this via an alias mechanism. As discussed towards the end of last year, it is mostly separate from the existing code using literal sources and keys via an index property
run.alias[...]
inspired by the likes ofpandas
andxarray
.Aliases may refer to a source or a source-key or equivalently to a
SourceData
or aKeyData
object. There are no pure key aliases that may be used with different sources. They can be attached to aDataCollection
object via two methods:DataCollection.with_aliases(*alias_defs)
DataCollection.only_aliases(*alias_defs, strict=False, require_all=False)
They may be passed any number of alias definitions, which may be
dict
mapping the alias to astr
(for sources) or 2-tuple
ofstr
(for source-keys).str
pointing to a JSON, YAML or TOML file with alias definitions. These should define adict
in the same way, with the additional option of using nested dictionaries for multiple keys per source. (See docstring)With the returned
DataCollection
, aliases may only be used through theDataCollection.alias
property.Once attached, aliases are handed down to any
DataCollection
created from this, e.g. by selection. The aliases are merged as well when usingDataCollection.union()
as long as there is no conflict - in that case aValueError
is raised.Aliases are shown in
DataCollection.info()
in tags<
>
for both sources and keys, even if a source is not shown in detail.Documentation follows after the initial discussion 😃