Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update API to allow users to specify case root, archive dir, etc #31

Closed
mnlevy1981 opened this issue Sep 29, 2020 · 4 comments
Closed

Comments

@mnlevy1981
Copy link
Contributor

Right now the CaseClass is hard-coded to look in my scratch space for run directories and short-term archive directories. It would be nice to compare some plots to the 1 degree run, which means accessing data in Kristen's directories instead... so some sort of configuration step or additional command line argument to allow that would be useful.

@dcherian
Copy link

Would an intake catalog be the answer here? CaseClass could take a simulation_name and then look for it in the catalog.

(I've wanted to do this with the simulations I look at but haven't actually done it)

@mnlevy1981
Copy link
Contributor Author

We're talking about creating an intake catalog for every individual CESM case we want to analyze (ideally by using https://github.com/NCAR/CESM_catalog and possibly even by building it into the CIME / CESM workflow), but so far for this project we are leaning towards hiding the intake interface unless [advanced] users want to query it directly. So we would still want some logic to figure out where to look for the catalog given a case name.

@mnlevy1981
Copy link
Contributor Author

Some notes for future @mnlevy1981 to think about once casper and the jupyter hub are released from maintenance:

What I'm picturing now are optional arguments to CaseClass.__init__() named something like rundir_root, archive_root, and ts_root. Defaults would be '/glade/scratch/$USER', '/glade/scratch/$USER/archive', and None, and if ts_root isn't specified then the class won't even attempt to read in time series files.

A downside is that every notebook in this package would need to specify these three arguments for every CaseClass object. Another option would be to have the default values come from a .yaml or .cfg file, or to allow some such file to alter the defaults -- users could then

  1. specify three optional arguments in the __init__() call
  2. for arguments not specified when creating the object, look to see if they are defined in a config file
  3. for arguments not specified when creating the object or defined in the config file, fallback to the defaults mentioned above

If the goal is to have intake-esm handle everything, then I'm hesitant to introduce too much complexity in what is a temporary solution... but even with the move to intake it may be useful to have these features for analyzing old runs so perhaps there's still value.

@mnlevy1981
Copy link
Contributor Author

In #34, we ended up adding a single (required) argument: output_roots, a list that contains all potential directories such as RUNDIR, DOUT_S, or root directory for time series output (a string is acceptable as well, in that case output_roots = [output_roots] so it's treated as a list containing a single location.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants