-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
5 bash scripts from Gary to create time series #29
Merged
mnlevy1981
merged 15 commits into
marbl-ecosys:master
from
mnlevy1981:reshaping_scripts
Oct 30, 2020
Merged
5 bash scripts from Gary to create time series #29
mnlevy1981
merged 15 commits into
marbl-ecosys:master
from
mnlevy1981:reshaping_scripts
Oct 30, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Also, a single python script that submits each of the five scripts to the slurm queue on casper. Note that I've modified Gary's original scripts to take case identifier (e.g. 003 or 004) and a single year as command line arguments. The python script sets the default case to 004, but requires users to specify at least a single year. The user can also specify specific scripts to run, and the default is to run all five of them. There is also a "dry-run" option that doesn't actually call sbatch.
Since the bash scripts will be submitted to slurm by the python script, they do not need to be executable.
Now pass --mail-type and --mail-user through the python script (default sends email, but --no-mail turns off the messages)
I've added 0007 and 0008 to glade/campaign, so compare_ts_and_hist_004 checks those years. Also, I cleaned up some of the output (no longer printing start / finish time)
There is now a way to query whether a specific year of a variable from dataset came from time series or history files. This is probably only useful for the compare_ts_and_hist notebooks, which have been re-run. Note that for this commit I re-ran the notebooks on cheyenne, which does not have access to the time series data on campaign -- when casper is back up, I will re-run the notebooks to actually do the comparison.
Casper is online so we can compare to time series again
Pass short term archive root as an argument (default is /glade/scratch/$USER/archive) to shell scripts rather than assuming archive is in my scratch directory and pass full name of case rather than suffix. These two changes combined should make the tool general enough to apply to any CESM case (e.g. Kristen's 1-degree cocco runs). Also cleaned up the way data_reshaping/logs is ignored; may need an additional commit to create the directory from run_all.py as a result.
Created utils/compare_ts_and_hist.py which will eventually be a command line tool but also provides compare_ts_and_hist() via import utils.
6341c3b reduces the code duplication between the two notebooks comparing time series and history output, but there are a few things I don't like:
One option for the second issue is to add a |
1. CaseClass has two new public methods: get_timeseries_files() and get_history_files(); both return lists of files for a given year and stream. For time series, users can also specify a list of varnames to further pare down the resulting list of files. 2. gen_dataset() now relies on the two functions mentioned in (1) to determine what files to open 3. Massive overhaul to compare_ts_and_hist: * Use open_mfdataset and case.get_history_files() to open ds_hist for a given stream and year; then loop through variables and check that get_timeseries_files() does not return an empty list * No longer run da.identical(); for now, we are only concerned with verifying that all variables from history files made it into time series * This puts "reinstate da.identical()" on a to-do item; even with dask I was running into memory issues comparing monthly 3D fields * Refactored so there is utils/compare_ts_and_hist.py that will eventually be a command-line tool for comparing a given stream and year but is currently imported via utils. Also wrote utils.utils.timeseries_and_history_comparison() which is just a wrapper that accounts for things like missing cice.h1 time series from year 1. I think compare_ts_and_hist.py should live with CaseClass when we refactor this package, while timeseries_and_history_comparison() is specific to the high-res analysis 4. Add ability to get cice.h and cice.h1 streams for both history and time series so (3) compares all five streams rather than just looking at a few specific variables in pop.h
And a few other bad / unnecessary imports
Github Actions didn't like the "import utils" call even though it was fine in the notebooks, I think because utils.utils was trying to import compare_ts_and_hist.py; now that import is in the timeseries_and_history_comparison() function and hopefully everything will work again.
mnlevy1981
commented
Oct 22, 2020
Moved the import statement out of timeseries_and_history_comparison() and fixed sys.path in test_utils.py to ensure the import statement still works.
compare_ts_and_hist.py needs CaseClass.CaseClass, not just CaseClass
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Also, a single python script that submits each of the five scripts to the slurm
queue on casper.
Note that I've modified Gary's original scripts to take case identifier (e.g.
003 or 004) and a single year as command line arguments. The python script sets
the default case to 004, but requires users to specify at least a single year.
The user can also specify specific scripts to run, and the default is to run
all five of them. There is also a "dry-run" option that doesn't actually call
sbatch.