-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop one steps #236
Develop one steps #236
Conversation
This adds several features to the one-step pipeline - big zarr. Everything is stored as one zarr file - saves physics outputs - some refactoring of the job submission. Sample output: https://gist.github.com/nbren12/84536018dafef01ba5eac0354869fb67
* save lat/lon grid variables from sfc_dt_atmos
Use the big zarr from the one step workflow as input to the create training data pipeline
This makes the diagnostics variables appended to the big zarr have the appropriate step and forecast_time dimensions, just as the variables extracted by the wrapper do.
This accomplishes two things: 1) preventing true model 0s from being cast to NaNs in the one-step big zarr output, and 2) initializing big zarr arrays with NaNs via full so that if they are not filled in due to a failed timestep or other reason, it is more apparent than using empty which produces arbitrary output.
Remove references to hard coded dims and data variables or imports from vcm.cubedsphere.constants, replace with arguments. Can provide coords and dims as args for mappable var
Allows for starting the one-step jobs at the specified index in the timestep list to allow for testing/avoiding spinup timesteps
* change order of required args so output is last * fix arg for onestep input to be dir containing big zarr * update end to end integration test ymls * prognostic run adjustments
This PR introduces several improvements to the logging capability of our prognostic run image - include upstream changes to disable output capturing in `fv3config.fv3run` - Add `capture_fv3gfs_func` function. When called this capture the raw fv3gfs outputs and re-emit it as DEBUG level logging statements that can more easily be filtered. - Refactor `runtime` to `external/runtime/runtime`. This was easy since it did not depend on any other module in fv3net. (except implicitly the code in `fv3net.regression` which is imported when loading the sklearn model with pickle). - updates fv3config to master
Has the |
|
||
__all__ = ["open_model", "predict", "update"] | ||
|
||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is leftover from when I was trying to debug something. I'll remove it.
Exactly. It makes the build more deterministic, but ultimately we might move towards a tool like poetry. I don't know if the prognostic run really needs |
* update history * fix positional args * fix function args * update history * linting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving- any fixes required if the integration test fails will be done as a new PR to master
I agree... seems like it should be just be fv3gfs-python + scikit-learn + zarr + fv3config at same version as corresponding fv3net image. Hopefully we can get it closer to that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to get this merged into master. I imagine there will be some more fixes to be done, but I think it's fine to do a subsequent PR to master.
No description provided.