-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new use case of python embedding for point observations with NRL innovation files. #635
Comments
Email from Liz on 9/23/2020:
We now output h5 files instead of ascii. I put both formats and the python converter on the ftp directory. Putting the obs values into ascii2nc will get us to the verification task and is where we likely need to start. Having matched pairs would also be useful. There is additional information in the file that would be helpful to retain (you could treat it as an additional observation type) , in particular:
If that information is available along with the ob value, this enables us to implement some DA diagnostics fairly easily. |
On 10/1/2020, Tara noted that the funding to pay for this must be spent by the end of 2020. |
The prototype for this work exists in branch feature_635_nrlinnov. I decided to add the filtering capability in the Python embedding for now, via pandas. The pandas filters are configurable via [user_env_vars] in METplus, as is the mapping between NRL column names and MET 11-column names. Processing is somewhat slow, but these are very large files. The ASCII format takes ~ 3 minutes per file while the HDF5 files are about half of that or less (1-1.5 minutes per file) thus I would recommend using HDF5 input files whenever possible. |
Describe the New Feature
This task is to create a python script to read NRL innovation files and use python embedding to pass them into the ascii2nc tool. Once that script works well, collaborate with NRL staff to develop a new METplus use case which calls ascii2nc to prepare the point observations and then either Point-Stat or Ensemble-Stat to verify model output against them. Need to get direction from Liz Satterfield for direction on these details.
See details about the formatting of the NRL innovation files as a comment on this issue.
Consider breaking down the multiple steps for this task into sub-issues:
(1) Write and test a python embedding script to process the input file with ascii2nc.
(2) See comments for a description of how Liz would like (1) to be configurable.
Should filtering logic be added to the python-embedding script or to the ascii2nc tool?
(3) Gather sample model data which should be verified using these point observations and develop of METplus use case for these steps.
(4) This data may also be suitable for python embedding in Stat-Analysis to compute statistics directly using the obs and innovation values. Explore this option and potentially include it in the use case as well.
Acceptance Testing
Input NRL innovations files can be found in kiowa:/d1/projects/METplus/METplus_Development/feature_635/innovation_data
Contact Liz Satterfield for the model data to be used in use case development.
Time Estimate
Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.
Sub-Issues
Consider breaking the new feature down into sub-issues.
Relevant Deadlines
METplus-4.0 release
Funding Source
2700021
Define the Metadata
Assignee
Labels
Projects and Milestone
Define Related Issue(s)
Consider the impact to the other METplus components.
No know impacts.
New Feature Checklist
See the METplus Workflow for details.
Branch name:
feature_<Issue Number>_<Description>
Pull request:
feature <Issue Number> <Description>
Select: Reviewer(s), Project(s), Milestone, and Linked issues
The text was updated successfully, but these errors were encountered: