-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We need a plan for connecting CIME to the public input data sources #15
Comments
We have an ftp site that was opened up for staging data. Documentation is still weak. Suggest a meeting with Jun Wang and Kate Friedman to bolster this. We should explore using GIT-LFS added to mr-weather-app or to the weather-model to see if the data can be downloaded automatically |
@mvertens, could we have a meeting to discuss this issue? Thanks. |
I will try to put something on the calendar to discuss. I am interested in @arunchawla-NOAA suggestion of GIT-LFS and how that repo could be connected to CIME to download data as needed. Another question is whether there is a standard directory structure expected on supported platforms, e.g., fixed files, ICs, etc. - is there a standard structure already used on NOAA systems, e.g., Hera, that should be generalized and adopted across all platforms. CESM uses this idea of a shared root data directory - how well does this apply here? |
Could @KateFriedman-NOAA please provide a list of current input data sources expected to be used by the release? e.g., the FTP site and any others, GIT-LFS, etc. |
In the current ftp site the input files (except static ones) are in tar format and it is hard to access individual files without extracting the file. We need to have a hierachical and standardized folder structure rather than having single tar file for input. It could be also nice to have the input files for different resolution (i.e. C384) and also support for CCPP. |
Given the large size of many of these files - should we support a compression mechanism? |
I know of the following on the EMC ftp site: ...which are broken into sub-groups:
This is the full collection of FV3GFS fix files as of November 12th. Most likely overkill. Do we need a paired down collection?
|
@KateFriedman-NOAA this is great and already very helpful! Can you provide (or is there already) a brief description of what's in each of those subdirectories? Are these subdirs expected to be reproduced as is on all supported platforms? |
@KateFriedman-NOAA The global/fix directory has the fixed files. What is the public source for users to retrieve initial conditions and boundary conditions? Is the plan to host those separately? |
I'm mostly a facilitator of copying these files to our supported platforms so I can't give a detailed description. Is one needed? Here is a very brief description:
When we get updated fix files myself or another developer copies it into one of the main collections (FIX_DIR) on the WCOSS-Dell and then I copy them to the FIX_DIRs on WCOSS-Dell (other side), both WCOSS-Crays, Hera, and Jet. I also save the whole collection in a new tarball on HPSS. If needed I can copy whatever final set of fix files folks land on for the release to the supported platforms. We hold them under a group account.
There is no public source currently. Our archival server (HPSS) is not accessible by the public, only from NOAA machines. The release team would need to post a sample set online somewhere (our ftp server maybe). NCEI is a public source for model output but I don't see the restart files there, just some post-processed grib output. Access to model output, especially initial conditions, has become a real issue for anyone who doesn't have access to a NOAA machine. :( |
The list of fix files needed for the runs supported in this release is in
the draft UFS WM User's Guide. Work is in progress by @LPCarson to add
one-line descriptions for each fix file This is in Google Doc
https://docs.google.com/document/d/1D7aupwMAjIdv_8eHiRtdkyad_o_f_eGmpGFIvBeZXcg/edit#.
It will later be converted to Sphinx. @KateFriedman-NOAA
<https://github.com/KateFriedman-NOAA> if you can contribute to the
description these files, please do!
…On Tue, Dec 10, 2019 at 8:11 AM Kate Friedman ***@***.***> wrote:
@KateFriedman-NOAA <https://github.com/KateFriedman-NOAA> this is great
and already very helpful! Can you provide (or is there already) a brief
description of what's in each of those subdirectories? Are these subdirs
expected to be reproduced as is on all supported platforms?
I'm mostly a facilitator of copying these files to our supported platforms
so I can't give a detailed description. Is one needed? Here is a very brief
description:
- fix_am - atmospheric
- fix_chem - chemistry
- fix_fv3 - fv3
- fix_fv3_gmted2010 - another fv3
- fix_gldas - GLDAS
- fix_orog - orography
- fix_sfc_climo - surface climo
- fix_verif - verification
When we get updated fix files myself or another developer copies it into
one of the main collections (FIX_DIR) on the WCOSS-Dell and then I copy
them to the FIX_DIRs on WCOSS-Dell (other side), both WCOSS-Crays, Hera,
and Jet. I also save the whole collection in a new tarball on HPSS.
If needed I can copy whatever final set of fix files folks land on for the
release to the supported platforms. We hold them under a group account.
@KateFriedman-NOAA <https://github.com/KateFriedman-NOAA> The global/fix
directory has the fixed files. What is the public source for users to
retrieve initial conditions and boundary conditions? Is the plan to host
those separately?
There is no public source currently. Our archival server (HPSS) is not
accessible by the public, only from NOAA machines. The release team would
need to post a sample set online somewhere (our ftp server maybe). NCEI is
a public source for model output but I don't see the restart files there,
just some post-processed grib output. Access to model output, especially
initial conditions, has become a real issue for anyone who doesn't have
access to a NOAA machine. :(
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#15?email_source=notifications&email_token=AE7WQARLHSRZSITWGWJ6ZZ3QX6WQNA5CNFSM4JXCWJV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGPSJII#issuecomment-564077729>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE7WQAVUM6KSECE6UQY55KTQX6WQNANCNFSM4JXCWJVQ>
.
|
@arunchawla-NOAA @KateFriedman-NOAA @ligiabernardet - @rsdunlapiv @jedwards4b and @mvertens propose the following structure for managing and hosting input data for the release:
|
The proposal about fix files seems good to me. The proposal for initial files is fine along as it is extensible. It is my understanding that the Data Prep release team is working on having the capability for the MR Weather App to run from GFS GRIB2 files available through public archives (such as NOAA NOMADS and NCAR). This is very important for the community to use this app for research. It would be good if CIME can start the app from those public files. User may have to download and stage the data on disk by hand - that is fine. Lateral boundary conditions are not needed for the global MR Weather app configuration because it is a global domain. |
@ligiabernardet Can you provide an example of files from these public archives that would allow us to run the model? Thanks |
No, I cannot. The current model I have access to does not work from GFS GRIB2 files available in public archives. The UFS release data prep team was doing enhancements to allow this capability. @LarissaReames-NOAA Do you have any update on this? |
@ligiabernardet @jedwards4b The files that we have been testing against are located on NCEP's http server of the form gfs.tCCz.pgrb2.0p25.fFFF or gfs.tCCz.pgrb2.0p50.fFFF. We're also looking to support older archived files on NCDC's nomads server. This is just one example of those files. Functionally, these files should be very similar. There is also a NCEI server (Arun). |
@LarissaReames-NOAA Do you have a timeline of when I could expect to be able to run the ufs_mrweather_app using files from nomads? |
@jedwards4b We fixed one last bug in implementing the surface parameter processing code yesterday, so we'll be able to start testing that soon. @arunchawla-NOAA might have more information on how long he thinks that might take. |
@LarissaReames-NOAA is it possible to extract fv3_gfdlmprad.tar under /EIB/UFS/RT directory in the ftp side. By this way, CIME could access individual required files in it. |
@uturuncoglu I put that tarball up on our ftp server and have access to it. @arunchawla-NOAA @junwang-noaa Any objections if I unpack the tarball on our ftp server? |
@KateFriedman-NOAA @yangfanglin @arunchawla-NOAA Is the FTP directory structure changed? I could not see RT directory anymore. There is a simple-test-case/ directory but it is a tar file. We were using RT directory to get some files such as tables etc. |
why are you using RT? That is a snapshot of one of the regression test cases. What files are you needing ? I am adding @junwang-noaa and @DusanJovic-NOAA |
@arunchawla-NOAA @DusanJovic-NOAA @junwang-noaa The list of files that are retrieved from RT is data_table The nc files are required for the example test case, that will be default for the application. The table files can be retrieved from the source directory but i am not sure. |
the simple test case has many of the files you need
---------------------------------------------------------------
Arun Chawla
Chief
Engineering & Implementation Branch
Room 2083
National Center for Weather & Climate Prediction
5830 University Research Court
College Park, MD 20740
Ph: 301-683-3740
Fx: 301-683-3703
------------------------------------------------------------
…On Fri, Dec 20, 2019 at 3:23 PM Ufuk Turunçoğlu ***@***.***> wrote:
The list of files that are retrieved from RT is
data_table
diag_table
field_table
nems.configure -> i could create it with script for mrweather
gfs_ctrl.nc
sfc_data.tile*.nc
gfs_data.tile*.nc
The nc files are required for the example test case, that will be default
for the application. The table files can be retrieved from the source
directory but i am not sure.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15?email_source=notifications&email_token=AL5NYI3E4IDRLBR7GJLAFB3QZUSUFA5CNFSM4JXCWJV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOCLIA#issuecomment-568075680>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AL5NYI54X6DICECC7TEVXXTQZUSUFANCNFSM4JXCWJVQ>
.
|
I think that it would be nice to have sample set of ICs that could be used to run the model without chgres. I also did not implement restart capability yet. We might need some files for it. BTW, is there any documentation that has information about restarting the standalone FV3. |
@arunchawla-NOAA Then, we need to extract it just as we did before for fv3_gfdlmprad. |
@arunchawla-NOAA @GeorgeGayno-NOAA @KateFriedman-NOAA @yangfanglin How do we handle required fixed resolution dependent input files for CHGRES? So, following files needed for each supported resolution and need to be placed under FTP
|
Currently, i am getting them from the directory that i used for the prototype version of workflow. |
CHGRES also required to
So, maybe it could be good to create POST and PRE directories and place required fixed input files there. Any idea? |
@GeorgeGayno-NOAA and @LarissaReames-NOAA Kate set up an ftp server with all the fix files needed for the model. Can you check to see if all the files needed for chgres are also there? If not can you let Kate know where to add them for, or alternatively place them in a directory there? Location of ftp server below |
The files needed for chgres are there. |
I think we need to open this again. Following issues are not solved yet. 1 - We need to extract the tar file under 2 - global_hyblev.l65.txt is in C96.facsf..nc There are some files under this directory but those are in grib format. I think it would be better to put those netcdf files under https://ftp.emc.ncep.noaa.gov/EIB/UFS/global/fix/fix_fv3_gmted2010.v20191213/ based on their resolution. |
The above files are under the ./fix_sfc subdirectory.
|
Okay. I could see them now. Thanks @GeorgeGayno-NOAA. We still need to extract the tar file. |
@uturuncoglu I have unpacked the tarball: https://ftp.emc.ncep.noaa.gov/EIB/UFS/simple-test-case/ I also moved the gzipped tarball up one level so it wasn't within the unpacked folder. Let me know if I should put it back down within the simple-test-case folder. The path to that tarball is now: https://ftp.emc.ncep.noaa.gov/EIB/UFS/simple-test-case.tar.gz
|
Why do we need this tar file unpacked? It's only purpose is to be used as a canned case for testing the ufs-weather-model executable. Nothing else. It should be not used as a source of any configuration or input data. |
@KateFriedman-NOAA thanks. @DusanJovic-NOAA Actually, we are copying *_table files, nems.configure and also default initial conditions but i'll include *_table files and nems.configure to FV3 CIME interface. The initial condition is also generated by the chgres and if we don't have any objections we could always use cghres to produce data from GFS in a desired resolution. Then, we will remove the dependency to simple-test-case/ directory. |
I am closing this issue as most things seem to be resolved here |
In order for CIME to be extended to support different forecast initial conditions flexibly, we need to understand the directory structure in the FTP site that will be made available. We need to understand the requirements for CIME to obtain this data and the list of files that will be needed.
A major concern is the potentially large size of some of the files - and we need to determine if they need to be manually downloaded or have CIME download them automatically as part of its workflow.
The text was updated successfully, but these errors were encountered: