-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add/Implement data source mirrors #26
Comments
thanks @observingClouds! I was actually thinking that maybe I should remove my zarr-based mirrors from the main repository and we just use AERIS directly instead. What do you think? I'm happy to keep my zarr-based catalog available, but maybe I'll put that on a separate repository that we can link to from this main one? Maybe in |
Well, as long as you could keep the files up-to date (and I don't see that I should reprocess them soon) and/or make sure they see which version they are using (DOI), it might actually be great to still have that resource in case AERIS is down. It would be great, if one could have several possible resources in the catalog and intake switches (semi-)automatically, but I guess this is not yet implemented? You guys probably know more. |
I think references to Aeris should go into the catalog. However, having an active backup is also a very good idea. There is already some progress in intake/intake#557 on providing multiple locations for one dataset, but it is not done yet. Having a mirror structure could be an addition, but I am not so sure if we really want to have that. A result of this would be that users would have to specify some form of path manually again and most likely we'll end up in having a couple of scripts passed around which only access the "mirror" tree. This can become particularly problematic if the mirror is not complete, such that some datasets will effectively work only on the main tree while others will probably only work on the mirror tree... |
So, in the meantime (before mirroring is available) we could just go ahead and replace the entry backed by my server with the data on AERIS? I think adding a |
Puh... I really find this one hard to decide.
I have to 🤷 and hope that others have better arguments. |
Ah yes, you're absolutely right. I hadn't thought of that. We could instead adopt a convention of adding |
I don't know if this makes the situation better or worse... If we'e implement this, then a user would need to access the data using something like: def reliable_to_dask(cat, entry):
try:
return cat[entry].to_dask()
except:
return cat[f"{entry}__mirror"].to_dask()
cat = eurec4a.get_intake_catalog()
### some more code
ds = reliable_to_dask(cat.ATR, "track") This has the potential of not creating a ton of hard-coded |
This includes a change from denby.io to Aeris. see eurec4a#26 for some discussion about this
Hi guys,
I'm just in the process of uploading a new version of the radiosonde dataset. This time, it is not a tar archive, but the level1 and level2 data can be directly accessed through the AERIS THREDDS server.
@leifdenby do you want to update your zarr files, or change to the AERIS THREDDS server (https://observations.ipsl.fr/thredds/catalog/EUREC4A/PRODUCTS/MERGED-MEASUREMENTS/RADIOSOUNDINGS/v3.0.0/level2/catalog.html), or even better add both sources for a better availability in case a server is down.
I make an announcement in the data-channel, when the upload is final.
Cheers!
The text was updated successfully, but these errors were encountered: