-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Republication failure because of old version of the dataset. #180
Comments
@pabretonniere another option for you is to start using the pre-release v5 of the publisher. This doesn't use THREDDS catalogs at all. https://esg-publisher.readthedocs.io/en/gen-five-pkg/ |
Thanks @sashakames ! I'll give it a try with the new version. Is it fully compatible with the previous one? I mean, can I use only the publish command with mapfiles generated with my previous conda env (esgmapfile (from esgprep v2.9.4 2018-10-12))? |
Correct, same mapfiles as before. the esg.ini format is not compatible (you still use the old format with esgmapfile), but the new file format is much simpler and so has a shorter file. |
Right. Thanks a lot! |
Update on this: we tried to update the publisher but it was not successful. Debugging this with @pchengi it seems to come from the fact we had an old version of the node. We are now upgrading the node with Ansible but still have an issue (ESGF/esgf-ansible#153). In the case we can't update the node, do you see any possible solution with the current publisher/node? Thanks |
Hi @sashakames After our discussion in the last CDNOT meeting, here are the details of the run with the latest publisher (esgcetv5 beta):
I attach the log and strace if it helps. It looks like a permission problem (User: https://esg-dn1.nsc.liu.se/esgf-idp/openid/kserradell_liu is not authorized to publish/unpublish resource: CMIP6.ScenarioMIP.EC-Earth-Consortium.EC-Earth3.ssp585.r3i1p1f1.Omon.zostoga.gn.v20210608.zostoga_Omon_EC-Earth3_ssp585_r3i1p1f1_gn_209501-209512.nc) but we checked them with Prashanth and they looked OK. Thank you. |
@pchengi2 @pchengi @pabretonniere I hope you all can take another look at the permissions issue. Looking through the trace the only issue I see is that it detected that the dataset version that is being published appears to match the same version that was previously published. If this publication is intended to update a previous version, the recommended practice is to create a new version. But that shouldn't prevent the publication on the index. v20210608 The spot in the logs shows that there already exists a dataset with the same id: |
Thank you for the answer. This version might come from a previous intent of publication, but the data corresponding to v20210608 is the one we want to publish. But it doesn't appear on the index node anyway, so something else in the workflow must have failed... |
the dataset record appears for me (full record): https://esg-dn1.nsc.liu.se/esg-search/search/?distrib=false&format=application%2Fsolr%2Bjson&data_node=esgf.bsc.es&master_id=CMIP6.ScenarioMIP.EC-Earth-Consortium.EC-Earth3.ssp585.r3i1p1f1.Omon.zostoga.gn I'd recommend a different access control string as I've found that the rules application to be at times unpredictable and restrict things I expect to pass. |
What do you mean by "I'd recommend a different access control string"? Would you recommend trying to unpublish the current v20210608 that didn't make it properly to the index node and republish it again with a new version? |
You should not have to unpublish in order to republish in this case. The access control issue is on the index server side. Presumably if there is a problem with access control, you wouldn't be able to unpublish either because of the same rule. Also, (1) if you haven't change the data for v20210608 then should be no need to create a new version here. |
@pchengi2 @pchengi, can we discuss about the first point that Sasha is mentioning? What would you recommend? @sashakames : meanwhile, I would be keen on trying your second option. Could you please give more details (or point me to some documentation) to do it? Thank you both! |
Sure: https://esgf.github.io/esg-search/REST_Publishing_Services.html#push-operations Apparently the "dataset" passes as I see this now in the log you provided: Published record: CMIP6.ScenarioMIP.EC-Earth-Consortium.EC-Earth3.ssp585.r3i1p1f1.Omon.zostoga.gn.v20210608|esgf.bsc.esDone. Cleaning up. And that explains why the record might appear when I searched but there are no files associated with it. But it raises an issue that the publisher should halt when it encounters the first error of this type. |
Thank you, I will check this and let you know! |
Sorry about the radio silence. My wife delivered a child three weeks ahead
of schedule, so I couldn't inform about my limited availability. I'm
expected to be working one day a week after tomorrow. I'll go through this
trail and see what I can do.
Regards
Prashanth
Den ons 29 sep. 2021 kl 08:43 skrev pabretonniere ***@***.***
…:
Thank you, I will check this and let you know!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#180 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA3P3I3ORENVBMPFFC5A4WDUEKYR5ANCNFSM46MCFT5Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@pabretonniere I found a bug in the publisher that impacts the "File" "id" field and that might be responsible for why the publications are failing to return files. I'll try to get a release out but might mean tomorrow. I'll keep you posted. |
Thank you @sashakames ! Looking forward to testing the new release! |
Hi, After trying the new version of the publisher (5.1.0b6), I get a different error, solr related it seems. Log attached. Note that I had to comment the lines 25 and 26 in esgcet/cmip6.py because it threw me an error. if self.replica:
self.skip_prepare= argdict["skip-prepare"] Thanks |
Thanks for the update and pointing out the error. Sorry you needed to comment that line. I'm surprised, you didn't set replica = true. That will be fixed in the next release. I'm not sure why the LiU index is reporting a 500 error. if this happens repeatedly we may need to troubleshoot with Prashanth. For now though I see what's happening. The workaround is to clean up: delete the record that was partially published. You may need additional arguments for
Then check if it was properly deleted:
If so you can start the publication run again. |
Excellent, thanks! Now I could unpublish the data and the new publish works fine until getting a permission error from LIU. I will see this with Prashanth.
|
Hi @pchengi When you can, could you have a look at the error I'm having please? It looks like a permission issue on your side. If you prefer to communicate through the ESGF slack or by email. Thank you very much |
Hi, Following @pabretonniere's error here at BSC, I wanted do add the steps I did to try to publish one variable of one of our experiments:
I first ran the |
@aearamos I took a look at the log, and you were getting "success" messages, so presumably the publication had worked. Unfortunately the index node at LiU is offline at the moment (Prashanth is addressing a security concern) so we can't verify the publication. |
Thanks. Actually we were surprised not to see errors in the logs. But we tried this command already a few weeks ago and at this time, I think the LIU node was up and running (we could see other of our data published there). But we can come back in a few days when the index node is back to life. |
I checked again (in event there's already a replica) and I found that the LLNL replica shard picked it up. See on our new UI (pre-beta test) (if the server is down try again in a few minutes as we may be updating it at some point...) |
Indeed. It seems to be findable. And the data that @aearamos mentioned and that you found is completely new (it doesn't come from an errata-unpublish-correct-republish) but we never saw it appear on the "classic search" through the portal. So I understand this indicates a LIU/index node issue? |
Correct, maybe check back next week with P on the status of their node. If publishing works now, that's great news, but clearly you need the index node online to go further. |
Hi @pchengi The LIU index node seems to be back but the issue we discussed above with Sasha is still present. Can you have a look please? Thanks! |
Hello @sashakames I've been able to publish the new data (all successful and no error in the logs, as you can see in an example log attached) and can indeed see it in your new UI: However, I can't download the files, I get the following message when accessing them through opendap: Error {
code = 500;
message = "java.net.MalformedURLException: /data/CMIP6/DCPP/EC-Earth-Consortium/EC-Earth3/dcppB-forecast/s2021-r1i4p1f1/Amon/pr/gr/v20211222/pr_Amon_EC-Earth3_dcppB-forecast_s2021-r1i4p1f1_gr_202111-202210.dods must start with dods: or http: or file:";
};
Here is the full command I ran: esgpublish --verbose --ini /data/home/pbretonn/esgcet-5/a3w5-dcppB-forecast-r1i4p1f1.ini --map /data/mapfiles/a3w5-dcppB-forecast-r1i4p1f1/
The corresponding log, ini and an example of mapfiles used are attached in the esgf-files.zip Do you have an idea of what could be wrong? The URL seems to be right and has the same syntax and directory tree as what we had successfully published before. Thanks a lot, |
Hi @pabretonniere I took a look, but first wanted to see if I could download the file via http. I'm getting a 404, see
You may want to check your data mount. |
Thank you @sashakames There might indeed be 2 issues. My data is located in /data/a3w5-dcppB-forecast-r1i4p1f1/CMIP6/DCPP/EC-Earth-Consortium/EC-Earth3/dcppB-forecast/s2021-r1i4p1f1/Amon/pr/gr/v20211222/pr_Amon_EC-Earth3_dcppB-forecast_s2021-r1i4p1f1_gr_202111-202210.nc In the esg.ini (attached in the previous comment) called in the esgpublish, I specified the following: data_roots = {"/data/a3w5-dcppB-forecast-r1i4p1f1": "esg_dataroot"} whereas in /esg/config/esgcet/esg.ini, I have
So I guess there is a mismatch between the 2 where one is looking for the data in /data/a3w5-dcppB-forecast-r1i4p1f1/CMIP6 while the other one is looking in /data/CMIP6. |
this worked: |
I think I have a solution for you. data_roots = {"/data/a3w5-dcppB-forecast-r1i4p1f1": "esg_dataroot"} performs a replacement. So I understand now why the data_roots = {"/data/a3w5-dcppB-forecast-r1i4p1f1": "esg_dataroot/a3w5-dcppB-forecast-r1i4p1f1"} is what you need for now. I think there might be an issue with "submounts" The common case is that the "project root", eg "CMIP6" is to be found right after the data root. but that might not fit well for all arrangements. If the old publisher supported "/data" -> "esg_dataroot" then that should be the case here as well. I'll need to ensure the logic works or correct if not, as this was updated recently in a PR and I tested various cases but this feature is admittedly lacking a comprehensive specification of valid inputs and outputs. A release update is in the works. |
This makes sense. My data has always been in /data/$expid-$exp-$member/CMIP6/... (published with the old and the new publisher) Now with the new publisher, I was using an esg.ini template with
With your new release, I will use data_roots = {"/data/$expid-$exp-$member": "esg_dataroot/$expid-$exp-$member"}. Thanks again! |
Sorry that having a separate {"/data/$expid-$exp-$member": "esg_dataroot/$expid-$exp-$member"} for each tuple sounds inconvenient. So we will get the map {"/data" | "esg_dataroot" } to work soon (or it might now but not sure if you want to be the first tester, as I can't promise I'll look at that until next week given my schedule and commitments. |
Hi @pabretonniere I was able to test the publisher and it should work fine with the |
Hi, After talking with @pchengi he suggested me to try (with the old publisher) he suggested me to split the command I was using for publishing in 2: from:
to
followed by:
I did it for an old experiment that we could publish some time ago, the ones that got me the failure I reported initially here and one completely new experiment. With this strategy, I get some errors that are that I didn't see while merging both keywords as you can see in the logs attached and Prashanth thought our initial issue might come from there. It's strange because the files are fine and they could be published (=seeable) on https://aims2.llnl.gov/metagrid but we thought it might be worth reporting it. |
Hi @pabretonniere We had a hardware issue with the aims2.llnl.gov host. We are working on restoring metagrid. That said, I'm unable to support issues with the old publisher. I'd hope you could continue with the v5.1.x-beta versions (I recommend keeping up to date as we push out bug fixes). If you go that route (rather than reverting to the "old" [v3.x] versions) I'll make every effort to support. |
With the new publisher, it works, I can publish the data, both for the DCPP data that I mentioned a few weeks ago and for the one I published this afternoon. I did a try right now and I didn't get any error message. I can't double check the data can be seen through your new search engine when aims2.llnl.gov is back, but as before, I can't see it in the official search https://esg-dn1.nsc.liu.se/search/cmip6-liu/. |
https://aims2.llnl.gov/metagrid/search is back on line. What are the search criteria? I'm wondering why it doesn't appear on the liu CMIP6 site... |
The search was for example https://aims2.llnl.gov/metagrid/search?project=CMIP6&data=%7B%22activeFacets%22%3A%7B%22activity_id%22%3A%5B%22CMIP%22%5D%2C%22data_node%22%3A%5B%22esgf.bsc.es%22%5D%2C%22source_id%22%3A%5B%22EC-Earth3-CC%22%5D%2C%22experiment_id%22%3A%5B%22historical%22%5D%2C%22variant_label%22%3A%5B%22r9i1p1f1%22%5D%7D%7D and it indeed worked, I can see and download the file. So we are left with the initial issue of why it is not appearing on the LIU and other CMIP6 websites. @pchengi2 suggested that it might come from the error of the THREDDS that appeared with the old publisher but I understand that it is not the case as the file seems to be correctly published and seeable through your metagrid. Would it be an option to try to publish this file (or a new one if needed) on another index node to see if the error comes from there? |
@pabretonniere @pchengi I figured out why the published datasets aren't appearing. I noticed this in the metadata:
So the publisher configuration likely has |
It wooooooorks!!!! Thank you so much @sashakames !!! I could publish a new dataset deactivating the replica and I can see it on the LIU node! I'll have a look at how to modify the metadata records for the old data and if I can't make it work, I'll talk to you on Slack. I still don't get why it stopped working at some point as I don't think I ever touched this, but at this point, it doesn't really matter any more, as long as the issue is fixed! Thanks again! |
Just one minor additional thing, I see the OpenDAP URL given by the metagrid search is not working: gives It is missing the .nc. https://esgf.bsc.es/thredds/dodsC/esg_dataroot/a3o4-r9-noreplica/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-CC/historical/r9i1p1f1/Amon/uas/gr/v20220302/uas_Amon_EC-Earth3-CC_historical_r9i1p1f1_gr_185001-185012.nc.dods works though. Same for the thredds search in the official CMIP website. But the http download works, so that's good. |
On the OpenDAP urls: This should be corrected if using the most recent publisher version (v5.1.0b8). I recall there was an issue and we had the template wrong .dods, or .html extension . As this error proliferated we'll try to get it corrected in the Metagrid copy url feature if feasible. But best practice to upgrade the publisher. An updated release is close so you may prefer to delay for a day or so. |
Hi I have similar failure.
How can I clean up the previous info on the local PostgreSQL database only for $expid, without affecting other cmip6 data publication? Can I use this |
@tiantdk Is this still an issue for you? if so, I recommend you upgrade your publisher. Once done you should be able to publish new versions without relying r PostgreSQL database issues. Please see https://esg-publisher.readthedocs.io/en/latest/ for more information. |
@sashakames I have solved the problem by using the above mentioned command and the current version is 3.7.3. Many thanks for your suggestion! I will consider an update. |
Hello,
I have a recurrent issue when republishing a dataset.
I don't know if it is related to the publication itself, or the previous unpublish.
We published a dataset in our BSC ESGF node in last May (2021/05/11), and the process went well. Then to add a missing year in a variable, I unpublished it, and I saw the dataset being unpublished from the ESGF as expected, both from the search and from our thredds, without any error message.
Then I corrected the dataset, redid the drs, mapfiles and publish but when getting to the publish, I have this error:
It seems the publisher is still looking for the previous version, even if doesn't exist anymore anywhere (the file thredds/catalog/esgcet/237/CMIP6.ScenarioMIP.EC-Earth-Consortium.EC-Earth3.ssp585.r3i1p1f1.Omon.zostoga.gn.v20210511.xml does not exist anymore).
I see that the new version is present in THREDDS (https://esgf.bsc.es/thredds/catalog/esgcet/249/CMIP6.ScenarioMIP.EC-Earth-Consortium.EC-Earth3.ssp585.r3i1p1f1.Omon.zostoga.gn.v20210608.html) but not in the index node, probably because of the previous error.
In the past we unpublished and republished without any problem, but this is happening for the 3 last datasets we worked on.
Would @lukaszlacinski or @soay (if related to the PIDS, which I doubt) have an idea about what could be happening?
We are publishing from our data node, with LIU as index node and esg-publisher (esgcet) version 3.7.3.
Thanks a lot.
Tagging @aearamos who faced the same issue working with me on these datasets.
The text was updated successfully, but these errors were encountered: