-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a data custodian, I want the Deep Archive to work around invalid URLs in the Registry #162
Comments
@nutjob4life I just realized I never triaged this and we probably need this cleaned up ASAP to unblock that operations ticket. |
@jordanpadams on it! |
Thanks @nutjob4life π |
@nutjob4life what url should I use to run the deep-registry-archive? I got "ValueError: π€·ββοΈ The bundle urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1 cannot be found in the registry at https://pds.nasa.gov/api/search/1.0/" |
Hi @gxtchen, I don't know the answer to this. I believe the URL is correct but perhaps the registry is missing some data? @tloubrieu-jpl @jordanpadams could you take a peek? When I run it, I get the same thing: mirasol 209 % .v/bin/pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1
INFO π PDS Deep Registry-based Archive, version 1.3.0
ERROR π₯ We got an unexpected error; sorry it didn't work out
Traceback (most recent call last):
File "/Users/kelly/Documents/Clients/JPL/PDS/Development/nasa-pds/deep-archive/src/pds2/aipgen/registry.py", line 375, in main
generatedeeparchive(args.url, args.bundle, args.site, not args.include_latest_collection_only)
File "/Users/kelly/Documents/Clients/JPL/PDS/Development/nasa-pds/deep-archive/src/pds2/aipgen/registry.py", line 350, in generatedeeparchive
prefixlen, bac, title = _comprehendregistry(url, bundlelidvid, allcollections)
File "/Users/kelly/Documents/Clients/JPL/PDS/Development/nasa-pds/deep-archive/src/pds2/aipgen/registry.py", line 224, in _comprehendregistry
raise ValueError(f"π€·ββοΈ The bundle {bundlelidvid} cannot be found in the registry at {url}")
ValueError: π€·ββοΈ The bundle urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1 cannot be found in the registry at https://pds.nasa.gov/api/search/1.0/
INFO π Thanks for using this program! Bye! |
@gxtchen you cannot test this this with the public registry until the multi-tenancy migration has completed: NASA-PDS/registry#185 |
you can try downloading and loading that data into a local registry and test with that |
@gxtchen can wait to test that until the API is up again. |
Checked for duplicates
No - I haven't checked
π§βπ¬ User Persona(s)
Users like the ones in NASA-PDS/operations#476
πͺ Motivation
The Registry API seems to be loaded with some bad data, namely file paths like
with a
//
betweencassini_uvis_solarocc_beckerjarmak2023
anddata
. This causes the Deep Archive to output Submission Information Packages with double-slashes in them too, causing validation errors.π Additional Details
See NASA-PDS/operations#476 for a specific example.
Acceptance Criteria
Given a document in OpenSearch containing double-slashes in the URL path
When I perform
pds-deep-registry-archive
on the bundle containing that documentThen I expect the file paths and URLs output to be "cleaned" up to single slashes
βοΈ Engineering Details
No response
The text was updated successfully, but these errors were encountered: