Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nssdca-delivery] urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0 #476

Open
mace-space opened this issue Dec 21, 2023 · 29 comments

Comments

@mace-space
Copy link

Discipline Node Information


Engineering Node Process

See the internal EN process at https://pds-engineering.jpl.nasa.gov/content/nssdca_interface_process

@c-suh
Copy link
Contributor

c-suh commented Jan 27, 2024

@mace-space hello and thank you for your submission! Unfortunately, there are a number of errors which must be addressed before this can be posted for the NSSDCA. I am attaching the validation report for your review; please resubmit the updated delivery package after addressing the multiple instances of the 5 errors. Thank you!

Validation report: cassini_uvis_solarocc_beckerjarmak2023_v1.0_20231221-validate.txt

As an additional note, I've noticed that you're using an older version of Validate and highly recommend upgrading to the latest version as it has the latest features and bug-fixes. Thank you!

@mace-space
Copy link
Author

mace-space commented Feb 1, 2024

Thanks, @c-suh

I updated Validate to the latest version, which I'm glad to have done as it spotted bugs that the older version of Validate missed (and I have re-processed the bundle to correct those table offset byte count errors).

However, after re-running pds-deep-registry-archive, the AIP and SIP remain invalid. Looks like issue #155

The AIP and SIP labels reference an incomplete bundle LIDVID: cassini_uvis_solarocc_beckerjarmak2023::1.1, resulting in errors:

   FAIL: file:/Volumes/pdsdata-admin/data_sandbox/deep_registry/test/cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240201_aip_v1.0.xml
       ERROR  [error.label.schematron]   line 27, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
       ...
      ...
   FAIL: file:/Volumes/pdsdata-admin/data_sandbox/deep_registry/test/cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240201_sip_v1.0.xml                                         
       ERROR  [error.label.schematron]   line 77, 25: The number of colons found in lidvid_reference: (2) is inconsistent with the number expected: (5:7).
  ...
   ...

@c-suh
Copy link
Contributor

c-suh commented Feb 1, 2024

@mace-space that is a great find on the deep archive issue! I concur and hope that the issue will be resolved soon. I will try to notify you here once it is. Thank you!

@jordanpadams
Copy link
Member

@c-suh see updated package here:
Archive.zip

@c-suh
Copy link
Contributor

c-suh commented Feb 16, 2024

@jordanpadams and @mace-space this set has been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LID below:

SIP LID:

  • urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240215

@mace-space
Copy link
Author

Thanks! @c-suh I checked the status and SIP LIDVID: urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240215::1.0
failed because

Bundle located at https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023//bundle.xml does not match checksum in manifest

I think this is because I updated the bundle (to fix the issue detected by the updated version of Validate) while the pds-deep-registry-archive tool was being patched and therefore the url (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023) points to v1.1 (rather than v1.0) of the bundle.

Shall I try again using the updated url for v1.0 (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0/)? Then separately run pds-deep-registry-archiveand the steps outlined in the PDS Delivery Checklist on v1.1 of the bundle? Is the process the same when registering another version of the bundle to the deep archive or is there a different process to register an updated bundle?

@c-suh
Copy link
Contributor

c-suh commented Feb 21, 2024

Hi @mace-space! Please hold off on re-running the deep-registry-archive tool until a new, non-dev version is released (e.g., higher than v1.1.4).

I believe the process is the same when registering another version of the bundle. When creating this new bundle, however, be sure to increment the version in the VID wherever applicable.

To make sure I'm understanding correctly, would you confirm or correct the following bullet points? Thank you!

@mace-space
Copy link
Author

mace-space commented Feb 22, 2024

Thanks, @c-suh! I will hold off re-running pds-deep-registry-archive until there's a new non-dev version, and will make sure I increment the version in the VID when it comes to registering v1.1

I'll respond to your points above in bold inline here:

  • there exist v1.0 and v1.1 packages for cassini_uvis_solarocc_beckerjarmak2023
    Yes, the cassini_uvis_solarocc_beckerjarmak2023 bundle has v1.0 and v1.1
  • there is no difference in the v1.0 and v1.1 packages other than the VID
    No, there are differences between the v1.0 and v1.1 bundles. The v1.1 bundle corrects a byte count error in table offsets for the labels in the data collection, and consequentially the collection_data.csv, collection_data.xml, bundle.xml have also been updated.
  • v1.0 has been submitted to us and to the NSSDCA
    Yes, v1.0 has been submitted to EN (via harvest and registry-manager) and also submitted to NSSDCA (via pds-deep-registry-archive). Also, I wanted to double check whether that was what you meant or are you referring to another process of submission?
  • v1.1 has not been submitted to either us at EN or to the NSSDCA
    No, v1.1 has been submitted to EN (via harvest and registry-manager) but has not yet been submitted to the NSSDCA
  • the URL https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023/ points to v1.1
    Yes
  • the URL https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0/ points to v1.0
    Yes

Thanks again for all your help

@jordanpadams
Copy link
Member

@mace-space as long as the latest versions with latest paths of each bundle are loaded into the next-gen registry, you should be able to just run pds-deep-registry-archive with each of their applicable LIDVIDs, and get the 2 accurate SIP packages:

$ pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1

$ pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0

@jordanpadams
Copy link
Member

@mace-space also, you should be able to upgrade your Deep Archive software and continue delivering SIP packages. Let us know if you run into any additional issues.

@mace-space
Copy link
Author

mace-space commented Feb 28, 2024

Thanks, here's the delivery for both v1.0 and v1.1 of urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023 :

NOTE: There were invalid urls in cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240228_sip_v1.0.tab (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023//), which I corrected to https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0//

NOTE: As described previously, v1.0 fails validation because of table offset byte count errors that were only flagged by v3.4.1 of Validate (passed validation using older version of Validate). Would you still want v1.0 included, despite it failing validation with v3.4.1?

Let me know if you have any questions or concerns

@c-suh
Copy link
Contributor

c-suh commented Mar 7, 2024

@jordanpadams and @smclaughlin7, passing Mia's question to you:

NOTE: As described previously, v1.0 fails validation because of table offset byte count errors that were only flagged by v3.4.1 of Validate (passed validation using older version of Validate). Would you still want v1.0 included, despite it failing validation with v3.4.1?

The validation report in case it might be helpful.


In the meantime, @mace-space, the v1.1 set has been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LID below:

SIP LID:

  • urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240228

@jordanpadams
Copy link
Member

NOTE: As described previously, v1.0 fails validation because of table offset byte count errors that were only flagged by v3.4.1 of Validate (passed validation using older version of Validate). Would you still want v1.0 included, despite it failing validation with v3.4.1?

@mace-space I would say yes. if the data went online, to ensure provenance of the data in the archive, even if it had some issues with it, it should go to the NSSDCA.

@c-suh
Copy link
Contributor

c-suh commented Mar 8, 2024

Note: since posting of v1.0 is to ensure provenance of the data, I am ignoring both errors found in the node's validation report (error.table.missing_LF) and in the EN's validation report (error.label.filesize_mismatch).


@mace-space the v1.0 set has also been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LID below:

SIP LID:

  • urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240228

@mace-space
Copy link
Author

mace-space commented Mar 13, 2024

Thanks! v1.1 is in Pre-Ingest stage (some remarks about Context_Area, context products but seems to be progressing OK).

However, v1.0 is still reporting an error:

SIP LIDVID: urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240228::1.0

Node: PDS_RNG

Received: 2024-03-09
Failed: 2024-03-09

Remarks: Manifest checksum calculated does not match manifest checksum in SIP.

I think I need to do a similar thing as for #490's Vgr2 NSSDCA submission and re-load the data into the registry with the correct URL (https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0 for v1.0 of this bundle), and then re-run the deep archive software?

When I run :

curl -u username 'https://search-rms-prod-etcetcetc.us-west-2.es.amazonaws.com/registry/_search?q={_id:"urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0"}' | json_pp

it lists ops:Label_File_Info/ops:file_ref and ops:Data_File_Info/ops:file_ref with the v1.1 URL ( https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023) instead of the v1.0 URL ( https://pds-rings.seti.org/pds4/bundles/cassini_uvis_solarocc_beckerjarmak2023_v1.0)

@c-suh
Copy link
Contributor

c-suh commented Mar 13, 2024

@mace-space correct, as you so neatly recapped above and did for Vgr2 in #490. Thank you!

@mace-space
Copy link
Author

Please find v1.0 with corrected URL:
cassini_uvis_solarocc_beckerjarmak2023_v1.0_NSSDCA_20240313.tar.gz

@matthewtiscareno
Copy link

@jordanpadams and @c-suh: I wonder if there might be a larger issue here.

Whenever we archive what is the current version at the time, the URL includes the bundle name with no version number appended (e.g., pds4/bundles/cooldata). However, when that version is superseded, the new current version takes that same URL, while the previous version now has the same URL with its version number appended (e.g., pds4/bundles/cooldata_v1.0). This reflects how we have always managed versioning under PDS3.

Will this always require that we re-ingest any bundle at the time that it is superseded? If so, should we change our practice, so that this isn't required? Or could EN tools change so that this is no longer required? Do other nodes do things differently?

One solution might be that pds4/bundles/cooldata_v1.0 already exists even when it is the current version, and either that or pds4/bundles/cooldata is an alias pointing to the other. Please let us know what you think.

@jordanpadams
Copy link
Member

jordanpadams commented Mar 18, 2024

@matthewtiscareno a few other nodes encounter this issue as well, and there is a new requirement for the registry to provide some sort of utility to allow a node to update the data path to a file, versus requiring a reload of the products to get the correct paths. NASA-PDS/registry#266. No matter what, it will require some sort of operational intervention to know the file paths have changed, and update the paths in the registry.

From an efficiency perspective, it would be much easier to just put the data online as pds4/bundles/cooldata_v1.0 and pds4/bundles/cooldata_v2.0 from the start, and then just load the data as the new versions come online and that is it. This would require no manual intervention of movement of files, and would decrease overhead over time. That being said, we understand that some nodes prefer "clean" archive directories that include only the latest versions of data products. So we will need to implement some sort of utility. We also hope to avoid the need to do this down the road by providing some web app using the registry to drive "directory views" of pages, so we can obfuscate those old versions of the users unless they want to see them.

Happy to talk more about this or we can discuss at the SWG on Wednesday.

@c-suh
Copy link
Contributor

c-suh commented Mar 22, 2024

@mace-space the corrected package from your comment has a validate error. Please review this report for details. Thank you!

@mace-space
Copy link
Author

mace-space commented Mar 22, 2024

Thanks @c-suh. Sorry to have missed this. It appears that the validate error may be due to extra slashes in the filepaths (field 2) from record 3 onwards and this is causing validate to interpret it as a null field. Do you know what might be causing the additional slash?

 urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0    /bundle.xml                                                                                                                                                                                                                                                                                                                                                                                                                                  
 urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0    /readme.txt                                                                                                                                                                                                                                                         
 urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023:data::2.0 //collection_data.csv                                                                                                                                                                                                                                               
 urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023:data::2.0 //collection_data.xml                                                                                                          
 urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023:data:uvis_euv_2006_257_solar_time_series_ingress::1.1 //uvis_euv_2006_257_solar_time_series_ingress.xml  
  ...
  

(I had to delete a lot of whitespace between field 1 and 2 to get it to display here)

I ran the pds-deep-registry-archive tool in the same manner as for other bundles previously submitted to NSSDCA, but I'm wondering if I somehow introduced this error? I'm using v1.1.5 of pds-deep-registry-archive

It also appears that the VIDs are wrong – 2.0 and 1.1, instead of 1.0

@jordanpadams
Copy link
Member

@mace-space apologies here. this is another bug in our software. we are investigating and will get back to you here.

@jordanpadams
Copy link
Member

$ pds-deep-registry-archive -s PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.1

@mace-space
Copy link
Author

mace-space commented Mar 26, 2024

% pds-deep-registry-archive --site PDS_RNG urn:nasa:pds:cassini_uvis_solarocc_beckerjarmak2023::1.0
Thanks for looking into this

@jordanpadams
Copy link
Member

Blocked by NASA-PDS/deep-archive#164, which is blocked by NASA-PDS/registry#185

@matthewtiscareno
Copy link

Blocked by NASA-PDS/deep-archive#164, which is blocked by NASA-PDS/registry#185

@jordanpadams: Does this mean that we should simply stand by until EN resolves these issues?

@jordanpadams
Copy link
Member

@matthewtiscareno unfortunately yes. The fix is in work, but until we have a working API up and running, end users can’t really run pds-deep-registry-archive

@jordanpadams
Copy link
Member

@c-suh here is an updated SIP package: cassini_package.zip

@c-suh
Copy link
Contributor

c-suh commented Sep 10, 2024

@mace-space the 2 sets provided by Jordan have been posted for NSSDCA processing! From tomorrow, you can check the status at https://nssdc.gsfc.nasa.gov/psi/ReportPDS4.jsp using the SIP LIDs below:

SIP LIDs:
urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.0_20240903
urn:nasa:pds:system_bundle:product_sip_deep_archive:cassini_uvis_solarocc_beckerjarmak2023_v1.1_20240903

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Release Backlog
Development

No branches or pull requests

5 participants