Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: use the xmlvalidate activity and SIP XSD files to validate XML metadata #39

Closed
jraddaoui opened this issue Aug 7, 2024 · 1 comment · Fixed by #64
Closed

Comments

@jraddaoui
Copy link
Contributor

Is your feature request related to a problem? Please describe.

We are still using a Python script and a fixed local version of the XSD files to validate the metadata.xml or UpdatedAreldaMetadata.xml files.

Describe the solution you'd like

@mcantelon recently created an activity in the temporal-activities respository to perform that validation:

https://github.com/artefactual-sdps/temporal-activities/blob/main/xml/xsd_validate_activity.go

We also need to change the workflow to use the XSD files contained on each SIP instead of the fixed ones copied into the environment.

@jraddaoui jraddaoui added this to Enduro Aug 7, 2024
@jraddaoui jraddaoui moved this to 👍 Ready in Enduro Aug 7, 2024
@mcantelon mcantelon self-assigned this Aug 21, 2024
@djjuhasz djjuhasz self-assigned this Oct 10, 2024
@djjuhasz djjuhasz moved this from 👍 Ready to ⏳ In Progress in Enduro Oct 10, 2024
djjuhasz added a commit that referenced this issue Oct 16, 2024
- Install libxml2-utils in the preprocessing-worker Docker image to
  provide xmllint, which is required by
  https://github.com/artefactual-sdps/temporal-activities/tree/main/xmlvalidate
- Update Python version 3.13
- Download and build the latest development version of bagit-python to
  to get the fixes made since the v1.8.1 release, and for compatibility
  with Python 3.13
- Update the Dockerfile syntax version to the latest version of 1.x
djjuhasz added a commit that referenced this issue Oct 16, 2024
- Install libxml2-utils in the preprocessing-worker Docker image to
  provide xmllint, which is required by
  https://github.com/artefactual-sdps/temporal-activities/tree/main/xmlvalidate
- Update Python version 3.13
- Download and build the latest development version of bagit-python to
  to get the fixes made since the v1.8.1 release, and for compatibility
  with Python 3.13
- Update the Dockerfile syntax version to the latest version of 1.x
- Add stdout & stderr output to error message when running the Python
  metadata validation script (`xsdval.py`) to aid debugging
djjuhasz added a commit that referenced this issue Oct 16, 2024
- Install libxml2-utils in the preprocessing-worker Docker image to
  provide xmllint, which is required by
  https://github.com/artefactual-sdps/temporal-activities/tree/main/xmlvalidate
- Update Python version 3.13
- Download and build the latest development version of bagit-python to
  to get the fixes made since the v1.8.1 release, and for compatibility
  with Python 3.13
- Update the Dockerfile syntax version to the latest version of 1.x
- Add stdout & stderr output to error message when running the Python
  metadata validation script (`xsdval.py`) to aid debugging
djjuhasz added a commit that referenced this issue Oct 18, 2024
Fixes #39

Switch from using the internal "validate metadata" activity which calls
a Python (xsdval.py) script to the temporal-activities/xmlvalidate
module. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
@djjuhasz djjuhasz linked a pull request Oct 18, 2024 that will close this issue
djjuhasz added a commit that referenced this issue Oct 18, 2024
Fixes #39

Switch from using the internal "validate metadata" activity which calls
a Python (xsdval.py) script to the temporal-activities/xmlvalidate
module. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
djjuhasz added a commit that referenced this issue Oct 18, 2024
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
djjuhasz added a commit that referenced this issue Oct 18, 2024
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
djjuhasz added a commit that referenced this issue Oct 18, 2024
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
@djjuhasz djjuhasz changed the title Feature: use xsd-validate-activity and SIP XSD files to validate XML metadata Feature: use the xmlvalidate activity and SIP XSD files to validate XML metadata Oct 18, 2024
djjuhasz added a commit that referenced this issue Oct 22, 2024
- Install libxml2-utils in the preprocessing-worker Docker image to
  provide xmllint, which is required by
  https://github.com/artefactual-sdps/temporal-activities/tree/main/xmlvalidate
- Update Python version 3.13
- Download and build the latest development version of bagit-python to
  to get the fixes made since the v1.8.1 release, and for compatibility
  with Python 3.13
- Update the Dockerfile syntax version to the latest version of 1.x
- Add stdout & stderr output to error message when running the Python
  metadata validation script (`xsdval.py`) to aid debugging
djjuhasz added a commit that referenced this issue Oct 22, 2024
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
djjuhasz added a commit that referenced this issue Oct 22, 2024
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
- Remove Python, python-bagit and lxml from the Docker image
djjuhasz added a commit that referenced this issue Oct 22, 2024
- Install libxml2-utils in the preprocessing-worker Docker image to
  provide xmllint, which is required by
  https://github.com/artefactual-sdps/temporal-activities/tree/main/xmlvalidate
- Update Python version 3.13
- Download and build the latest development version of bagit-python to
  to get the fixes made since the v1.8.1 release, and for compatibility
  with Python 3.13
- Update the Dockerfile syntax version to the latest version of 1.x
- Add stdout & stderr output to error message when running the Python
  metadata validation script (`xsdval.py`) to aid debugging
djjuhasz added a commit that referenced this issue Oct 22, 2024
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
- Remove Python, python-bagit and lxml from the Docker image
@github-project-automation github-project-automation bot moved this from ⏳ In Progress to 🎉 Done in Enduro Oct 22, 2024
@sallain
Copy link
Contributor

sallain commented Oct 30, 2024

This looks like it's working - packages are failing and succeeding on metadata validations appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants