Skip to content

Commit

Permalink
Use xmllint to validate SIP manifests
Browse files Browse the repository at this point in the history
Fixes #39

Switch from the internal `ValidateMetadata` activity which calls a
Python (xsdval.py) script to the `temporal-activities/xmlvalidate`
activity. xmlvalidate calls the xmllint C program to validate the SIP
manifest file against the XSD schema files included in the SIP.

- Import github.com/artefactual-sdps/temporal-activities/xmlvalidate
- Switch to xmlvalidate with the xmllint validator for validating the
  SIP metadata file
- Remove the internal "validate metadata" activivity
- Remove the sampledata directory containing the `xsdval.py` script,
  Arelda XSD files, and sample SIP
  • Loading branch information
djjuhasz committed Oct 18, 2024
1 parent 3b740a9 commit b2c2e6d
Show file tree
Hide file tree
Showing 23 changed files with 36 additions and 2,111 deletions.
1 change: 0 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ RUN addgroup -g ${GROUP_ID} -S preprocessing
RUN adduser -u ${USER_ID} -S -D preprocessing preprocessing

USER preprocessing
COPY --from=build-preprocessing-worker --link /src/hack/sampledata/xsd/* /
COPY --from=build-preprocessing-worker --link /out/preprocessing-worker /home/preprocessing/bin/preprocessing-worker
RUN mkdir /home/preprocessing/shared

Expand Down
5 changes: 3 additions & 2 deletions cmd/worker/workercmd/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"crypto/rand"

"github.com/artefactual-sdps/temporal-activities/bagcreate"
"github.com/artefactual-sdps/temporal-activities/xmlvalidate"
"github.com/go-logr/logr"
"go.artefactual.dev/tools/temporal"
temporalsdk_activity "go.temporal.io/sdk/activity"
Expand Down Expand Up @@ -89,8 +90,8 @@ func (m *Main) Run(ctx context.Context) error {
temporalsdk_activity.RegisterOptions{Name: activities.AddPREMISAgentName},
)
w.RegisterActivityWithOptions(
activities.NewValidateMetadata().Execute,
temporalsdk_activity.RegisterOptions{Name: activities.ValidateMetadataName},
xmlvalidate.New(xmlvalidate.NewXMLLintValidator()).Execute,
temporalsdk_activity.RegisterOptions{Name: xmlvalidate.Name},
)
w.RegisterActivityWithOptions(
activities.NewTransformSIP().Execute,
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ module github.com/artefactual-sdps/preprocessing-sfa
go 1.23.2

require (
github.com/artefactual-sdps/temporal-activities v0.0.0-20240821162351-47302711bc7b
github.com/artefactual-sdps/temporal-activities v0.0.0-20241018212855-8ea34d29bdf4
github.com/beevik/etree v1.4.0
github.com/deckarep/golang-set/v2 v2.6.0
github.com/go-logr/logr v1.4.2
Expand Down
6 changes: 6 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,12 @@ cloud.google.com/go/storage v1.43.0/go.mod h1:ajvxEa7WmZS1PxvKRq4bq0tFT3vMd502Jw
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/artefactual-sdps/temporal-activities v0.0.0-20240821162351-47302711bc7b h1:kTOc2pbkdII6/Z84Bus1q52z5KAOaT8vLpfRoOs1l1I=
github.com/artefactual-sdps/temporal-activities v0.0.0-20240821162351-47302711bc7b/go.mod h1:FVh79rCGNlUU1QnioAU+lrSjLqrA1PJFYKIhWPsmyug=
github.com/artefactual-sdps/temporal-activities v0.0.0-20241017225716-a7e211dd7177 h1:8azOOy+6CjnqXeN7Jwz+xltCN3ES9XL2tG6jKES+074=
github.com/artefactual-sdps/temporal-activities v0.0.0-20241017225716-a7e211dd7177/go.mod h1:JgeLmORdJuOa2G9wFXUfMIfkFL89DFjZWgpn3gfkXPA=
github.com/artefactual-sdps/temporal-activities v0.0.0-20241018212855-8ea34d29bdf4 h1:WF95IOkZRVSCST/26SAqPYsUrtUuJpavBht6lvdeKl0=
github.com/artefactual-sdps/temporal-activities v0.0.0-20241018212855-8ea34d29bdf4/go.mod h1:FVh79rCGNlUU1QnioAU+lrSjLqrA1PJFYKIhWPsmyug=
github.com/artefactual-sdps/temporal-activities v0.0.0-20241018224155-1c20c3329100 h1:a0JiwD53z5ytnCpKK6E8MRaTOM6k1o1Evow+l5z9HVw=
github.com/artefactual-sdps/temporal-activities v0.0.0-20241018224155-1c20c3329100/go.mod h1:FVh79rCGNlUU1QnioAU+lrSjLqrA1PJFYKIhWPsmyug=
github.com/aws/aws-sdk-go v1.55.5 h1:KKUZBfBoyqy5d3swXyiC7Q76ic40rYcbqH7qjh59kzU=
github.com/aws/aws-sdk-go v1.55.5/go.mod h1:eRwEWoyTWFMVYVQzKMNHWP5/RV4xIUGMQfXQHfHkpNU=
github.com/aws/aws-sdk-go-v2 v1.30.3 h1:jUeBtG0Ih+ZIFH0F4UkmL9w3cSpaMv9tYYDbzILP8dY=
Expand Down
Binary file removed hack/sampledata/SIP_20111020_BFB_v60.zip
Binary file not shown.
78 changes: 0 additions & 78 deletions hack/sampledata/xsd/ablieferung.xsd

This file was deleted.

14 changes: 0 additions & 14 deletions hack/sampledata/xsd/archivischeNotiz.xsd

This file was deleted.

34 changes: 0 additions & 34 deletions hack/sampledata/xsd/archivischerVorgang.xsd

This file was deleted.

Loading

0 comments on commit b2c2e6d

Please sign in to comment.