-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: combine METS and metadata files for delivery to AIS #77
Comments
@sallain I originally planned to try and merge the SFA Arelda metadata into the METS XML as a proper XML document with one root node and proper namespacing. I see now that SFA would like the Arelda metadata first in the document, and I've also realized that adding the Arelda XML inside the METS XML is going to be quite a bit of work. So, I've settled for now on just concatenating the two XML files with the Arelda first and the METS second. It's a work in progress (still needs testing) but I think the concatenation code should work now: https://github.com/artefactual-sdps/preprocessing-sfa/tree/dev/issue-77-combine-ais-metadata |
Fixes #77. Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file.
Fixes #77. Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file.
Fixes #77. Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file.
Fixes #77. Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file. [skip codecov]
Attached is a zipped AIS package created by Enduro with the combined AIS metadata file. Note that the current name of the AIS metadata file is "AIS_1974_47_3578513" with no file extension. From the description above I think that's what the filename should be, but let me know if I should and an extension (e.g. ".xml"). |
Fixes #77. Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file. [skip codecov]
Refs #77. - Remove extraneous `filepath.Join()` calls - Improve commentary a bit - Correct "search_md" zip name in workflow tests
Refs #77. - Remove extraneous `filepath.Join()` calls - Improve commentary a bit - Correct "search_md" zip name in workflow tests [skip codecov]
Refs #77. - Remove extraneous `filepath.Join()` calls - Improve commentary a bit - Correct "search_md" zip name in workflow tests [skip codecov]
Results as expected! |
Is your feature request related to a problem? Please describe.
DPS must deliver both the METS file and the metadata.xml/UpdatedAreldaMetadata.xml file to the AIS during the post-preservation workflow. However, AIS only expects one file.
Describe the solution you'd like
Combine the METS and the metadata.xml/UpdatedAreldaMetadata.xml files together into one metadata file. For migration files (files identified as DigitizedAIP or BornDigitalAIP), UpdatedAreldaMetadata.xml should be used.
The newly created file should be named with the prefix
AIS_
followed by the accession number, which can be found in the metadata.xml (or UpdatedAreldaMetadata.xml, but should be the same value) under<ablieferungsnummer>
. There should only be one ablieferungsnummer per metadata file. The number is formatted as2002/05
; the / should be replaced with an _. The final file name will beAIS_2002_05
.Within the file, SFA would like the contents of metadata.xml/UpdatedAreldaMetadata.xml first, since it contains the higher hierarchies, and then the METS. The contents of the two files should probably be tagged in some way but I think it can be pretty simple - perhaps just indicating the source file.
Describe alternatives you've considered
None
Additional context
There's a very real chance that, when operating at scale, the resulting file will be too big for AIS to handle; it might make sense to then limit which fields from each file we're combining into this new file. But we'll tackle that if/when it happens.
The text was updated successfully, but these errors were encountered: