Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an activity to merge AIS metadata files #80

Merged
merged 1 commit into from
Nov 14, 2024

Conversation

djjuhasz
Copy link
Contributor

Fixes #77.

Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file.

@djjuhasz
Copy link
Contributor Author

@jraddaoui I haven't been able to test this locally because I don't know how to configure the AIS worker, but I'm open the PR anyway to get an initial code review.

@djjuhasz djjuhasz force-pushed the dev/issue-77-combine-ais-metadata branch from 7d1721b to 17738ab Compare November 13, 2024 22:43
Copy link

codecov bot commented Nov 13, 2024

Codecov Report

Attention: Patch coverage is 62.62626% with 37 lines in your changes missing coverage. Please review.

Project coverage is 61.56%. Comparing base (fa37e42) to head (d74f10a).

Files with missing lines Patch % Lines
internal/ais/combinemd.go 62.02% 20 Missing and 10 partials ⚠️
internal/ais/workflow.go 65.00% 6 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #80      +/-   ##
==========================================
+ Coverage   54.48%   61.56%   +7.08%     
==========================================
  Files          30       31       +1     
  Lines        1986     2084      +98     
==========================================
+ Hits         1082     1283     +201     
+ Misses        832      705     -127     
- Partials       72       96      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@djjuhasz djjuhasz force-pushed the dev/issue-77-combine-ais-metadata branch from 17738ab to 12acc86 Compare November 13, 2024 22:47
Copy link
Contributor

@jraddaoui jraddaoui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good @djjuhasz!

internal/ais/combinemd.go Outdated Show resolved Hide resolved
CombineMDActivityParams struct {
AreldaRelPath string
METSRelPath string
WorkingDir string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call it localDir like in the workflow, workingDir (from the config) would be the parent directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the naming conventions here pretty confusing - it wasn't clear to me from the names what the difference was between the workingDir and localDir. :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working dir is the config value, used by the worker for all workflows. Local dir is the specific one created for this workflow. Definitely not clear.

Copy link
Contributor Author

@djjuhasz djjuhasz Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looking at this again the name of localDir is deterministic and will be the same for multiple workflow runs if the workingDir is the same (which is always true for a single worker) and the AIP UUID is the same. If two workflows got started with the same UUID it could cause file lock or race conditions.

It's probably not a problem right now because we are limiting the number of concurrent running workflows to one, but if we ever do increase the number of workflows the worker can simultaneously process it could be an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point and something worth looking at in detail, even if you deploy two workers using the same working directory.

When we start these workflows from Enduro we also use the AIP UUID for the workflow ID, but we don't set a WorkflowIDReusePolicy and I'm not sure what would happen if unspecified. This could be configurable in Enduro, but we still need to consider all options in the workflow.

internal/ais/combinemd.go Outdated Show resolved Hide resolved
internal/ais/combinemd.go Outdated Show resolved Hide resolved
internal/ais/workflow.go Outdated Show resolved Hide resolved
internal/ais/workflow.go Outdated Show resolved Hide resolved
internal/ais/workflow.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jraddaoui jraddaoui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice, thanks for that workflow test @djjuhasz! 😍

I think it would be better to pass the absolute paths directly instead of splitting and then re-joining them in the activity. I'd create a variable in the workflow for the metadata file absolute path, and use that as the Destination in the fetch activity and as the AreldaPath in the combine activity. Other than that and removing the test data file, it looks great to me.

Fixes #77.

Concatenate the Arelda metadata file from the original package and the
METS file created by Archivematica into a single "AIS" metadata file.

[skip codecov]
@djjuhasz djjuhasz force-pushed the dev/issue-77-combine-ais-metadata branch from d74f10a to 03995be Compare November 14, 2024 23:46
Copy link
Contributor

@jraddaoui jraddaoui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @djjuhasz!

@djjuhasz djjuhasz merged commit b8989cb into main Nov 14, 2024
7 checks passed
@djjuhasz djjuhasz deleted the dev/issue-77-combine-ais-metadata branch November 14, 2024 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: combine METS and metadata files for delivery to AIS
2 participants