-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an activity to merge AIS metadata files #80
Conversation
@jraddaoui I haven't been able to test this locally because I don't know how to configure the AIS worker, but I'm open the PR anyway to get an initial code review. |
7d1721b
to
17738ab
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #80 +/- ##
==========================================
+ Coverage 54.48% 61.56% +7.08%
==========================================
Files 30 31 +1
Lines 1986 2084 +98
==========================================
+ Hits 1082 1283 +201
+ Misses 832 705 -127
- Partials 72 96 +24 ☔ View full report in Codecov by Sentry. |
17738ab
to
12acc86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good @djjuhasz!
internal/ais/combinemd.go
Outdated
CombineMDActivityParams struct { | ||
AreldaRelPath string | ||
METSRelPath string | ||
WorkingDir string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it localDir
like in the workflow, workingDir
(from the config) would be the parent directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the naming conventions here pretty confusing - it wasn't clear to me from the names what the difference was between the workingDir and localDir. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working dir is the config value, used by the worker for all workflows. Local dir is the specific one created for this workflow. Definitely not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, looking at this again the name of localDir
is deterministic and will be the same for multiple workflow runs if the workingDir is the same (which is always true for a single worker) and the AIP UUID is the same. If two workflows got started with the same UUID it could cause file lock or race conditions.
It's probably not a problem right now because we are limiting the number of concurrent running workflows to one, but if we ever do increase the number of workflows the worker can simultaneously process it could be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point and something worth looking at in detail, even if you deploy two workers using the same working directory.
When we start these workflows from Enduro we also use the AIP UUID for the workflow ID, but we don't set a WorkflowIDReusePolicy
and I'm not sure what would happen if unspecified. This could be configurable in Enduro, but we still need to consider all options in the workflow.
internal/testdata/little-Test-AIP-Digitization/AIS_1000_893_3251903
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice, thanks for that workflow test @djjuhasz! 😍
I think it would be better to pass the absolute paths directly instead of splitting and then re-joining them in the activity. I'd create a variable in the workflow for the metadata file absolute path, and use that as the Destination
in the fetch activity and as the AreldaPath
in the combine activity. Other than that and removing the test data file, it looks great to me.
Fixes #77. Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file. [skip codecov]
d74f10a
to
03995be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @djjuhasz!
Fixes #77.
Concatenate the Arelda metadata file from the original package and the METS file created by Archivematica into a single "AIS" metadata file.