Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish the PluginPostCheckMigrateFedora3AuditLog plugin #33

Open
mjordan opened this issue Jan 30, 2019 · 1 comment
Open

Finish the PluginPostCheckMigrateFedora3AuditLog plugin #33

mjordan opened this issue Jan 30, 2019 · 1 comment

Comments

@mjordan
Copy link
Owner

mjordan commented Jan 30, 2019

Migrating fixity check events for the datastreams in a Fedora 3 object could be handled by the following Riprap configuration. This approach separates the persisting of legacy events from persisting of current events. This requires two separate configurations, a "primary" one that uses the PluginFetchResourceListFromDrupal plugin, and a "secondary" one that runs independently and in a separate scheduled job. The presence of the secondary configuration does not affect the operation of the primary job.

PluginFetchResourceListFromDrupal in the primary configuration writes out a file containing resource IDs of AUDIT resources, but does not add these resources to the list of resources currently being checked. The secondary configuration registers the PluginFetchResourceListFromFile plugin to read from that file.

The secondary configuration registers a persistplugin that does nothing (let's call it a "null plugin"); its getReferenceEvent() returns null and its persistEvent() returns true. (We persist the legacy events in a postcheckplugin.)

The configuration registers a fetchdigestplugin that does nothing (another "null plugin"); its execute() function returns a placeholder string (returning false would make the CheckFixity command continue to the next fetchdigestplugin, which we don't want to do).

$this->checkFixity() will return true since the reference event is null.

At this point, the CheckFixity command has an $event, but we are not going to use it since it's not the event we want to persist. The null persist plugin's persistEvent() function returns true, so the registered PluginPostCheckMigrateFedora3AuditLog executes. This plugin:

  • fetches the AUDIT log listed in the resource list written by the PluginFetchResourceListFromDrupal plugin
  • parses out the legacy events (this code already exists)
  • constructs the URI for the Fedora 3 datastream identified in the legacy event
  • if using Fedora URIs, queries Gemini to get the equivalent Fedora URI (related issue Get PluginFetchResourceListFromDrupal plugin to authenticate against Gemini #21)
  • persists the event (using its own code, it does not execute a persistplugin), adding an Event Detail that the event was migrated from Fedora 3's AUDIT log. There will be no digest value.

Since we want to migrate legacy events only once, we need to make sure that the PluginPostCheckMigrateFedora3AuditLog checks for the presence of the URIs first before continuing. Also, this secondary configuration can be removed after all legacy events have been migrated.

Some tasks if we take this approach:

  • modify the PluginFetchResourceListFromDrupal so it writes out a file of AUDIT binary resources (the URIs for those resources end in _AUDIT.xml)
  • write the null plugins described above
  • modify existing persist plugins so that reference events must contain a non-empty digest value
  • come up with standard value for the Event Detail indicating the event was migrated
@mjordan
Copy link
Owner Author

mjordan commented Jan 31, 2019

Note that datastreams with multiple versions have incremented datastream version IDs:

<foxml:datastream ID="OBJ" FEDORA_URI="info:fedora/islandora:9/OBJ" STATE="A" CONTROL_GROUP="M" VERSIONABLE="true">
<foxml:datastreamVersion ID="OBJ.0" LABEL="krappen verboten.jpg" CREATED="2019-01-31T03:22:48.053Z" MIMETYPE="image/jpeg" SIZE="397922">
<foxml:contentDigest TYPE="SHA-1" DIGEST="02934905cf07f55f173d0d602c2e860c97b0dfc8"/>
<foxml:contentLocation TYPE="INTERNAL_ID" REF="http://localhost:8080/fedora/get/islandora:9/OBJ/2019-01-31T03:22:48.053Z"/>
</foxml:datastreamVersion>
<foxml:datastreamVersion ID="OBJ.1" LABEL="krappen verboten.jpg" CREATED="2019-01-31T03:36:42.980Z" MIMETYPE="image/jpeg" SIZE="184791">
<foxml:contentDigest TYPE="SHA-1" DIGEST="a1634d4c9cf5a0630bbc91880b4288968660b3e0"/>
<foxml:contentLocation TYPE="INTERNAL_ID" REF="http://localhost:8080/fedora/get/islandora:9/OBJ/2019-01-31T03:36:42.980Z"/>
</foxml:datastreamVersion>

These datastream IDs are used in AUDIT records:

<audit:record ID="AUDREC15">
<audit:process type="Fedora API-M"/>
<audit:action>modifyObject</audit:action>
<audit:componentID></audit:componentID>
<audit:responsibility>admin</audit:responsibility>
<audit:date>2019-01-31T03:41:44.152Z</audit:date>
<audit:justification>PREMIS:file=islandora:9+OBJ+OBJ.1; PREMIS:eventType=fixity check; PREMIS:eventOutcome=SHA-1 checksum validated.</audit:justification>
</audit:record>

Assuming that migrations from 7.x only migrate the latest version of datastreams, this plugin should only be concerned with AUDIT records that apply to the most recent datastream version.

mjordan added a commit that referenced this issue Apr 22, 2019
mjordan added a commit that referenced this issue Apr 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant