-
Notifications
You must be signed in to change notification settings - Fork 24
Sufia 6.x to Sufia 7.2 migration PCDM
-
Collection Becomes Collection
-
Make sure Encoding in metadata works during the migration
-
GenericFile Becomes GenericWork & FileSet
-
GenericWork.id = GenericFile.id
-
Versions: Hector's code handles this, but the order may be invalid project_hydra/sufia/import_s6
- make sure to keep the update and created dates
- import_current_version in import_s6 branch should use actor stack
-
Permissions: Hector's code handles this, but the order may be invalid project_hydra/sufia/import_s6
-
Thumbnails - We are thinking we are throwing away thumbnails and allowing Sufia 7 to regenerate thumbnails
-
Make sure Encoding in metadata works during the migration
-
Migrate related URLs if they point back into ScholarSphere - do we even need this?
- Yes migrate the urls forward
-
Batch Becomes hasRelatedWork predicate on GenericWork (Need project_hydra/sufia/#1711)
-
Should use DCE:relation Not hasRelatedWork
-
Activity - In Redis. Need to migrate from Work ID to new File Set ID
-
Deposit messages can change from GenericFile to GenericWork in the redis key
-
Attached Message need to move from the id of the GenericFile to the id of the FileSet
- Should we create the correct files set message or should we just put these at the work.
- ActorStack might create additionl Redis activity, but with bad dates
-
Work created at the work level
-
Upload at the fileset level
-
Update of new files would occur as activity on both the work and the new fileset
-
Audit logs - In MySQL. Need to migrate from Work ID to new File Set ID
-
Featured Works - In MYSQL. Database Migration
-
Analytics - In MySQL. Account for views of a work & views and downloads of a file set
-
May need to start with views of the work/fileset being a copy of the GenericFile views/downloads
-
The work download should be the sum total of the downloads of the file.
-
The file would have downloads and views
-
- Keep existing URLs valid - Alias /files to concern/generic_works
- Resource type is not a base term
- Shared File directory between web application machines is needed for thumbnails
- Existing code from Hector. Needs to be reworked for the latest version of S6 and S7
- https://github.com/projecthydra/sufia/tree/import_s6
- https://github.com/psu-stewardship/scholarsphere/tree/export_s6
- Need Fedora 4.5.0
- Need Solr 5.5.1 or ETDa version
- Version of Redis? 2.6?
Below are the steps to audit the Fedora 4 to Sufia 7 (PCDM) migration. The basic workflow is as follows:
- Gather the list of all objects( Collections, GenericFiles, & map Batches) in the Fedora 4 repo and store them in a MySQL table.
- Migrate the data.Plan
- Loop through all the Fedora 4 objects in the MySQL table and make sure that (a) they exist in the Fedora 4 repo and (b) their model (Collection = Collection, GenericFile = GenericWork (with related works) & FileSet) matches with their mapped model set.
You can perform steps 1 and 2 in whatever order. The only requirement to perform step 1 is that you have access (i.e. the URL) to a running Fedora 4 repo with the original data.
- sync data to QA & Staging
- Run Migration on QA
- Regression Testing QA
- Bug Fixes
- Release Notification on Production
- Create user email list
- Staging migration and process documentation
- ITS alert
- Open firewall so qa is available to stage & prod (repos)
- Add migration banner into release branch of master
- Ensure all linked files are present and configured correctly
- see http://sites.psu.edu/dltdocs/?p=3521
- the deploy will fail if application.yml does not have all the required keys