Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PFDR-252 v2 deposit scenario #88

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open

PFDR-252 v2 deposit scenario #88

wants to merge 26 commits into from

Conversation

malakai97
Copy link
Contributor

@malakai97 malakai97 commented Sep 4, 2019

Overview

At a high level, we have added the v2 deposit pipeline in the form of Deposit, Artifact, Revision, and FinishDeposit. We've revamped components of the v1 pipeline to allow component reuse. Notably:

  • PackageStorage accepts both v1 and v2 objects
  • Volume accepts both v1 and v2 objects

The validation system has been completely revamped. Chipmunk::Bag still has a #valid? method courtesy of tipr/bagit, but validation is now the responsibility of ValidationService. The logic of validation that was spread across the app has been pushed to subclassess of Chipmunk::Validator::Validator.

There's also been a general consolidation and focusing of the responsibilities of the various storage components.

Tests

Care has been taken to not lose coverage. Most tests that fell out of the revised responsibilities have been moved. Those few that were removed were considered to no longer provide value.

v1 cucumber specs are intact.

Review

There are a few areas that should receive particular review attention:

  • The steps in the scenario; specifically we could remove one of the REST calls by folding it into a previous call.
  • Any thoughts on validating Deposits
  • Checkpoint/Keycard integration. I'll need to pair on this, either before or after attempting it on my own.
  • Some of the new controllers directly specify the template they need to display; i'm not sure what's up with the automatic template resolution.

Rubocop was not applied consistently. There is a single rubocop commit at the end of the PR. I'm willing to go back and apply rubocop to each commit individually, but at time of writing that seemed like a poor tradeoff.

The thought was to add DepositStatus as a PORO, and use it via the Rails
attributes API and app/attributse/DepositStatusType, but I couldn't get
it to work.
* Introduces the concept of a Reader, used to load instances from disk.
  This immediately replaces Volume's call to Bag.new with delegation to
  the volume's reader, an instance of Bag::Reader

  This conceptoffers a point from which we can more effectively handle
  different types of stored objects.

* Similarly, we introduce the concept of a Writer, implement
  Bag::MoveWriter, and use it in our volumes.

The main driver here is to have the ability to read and write v1 bags,
versioned bags, and in the future OCFL objects in a consistent way. This
allows us to use PackageStorage for all of these object
types, avoiding having something like RevisionStorage and OCFLStorage.

This approach was chosen because the responsibility for simply
writing the object felt like it belonged in Volume, and so we inject it
as a dependency. Meanwhile, the high-level rules around writing,
especially where it fits in the lifecycle, belong to PackageStorage.
This check was concerned with the order in which finishing a deposit
takes place or with preventing duplicate deposits. IncomingStorage
is not the correct place for this check. Moved to BagMoveJob.
* Checking whether or not the bag moved was not sufficiently paranoid
* Removed some duplicate tests
Remove demeter violation for user.username in UserUploadPath
* Remove Chipmunk::Validatable (replaced)
* Remove Chipmunk::Bag::Validator (replaced)
* Add ValidatorService
* Add ValidationResult
* BagMoveJob relies on ValidatorService and does not interrogate Package
* Moved Package's validation logic out of the class
Rails overwrites "format" in the request's parameter hash based on the
body of the request, e.g. json. We didn't notice this before because
we set the format of the created Package to Chipmunk::Bag.format.
Now that we want to get the format from the user, we run into this
conflict.

This works around the issue by renaming the field to storage_format

It also removes the paranoid format check from Volume. That sort of
check is now part of validation.
* Provide hook in PackageStorage for on-success behavior
* Use said hook
* Define methods on QueueItem to replace calls to update() previously
  used in BagMoveJob.

These changes allow for greater reuse of PackageStorage.
Rubocoping all at once is far from ideal.
Previously, this relied on Rails's automatic behavior to migrate the
database, which seems to build the database from schema.rb. As schema.rb
is not created correctly when the PK of a table is a uuid, it would skip
creating the artifacts table.

By forcing the migrations to run, we better test the production setup,
and avoid this issue.
@malakai97 malakai97 changed the title PFDR-244 v2 deposit scenario PFDR-255 v2 deposit scenario Sep 4, 2019
@malakai97 malakai97 changed the title PFDR-255 v2 deposit scenario PFDR-252 v2 deposit scenario Sep 4, 2019
@malakai97
Copy link
Contributor Author

I should have time to address the issues travis reports today.

@malakai97 malakai97 force-pushed the PFDR-244-clean-good branch 3 times, most recently from 42f109b to 91712af Compare September 11, 2019 01:08

def create
collection_policy.new(current_user).authorize! :new?
# We don't explicitly check for :save? permissions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear whether this is descriptive or declarative. Is this a problem report or a design decision?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while, but the save check and the new check seemed be identical sets.

class DepositsController < ResourceController

def create
# TODO policy check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker

end

def ready
# TODO policy check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker

AnyArtifact.new
end

alias_method :identifier, :id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about this identifier business... remind me of the intent behind an alias?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be able to treat the artifact the same as our v1 objects, so there was some mismatch between id, uuid.


alias_method :identifier, :id

# Each artifact belongs to a single user
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to follow convention and have all of the relationships and validations at the very top of the class.

deposit.transaction do
Revision.create!(deposit: deposit, artifact: deposit.artifact)
deposit.complete!
deposit.artifact.save!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be nothing to save on the artifact, right? Is this just to make sure that the reference on the deposit is updated to include the revision association?

class ValidationService

# @return [ValidationResult]
def validate(validatable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll want to think about this pattern more later. This is simple, so it works for now, but I want to be careful about the switch growing and the implication that validation is "the same" for multivariate inputs.

@@ -0,0 +1,3 @@
# frozen_string_literal: true

Dir[File.join(__dir__, "validator", "**", "*.rb")].each {|file| require_relative file }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect plugins? If not, I would rather list than loop -- I don't expect much churn here.

expect(described_class.new([]).valid?).to be true
end
it "is invalid when there are errors" do
expect(described_class.new([1]).valid?).to be false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 1 appears to have some significance. Change to something that is more obviously filler, like below.

paths: Chipmunk::UploadPath.new("/"),
links: Chipmunk::UploadPath.new(Chipmunk.config.upload["rsync_point"])
)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty novel: to register test components over the standard ones. I'm not sure that the pattern holds for all component types, but I think this is a great approach to try. The alternative is a proper "normal" vs "test" context separation, probably initiated by the environments/*.rb files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we do this in moku, as an example; that was the inspiration for the change here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants