-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add audit functionality #113
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! One specific comment inline that would drive a little cleanup elsewhere. We'll need to fix up whitespace, indentation, but functionally, I think this is very close.
app/jobs/audit_manager.rb
Outdated
start_audit(user: User.system_user, audit: audit, packages: packages) | ||
end | ||
|
||
def self.start_audit (user, audit, packages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not pass the packages here. We're implying that it should be an audit of everything, so if we do the lookup of Package.stored
here and update the audit with the count before it is saved, we can drop the parameter and duplicate lookups.
packages.each do |package| | ||
AuditFixityCheckJob.perform_later(package: package, user: current_user, audit: audit) | ||
end | ||
AuditManager.start_audit (user: current_user, audit: audit, packages: packages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually need to squish the parentheses and method name together... It's a weird Ruby parsing thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super close now! Final comments inline.
|
||
packages = Package.stored | ||
audit = Audit.new(user: current_user, packages: packages.count) | ||
audit = Audit.new(user: current_user, packages: Package.stored.count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we're not saving in this method anymore, we don't have to set the package count. Just give it what we know here, but the audit method cannot know -- the user.
app/jobs/audit_manager.rb
Outdated
@@ -0,0 +1,14 @@ | |||
class AuditManager | |||
def self.system_audit | |||
audit = Audit.new(user: User.system_user, packages: Package.stored.count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as in the controller, just set the user on the audit. Let start_audit
query and set the count.
app/jobs/audit_manager.rb
Outdated
|
||
def self.start_audit (user, audit) | ||
audit.save | ||
packages = Package.stored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this before the save, and set the count on the audit object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further changes requested. Seeing it fresh made me reconsider whether to have a standalone class or a job subclass -- see inline for detailed thoughts.
packages = Package.stored | ||
audit = Audit.new(user: current_user, packages: packages.count) | ||
audit = Audit.new(user: current_user) | ||
|
||
resource_policy.new(current_user, audit).authorize! :save? | ||
audit.save | ||
|
||
packages.each do |package| | ||
AuditFixityCheckJob.perform_later(package: package, user: current_user, audit: audit) | ||
end | ||
AuditManager.start_audit(user: current_user, audit: audit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the total logic is brief and each line is simple, I would remove the blank lines between them and put one before the response. Then the business part is grouped together, and the web part finishes the method alone.
app/jobs/do_audit.rb
Outdated
require "audit_manager" | ||
|
||
# To run it | ||
# bin/rails runner app/jobs/do_audit.rb | ||
AuditManager.system_audit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is unnecessary because rails runner
works directly with ruby code. In this case:
bin/rails runner "AuditManager.system_audit"
This is why we chose to make it a class method rather than an instance method -- to keep it concise.
https://guides.rubyonrails.org/command_line.html#bin-rails-runner
app/jobs/audit_manager.rb
Outdated
@@ -0,0 +1,15 @@ | |||
class AuditManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I suggested making a standalone class, I may have been unnecessarily idealistic. These methods can go directly on AuditFixityCheckJob
if we choose. There is a trade-off, here...
The ActiveJob subclasses behave somewhere between classes and instances in a way that is not obvious. When you call the perform_later
method on the class, it enqueues a job (serializing all of the parameters, because the workers are in a separate Ruby process). When that job is processed, an instance of your job class is created with no parameters, and the perform
method is called with the deserialized versions of the parameters you passed to perform_later
. (phew!)
You can add other instance or class methods to the ActiveJob subclasses. I initially suggested a separate class to avoid any confusion about the perform behavior and other entrypoints to the class. However, having a class in the jobs
directory that doesn't behave like the others is its own source of confusion. We could put this class somewhere else, but here, we're taking advantage of the conventional Rails directory structure and auto-loading.
Reconsidering, I think we should make this something more like:
class RepositoryAuditJob < ApplicationJob
def perform(user: User.system_user, audit: Audit.new(user: user))
# ...
end
end
Note that the default value for audit
uses the user
parameter. This is totally fine in Ruby. The invocation from the controller would be similar to what it is now:
RepositoryAuditJob.perform_later(user: current_user, audit: audit)
And the "runner" invocation would take advantage of the default parameters:
bin/rails runner RepositoryAuditJob.perform_later
I will note, this is still a little awkward in the design because we expect a user, and an audit instance that knows the user... but it follows job conventions more closely, and is amenable to refactoring. The two-stage policy check in the controller is the issue there, and should be cleaned up such that the authorization does not depend on on Audit instance. Then, the Audit can be created wholly within in this job. That can happen now, or in another step.
config/settings.yml
Outdated
# Sys Admin username (used for system audit) | ||
admin_username: (sysadmin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it some more, this username should not be configurable, and it's not really the "administrator". This is meant to represent the system itself.
app/models/user.rb
Outdated
@@ -12,6 +12,10 @@ class User < ApplicationRecord | |||
# Assign an API key | |||
after_initialize :add_key, on: :create | |||
|
|||
def self.system_user | |||
new(username: Chipmunk.config["admin_username"], email: Chipmunk.config["admin_useremail"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a literal (system)
is better for the username here. It's already a bit of a concession that we have a fake user and it must have a username/email, rather than a proper notion of the system taking action. I cannot imagine a scenario where that username should be configured. It is "the system", not an unnamed administrator.
The email key here is mismatched with the config file. I think admin_email
like it is in the config file is fine as it stands. In the grand scheme, I'd prefer to formalize a notification scheme and recipients, but this has to do for now.
app/jobs/audit_manager.rb
Outdated
@@ -0,0 +1,15 @@ | |||
class AuditManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is now left over. It should go away.
resource_policy.new(current_user, audit).authorize! :save? | ||
audit.save | ||
RepositoryAuditJob.perform_later(user: current_user, audit: audit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little worried about this... https://travis-ci.org/github/mlibrary/chipmunk/jobs/763053000#L1255
Because of the assumptions built into ActiveJob, it tries to serialize any ActiveRecord model instances. It needs them to be saved to do so. I think there are two solutions, here:
- Remove the
audit
parameter from this call, so it is created in the job. - Switch to
perform_now
.
I think what we really want is the first option. The fact that we do authorization on the audit instance is strange and redundant, anyway. If the rules would differ at the instance, we could enforce that policy within the job. What we're seeing here is one of the fundamental conceptual issues with just how "active" the model classes are in typical Rails patterns, and how the REST orientation cutting through the entire application can cause some real awkwardness.
Basically, new
and create
have the same rule for the Audit type, so we should take out the instance and resource policy check entirely. This becomes a very simple, two-line method.
app/jobs/repository_audit_job.rb
Outdated
@@ -0,0 +1,10 @@ | |||
class RepositoryAuditJob < ApplicationJob | |||
def perform(user: User.system_user, audit: Audit.new(user: user)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't need to pass the audit, we can make it in the body of the method now.
I think the serialization would also choke on the system user with perform_later
, so we will have to try RepositoryAuditJob.perform_now
to be sure rails runner
can run the job with a transient instance.
17baed9
to
25a51ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something weird happened in the CI run, but it did reveal a bug. We removed the audit instance entirely from the controller method, which we then try to use:
https://travis-ci.org/github/mlibrary/chipmunk/jobs/763122354#L1510
A suggestion of something to try inline.
@@ -17,16 +17,8 @@ def show | |||
|
|||
def create | |||
collection_policy.new(current_user).authorize! :new? | |||
RepositoryAuditJob.perform_later(user: current_user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Last, chance, ActiveJob...
If we're trying to preserve the ability to start an audit through the API, and the RESTful nature, we need the job to create the audit, enqueue the package jobs, and return the audit instance, so we can redirect to it with a 201.
If we can't get it to cooperate by switching this to a perform_now
and giving us a real return value, it's back to a standalone class.
d8fa586
to
64a114c
Compare
app/jobs/repository_audit_job.rb
Outdated
packages = Package.stored | ||
audit = Audit.create(user: user, packages: packages.count) | ||
packages.each do |package| | ||
AuditFixityCheckJob.perform_now(package: package, user: user, audit: audit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! We can't perform_now
on the individual items. We just want to initiate the audit, scheduling the individual packages checks (peform_later
), and then return the audit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the controller test passing, I believe that perform_now
is returning the value from perform
in process. I think this is ready to go. Nice work sticking with it.
No description provided.