-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(robot-server): limit the maximum number of analyses stored per protocol #14885
feat(robot-server): limit the maximum number of analyses stored per protocol #14885
Conversation
analyses_to_delete = analyses_ids[ | ||
: len(analyses_ids) - MAX_ANALYSES_TO_STORE + 1 | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Purely stylistic, no opinion from me either way, but you can also write this as analyses_to_delete = analyses_ids[:-MAX_ANALYSES_TO_STORE]
if you think that's clearer.
for analysis_id in analyses_to_delete: | ||
self._memcache.remove(analysis_id) | ||
delete_statement = sqlalchemy.delete(analysis_table).where( | ||
analysis_table.c.id == analysis_id | ||
) | ||
with self._sql_engine.begin() as transaction: | ||
transaction.execute(delete_statement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do all these deletions inside a single transaction?
And ideally, we'd actually go even further and have one transaction for deleting the old entries and inserting the new entry. That doesn't have to happen in this PR if it's a nontrivial refactor, but it is something that we should do, so can we think through what it would take? Like, does the AnalysisStore
/CompletedAnalysisStore
split get in the way of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try exactly this. Maybe I am not yet familiar with the full range of capabilities of sqlalchemy yet but the code was getting a bit too clunky for my liking, esp when trying to delete all old entries in one transaction.
This was what I was going for:
delete_statement = analysis_table.delete()
insert_statement = analysis_table.insert().values(
await completed_analysis_resource.to_sql_values()
)
for analysis_id in analyses_to_delete:
self._memcache.remove(analysis_id)
delete_statement = delete_statement.where(
analysis_table.c.id == analysis_id
)
# <add something to make sure we don't end up executing `analysis_table.delete()` >
with self._sql_engine.begin() as transaction:
transaction.execute(delete_statement)
transaction.execute(insert_statement)
self._memcache.insert(
completed_analysis_resource.id, completed_analysis_resource
)
(needed to have a few more checks in there but that was basically the idea)
Do you know of a better way of composing statements with multiple conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. it just occurred to me that I could have tried this-
insert_statement = analysis_table.insert().values(
await completed_analysis_resource.to_sql_values()
)
delete_statements = []
for analysis_id in analyses_to_delete:
self._memcache.remove(analysis_id)
delete_statements.append(analysis_table.delete().where(
analysis_table.c.id == analysis_id
))
with self._sql_engine.begin() as transaction:
for delete_statement in delete_statements:
transaction.execute(delete_statement)
transaction.execute(insert_statement)
self._memcache.insert(
completed_analysis_resource.id, completed_analysis_resource
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But is this^ any better than what is implemented?
It looks like there is a special engine configuration for executing multiple statements in a single transaction which I don't think we have enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[...] for analysis_id in analyses_to_delete: self._memcache.remove(analysis_id) delete_statement = delete_statement.where( analysis_table.c.id == analysis_id ) [...]Do you know of a better way of composing statements with multiple conditions?
Yeah. There are a few ways to skin this cat, but I think the most direct translation of what you're trying to do is:
delete_statement = analysis_table.delete().where(analysis_table.c.id.in_(analyses_to_delete))
Which will generate a SQL statement like:
DELETE FROM analysis WHERE id IN ("abc", "def", "ghi")
Hmm.. it just occurred to me that I could have tried this-
[...] with self._sql_engine.begin() as transaction: for delete_statement in delete_statements: transaction.execute(delete_statement) [...]
Yep, that also works.
But is this^ any better than what is implemented?
Definitely. It's faster, safer in the face of concurrency that might be added in the future, and more semantically "correct."
It looks like there is a special engine configuration for executing multiple statements in a single transaction which I don't think we have enabled.
Hm, what do you mean? Executing multiple statements in a single transaction is a normal thing to do and we do it routinely. Where are you seeing that that needs special configuration?
TY! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Looks good!
…es length bug (#14904) Closes AUTH-347 # Overview #14885 added the feature to limit number of analyses we store in DB. In [this](#14885 (comment)) comment, @SyntaxColoring pointed out that we should consolidate the DB transactions for better performance, so that's what this PR does. Also fixes a bug where if the existing number of analyses in the DB was 3 and we were to add another analysis, then the formula for getting the analysis IDs to delete would result in `analysis_ids[:-1]` and it would delete all analyses except last one. # Test Plan - Tested the cases mentioned in #14885 - Tested the bug case # Risk assessment Low. Refactor + small bug fix
…es length bug (#14904) Closes AUTH-347 # Overview #14885 added the feature to limit number of analyses we store in DB. In [this](#14885 (comment)) comment, @SyntaxColoring pointed out that we should consolidate the DB transactions for better performance, so that's what this PR does. Also fixes a bug where if the existing number of analyses in the DB was 3 and we were to add another analysis, then the formula for getting the analysis IDs to delete would result in `analysis_ids[:-1]` and it would delete all analyses except last one. # Test Plan - Tested the cases mentioned in #14885 - Tested the bug case # Risk assessment Low. Refactor + small bug fix
Closes AUTH-317
Overview
Adds a max-analyses-per-protocol limit of 5 analyses. Auto-deletes the oldest analysis (es) before adding a new one if the number of analyses stored exceeds the max.
Test Plan
Changelog
CompletedAnalysisStore
Review requests
Usual code review. I've added in-line comments for things that need extra attention.
Risk assessment
Low. Well tested and pretty isolated feature.