-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(robot-server): Parallelize the migration to schema 3 #14465
perf(robot-server): Parallelize the migration to schema 3 #14465
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## chore_release-7.2.0 #14465 +/- ##
====================================================
Coverage 67.78% 67.78%
====================================================
Files 2519 2519
Lines 72011 72011
Branches 9263 9263
====================================================
Hits 48812 48812
Misses 20991 20991
Partials 2208 2208
Flags with carried forward coverage won't be shown. Click here to find out more. |
Merging for persistence test improvements.
…xecutor. We already need a multiprocessing.Lock, and we're not doing anything that needs concurrent.Futures, so this seems more direct. There also appears to be very slightly less overhead (~0-5%).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in this file are to separate connecting to the database (sqlalchemy.create_engine()
) from adding all the tables to the database (metadata.create_all()
).
The new migration code has to close and reopen the database several times, and we don't want to re-add all the tables every time we do that. I think it would technically be harmless, but it felt confusing, and it spammed the logs.
def sql_engine_ctx(path: Path) -> Generator[sqlalchemy.engine.Engine, None, None]: | ||
"""Like `create_sql_engine()`, but clean up when done.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason we haven't been defining this as a context manager this whole time is that it was difficult to integrate with FastAPI.
In recent versions of FastAPI, that's fixed, yay! So at some point, we can switch everything to this and delete the non-context-manager version.
sqlalchemy.PickleType(pickler=legacy_pickle, protocol=PICKLE_PROTOCOL_VERSION), | ||
sqlalchemy.LargeBinary, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sqlalchemy
-only change (not affecting the actual data, or the SQL) is to let us unpickle the data without holding the database lock the whole time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separately: I'm increasingly unsure of how to document historical schema stuff like this. Just seeing LargeBinary
is definitely not helpful if you're a new contributor and you're trying to figure out how this old migration code works. But it's difficult to clarify this without just being like "see RunStore
," and I don't know if that's much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding what to do about documenting schema info/transitions as we implement the migration, a summary of the old command storage method describing the migration to our new storage method exists elsewhere, yes? I would recommend directing contributors there from here, and if need be add details regarding the migration in that singular location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a summary of the old command storage method describing the migration to our new storage method exists elsewhere, yes?
Yeah, although it's not super detailed.
I would recommend directing contributors there from here, and if need be add details regarding the migration in that singular location.
Hmm. Not the other way around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks solid, the performance improvements will be very advantageous. The forkserver
adaptation is appropriate as well.
sqlalchemy.PickleType(pickler=legacy_pickle, protocol=PICKLE_PROTOCOL_VERSION), | ||
sqlalchemy.LargeBinary, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding what to do about documenting schema info/transitions as we implement the migration, a summary of the old command storage method describing the migration to our new storage method exists elsewhere, yes? I would recommend directing contributors there from here, and if need be add details regarding the migration in that singular location.
This is already in the worker process's function's docstring.
Overview
Closes RSS-451.
Changelog
Test plan
For testing for correctness, I think we're decently covered by automated tests.
For performance, I've tested this on an OT-2 along with #14480 and found that it's in line with the initial estimates in RSS-451. With a run history lifted from one of the ABR robots, with 88k commands, this cuts the 2->3 migration time from 10-11 minutes (625s) to 4-5 minutes (261s). Extrapolating to the worst case as defined in RSS-451, it would be a cut from 23-24 minutes to 9-10 minutes.
Review requests
See inline comments.
Risk assessment
Medium. This is the first time (?) we're introducing multiprocessing into robot-server, and so we have to work through some gotchas. And it's part of database migration code, so it's sort of a complexity turducken situation.