Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(robot-server): Store one command per row #14348

Merged
merged 8 commits into from
Jan 30, 2024

Conversation

SyntaxColoring
Copy link
Contributor

@SyntaxColoring SyntaxColoring commented Jan 24, 2024

Overview

Closes RSS-132. See that ticket for background.

Test Plan

I think we're sufficiently covered by existing automated integration tests.

I've reviewed some of the new queries with SQLite's EXPLAIN QUERY PLAN. They look like they're all using fast index-based searches instead of slow full table scans.

Changelog

  • Remove the run.commands column, where each value was a large list of commands.

  • In its place, add a run_commands table, where each row holds just a single command.

    run_id index_in_run command_id command
    run1 0 abcd [blob]
    run1 1 efgh [blob]
    run2 0 ijkl [blob]
    run2 1 mnop [blob]
    ... ... ... ...
  • Update RunStore to use the new table.

  • Add a migration.

Review requests

This new table will be our biggest, by far. Tens of thousands of records, as opposed to ~20 in our existing tables. One of SQL's traps is that it's easy to accidentally write a very inefficient query. If the right indexes aren't set up, SQLite will degrade to a full O(n) table scan. So scrutinize my queries to make sure we're not doing that.

Also see my inline review comments.

Risk assessment

Medium. See review requests above.

@SyntaxColoring SyntaxColoring requested a review from a team January 24, 2024 19:54
@SyntaxColoring SyntaxColoring changed the base branch from edge to db_draft_migration January 24, 2024 19:54
@SyntaxColoring SyntaxColoring force-pushed the db_run_commands_as_rows branch from e82a168 to d6d0713 Compare January 25, 2024 17:40
def _clear_caches(self) -> None:
self.has.cache_clear()
self.get.cache_clear()
self.get_all.cache_clear()
self.get_state_summary.cache_clear()
self.get_command.cache_clear()
self._get_all_unparsed_commands.cache_clear()
Copy link
Contributor Author

@SyntaxColoring SyntaxColoring Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache was mostly saving the computational work of repeatedly parsing the whole list of commands. It's removed here "accidentally," because we never parse the whole list of commands anymore: we only parse the slices that are requested, when they're requested.

If we wanted to retain the cache, I guess the spiritual sequel would be a pair of caches whose keys are (run_id, command_index) and (run_id, command_id). It wouldn't be as simple as an @lru_cache anymore. We could do this, I guess, but I'm skeptical that it's worthwhile. Our app no longer requests big batches of run commands. And if it did, I'd urge us to find ways to speed up the actual underlying processing, like what we did in #13425.

Copy link
Contributor Author

@SyntaxColoring SyntaxColoring Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose for consistency, we should remove the cache on get_command(id) too.

@SyntaxColoring SyntaxColoring marked this pull request as ready for review January 25, 2024 18:46
@SyntaxColoring SyntaxColoring requested a review from a team as a code owner January 25, 2024 18:46
Copy link
Member

@sfoster1 sfoster1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments on db stuff

Copy link

codecov bot commented Jan 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (ffcc548) 68.29% compared to head (98624e3) 68.25%.
Report is 1 commits behind head on edge.

❗ Current head 98624e3 differs from pull request most recent head a8156e5. Consider uploading reports for the commit a8156e5 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             edge   #14348      +/-   ##
==========================================
- Coverage   68.29%   68.25%   -0.05%     
==========================================
  Files        1623     2512     +889     
  Lines       54858    71910   +17052     
  Branches     4115     9174    +5059     
==========================================
+ Hits        37466    49080   +11614     
- Misses      16705    20664    +3959     
- Partials      687     2166    +1479     
Flag Coverage Δ
app 64.83% <ø> (+29.99%) ⬆️
components 49.62% <ø> (ø)
labware-library 41.10% <ø> (ø)
protocol-designer 38.01% <ø> (ø)
react-api-client 66.16% <ø> (ø)
step-generation 86.90% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
robot-server/robot_server/runs/run_store.py 100.00% <ø> (ø)

... and 897 files with indirect coverage changes

Copy link
Contributor

@CaseyBatten CaseyBatten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, some comments below with questions and clarifications.

Base automatically changed from db_draft_migration to edge January 30, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants