Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding assignment-unit cache to singleton DB #441

Merged
merged 3 commits into from
Apr 28, 2021

Conversation

JackUrb
Copy link
Contributor

@JackUrb JackUrb commented Apr 23, 2021

Overview

Assignment.get_units is currently the largest remaining call in the Mephisto profile when running with MephistoSingletonDB This PR looks to remove this chunk by adding a cache for this specific call.
Screen Shot 2021-04-23 at 5 00 10 PM

Implementation

Creates a dict in the MephistoSingletonDB to hold a mapping between the given assignment_ids queried and the returned list of Units. Also adds a wrapper around new_unit to ensure that if a new unit is added to a given assignment_id we clear that cached value. Considering we're within a singleton, this is sufficient to ensure that the local units will always be up-to-date.

Testing

In python shell:

>>> from mephisto.abstractions.databases.local_singleton_database import MephistoSingletonDB
>>> db = MephistoSingletonDB()
>>> from mephisto.data_model.task_run import TaskRun
>>> len(TaskRun(db, 1188).get_assignments())
300
>>> TaskRun(db, 1188).get_assignment_statuses()
{'created': 0, 'launched': 2, 'assigned': 0, 'completed': 0, 'accepted': 0, 'mixed': 0, 'rejected': 0, 'soft_rejected': 0, 'expired': 298}
>>> TaskRun(db, 1188).get_assignment_statuses()
{'created': 0, 'launched': 2, 'assigned': 0, 'completed': 0, 'accepted': 0, 'mixed': 0, 'rejected': 0, 'soft_rejected': 0, 'expired': 298}

The second call to get_assignment_statuses() was instant, where the first took about a second.

@JackUrb JackUrb requested a review from mojtaba-komeili April 23, 2021 21:33
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 23, 2021
sandbox: bool = True,
) -> str:
"""
Create a new unit with the given index. Raises EntryAlreadyExistsException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does throwing the exception happen in the super class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@codecov-commenter
Copy link

codecov-commenter commented Apr 23, 2021

Codecov Report

Merging #441 (be17f06) into master (e01d0ff) will decrease coverage by 0.09%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #441      +/-   ##
==========================================
- Coverage   66.05%   65.95%   -0.10%     
==========================================
  Files          78       78              
  Lines        7105     7118      +13     
==========================================
+ Hits         4693     4695       +2     
- Misses       2412     2423      +11     
Impacted Files Coverage Δ
...ephisto/abstractions/providers/mturk/mturk_unit.py 17.88% <0.00%> (+0.23%) ⬆️
mephisto/operations/operator.py 56.70% <66.66%> (-0.25%) ⬇️
...abstractions/databases/local_singleton_database.py 98.41% <92.85%> (-1.59%) ⬇️
...tractions/architects/channels/websocket_channel.py 79.06% <0.00%> (-3.49%) ⬇️
mephisto/abstractions/architects/mock_architect.py 88.88% <0.00%> (-2.62%) ⬇️
mephisto/operations/supervisor.py 78.08% <0.00%> (-0.70%) ⬇️
mephisto/data_model/task_run.py 83.33% <0.00%> (-0.53%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e01d0ff...be17f06. Read the comment docs.

@@ -306,7 +306,7 @@ def _track_and_kill_runs(self):
tracked_run.architect.shutdown()
tracked_run.task_launcher.shutdown()
del self._task_runs_tracked[task_run.db_id]
time.sleep(2)
time.sleep(10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this as hardcoded constant (probably all caps) at the top of the file or in a constants file (if you are using one). Specially having a descriptive name helps someone who may want to debug later see what is readily available for tweaking.

Copy link
Contributor

@mojtaba-komeili mojtaba-komeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Jack. Can't wait to run with new changes 😃

@JackUrb JackUrb merged commit 165c692 into master Apr 28, 2021
@JackUrb JackUrb deleted the singleton-find-unit-cache branch April 28, 2021 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants