-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ultimate batch saving speedup #15761
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Ladas
force-pushed
the
ultimate_batch_saving_speedup
branch
from
August 8, 2017 13:29
287b062
to
fb7580e
Compare
Check if IC has serializable_keys? and transform them, this saves some computing since we don not have to use :use_ar_object just because the model has: serialize <col_name>
Add TODOs for noticed possible bugs
Use to_sym for individual keys instead of symbolize_keys!
Performance tweaks for the base saver, getting repeated logic computed in initializer. Getting record_key abrtratcion for fetching attributes of non AR records.
Perf tweaks of the batch saver: 1. Having iterator that fetches raw SQL without creationg AR objects. 2. Select on the fetched data to reduce the mem size needed. 3. Bumping batch_size to 10k, since the objects are 10x smaller than AR objects. And few small tweaks in the core saving
Perf tweaks in SQL mixin: 1. Store connection for the whole batch 2. Precalculate pg_types for faster type_casting
Fix bad comments
Fix rubocop issues
Correct symbol vs string keys
Ladas
force-pushed
the
ultimate_batch_saving_speedup
branch
from
August 9, 2017 11:15
cc703cd
to
59b7895
Compare
Move batch size to attributes with setter, so we avoid computing it multiple times.
1 task
Remove redundant symbolize_keys
Use faster inventory_object.id call
Store primary_key_value to avoid multiple method calls
This was referenced Aug 9, 2017
Use reorder for find_in_batches working properly
Checked commits Ladas/manageiq@23c6e47~...60e14d1 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 app/models/manager_refresh/save_collection/saver/base.rb
app/models/manager_refresh/save_collection/saver/sql_helper.rb
|
agrare
approved these changes
Aug 10, 2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Various micro-optimizations that are together making the 2nd refresh batch saving about 2 times faster on a bigger data loads (at about 1.4M records stored it's becoming more than 2x faster)
Graph comparing the speed with the current batch saving: