Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unnecessary queries when batch-removing statuses, 100x faster #15387

Merged
merged 1 commit into from
Dec 22, 2020

Conversation

Gargron
Copy link
Member

@Gargron Gargron commented Dec 21, 2020

Deleting an account speed comparison:

Old New
149.5 ms/status 3.1 ms/status

@Gargron Gargron added the performance Runtime performance label Dec 21, 2020
@Gargron Gargron force-pushed the fix-batched-remove-status-perf branch 4 times, most recently from 8dd0c74 to b3689bb Compare December 21, 2020 06:47
Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll re-read it but so far it looks good to me.

app/services/batched_remove_status_service.rb Show resolved Hide resolved
@Gargron Gargron force-pushed the fix-batched-remove-status-perf branch from b3689bb to 2334add Compare December 22, 2020 00:35
@Gargron Gargron changed the title Fix n+1 queries in batched remove status service Fix unnecessary queries when batch-removing statuses, 100x faster Dec 22, 2020
Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm comfortable with that going into 3.3.0 without another RC first, but it looks good to me so far.

I also have made a bunch of inline comments that I believe would further improve performances, but they can reasonably be postponed to another PR, this one seems fine as is.

Comment on lines +26 to +30
# We do not batch all deletes into one to avoid having a long-running
# transaction lock the database, but we use the delete method instead
# of destroy to avoid all callbacks. We rely on foreign keys to
# cascade the delete faster without loading the associations.
statuses_and_reblogs.each(&:delete)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could have some middle-ground here and batch some deletes together?

Comment on lines +6 to +7
# Delete multiple statuses and reblogs of them as efficiently as possible
# @param [Enumerable<Status>] statuses An array of statuses
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the comment should be updated to note that some side effects aren't handled by this service and should be taken care of separately, e.g. notifications not being cleaned up.

Comment on lines +40 to +41
@tags = statuses_and_reblogs.each_with_object({}) { |s, h| h[s.id] = s.tags.map { |tag| tag.name.mb_chars.downcase } }
@json_payloads = statuses_and_reblogs.each_with_object({}) { |s, h| h[s.id] = Oj.dump(event: :delete, payload: s.id.to_s) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, those are only used once per status, so I am unsure why it's pre-computed.

Comment on lines +69 to +74

return unless account.local?

statuses.each do |status|
FeedManager.instance.unpush_from_home(account, status)
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's only ever used in a context where the user gets deleted, I wonder if we could get rid of that completely.

Comment on lines +38 to +55
ASSOCIATIONS_WITHOUT_SIDE_EFFECTS = %w(
account_pins
aliases
conversation_mutes
conversations
custom_filters
devices
domain_blocks
featured_tags
follow_requests
identity_proofs
migrations
mute_relationships
muted_by_relationships
notifications
scheduled_statuses
status_pins
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bookmarks could be in there too (there's Chewy tracking, but the account is gonna be deleted anyway, right?)

Comment on lines 154 to 158
@account.statuses.reorder(nil).find_in_batches do |statuses|
statuses.reject! { |status| reported_status_ids.include?(status.id) } if @options[:reserve_username]
BatchedRemoveStatusService.new.call(statuses, skip_side_effects: @options[:skip_side_effects])
statuses.reject! { |status| reported_status_ids.include?(status.id) } if keep_account_record?

BatchedRemoveStatusService.new.call(statuses, skip_side_effects: skip_side_effects?)
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if the reported_status_ids shouldn't be a where.not(id: reported_status_ids) instead and the whole thing an in_batches rather than find_in_batches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Runtime performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants