Use exists subquery to find deleted model instance pks #791
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an idea that may improve the performance of
get_deleted
for some of the users in #748.The changes shift the subquery on the model on
get_deleted
to use the built-inExists
(only django versions 1.11 and up are listed as supported as of now, andExists
was added in this version). The subquery should work across supported backends, the test suite passes, and I've used locally in MySQL and Postgres successfully.As far as performance goes, things get a bit tricky, and I've been able to look into query times only in Postgres at the moment. But, in Postgres, I've experienced performance changes similar similar to the following when running
get_deleted
on the entireVersion
queryset:But there is another use case in which I've found more significant performance improvements (see jedie/django-reversion-compare#95). In the linked issue, the performance issues are from the following line (https://github.com/jedie/django-reversion-compare/blob/0bfe214e40933e38a5e5a94f3c6d0f56de9051be/reversion_compare/compare.py#L206):
Here, the original query still must iterate over the pks the related model, but using
Exists
results in a significant performance improvement (in a model with a few million instances and many deletions, the query went from about 50 seconds to 1.5 milliseconds).That being said, there are a couple of issues: