-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deleteBy operation easily triggers OOM and has horrible performance #3177
Comments
How would that help? JPA entities are attached to the session, and generally, paging and batch sizes are subject to tuning so all simplicity would vaporize.
You might be caused by surprise but specifically, JPA can cause cascading deletes resulting in multiple operations for one object to be deleted. However, batching in the sense of Criteria Delete queries could leverage optimizations within the Persistence Provider.
Any reason you do not use
The rationale for this design is to be able to get hold of the deleted elements. Also, JPA lifecycle events are honored. Part of the problem is that we use Criteria Queries to build the query object. Update and Delete Criteria do not share the same base types to create a query through I would argue that using JPA with bulk data is intricate on its own because the amount of data you can reliably process is constrained by the transactional capacity of your system. |
Good point and I get the cascading/optimisation JPA brings about though this is sort of assuming we want to care about cascades and events. In our case, we had a very simple table: no @onetomany etc. We worked around it by @Modifying @query("DELETE...") as you mentioned but I guess what annoyed me most is that I have no way of stopping or warning other devs that they are about to potentially introduce behaviour they're not expecting. It feels like though we have a way of working around it our options are:
There's no option to do something like:
to avoid writing HQL. Here's a graph of what happened when we did the @query("DELETE...") change, just as an indication of how badly this default hit us. Happy to see different opinions on the matter obviously. We have a workound in place but I'd be OK to open a MR if you think this is something that's worth pursuing. |
If the entity has no cascades and events, then bulk delete should be used instead of fetching all and delete one by one. |
Any updates on this issue? |
Likely, the only thing we can do here is add a bit of documentation to make its effects more obvious. |
Hello,
We recently had a production incident caused by this as a repository method:
Digging in the source code, this leads to:
spring-data-jpa/spring-data-jpa/src/main/java/org/springframework/data/jpa/repository/query/JpaQueryExecution.java
Line 295 in 7cdf53f
in our case the contents of that table was large enough that it caused an OutOfMemoryException.
I would like to argue that the default behaviour is poor:
I know why it was done that way (the docs explain it, it's to fire off events "on delete") but IMHO it violates the principle of least surprise. Looking around at our microservices, I can see multiple developers taking the same wrong assumption all over the place.
At the very least, this should be done paging style though I'd much rather have it so as that you'd want to add some config/annotation to indicate that you care about such events.
Thoughts?
The text was updated successfully, but these errors were encountered: