Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invocations of ContentStore methods with PropertyPath argument are not optimistically locked by spring-versions-jpa #1023

Open
kreymerman opened this issue Aug 8, 2022 · 9 comments · Fixed by #1028

Comments

@kreymerman
Copy link
Contributor

kreymerman commented Aug 8, 2022

Because OptimisticLockingInterceptor
checks only a certain list methods (without PropertyPath argument), invocations of ContentStore methods with PropertyPath argument are not optimistically locked.

In addition, I would like to take this opportunity to ask whether it is necessary to call EntityManager::merge in the OptimisticLockingInterceptor::lock method.
This can sometimes have unexpected consequences: I wanted to automatically remove the associated content in s3 storage, but only after the entity has been successfully removed from the database. But if I call Content::unsetContent with an Entity that has already been removed from the database, merge causes it to be recreated.
Sorry for asking this question here, I didn't make a separate issue for this because this is a very specific and probably unsupported case and there is a workaround for it (using ((DeletableResource)contentStore.getResource(...)).delete()), but maybe merge isn't really needed here and can be removed?

@paulcwarren
Copy link
Owner

Hi @kreymerman, thanks for raising.

Regards the optimisticlockinginterceptor. Sounds pretty easy to update that interceptor with the new methods. That was an oversight on my part.

Regards the merge. I am really not sure if that can be removed, or not. Have to give it a go and see. I suspect not. Although, I am not a JPA expert so I am not convinced I'd be able to explain exactly why. I'll put a PR out there to investigate.

@paulcwarren
Copy link
Owner

Hi @kreymerman, looks the PR to add the additional property path methods to the optimistic lock interceptor is green. So I am going to merge that. Any chance you test out the SNAPSHOT to make sure that it meets your needs?

@paulcwarren
Copy link
Owner

paulcwarren commented Aug 10, 2022

Looks like the PR to investigate removing the merge failed but right at the end in the cmis examples. That might be a false negative although those tests have been very reliable. Will investigate more tomorrow.

@kreymerman
Copy link
Contributor Author

kreymerman commented Aug 11, 2022

Hi, @paulcwarren , thanks a lot, I tested the SNAPSHOT and interception seems to work (e.g. when I use Spring Content REST and update just the content of a File Entity e.g. with PUT /files/{ID}/content (even if the content has the same size as before and the corresponding Entity fields such as contentLength are not updated) the @Version field is updated anyway).

As for the merge call:
I investigated this too and it is not surprising that to lock an entity it must be in a managed state (or this error "entity not in the persistence context" will occur). Calling merge() of course ensures that entity is always added to current persistence context and then locked. But when calling ContentStore methods e.g. with new (transient), detached or already deleted entities this will lead to unexpected consequences. For some reason I assumed, that all entities are already managed when using Spring Content REST and CMIS, and therefore always calling merge() and locking could be replaced with

        if (em.contains(entity)) { // check whether entity is in a managed state (within a persistence context)
            em.lock(entity, lockMode);
        }

safely, but it turns out, that this is not the case. I first tried to remove only the merge call and to do some tests with Spring Content REST (because I haven't used CMIS yet) and e.g. PUT example above (handled by internal.org.springframework.content.rest.contentservice.ContentStoreContentService#setContent) will fail too with the error "entity not in the persistence context". And if I only lock managed entities (as in code snippet above) locking doesn't happen here at all of course. This is due to domainObject, that was previously loaded from DB in the same thread is no longer in the persistence context associated with OptimisticLockingIntercepter (EntityManager::contains returns false), because in Spring Content REST 1) EntityManager is not thread-bound for the entire request, but recreated for every transaction and 2) every Repository or ContentStore operation starts new transaction.

Most likely error in CMIS tests has the same source. If there is no other (clean) way to ensure that Spring Content REST and CMIS only pass managed entities in ContentStore methods (e.g. by loading entities first and using the the same persistence context for content operations too with request-scoped EntityManager or with single transaction for the whole operation), then merge() probably cannot be removed.

@paulcwarren
Copy link
Owner

paulcwarren commented Aug 18, 2022

Hi @kreymerman , re SNAPSHOT testing. Great. Thanks for the update.

Thanks for the analysis on the merge issue. In general I have just followed SD/SDR lead here where repository methods are marked @Transactional and therefore participate in an existing transaction or start one if none exists.

Industry best practices kinda suggest (I say kinda b/c its all subjective) that only the service layer should be transactional. Side note: I have never quite been sure how to rationalize that with SD's implementation where repository methods are transactional which means the persistence layer is transactional (if none already exists) but there you are. Anyways, SDR doesn't really have a service layer. There is no business logic per se. Controllers call Repository methods directly.

With SCR one could argue there is a more defined service layer with its ContentService abstraction. There is definitely "logic" in that layer (is it business logic?). So, one possible option is to make those ContentService methods @Transactional (which would probably imply that the ContentService's need to become beans too - currently they are not). ContentStore methods should be fine. They'll just join this existing transaction. But events suddenly also become transactional. If that even works, it feels like a big, scary change to me. But it that might be investigating? I will try and take a look at that but it probably wont be for a bit. So, if you are keen. Any help would be greatly appreciated.

The other option is to try a refactor that ties the entity manager to the request thread somehow? I am not sure what that would look like, or it is also even possible.

@kreymerman
Copy link
Contributor Author

kreymerman commented Aug 19, 2022

The other option is to try a refactor that ties the entity manager to the request thread somehow? I am not sure what that would look like, or it is also even possible.

I have done more investigation around this topic and that's actually exactly what OpenSessionInView(OSIV)-Pattern (implemented in Spring e.g. with OpenSessionInViewInterceptor mentioned in the stack overflow response you cited is doing.

By the way Spring Data REST adds OpenSessionInViewInterceptor by default in org.springframework.data.rest.webmvc.support.JpaHelper and it is not disabled even by the option spring.jpa.open-in-view=false.
Spring Content REST on the other hand does not add OpenSessionInViewInterceptor at all, but this is good, because

THE PROBLEM IS when we have both database IO and file/network IO during one request in Spring Content REST (as with Spring Content Filesystem and especially with Spring Content S3 and other object storage backends because they use potentially slow or just long-running (like uploading a large file) remote calls) then using OSIV pattern or mixing database IOs with other types of IOs in one transaction is a bad smell e.g. because it can lead to running out of database connections if non-database IO takes a long time (https://www.baeldung.com/spring-open-session-in-view#1-avoiding-mixed-ios explains this situation well).

Therefore from this point of view

option is to make those ContentService methods @transactional (which would probably imply that the ContentService's need to become beans too - currently they are not). ContentStore methods should be fine. They'll just join this existing transaction. But events suddenly also become transactional. If that even works, it feels like a big, scary change to me.

could work well when used with Spring Content JPA (only database IO; less transactions during the whole request, because no separate transactions for every Repository/ContentStore-method and merges with additional queries are needed). The fact that events will also become transactional is of course a major and maybe a breaking change. But usually if a rollback occurs one would want to roll back the database changes made in event handlers as well. But yes, of course it's better to be able to control whether event handling is executed in a separate transaction or not, e.g. with annotating the handlers with @transactional(propagation = Propagation.REQUIRES_NEW).

The problem is that both OSIV, executing the whole operation inside a transaction (e.g. by making ContentService methods @Transactional) and even @Transactional ContentStore methods are bad for most other backends (I am especially interested in using Spring Content S3) because different types of IO are mixed.

For this reason I think that ContentStore-Methods should not be transactional at all, because it is not very necessary for Spring Content JPA, may be harmful for other backends and causes the entity to be automatically saved in the database when Content-Store methods are called. In combination with merge() call in OptimisticLockingInterceptor new, detached, or deleted entities are unexpectedly saved too.

And actually when used e.g. with Spring Content REST ContentStore-Methods are not transactional despite @Transactional Annotation unless Spring Versions JPA is in classpath too, which adds TransactionInterceptor in internal.org.springframework.versions.jpa.JpaLockingAndVersioningProxyFactoryImpl#addTransactionAdviceIfNeeded)

Therefore transactional ContentStore methods appear to be only needed for the current implementation of OptimisticLockingInterceptor (due to EntityManager::lock can only work in a transactional context). Correct me please if I am wrong.
And after some more analysis it seems to me that on the one hand the current optimistic locking approach still allows some racing conditions and on the other hand I think it is possible to implement optimistic locking without making all ContentStore-methods transactional.

With the current OptimisticLocking implementation the following issue can occur for example when two clients A and B call ContentStore:setContent concurrently on the same File Entity with associated content in S3 or in filesystem storage:

  • A fetches File entity from DB with version 0
  • A calls setContent
  • A starts new transaction for setContent
  • A calls merge
  • A locks the entity optimistically (maybe not really needed as version is incremented manually in touch()-method on this entity later anyway, but it probably may be replaced with LockModeType.OPTIMISTIC_FORCE_INCREMENT)
  • A starts writing content (but not finished with it) in a transactional context (which should be avoided with these backends)
  • B fetches the same File entity from DB with version 0
  • B calls setContent
  • B starts new transaction for setContent
  • B calls merge (database has still version 0, no OptimisticLockingException is thrown)
  • B locks the entity optimistically
  • B writes content of smaller size and finishes with it (In case of the filesystem backend, file is already corrupted due to concurrent writes, but with atomic S3 backend B's content is uploaded successfully)
  • B updates all relevant fields, e.g. contentLength, increments version
  • B commits setContent-Transaction and entity is saved to DB with version 1 (even before FileRepository::save is called)
  • A successfully finishes writing content (whereby with atomic S3 backend A's content replaces B's content)
  • A updates all relevant fields, e.g. contentLength, increments version
  • A tries to commit, but fails with OptimisticLockingException and transaction is rolled back
    But we already changed content in S3 and therefore e.g. contentLengh in DB does not match anymore.

So when using optimistic locking with most storage backends except Spring Content JPA (although I haven't worked with it, maybe there are similar pitfalls too) it may be undesirable to write content 1) in-place 2) inside a transaction and a rollback mechanism for content would be useful too.

In fact, I'm not an expert here at all either, but I can try to sketch an optimistic locking variant where this particular problem will be solved for a specific Entity class, and then you could evaluate if this solution is valid and whether you can add either a generalized variant or appropriate extension points for alternative optimistic locking implementations to Spring Content REST and Spring Content JPA.

@paulcwarren
Copy link
Owner

Hi @kreymerman , apologies for my tardy response times. Thanks for the detailed info. Super helpful.

Please do evaluate some approaches. That would be greatly appreciated. A few quick comments but I will try and craft a longer response at a later date.

General speaking this was all designed to work on top of Jpa storage. That is absolutely the case. I should probably make that clearer in the docs. More on that in a second.

In general Content Management Systems don't support optimistic locking. Only pessimistic. Hence the LockingAndVersioningRepository extension. I anticipated this would be the API that most folks would use when needing a locking solution.

So the original idea to try and support optimistic locking came from a chatting to someone at a Spring conference. He worked for Boston Globe and some very very large publishing house (whose name I cant remember now). He had some very, very aggressive performance targets that were impossible to achieve with traditional ECMs. His plan was to use Spring Data Redis/Spring Content JPA on top of Redis (somehow?) with optimistic locking. We were planning to collaborate on this but I don't think he ever actually got around to trying it all out. But that the genesis story for context.

Random other thoughts:

  • this is essentially a distributed transaction problem and they are really hard!
  • Spring Content JPA uses various database blob api's to store the content and some/all of those must run inside a transaction so removing @Transactional from ContentStore methods is probably untenable from that perspective.
  • I am not sure I understand your example since the optimistic lock is taken on the way into calling the setContent method

@kreymerman
Copy link
Contributor Author

kreymerman commented Sep 4, 2022

Hi @paulcwarren , thank you very much for the detailed explanation!

Sorry for the late reply, unfortunately I am distracted by a completely different project that locks up all the time, but I hope to get back to this question again soon.

Spring Content JPA uses various database blob api's to store the content and some/all of those must run inside a transaction so removing @Transactional from ContentStore methods is probably untenable from that perspective.

You're right, of course, about the necessity to save content to dаtabase within a transaction. When I said "it is not very necessary for Spring Content JPA" (to have transactional ContentStore-methods) I meant that for maximum efficiency and correctness with Spring Content JPA it makes sense to perform all database operations of one (API) request in one transaction anyway, so it is not mandatory for DefaultJpaStoreImpl to have transactional methods too. But they do not harm either and DefaultJpaStoreImpl methods can remain transactional (similar as they are on Spring Data JPA Repository methods), because if there is an "outer" transaction, they will participate in it and not create new ones.

But for most other implementations this could be an antipattern.
And it seems to me that other ContenStore-implementations have transactional methods just to make optimistic locking work (but I may be wrong). Since you say that current implementation of optimistic locking is designed to work only on top of Jpa storage, maybe it should be automatically enabled only for Spring Content JPA, and for other storage backends we could explore another implementation options, which will work without transactional ContentStore-methods and will avoid the racing condition, that I described in my previous message.

I am not sure I understand your example since the optimistic lock is taken on the way into calling the setContent method

As far as I understand, optimistic locking doesn't take a lock and doesn't check for it when for example EntityManager::lock(entity, LockModeType.OPTIMISTIC) is called, so execution continues and the version check happens at end of transaction or after it (either by direct query or by adding WHERE version = <current known version> to DML statements and the following check of the number of changed lines). If version doesn't match, then rollback happens. But in s3 and in the file system however the changes have already been done and are not rolled back.

Moreover, it turns out that in this scenario when using S3 (where writing objects is atomic and whoever finishes writing later wins), the current implementation of OptimisticLockingInterceptor actually leads to a non-consistent state (without it at least the content which client A was uploading would have been successfully saved and the metadata in the db would match it).

@paulcwarren
Copy link
Owner

paulcwarren commented Sep 20, 2022

Hi @kreymerman

A few comments:

DefaultJpaStoreImpl methods can remain transactional

They would have to. Otherwise a User using SC JPA with SCR (which doesnt create an outer txn) would see "not in txn" failures.

maybe it should be automatically enabled only for Spring Content JPA, and for other storage backends we could explore another implementation options

Yes. Perhaps we should consider turning on the OptimisticLockInteceptor for SC JPA stores only?

As far as I understand, optimistic locking doesn't take a lock and doesn't check for it when for example
EntityManager::lock(entity, LockModeType.OPTIMISTIC) is called

Now you explain it to me that sounds right. So, I retract my previous statement. You may well be right that there is an issue here. It would be nice to prove that more that theoretically but anyway that backs up the suggestion that we should enable optimistic locking for SC JPA for now. And investigate other solutions for the other stores.

Thanks for your continued help with this. It is really useful to have someone thinking more deeply about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants