-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock issue when generating CC and retrieving cohort characterization analysis list #700
Comments
This is also happenign when pulling up incidence rate analysis lists: It hung for 4 minutes, but would have been longer if @anthonysena didn't kill some locks on the WebAPI db. @anthonysena, can you provide details on which tables you saw locks on? |
Didn't grab the details this time around but will next time this happens. |
May be related to #797 |
@anthonysena / @chrisknoll , cannot reproduce this on a local instance of the latest WebAPI + MS SQL 2017. Are you able to test the latest code in your env ( |
Bumping this: we're seeing this now in our beta environment after a lot of generation activity was put into the job logs. The way this is impacting a client is: the browser only has a fixed number of active connections (in Chrome I think it is 6). As we have various background activities these HTTP connections start go get consumed, and soon we get to a point where all the active connections are busy, and all the network activity looks like 'pending'. This causes stalls in the app. BTW: This connection limit is across all tabs so having 6 browser sessions open will eat up all your browser connections. We seem to see this focused around the /notifications endpoint. I'm looking at the service call:
Looking at the object that's returned, it has a lot of useless stuff for the /notifications endpoint. And I wonder if there's a way to just query only the information we need out of the batch tables directly. |
@chrisknoll @anthonysena Does the issue still happen in your environment? and if so, is there any way to reproduce it, or any clue how to reproduce it. |
We're now on PostgreSQL as our WebAPI host, so I think we're avoiding the issue. The issue is localized to MSSQL server and how spring batch uses a sequence table to create batchIDs (but requires a full table scan). It is also hard to reproduce, because apparantly table geometry matters (ie: if there's a clustered index on the table or not). In addition, we'll eventually only be supporting PostgreSQL as the WebAPI database so, we could just leave this as a known issue. But the real solution to this (if we want to solve this) is to try to implement a custom sequencer for spring batch, described in this PR: #834, but direct link is here: spring-projects/spring-framework#21425 |
@chrisknoll thank you for such a detailed answer. You save me a lot of time. I see that the issue has a long history. But what should we do for now ? just close the story? |
I think we can close it here as a known issue, and hope that eventually spring batch will update their support to SqlServer 2012 or 2016. As noted in the spring-framework thread, the SqlServer 2008 is going End Of Life this year, so they could drop support for that platform in a future release. |
There appears to be some kind of deadlock issue when submitting/processing cohort characterization that is leading to a deadlock when retrieving cohort characterization list. Below is screenshot of the request for the cohort characterization, note the 9.5 minute execution time (which did finally return results:
It is unclear that the actual execution of the CC analysis is leading to this behavior, but it has been reproduced at least 2 times in our internal systems.
This is on an internal deployment of v2.6.0 (not from latest master).
The text was updated successfully, but these errors were encountered: