Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core-service panics when database connection is still good #691

Closed
BenjaminPelletier opened this issue Jan 31, 2022 · 0 comments · Fixed by #692
Closed

core-service panics when database connection is still good #691

BenjaminPelletier opened this issue Jan 31, 2022 · 0 comments · Fixed by #692
Labels
bug Software behaves incorrectly because of this issue P0 Highest priority; blocking usage or development

Comments

@BenjaminPelletier
Copy link
Member

#679 included a change from pingDB to getDBStats. One purpose of both routines was to verify a good connection to the database, or else kill the core-service process so that the Kubernetes manager could restart it and (hopefully) restore the database connection. pingDB used PingContext to verify connectivity whereas getDBStats checked TotalConns == 0. We verified that TotalConns == 0 occurs when the connection to the database is broken, but it turns out that also occurs when the connection to the database is idle. Because of this, core-service will die after tens of minutes of inactivity even though everything is still working. The simplest fix to this problem is to only warn when TotalConns == 0, and do not attempt to kill core-service upon bad database connection. Instead, if the motivating issue is detected again, the database client should be improved to attempt to restore the connection upon initial failure without returning an error (this will be necessary at higher request volume any way).

@BenjaminPelletier BenjaminPelletier added P0 Highest priority; blocking usage or development bug Software behaves incorrectly because of this issue labels Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Software behaves incorrectly because of this issue P0 Highest priority; blocking usage or development
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant