You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to track down the source of an error message that appears periodically in the logs for my Kong cluster (backed by Cassandra). The message looks like this:
connector.lua:269: [cassandra] failed to refresh cluster topology: timeout, context: ngx.timer
which ends up calling _Cluster:refresh() in lua-cassandra. While reviewing the code, I noticed that that _Cluster:refresh() obtains a shared lock, but the function has several exit points that do not release the lock. In particular, there are 6 places where an error return can occur where lock:unlock() will not be called. Example:
The lock will auto-release after the 60-second timeout, but any callers may block until that happens.
Also, it would be helpful if _Cluster:refresh() could log a warning/error message for each of the early returns to help diagnose the root cause of any failures.
The text was updated successfully, but these errors were encountered:
bungle
added a commit
to bungle/lua-cassandra
that referenced
this issue
Oct 6, 2020
### Summary
On thibaultcha#137@chris-branch reported that `cluster:refresh` did not always release a lock
on error cases. This commit fixes that.
### Issues Resolved
Fixthibaultcha#137
### Summary
On thibaultcha#137@chris-branch reported that `cluster:refresh` did not always release a lock
on error cases. This commit fixes that.
### Issues Resolved
Fixthibaultcha#137
### Summary
On #137@chris-branch reported that `cluster:refresh` did not always release a lock
on error cases. This commit fixes that.
### Issues Resolved
Fix#137
I'm trying to track down the source of an error message that appears periodically in the logs for my Kong cluster (backed by Cassandra). The message looks like this:
connector.lua:269: [cassandra] failed to refresh cluster topology: timeout, context: ngx.timer
I believe this is coming from here:
https://github.com/Kong/kong/blob/master/kong/db/strategies/cassandra/connector.lua#L273
which ends up calling
_Cluster:refresh()
in lua-cassandra. While reviewing the code, I noticed that that_Cluster:refresh()
obtains a shared lock, but the function has several exit points that do not release the lock. In particular, there are 6 places where an error return can occur wherelock:unlock()
will not be called. Example:https://github.com/thibaultcha/lua-cassandra/blob/master/lib/resty/cassandra/cluster.lua#L583
The lock will auto-release after the 60-second timeout, but any callers may block until that happens.
Also, it would be helpful if
_Cluster:refresh()
could log a warning/error message for each of the early returns to help diagnose the root cause of any failures.The text was updated successfully, but these errors were encountered: