Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash #20

Closed
ReneKroon opened this issue Jun 17, 2019 · 3 comments
Closed

Crash #20

ReneKroon opened this issue Jun 17, 2019 · 3 comments

Comments

@ReneKroon
Copy link
Contributor

Jun 17 03:32:09 pro-boqs-app-003 boqs: goroutine 484637918 [running]:
Jun 17 03:32:09 pro-boqs-app-003 boqs: net/http.(*conn).serve.func1(0xc029a32a00)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/net/http/server.go:1769 +0x139
Jun 17 03:32:09 pro-boqs-app-003 boqs: panic(0xc8b580, 0x1517740)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/runtime/panic.go:522 +0x1b5
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.priorityQueue.Less(0xc0009cd800, 0x47, 0x100, 0x47, 0x23, 0x5d06ed99)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/priority_queue.go:43 +0x16e
Jun 17 03:32:09 pro-boqs-app-003 boqs: container/heap.up(0xef6c60, 0xc000968480, 0x47)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/container/heap/heap.go:93 +0x9f
Jun 17 03:32:09 pro-boqs-app-003 boqs: container/heap.Fix(0xef6c60, 0xc000968480, 0x47)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/container/heap/heap.go:86 +0x94
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.(*priorityQueue).update(...)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/priority_queue.go:18
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.(*Cache).getItem(0xc0001852d0, 0xc0162dbd60, 0x1c, 0x0, 0xed0f58cf0)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/cache.go:45 +0xeb
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.(*Cache).Get(0xc0001852d0, 0xc0162dbd60, 0x1c, 0x0, 0x0, 0xeeea00)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/cache.go:173 +0x50
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/internal/dao/cluster.(*PgTopologyCache).SubscribedQueues(0xc000972200, 0xc0162dbd60, 0x1c, 0x183ad9b0, 0xed176645f, 0x152b840, 0xc034089290, 0x3, 0x0, 0x120, ...)
@ReneKroon
Copy link
Contributor Author

Crash in configuration that was not explicitly tested. Configuration was added.

As it turns out the switch to RWLock was not a good one. There are data races all over the place if TTL is extended on hit. Another issue comes from

cache.expirationNotification <- true

This channel is async, but blocking and therefore can cause blocking conditions in conjunction with the locks on Get/Set calls and the cleanup routine.

All races have been fixed.

@ReneKroon ReneKroon mentioned this issue Jun 17, 2019
@froodian
Copy link
Contributor

sorry to introduce instability, thanks for the fix

@ReneKroon
Copy link
Contributor Author

No problem as i did'nt see the issues in the change either, but it did found some design guidelines:

  • Since there are no recursive locks the locking should always be done over whole 'top-level' operations in the Cache struct
  • Some parts of the code released a lock and then acquired a second one during the same operation. This is also a bad pattern as it effectively invalidates the data retrieved during the first lock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants