You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking through the WriteBatchRaw path and I realized that we don't check for db.Overloaded there. So basically what happens is when the commitlog queue is full (and all of our most important pools are empty like writeBatchedPoolReqBool and writeBatchPool which are both extremely expensive to allocate) we allocate both of them anyways, then we write all of the data into our in memory data structures and THEN we try and write to the commitlog, get an error, and reject the write. If we just did a check for db.IsOverloaded at the beginning of the method, we could make the cost of rejecting a write while we're under heavy load go down substantially which make allow the node to recover instead of getting stuck in this cycle with a perpetually full commitlog queues and all the goroutines / pool allocations eventually OOMing the node. The only issue with this approach is that we would end up rejecting writes that previously we would have probably accepted in some scenarios where we're under a lot of load, but in practice I bet this would be a big net win for reliability even if we need to tune our queue size for production or something (especially with all the M3msg buffering and retries we have at our disposal). If we wanted to be aggressive, we could probably even inject a preflight check into the thrift server to prevent it from even allocating the thrift structs if we know we're gonna reject a request.
Nodes are still getting OOM'd / annihilated by expensive index queries. I think if we just pushed down the context further into the query code so that in-between expensive operations like querying a block or querying a segment we could check if we should even continue. Because I think right now what happens is an expensive query comes in, eventually times out, but M3DB keeps chugging along allocating and querying even though the user has already received the timeout error.
The text was updated successfully, but these errors were encountered:
@Haijuncao I'd love your thoughts on this since I know you did some similar work to improve Schemaless's ability to load shed
richardartoul
changed the title
Improve M3DBs ability to apply backpressure and prevent catastrophic failure when under too much load
Improve M3DBs ability to loadshed / apply backpressure and prevent catastrophic failure when under too much load
Mar 17, 2019
I was looking through the
WriteBatchRaw
path and I realized that we don't check for db.Overloaded there. So basically what happens is when the commitlog queue is full (and all of our most important pools are empty like writeBatchedPoolReqBool and writeBatchPool which are both extremely expensive to allocate) we allocate both of them anyways, then we write all of the data into our in memory data structures and THEN we try and write to the commitlog, get an error, and reject the write. If we just did a check for db.IsOverloaded at the beginning of the method, we could make the cost of rejecting a write while we're under heavy load go down substantially which make allow the node to recover instead of getting stuck in this cycle with a perpetually full commitlog queues and all the goroutines / pool allocations eventually OOMing the node. The only issue with this approach is that we would end up rejecting writes that previously we would have probably accepted in some scenarios where we're under a lot of load, but in practice I bet this would be a big net win for reliability even if we need to tune our queue size for production or something (especially with all the M3msg buffering and retries we have at our disposal). If we wanted to be aggressive, we could probably even inject a preflight check into the thrift server to prevent it from even allocating the thrift structs if we know we're gonna reject a request.Nodes are still getting OOM'd / annihilated by expensive index queries. I think if we just pushed down the context further into the query code so that in-between expensive operations like querying a block or querying a segment we could check if we should even continue. Because I think right now what happens is an expensive query comes in, eventually times out, but M3DB keeps chugging along allocating and querying even though the user has already received the timeout error.
The text was updated successfully, but these errors were encountered: