-
Notifications
You must be signed in to change notification settings - Fork 14
server: Investigate ECONNRESET #271
Comments
so it seems like the nomidotwatcher Loadbalancer Service is abruptly cutting the connection.... https://stackoverflow.com/questions/17245881/how-do-i-debug-error-econnreset-in-node-js#17637900 indeed there seem to be some spikes in resource consumption around the problem periods of time: |
ok so as we discussed on Riot kubernetes/kubernetes#79365 (comment) it looks very likely that our issue is the nodewatcher is a pod made up of multiple containers, since we use the side-car model for GCP. according to that github comment, it means we need to explicitly set the resources we need. otherwise, we get https://matrix.parity.io/_matrix/media/r0/download/matrix.parity.io/BlNsHgxHVFckekECbDLzOeYL type of error. |
Nodewatcher recently got a couple pods evicted, the GCP console says:
pod describe says the same:
that's on last 50k. The running pod is as usual healthy and not showing any error. I'm worried about the "exceeds its request of 0." though |
frontend == ExternalIP ==> Loadbalancer ====> Server Deployment ===ClusterIP ==> Prisma (Node Watcher) port 4466 problem area here
the error comes from the
server => prisma
networking.so far we attempted:
neither seem to have actually solved the underlying issue.
trying this now....
3. scale the nodewatcher (double replicas)
The text was updated successfully, but these errors were encountered: