IP Leases: the IP operator does not come online automatically after node restart #76

shimpa1 · 2023-02-21T19:12:11Z

On my test provider:

single bare metal node build
built using helm-charts
using helm-based RPC node
after the worker node restart

The RPC node is in catching up: true state (as expected) and the provider pod is waiting for the RPC node to get to catching up: false state.
Meanwhile the IP-Operator pod is waiting for the provider pod.

When the RPC node catches up with the top of the chain, the provider pod starts however the IP operator pod does not recover.

I[2023-02-21|17:43:25.749] check result                                 cmp=provider operator=ip status=503
E[2023-02-21|17:43:25.749] not yet ready                                cmp=provider cmp=waiter waitable="<*operatorclients.ipOperatorClient 0xc0018bacc0>" error="ip operator is not yet alive"
I[2023-02-21|17:43:27.751] check result                                 cmp=provider operator=ip status=503
E[2023-02-21|17:43:27.751] not yet ready                                cmp=provider cmp=waiter waitable="<*operatorclients.ipOperatorClient 0xc0018bacc0>" error="ip operator is not yet alive"

Manually restarting the IP operator pod works.

Perhaps implement a probe of some sort to check the status of the provider pod before starting the IP operator pod.

cheers,

Shimpa

The text was updated successfully, but these errors were encountered:

andy108369 · 2023-02-21T20:13:01Z

Ideally that should be done on the provider side so it can detect when ip operator recovers.

But until that, we can see if livenessProbe/readinessProbe could be leveraged, so the pod restarts when it sees the ip operator hasn't been ready/functioning (from the provider point of view) for longer than 10 minutes or so.

andy108369 · 2023-07-19T11:55:08Z

Moved to #105

troian added repo/provider Akash provider-services repo issues sev2 labels Mar 1, 2023

andy108369 closed this as completed Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IP Leases: the IP operator does not come online automatically after node restart #76

IP Leases: the IP operator does not come online automatically after node restart #76

shimpa1 commented Feb 21, 2023 •

edited

Loading

andy108369 commented Feb 21, 2023

andy108369 commented Jul 19, 2023

IP Leases: the IP operator does not come online automatically after node restart #76

IP Leases: the IP operator does not come online automatically after node restart #76

Comments

shimpa1 commented Feb 21, 2023 • edited Loading

andy108369 commented Feb 21, 2023

andy108369 commented Jul 19, 2023

shimpa1 commented Feb 21, 2023 •

edited

Loading