Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

healthcheck probe fails to detect failure on loading keys #715

Closed
jmcruz1983 opened this issue Feb 15, 2023 · 0 comments
Closed

healthcheck probe fails to detect failure on loading keys #715

jmcruz1983 opened this issue Feb 15, 2023 · 0 comments
Assignees
Labels
TeamCerberus Under active development by TeamCerberus @Consensys

Comments

@jmcruz1983
Copy link

jmcruz1983 commented Feb 15, 2023

Having web3singer unable to load the keys,
still healthcheck returns healthy response.

I could suggest to pass an optional parameter to /healthcheck?expectedLoadedKeys=100
to compare expected loaded keys with the number of keys loaded from key vault / secret manager.

This solution would keep backwards compatibility in existing /healthcheck endpoint.

That way we could restart web3signer on liveness probe whenever failing on loading keys
and preventing subsequent 404 response codes on keys not found causing miss attestation down the line.

Check following log lines:

023-02-07 13:27:14.678+00:00 | main | INFO  | Web3SignerApp | Web3Signer has started with args eth2
2023-02-07 13:27:14.699+00:00 | main | INFO  | Web3SignerApp | Version = web3signer/v23.1.0/linux-x86_64/-eclipseadoptium-openjdk64bitservervm-java-17
2023-02-07 13:27:17.755+00:00 | main | INFO  | Eth2SubCommand | Network: prater
Spec Name: PHASE0, Fork Epoch: 0, First Slot: 0
Spec Name: ALTAIR, Fork Epoch: 36660, First Slot: 1173120
Spec Name: BELLATRIX, Fork Epoch: 112260, First Slot: 3592320

2023-02-07 13:27:17.816+00:00 | main | WARN  | HikariConfig | HikariPool-1 - idleTimeout is close to or more than maxLifetime, disabling it.
2023-02-07 13:27:17.820+00:00 | main | INFO  | HikariDataSource | HikariPool-1 - Starting...
2023-02-07 13:27:18.612+00:00 | main | INFO  | HikariPool | HikariPool-1 - Added connection org.postgresql.jdbc.PgConnection@2e7157c7
2023-02-07 13:27:18.615+00:00 | main | INFO  | HikariDataSource | HikariPool-1 - Start completed.
2023-02-07 13:27:19.622+00:00 | main | INFO  | HikariDataSource | HikariPool-2 - Starting...
2023-02-07 13:27:19.717+00:00 | main | INFO  | HikariPool | HikariPool-2 - Added connection org.postgresql.jdbc.PgConnection@40709f9
2023-02-07 13:27:19.719+00:00 | main | INFO  | HikariDataSource | HikariPool-2 - Start completed.
Setting logging level to INFO
2023-02-07 13:27:20.729+00:00 | main | INFO  | MetricsHttpService | Starting metrics http service on 0.0.0.0:9001
2023-02-07 13:27:21.001+00:00 | pool-2-thread-1 | INFO  | SignerLoader | Loading signer configuration metadata files from .
2023-02-07 13:27:21.036+00:00 | pool-2-thread-1 | INFO  | SignerLoader | Signer configuration metadata files read in memory 0 in 00:00:00.016
2023-02-07 13:27:21.040+00:00 | ForkJoinPool-1-worker-1 | INFO  | SignerLoader | Parsing configuration metadata files
2023-02-07 13:27:21.049+00:00 | ForkJoinPool-1-worker-1 | INFO  | SignerLoader | Total configuration metadata files processed: 0
2023-02-07 13:27:21.049+00:00 | ForkJoinPool-1-worker-1 | INFO  | SignerLoader | Total signers loaded from configuration files: 0 in 00:00:00.007
2023-02-07 13:27:21.244+00:00 | pool-2-thread-1 | INFO  | JacksonVersion | Package versions: jackson-core=2.14.1, jackson-databind=2.14.1, jackson-dataformat-xml=2.14.1, jackson-datatype-jsr310=2.14.1, azure-core=1.34.0, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot
2023-02-07 13:27:21.361+00:00 | vert.x-eventloop-thread-0 | INFO  | MetricsHttpService | Metrics service started and listening on 0.0.0.0:9001
2023-02-07 13:27:24.223+00:00 | Thread-6 | INFO  | ClientSecretCredential | Azure Identity => getToken() result for scopes [https://vault.azure.net/.default]: SUCCESS
2023-02-07 13:27:24.225+00:00 | Thread-6 | INFO  | AccessTokenCache | Acquired a new access token.
2023-02-07 13:27:24.282+00:00 | reactor-http-epoll-1 | WARN  | SecretClientImpl | Failed to list secrets
Status code 403, "{"error":{"code":"Forbidden","message":"Client address is not authorized and caller is not a trusted service.\r\nClient address: 20.241.133.231\r\nCaller: appid=60221c99-6e1a-46e3-b1d8-f3ef9aa3e22e;oid=93d3cbe4-c25e-4a5d-974a-629d0306df20;iss=https://sts.windows.net/17255fb0-373b-4a1a-bd47-d211ab86df81/\r\nVault: staking-signer-dev-kv;location=eastus","innererror":{"code":"ForbiddenByFirewall"}}}"
2023-02-07 13:27:24.288+00:00 | main | ERROR | Runner | Error loading signers
java.util.concurrent.ExecutionException: com.azure.core.exception.HttpResponseException: Status code 403, "{"error":{"code":"Forbidden","message":"Client address is not authorized and caller is not a trusted service.\r\nClient address: 20.241.133.231\r\nCaller: appid=60221c99-6e1a-46e3-b1d8-f3ef9aa3e22e;oid=93d3cbe4-c25e-4a5d-974a-629d0306df20;iss=https://sts.windows.net/17255fb0-373b-4a1a-bd47-d211ab86df81/\r\nVault: staking-signer-dev-kv;location=eastus","innererror":{"code":"ForbiddenByFirewall"}}}"
.  .  .  .  .  .  
2023-02-07 13:27:25.838+00:00 | main | INFO  | Runner | Web3Signer has started with TLS disabled, and ready to handle signing requests on 0.0.0.0:9000
2023-02-07 13:28:10.494+00:00 | vert.x-eventloop-thread-0 | INFO  | LoggerHandlerImpl | 127.0.0.1 - - [Tue, 7 Feb 2023 13:28:10 GMT] "GET /healthcheck HTTP/1.1" 200 137 "-" "curl/7.81.0"
2023-02-07 13:28:15.105+00:00 | vert.x-eventloop-thread-0 | INFO  | LoggerHandlerImpl | 127.0.0.1 - - [Tue, 7 Feb 2023 13:28:15 GMT] "GET /healthcheck HTTP/1.1" 200 137 "-" "curl/7.81.0"

And from CURL doing the health check:

curl -i http://localhost:9000/healthcheck
HTTP/1.1 200 OK
vary: origin
content-type: application/json;charset=UTF-8
content-length: 137

{"status":"UP","checks":[{"id":"default-check","status":"UP"},{"id":"slashing-protection-db-health-check","status":"UP"}],"outcome":"UP"}

Thanks

@siladu siladu added the TeamCerberus Under active development by TeamCerberus @Consensys label Feb 15, 2023
@usmansaleem usmansaleem self-assigned this Mar 2, 2023
@jframe jframe closed this as completed Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TeamCerberus Under active development by TeamCerberus @Consensys
Projects
None yet
Development

No branches or pull requests

4 participants