-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Telemetry] After ONIE install, the telemetry process inside telemetry container exits but docker stays up #16533
Comments
@dgsudharsan could you please capture the difference in behavior across the two sonic version. |
In 202211 when installing from ONIE, the telemetry process exits. However along with it the telemetry docker exits too since the telemetry process is defined as a critical process. In 202305 the telemetry docker however doesn't exit.
|
Reproduce the issue locally on 20230531.03 version. After ONIE installation, telemetry process is exited indeed. admin@sonic:/var/log$ docker exec telemetry supervisorctl status Snippet telemetry.log: investigating on a proper fix. |
This is because telemetry service introduce the cert authentication but no telemetry config in Config DB. 127.0.0.1:6379[4]> keys TELEMETRY*
(empty array)
127.0.0.1:6379[4]> Therefore, we need to manually load the TELEMETRY config into config DB: telemetry.json
{
"TELEMETRY": {
"gnmi": {
"client_auth": "false",
"port": "50051",
"log_level": "2"
}
}
}
{
"TELEMETRY": {
"certs": {
"server_crt": "/etc/sonic/telemetry/streamingtelemetryserver.cer",
"server_key": "/etc/sonic/telemetry/streamingtelemetryserver.key",
"ca_crt": "/etc/sonic/telemetry/dsmsroot.cer"
},
"gnmi": {
"client_auth": "true",
"port": "50051",
"log_level": "2"
}
}
} Load telemetry config into CONFIG DB: sudo config load telemetry.json -y Then, start telemetry process docker exec telemetry supervisorctl start telemetry After that, the above telemetry issue will be resolved. It requires a mechanism to generate a default TELEMETRY config into config db. |
It still suggests to load customized TELEMETRY configs, if no TELEMETRY configuration in redis DB, after the fix, it will uses the default TELEMETRY configurations. |
#### Why I did it Fix issue #16533 , telemetry service exit in master and 202305 branches due to no telemetry configs in redis DB. #### How I did it Enable default config if no TELEMETRY configs from redis DB. #### How to verify it After the fix, telemetry service would work with the following two scenarios: 1. With TELEMETRY config in redis DB, load service configs from DB. 2. No TELEMETRY config in redis DB, use default service configs.
…-net#16683) #### Why I did it Fix issue sonic-net#16533 , telemetry service exit in master and 202305 branches due to no telemetry configs in redis DB. #### How I did it Enable default config if no TELEMETRY configs from redis DB. #### How to verify it After the fix, telemetry service would work with the following two scenarios: 1. With TELEMETRY config in redis DB, load service configs from DB. 2. No TELEMETRY config in redis DB, use default service configs.
#### Why I did it Fix issue #16533 , telemetry service exit in master and 202305 branches due to no telemetry configs in redis DB. #### How I did it Enable default config if no TELEMETRY configs from redis DB. #### How to verify it After the fix, telemetry service would work with the following two scenarios: 1. With TELEMETRY config in redis DB, load service configs from DB. 2. No TELEMETRY config in redis DB, use default service configs.
Description
After installing through onie, the telemetry process inside the telemetry container exits and sometimes its FATAL.
Sep 7 18:27:45.653772 r-anaconda-51 INFO telemetry#supervisord 2023-09-07 15:27:45,652 INFO exited: telemetry (exit status 0; not expected)
Steps to reproduce the issue:
Describe the results you received:
Telemetry process exits. However docker stays up even though its a critical process.
Describe the results you expected:
Telemetry main process should not exit. If it exits the docker should exit as well
Output of
show version
:Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
sonic_dump_r-bulldog-03_20230913_023753.tar.gz
sonic_dump_r-anaconda-51_20230907_183233.tar.gz
The text was updated successfully, but these errors were encountered: