Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: docker discover 步骤在 docker ps -a 列表最后一项容器是 Exited 且被判断为 timeout 时会错误地进入失败状态 #1934

Closed
leemars opened this issue Nov 28, 2024 · 0 comments · Fixed by #1935
Labels
bug Something isn't working

Comments

@leemars
Copy link
Contributor

leemars commented Nov 28, 2024

Describe the bug

在 k3s + cri-dockerd 环境中发现 ilogtail 启动后一直大量报错输出日志,docker discover 失败。

检查代码后发现涉及以下逻辑:
fe95095f0a10e78dd2179c10af4ba1e5
c1d28d54f1a3a01bd841ce30593d78aa

这个 inspectOneContainer 判断 container 超时之后,把超时当成 err 返回了,但是事实上超时对于 fetchAll 这里应该是无所谓的。
这个 err 在 fetchAll 中还会被 for 循环覆盖,返回的是最后一个 err,如果不巧最后一个 container 是 Exited 的状态 + timeout 判定成功,就会导致 fetchAll 返回一个超时 err,然后进一步导致 docker discover 失效。

iLogtail Running Environment

  • logtail_plugin.LOG:
2024-11-27 05:30:01 [INF] [container_discover_controller.go:197] [Init] input:param        docker discover:true        cri discover:false        static discover:false
2024-11-27 05:30:01 [INF] [container_discover_controller.go:223] [Init] init docker center, fetch all seconds:5m0s
2024-11-27 05:30:01 [INF] [container_discover_controller.go:233] [Init] init docker center, fecth all success timeout:1h40m0s
2024-11-27 05:30:01 [INF] [container_discover_controller.go:243] [Init] init docker center, client request timeout:30s
2024-11-27 05:30:01 [INF] [container_discover_controller.go:254] [Init] init docker center, max fetchOne count per second:200
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 2bd2284a387ff7af09f15a7134979f3f278c2e53ae73fd8c3f31bc075686dc9b        error found:inspect time out container 2bd2284a387ff7af09f15a7134979f3f278c2e53ae73fd8c3f31bc075686dc9b
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 2a38c17886d91034f05f0354be7d0dfde8c6be4b7c3238987fa8d8f9a3468cbd        error found:inspect time out container 2a38c17886d91034f05f0354be7d0dfde8c6be4b7c3238987fa8d8f9a3468cbd
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 2c97e747d6f43cbf39646af799e273466d81fc57ad0e88ae4d859af9b1a2f139        error found:inspect time out container 2c97e747d6f43cbf39646af799e273466d81fc57ad0e88ae4d859af9b1a2f139
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container b7eb5d7f850b2147b19a89a57be14d7e71b4a3e076913cdb334c35b7d7b082c5        error found:inspect time out container b7eb5d7f850b2147b19a89a57be14d7e71b4a3e076913cdb334c35b7d7b082c5
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container c2fbb8a3ba1c2e48ccc74eb3fd36f73659e4705dec0ce715037731bc8c4cf189        error found:inspect time out container c2fbb8a3ba1c2e48ccc74eb3fd36f73659e4705dec0ce715037731bc8c4cf189
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container bbee7fa2be2f69b7db13b1dbc6a98ee1b113f2f2e33d80d7cf3df4201c47cafc        error found:inspect time out container bbee7fa2be2f69b7db13b1dbc6a98ee1b113f2f2e33d80d7cf3df4201c47cafc
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 6dc6e3a8eec21764666a1c2efcd37cdffdc7279025a248b66224b7bfc1e5ecca        error found:inspect time out container 6dc6e3a8eec21764666a1c2efcd37cdffdc7279025a248b66224b7bfc1e5ecca
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 51a89d29fc21a0065bdb9b221862f44e3a2c364acfce2930580502e4414c6b87        error found:inspect time out container 51a89d29fc21a0065bdb9b221862f44e3a2c364acfce2930580502e4414c6b87
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 811f18cf987ab0e3055bf612197b52233c5a229b7212225f60142292b93c12d3        error found:inspect time out container 811f18cf987ab0e3055bf612197b52233c5a229b7212225f60142292b93c12d3
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 47e58c1a0992c3bf6d07d60eeda4dfec362bc6bebdf25d3a4e2ac5923d4e027c        error found:inspect time out container 47e58c1a0992c3bf6d07d60eeda4dfec362bc6bebdf25d3a4e2ac5923d4e027c
2024-11-27 05:30:01 [WRN] [docker_center.go:672] [setLastError] AlarmType:DOCKER_CENTER_ALARM        message:inspect time out container 42f4697f5c8c2de9fa2ad9166e8a6b38741e87d75474bdf8c3a4b5842ebdca53        error found:inspect time out container 42f4697f5c8c2de9fa2ad9166e8a6b38741e87d75474bdf8c3a4b5842ebdca53
2024-11-27 05:30:01 [ERR] [container_discover_controller.go:260] [Init] AlarmType:DOCKER_CENTER_ALARM        fetch docker containers error, close docker discover, will retry
2024-11-27 05:30:01 [INF] [container_discover_controller.go:270] [Init] final:param        docker discover:false        cri discover:false        static discover:false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant