Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf service with S7Comm plugin will stopped after started 10s if the PLC is not available #15609

Closed
GitTurboy opened this issue Jul 9, 2024 · 9 comments · Fixed by #15655
Assignees
Labels
bug unexpected problem or unintended behavior

Comments

@GitTurboy
Copy link

GitTurboy commented Jul 9, 2024

Relevant telegraf.conf

# Plugin for retrieving data from Siemens PLCs via the S7 protocol (RFC1006)
[[inputs.s7comm]]
  ## Parameters to contact the PLC (mandatory)
  ## The server is in the <host>[:port] format where the port defaults to 102
  ## if not explicitly specified.
  server = "10.100.35.1:102"
  rack = 0
  slot = 1 


  pdu_size = 10 #462

 
  ## Timeout for requests
   timeout = "5s"

Logs from Telegraf

2024-07-09T03:14:19Z I! Starting Telegraf 1.30.0 brought to you by InfluxData the makers of InfluxDB
2024-07-09T03:14:19Z I! Available plugins: 233 inputs, 9 aggregators, 31 processors, 24 parsers, 60 outputs, 5 secret-stores
2024-07-09T03:14:19Z I! Loaded inputs: s7comm
2024-07-09T03:14:19Z I! Loaded aggregators: 
2024-07-09T03:14:19Z I! Loaded processors: 
2024-07-09T03:14:19Z I! Loaded secretstores: 
2024-07-09T03:14:19Z I! Loaded outputs: influxdb_v2
2024-07-09T03:14:19Z I! Tags enabled: 
2024-07-09T03:14:19Z I! [agent] Config: Interval:5s, Quiet:false, Hostname:"", Flush Interval:2s
2024-07-09T03:14:19Z D! [agent] Initializing plugins
2024-07-09T03:14:19Z D! [agent] Connecting outputs
2024-07-09T03:14:19Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2024-07-09T03:14:19Z D! [agent] Successfully connected to outputs.influxdb_v2
2024-07-09T03:14:19Z D! [agent] Starting service inputs
2024-07-09T03:14:19Z D! [inputs.s7comm] Connecting to "10.100.35.1:102"...

System info

Telegraf 1.31.1

Docker

No response

Steps to reproduce

  1. config a PLC which can not connected from the server telegraf hosted(OS windows 10)
  2. start the telegraf service
  3. view the service status and the log
    ...

Expected behavior

the service should keep run
and try to reconnect to PLC periody

Actual behavior

the service stoped

Additional info

no

@GitTurboy GitTurboy added the bug unexpected problem or unintended behavior label Jul 9, 2024
@GitTurboy
Copy link
Author

The host is windows 10

@powersj
Copy link
Contributor

powersj commented Jul 9, 2024

Hi,

Please enable debug_connection = true in your s7comm config and provide the complete logs. I would like to see the full set of attempts made and the final shutdown, not just the first debug connection message.

In general, telegraf will fail to start up if it fails to connect to a device. This is the expected behavior as it makes it very clear that something is wrong to the user. It could be a bad password, connection, etc. We have added some connection error retry logic to some plugins and could possibly add this here, but I would like to see a complete set of logs first.

Thanks

@powersj powersj added the waiting for response waiting for response from contributor label Jul 9, 2024
@GitTurboy
Copy link
Author

GitTurboy commented Jul 10, 2024

Hi,

Please enable debug_connection = true in your s7comm config and provide the complete logs. I would like to see the full set of attempts made and the final shutdown, not just the first debug connection message.

In general, telegraf will fail to start up if it fails to connect to a device. This is the expected behavior as it makes it very clear that something is wrong to the user. It could be a bad password, connection, etc. We have added some connection error retry logic to some plugins and could possibly add this here, but I would like to see a complete set of logs first.

Thanks

you are welcome!
after setting debug_connection = true I got the logs below:

2024-07-10T00:42:09Z I! Starting Telegraf 1.31.1 brought to you by InfluxData the makers of InfluxDB
2024-07-10T00:42:09Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 5 secret-stores
2024-07-10T00:42:09Z I! Loaded inputs: s7comm
2024-07-10T00:42:09Z I! Loaded aggregators:
2024-07-10T00:42:09Z I! Loaded processors:
2024-07-10T00:42:09Z I! Loaded secretstores:
2024-07-10T00:42:09Z I! Loaded outputs: influxdb_v2
2024-07-10T00:42:09Z I! Tags enabled:
2024-07-10T00:42:09Z I! [agent] Config: Interval:5s, Quiet:false, Hostname:"", Flush Interval:2s
2024-07-10T00:42:09Z D! [agent] Initializing plugins
2024-07-10T00:42:09Z D! [agent] Connecting outputs
2024-07-10T00:42:09Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2024-07-10T00:42:09Z D! [agent] Successfully connected to outputs.influxdb_v2
2024-07-10T00:42:09Z D! [agent] Starting service inputs
2024-07-10T00:42:09Z D! [inputs.s7comm] Connecting to "10.100.35.1:102"...

Looks like the program crashed at line 156 and not logging more information

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jul 10, 2024
@GitTurboy
Copy link
Author

connection error retry logic to some plugins

I think connection error retry logic for this plugins is very useful given our plant network situation, Thanks

@srebhan
Copy link
Member

srebhan commented Jul 10, 2024

@GitTurboy just for my understanding, do you see an error message in the log or does it stop with D! [inputs.s7comm] Connecting to ...? In my tests I always see an error that the connection failed...

@srebhan srebhan self-assigned this Jul 10, 2024
@srebhan srebhan added the waiting for response waiting for response from contributor label Jul 10, 2024
@GitTurboy
Copy link
Author

GitTurboy commented Jul 10, 2024

@GitTurboy just for my understanding, do you see an error message in the log or does it stop with D! [inputs.s7comm] Connecting to ...? In my tests I always see an error that the connection failed...

if run as a service , the log were just same as what I posted. 【and the service will stoped !】
if run as console. app, the log contains:

2024-07-10T07:42:44Z W! �[31mOutputs are not used in testing mode!�[0m
2024-07-10T07:42:44Z I! Tags enabled:
2024-07-10T07:42:44Z D! [agent] Initializing plugins
2024-07-10T07:42:44Z D! [agent] Starting service inputs
2024-07-10T07:42:44Z D! [inputs.s7comm] Connecting to "10.100.35.1:102"...
2024-07-10T07:42:49Z E! [agent] Starting input inputs.s7comm: connecting to "10.100.35.1:102" failed: dial tcp 10.100.35.1:102: i/o timeout
2024-07-10T07:42:49Z D! [agent] Stopping service inputs
2024-07-10T07:42:49Z D! [agent] Input channel closed
2024-07-10T07:42:49Z D! [agent] Stopped Successfully

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jul 10, 2024
@srebhan
Copy link
Member

srebhan commented Jul 23, 2024

@GitTurboy please test the binary in PR #15655, available as soon as CI finished the tests, and let me know if this fixes the issue! You should set startup_error_behavior = "retry" for the plugin to make the plugin retrying to connect in every gather cycle without failing.

@powersj powersj added the waiting for response waiting for response from contributor label Jul 24, 2024
@GitTurboy
Copy link
Author

GitTurboy commented Jul 25, 2024

startup_error_behavior = "retry"

I have tried and All looks well. After observer several days, I will close this problem. Thank you!

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jul 25, 2024
@srebhan
Copy link
Member

srebhan commented Jul 25, 2024

@GitTurboy please don't close the issue, it will automatically be closed as soon as we do merge the corresponding PR! Anyway, please let me know in here how your tests went, even though the issue might be closed already! In case of any problem feel free to reopen the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants