fix(mqtt): rework connection and message tracking #10696

kitlaan · 2022-02-22T01:07:14Z

My mqtt connection would randomly stall, or disconnect (without any logging on the telegraf side).
It turns out there's was a comedy of bugs all adding up to trigger the behavior.

Weird connection/disconnection behavior

On first glance, it seemed that telegraf would randomly disconnect and reconnect, just to disconnect again. That led to looking at paho's client management and finding that the plugin really should be making a new client after disconnect (specifically since the plugin is not using auto reconnect).

Reusing a Client is not completely safe. After calling Disconnect please create a new Client (NewClient()) rather than attempting to reuse the existing one (note that features such as SetAutoReconnect mean this is rarely necessary).

Weird data loss (deadlock?)

Through code inspection, I noticed #10687. Sadly, fixing this did not resolve the problem. But...

Metric tracking

As part of testing #10684 I noticed that the m.acc and m.sem channels were slowly-but-steadily growing larger over time. With the dedup bug fixed, both channels did not drain to zero as expected. It turns out that select has random behavior when both cases are ready. Since one path exits the for-loop, on average this causes m.acc to not process, and thus it fills up.

TL;DR

So all-together, this change both cleans up the client connection semantics, as well as cleaning up metric tracking.

Required for all PRs:

Updated associated README.md.
Wrote appropriate unit tests.
Pull request title or commits are in conventional commit format

resolves #10687

telegraf-tiger · 2022-02-22T01:24:23Z

Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

👍 This pull request doesn't change the Telegraf binary size

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB	RPM	TAR GZ	ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	darwin_arm64.tar.gz	windows_i386.zip
armel.deb	armv6hl.rpm	freebsd_amd64.tar.gz
armhf.deb	i386.rpm	freebsd_armv7.tar.gz
i386.deb	ppc64le.rpm	freebsd_i386.tar.gz
mips.deb	riscv64.rpm	linux_amd64.tar.gz
mipsel.deb	s390x.rpm	linux_arm64.tar.gz
ppc64el.deb	x86_64.rpm	linux_armel.tar.gz
riscv64.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_riscv64.tar.gz
		linux_s390x.tar.gz
		static_linux_amd64.tar.gz

…0696) (cherry picked from commit 2b37d7e)

…fluxdata#10696)

kitlaan added 2 commits February 21, 2022 19:41

Create new MQTT client after disconnect, per Paho dev notes

ee51de5

Clean up delivered messages before new ones

38fee6e

telegraf-tiger bot added the fix pr to fix corresponding bug label Feb 22, 2022

powersj approved these changes Sep 21, 2022

View reviewed changes

powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Sep 21, 2022

MyaLongmire approved these changes Sep 26, 2022

View reviewed changes

MyaLongmire merged commit 2b37d7e into influxdata:master Sep 26, 2022

kitlaan deleted the fix/mqtt-connect branch September 26, 2022 17:44

popey pushed a commit that referenced this pull request Oct 3, 2022

fix(inputs.mqtt_consumer): rework connection and message tracking (#1…

1b744ba

…0696) (cherry picked from commit 2b37d7e)

dba-leshop pushed a commit to dba-leshop/telegraf that referenced this pull request Oct 30, 2022

fix(inputs.mqtt_consumer): rework connection and message tracking (in…

5cc1557

…fluxdata#10696)

powersj mentioned this pull request Jan 4, 2023

Increased use of CPU time in version 1.25 #12422

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mqtt): rework connection and message tracking #10696

fix(mqtt): rework connection and message tracking #10696

kitlaan commented Feb 22, 2022

telegraf-tiger bot commented Feb 22, 2022

Artifact URLs

fix(mqtt): rework connection and message tracking #10696

fix(mqtt): rework connection and message tracking #10696

Conversation

kitlaan commented Feb 22, 2022

TL;DR

Required for all PRs:

telegraf-tiger bot commented Feb 22, 2022

Artifact URLs