Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mpd module hang/crash after suspend #3793

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

bretello
Copy link

@bretello bretello commented Nov 25, 2024

I've been having this annoying issue for a while, in which after waking up from a long suspend the MPD module is not updating anymore.

What I think is happening is that once the client goes to sleep the server TCP connection to the client expires (based on the values in /proc/sys/net/ipv4/tcp_keepalive_*), the server sends a RST but the client never receives it because it is asleep.

How to reproduce

This can be reproduced by killing the socket(s) on the server using ss like so

export MPD_HOST=<host>
export MPD_PORT=6600 # change port here if required
# suspend the client before executing the command below
sudo ss --kill "dst $MPD_HOST and dport $MPD_PORT" # WARNING: this will kill ALL connections to MPD_HOST:MPD_PORT

When the client wakes, the MPD module will not update anymore.

Second bug

While debugging this, I realized that killing the socket on the client (or just having the connection naturally expire) actually causes waybar to go on an infinite loop and eventually crash:

export  MPD_PORT=6600 # change port here if required
export  MPD_HOST=<host>

ss --kill "dst $MPD_HOST and port $MPD_HOST"

Results in waybar realizing that the idle connection has died, but reconnection fails, clicking on the mpd icon in waybar causes the crash (last line):

[warning] mpd: Idle: error: Software caused connection abort
[warning] mpd: Disconnected: error: Software caused connection abort
[warning] mpd: Disconnected: error: Software caused connection abort
[warning] mpd: Disconnected: error: Software caused connection abort
...
[warning] mpd: Disconnected: error: Software caused connection abort
[warning] mpd: Disconnected: error: Software caused connection abort
waybar: ../libmpdclient-2.22/src/status.c:358: mpd_status_get_consume: Assertion `status != NULL' failed.

Third bug (somehow unrelated?)

If there are any extra definitions in the waybar config, such as on-click-middle, single/right clicking on the mpd icon in waybar has no effect.

Fix proposal

I haven't found a way to actually make the client immediately realize that the mpd connection is dead on wake (short of setting a period Idle timer to check the connection?)

  • Reset the MPD unique_connection if there are any errors when using IDLE_RUN_NOIDLE_AND_CMD
  • Resetting the MPD unique_connection on errors in Idle::on_io (e.g. when the connection is killed like shown above (fixes second bug)
  • Replacing handlePlayPause with handleToggle (fixes third bug)

Reason for the third bug

The reason for the third bug is that when a waybar::AModule::AModule is instantiated, depending on the initialization values and the config, it connects a few slots, for example when a module has actions and/or some custom user actions are defined in the configuration, Gdk::BUTTON_PRESS_MASK
is added to the event box and AModule::handleToggle(GdkEventButton* const& e) is connected to event_box_.signal_button_press_event() (see https://github.com/Alexays/Waybar/blob/master/src/AModule.cpp#L50-L51)

handletoggle(e) calls handleUserEvent(e) (https://github.com/Alexays/Waybar/blob/master/src/AModule.cpp#L152, https://github.com/Alexays/Waybar/blob/master/src/AModule.cpp#L156), which does its processing and returns true, stopping other handlers from being invoked for the button press event.

In the current definition of waybar::modules::MPD::MPD(...), MPD::handlePlayPause is connected to event_box_.signal_button_press_event() (https://github.com/Alexays/Waybar/blob/master/src/modules/mpd/mpd.cpp#L43-L44), but this
is only ever called if there are no user-defined actions (e.g. on-click-middle) in the configuration. When there's no user-defined actions in the configuration, in that case handleToggle() is never connected in waybar::AModule::AModule(), event processing is not stopped by the AModule::handleToggle call and MPD::handlePlayPause is actually called.

This PR replaces handlePlayPause with handleToggle(), which handles play/pause/stop and then calls the parent handleToggle method.

@@ -50,7 +50,7 @@ inline int close(FILE* fp, pid_t pid) {
ret = waitpid(pid, &stat, WCONTINUED | WUNTRACED);

if (WIFEXITED(stat)) {
spdlog::debug("Cmd exited with code {}", WEXITSTATUS(stat));
spdlog::trace("Cmd exited with code {}", WEXITSTATUS(stat));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this (and below) to trace since they made debugging with -l debug very noisy. I'll be happy to revert this and/or open another PR in case we want separation of concerns.

@bretello bretello marked this pull request as draft November 25, 2024 21:00
@bretello

This comment was marked as outdated.

@bretello
Copy link
Author

bretello commented Dec 6, 2024

Update: this seemingly has to to with libmpdclient: running mpc idleloop shows the same problem after a long suspend. It looks like the underlying connection dies but the client never realizes this and keeps waiting for idle events.

@bretello bretello force-pushed the fix-mpd-module-hang-after-suspend branch from cbf546d to 7d222db Compare December 16, 2024 18:14
@bretello bretello marked this pull request as ready for review December 16, 2024 18:40
@bretello bretello changed the title fix mpd module hang after suspend fix mpd module hang/crash after suspend Dec 16, 2024
@bretello bretello marked this pull request as draft December 23, 2024 00:15
@bretello
Copy link
Author

Still has some hanging problems updating after a wake from sleep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant