Journal logging doesn't reconnect if journald gets restarted #233

krobertson · 2017-05-26T19:38:39Z

We encountered this issue today twice on a production system. We haven't yet identified the root cause, however this appeared to be an ancillary issue.

The host was running out of memory, likely from journald. It attempted to compress and rotate logs, but was unable to allocate the memory. Journald died but was restarted. However' we're also using the docker journald log driver along with journald's syslog transport. After the restart, all docker containers failed to continue logging with EPIPE errors to stdout/stderr.

In looking at the journal code, it connects to journald on init, however doesn't have any error handling where it might reconnect if journald was restarted.

The log output we had was:

May 26th 2017, 05:34:03.204    System journal (/var/log/journal/) is 1.7G, max 2.6G, 931.8M free.
May 26th 2017, 05:34:04.220    Failed to initialize XZ encoder: code 5
May 26th 2017, 05:34:04.231    systemd-journald.service: Main process exited, code=dumped, status=6/ABRT
May 26th 2017, 05:34:04.232    Failed to compress (unnamed temporary file): Invalid argument
May 26th 2017, 05:34:04.242    Detected coredump of the journal daemon or PID 1, diverted to /var/lib/systemd/coredump/core.systemd-journal.0.9b634f7d87464833a67ca9124f25ab86.14979.1495802022000000.
May 26th 2017, 05:34:04.242    systemd-journald.service: Unit entered failed state.
May 26th 2017, 05:34:04.245    systemd-journald.service: Service has no hold-off time, scheduling restart.
May 26th 2017, 05:34:04.245    systemd-journald.service: Failed with result 'core-dump'.
May 26th 2017, 05:34:04.259    Stopped Flush Journal to Persistent Storage.
May 26th 2017, 05:34:04.259    Stopped Journal Service.
May 26th 2017, 05:34:04.259    Stopping Flush Journal to Persistent Storage...
May 26th 2017, 05:34:04.268    Starting Journal Service...

After that, we only get huge error spikes to New Relic with EPIPE.

The text was updated successfully, but these errors were encountered:

lucab · 2017-05-29T07:32:22Z

See #218 for reference.

ssgreg · 2017-08-30T18:00:38Z

Guys, please check the implementation: https://github.com/ssgreg/journald

lucab mentioned this issue May 29, 2018

journal: use a connection-less socket #279

Merged

cachedout mentioned this issue Dec 18, 2018

Journalbeat Stops Reading Journals on Journald Rotation elastic/beats#9533

Closed

lucab closed this as completed in #279 Mar 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Journal logging doesn't reconnect if journald gets restarted #233

Journal logging doesn't reconnect if journald gets restarted #233

krobertson commented May 26, 2017

lucab commented May 29, 2017

ssgreg commented Aug 30, 2017

Journal logging doesn't reconnect if journald gets restarted #233

Journal logging doesn't reconnect if journald gets restarted #233

Comments

krobertson commented May 26, 2017

lucab commented May 29, 2017

ssgreg commented Aug 30, 2017