-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover rsyslog from 4xx error #14719
Recover rsyslog from 4xx error #14719
Conversation
events=PROCESS_LOG_STDERR | ||
priority=0 | ||
autorestart=true | ||
stdout_events_enabled = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this flag needed to be on the service you want to monitor. I expected it under [program:awx-rsyslogd]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
and
stdout_events_enabled = true
stderr_events_enabled = true
is equivalent
and rsyslogd section already have
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An event type emitted when a process writes to stdout or stderr. The event will only be emitted if the file descriptor is not in capture mode and if stdout_events_enabled or stderr_events_enabled config options are set to true.
nvm
if headers["eventname"] == "PROCESS_STATE_FATAL": | ||
headers.update( | ||
dict( | ||
[x.split(":") for x in sys.stdin.read(int(headers["len"])).split()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For any historians that come back to this PR comment.
We aren't sure that this ever worked. Or, if it did, it was a time when the header and data were not read in the general while loop above.
This sys.stdin.read()
would consume outside of the current message boundaries. We think maybe reading another message from supervisor. The logic that follow only "replies" once. This could/would lead to the supervisor buffer backing up.
write_stderr( | ||
f"{datetime.datetime.now(timezone.utc)} - sending SIGTERM to proc={headers} with data={headers}\n" | ||
) | ||
os.kill(headers["pid"], signal.SIGTERM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A PROCESS_STATE_FATAL
event is supervisor telling us that it has given up starting our process and will not try to start it again. There will be no pid
in the header.
13685bd
to
c7a353c
Compare
tools/docker-compose/supervisor.conf
Outdated
@@ -133,3 +165,7 @@ command = supervisor_stdout | |||
buffer_size = 100 | |||
events = PROCESS_LOG | |||
result_handler = supervisor_stdout:event_handler | |||
stdout_logfile=/dev/stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[eventlistener:stdout]
What does this evenlistener do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comes from https://github.com/coderanger/supervisor-stdout
seems to just dump supervisor logs to stdout so we can see it on container log
events=PROCESS_LOG_STDERR | ||
priority=0 | ||
autorestart=true | ||
stderr_logfile=/dev/stderr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this eventlistener doesn't specify a stdout_logfile. Is that so that restarting rsyslog will be seen in the docker container output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdout contain only that
READY
RESULT 2\nOK
message I don't want it to spam a file
and stderr will go into the container output (which contain message about restarting service)
bed006d
to
7656bb2
Compare
@@ -8,13 +8,14 @@ pidfile = /var/run/supervisor/supervisor.rsyslog.pid | |||
[program:awx-rsyslogd] | |||
command = rsyslogd -n -i /var/run/awx-rsyslog/rsyslog.pid -f /var/lib/awx/rsyslog/rsyslog.conf | |||
autorestart = true | |||
startsecs = 30 | |||
startsecs = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prevent PROCESS_STATE_FATAL
9764f38
to
8f95784
Compare
Due to ansible#7560 'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
Not every log messages need to be emitted as a event!
8f95784
to
e9c9326
Compare
SUMMARY
Fixes #7560
Related to rsyslog/rsyslog#4348
'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator
This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
NOTE: this workaround will cause message lost! It's best to resolve the root cause for the 4xx
ISSUE TYPE
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION