-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send "keep alive" (empty) messages to avoid idle timeouts when streaming #1187
Comments
AWX is one use case where this issue can cause unintended job failures.
|
^ wow, your comment linked there is very close to exactly what this would be. This issue could be summarized as implementing a supported means of doing what you did as a part of the ansible-runner code. I think the trickiest thing is that we need to do this as a part of the |
I've played with about 5 different solutions to this- my favorite so far is actually hiding this behavior entirely in the The pexpect
Just injecting surrogate events from a worker thread that's monitoring writes to the output we actually care about (logically or directly) is probably simpler, and also avoids adding further complexity to the pexpect wakeup code (which is already pretty gnarly IMO). The thread is cheap, and the logic can be mostly/completely isolated in One thing we do need to decide is if this should interact with |
The default ansible-runner/ansible_runner/config/_base.py Line 183 in c50532a
As a matter of the practical ask here, what people are really struggling with is K8S rules which will time out log connections after minutes or hours. In AWX there isn't even a mechanism to allow users to change the I'm not fully sure what your point 1 is saying. But I think it suggests that the expected behavior isn't clear. The documented functionality could be:
Either one would solve the problem, and the latter option would do so with fewer keep-alive messages in total. But again, I think we've got a grasp on the type of problem we want to solve so I'm not very worried about keep-alive message spam, should we do the former. What I am worried about is our ability to debug anything that goes wrong in |
The related Jira: https://issues.redhat.com/browse/AAP-6482 |
I have the latest version of AWX 21.12.0 running in OKE and I have the same problem. What is the procedure to implement the solution proposed in #1187 ?? |
@luckass1 with ansible/awx#13608 the idea is to pick it up automatically in some release in the future. But you should try it now! If you have your default EE (or a custom EE) updated with this patch, you could change the pod spec in the way done in that AWX PR. Go to instance groups in the UI and you should be able to edit the default group's pod spec there. |
The problem we want to address is that we have
ansible-runner worker
running, talking to anansible-runner process
agent. Due to the infrastructure set up between them (usually K8S), people can have issues with timeout triggers dropping the connection and doing other bad things.This proposes that ansible-runner, itself, sends a do-nothing messages from worker-->process at a periodic interval, configured by that user. The default is probably to not send these messages.
We may also provide a do-nothing callback in the process code.
The text was updated successfully, but these errors were encountered: