Failed to JSON parse a line from worker stream due to unexpected EOF(b'') #14693

aryklein · 2023-11-29T14:18:29Z

Please confirm the following

I agree to follow this project's code of conduct.
I have checked the current issues for duplicates.
I understand that AWX is open source software provided for free and that I might not receive a timely response.
I am NOT reporting a (potential) security vulnerability. (These should be emailed to [email protected] instead.)

Bug Summary

I've encountered an issue with some of my jobs in AWX 23.3.1, and I’m hoping to gather insights or solutions from anyone who might have faced a similar problem.

Error Message

The execution fails with an error message on the details tab:

Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b'

Environment

AWX Version: 23.3.1 Deployment Platform: Kubernetes Operator Version: 2.7.2

I'm inclined to think that this issue stems from a UI error during the process of logging data from the pod, although I'm not entirely certain.

Any Idea?

AWX version

23.3.1

Select the relevant components

Installation method

kubernetes

Modifications

no

Ansible version

ansible [core 2.16.0]

Operating system

Linux

Web browser

Firefox, Chrome

Steps to reproduce

Randomly at executing some templates

Expected results

Finish the Ansible playbook execution without errors

Actual results

The Jobs don't fail, they just finish the execution with error:

Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b'

Additional information

I tried using different Ansible EE version (included the latest one)

The text was updated successfully, but these errors were encountered:

jessicamack · 2023-11-29T18:43:24Z

Hello, we'll need some further information. Can you please set the following settings AWX_CLEANUP_PATHS = False and RECEPTOR_RELEASE_WORK = False. Then check the /tmp/awx_<job_id>_/artifacts/<job_id>/job_events directory. Confirm all the files there are JSON. If one isn't, please report back which one(s). This should all be done in the EE container in the task pod. Please also share the logs for the job pods. We're trying to confirm how far the job was able to run before failure.

TheRealHaoLiu · 2023-11-30T13:58:08Z

can u also provide the receptor log and what kind of kubernetes you are using

TheRealHaoLiu · 2023-11-30T14:16:17Z

can you retry with the latest EE image for the controlplane EE... u can do this by changing the imagepullpolicy in awx to Alaways than switch it back to IfNotPresent

aryklein · 2023-12-01T17:49:22Z

can u also provide the receptor log and what kind of kubernetes you are using

Kubernetes v1.24.2 self managed by kubeadm
How could I get the receptor log?

I'm Apologies for my limited knowledge on this subject. I attempted to include AWX_CLEANUP_PATHS=False and RECEPTOR_RELEASE_WORK=False in the configMap, but couldn't locate the appropriate section for their addition. Consequently, I chose to input them in the web UI, specifically under Settings, within the Job section, and in Extra Environment Variables. Could you please confirm if this approach is correct?

I achieved the desired configuration by editing the AWX object. Specifically, I added the extra settings in the YAML file under the 'spec' section. The adjustments were as follows:

extra_settings:
   - setting: RECEPTOR_RELEASE_WORK
     value: "False"
   - setting: AWX_CLEANUP_PATHS
     value: "False"

aryklein · 2023-12-01T19:02:02Z

@jessicamack,

Then check the /tmp/awx_<job_id>_/artifacts/<job_id>/job_events directory. Confirm all the files there are JSON.

The directory is empty:

kubectl exec -it awx-task-78cbf7c589-bzgd8 -c awx-task  -- bash                                                                                                                                                                               
bash-5.1# ls tmp/awx_7976_m5_ejtdl/artifacts/7976/job_events/
bash-5.1#

Here you also have the logs from job pod: automation-job-7976-z6vnr.log

fosterseth · 2023-12-06T16:46:42Z

your provided job output log looks good at first glance. Does the UI job output stdout page show all of those events?

where are you seeing Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b' exactly in the UI?

can you provide a screenshot of it?

aryklein · 2023-12-06T16:59:17Z

In the provided screenshot, line 6950 appears as the final line visible within the user interface. As a next step, I plan to conduct an experiment by deploying the same AWX setup on EKS, as opposed to using my self-managed Kubernetes cluster.

BTW, I found 2 more users with the same issue: https://www.reddit.com/r/awx/comments/176za7y/issue_with_json_parsing_error_in_awx_2320_on/

Dodexq · 2023-12-08T12:01:37Z

Same issue. Cluster version v1.24.12+rke2r1, AWX 23.3.1, the error occurs randomly, yesterday everything worked on custom Execution Environments

UPD:
When I set env ansible/receptor#683
ee_extra_env: |
- name: RECEPTOR_KUBE_SUPPORT_RECONNECT
value: disable
the error became complete and it: Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b'failed to create fsnotify watcher: too many open files after that I set the value to x3 (from installed by default) from the previous https:// serverfault.com/questions/1137211/failed-to-create-fsnotify-watcher-too-many-open-files and everything worked :)

Hope this helps you

djyasin · 2023-12-13T18:57:32Z

Hello @aryklein thank you for providing those additional screenshots.

Could you go to Settings> TroubleShooting Settings> Edit> From here, turn off temp dir cleanup and receptor release work

And then, in the control plane ee container (this is located in the task pod) get the /tmp/awx_<job_id>_<"*">

And also get the /tmp/receptor/<node name>/<work unit id>

You should be able to get the work unit id from the API for that job run.

Please provide us with the artifacts directory and the stdout file.

aryklein · 2023-12-13T19:28:55Z

@djyasin I think I did it here right?
#14693 (comment)

mxs-weixiong · 2023-12-15T01:32:16Z

I got this error as well.

"Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b''

AWX 23.4.0
Execution Environment: 23.0.0

I no idea what is causing this. Logs doesn't tell me anything. It will occur at different job.

If anyone got idea on how to troubleshoot this , please advise me.

Thanks.

kuba2915 · 2023-12-17T00:35:32Z

+1

My details:
Kubeadm cluster 1.28 on proxmox
AWX-Operator : 2.9.0 (helm, installed by argo with default values)
AWX definition with only spec for ingress.

Logs are the same in my case. It looks like that tasks are completed, but AWX fails when read status from execution environment.

Also when Verbosity is set to debug tasks are sucessful, but not always (~70% chances)

aryklein · 2023-12-18T17:35:42Z

I recently migrated my AWX deployment to EKS, with Kubernetes version 1.25, and the issue has completely vanished

mis4s · 2023-12-18T17:43:47Z

Same issue on k3s v1.28

Edit: it seems to be resolved after updating inotify:

fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288

JSGUYOT · 2023-12-20T14:10:33Z

Same issue with :
AWX 23.5.1
AWX OPERATOR : 2.9.0
K3S : v1.27.2+k3s1

rchaud · 2023-12-23T19:37:31Z

Im experiencing the same issue in 23.3.0 and K8s 1.24. I will dig deeper to troubleshoot more and capture some logs after the holidays.

marek1712 · 2023-12-28T13:56:50Z

Same issue on k3s v1.28

Thanks!

I was on 1.27 and inotify increases didn't change a thing. Updating to 1.28 helped.

mattiaoui · 2023-12-28T21:21:34Z

hi, i have same error on awx 23.5.1.
I restore DB frrom old k3s server, project sync work without problem,but all template terminate with error:
Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b''

Template OUTPUT:
Worker output:
": "/tmp/awx_3386_b795lk3f", "JOB_ID": "3386", "INVENTORY_ID": "3", "PROJECT_REVISION": "e6d9f8f52adb8de7d052019a05326020c0d1cc4a", "ANSIBLE_RETRY_FILES_ENABLED": "False", "MAX_EVENT_RES": "700000", "AWX_HOST": "https://awx.xxx.xxx.loc", "ANSIBLE_SSH_CONTROL_PATH_DIR": "/runner/cp", "ANSIBLE_COLLECTIONS_PATHS": "/runner/requirements_collections:collections:~~/.ansible/collections:/usr/share/ansible/collections", "ANSIBLE_ROLES_PATH": "/runner/requirements_roles:~~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles", "ANSIBLE_COLLECTIONS_PATH": "/runner/requirements_collections:~/.ansible/collections:/usr/share/ansible/collections", "ANSIBLE_CALLBACK_PLUGINS": "/usr/local/lib/python3.9/site-packages/ansible_runner/display_callback/callback", "ANSIBLE_STDOUT_CALLBACK": "awx_display", "AWX_ISOLATED_DATA_DIR": "/runner/artifacts/3386", "RUNNER_OMIT_EVENTS": "False", "RUNNER_ONLY_FAILED_EVENTS": "False"}, "cwd": "/runner/project"}
{"status": "running", "runner_ident": "3386"}.

i try with more different release off EE, but problem same.
Have Idea?
Bye
Mattia

marek1712 · 2023-12-28T21:29:49Z

@mattiaoui Upgrade your K3s to 1.28. Then follow this tip:
#14693 (comment)

mattiaoui · 2023-12-28T23:23:42Z

@mattiaoui Upgrade your K3s to 1.28. Then follow this tip:

#14693 (comment)

Many Thanks,after upgrade all template work 🤟🤟.
Happy Holiday and New Years

kuba2915 · 2023-12-29T00:02:00Z

In my case, temporary "fix" was setting Debug(3) in Verbosity on all my templates. After this change, all scheduled tasks was successful for about a week.

But during upgrade to k8s 1.29 I noticed that is issue with kube-proxy on node with awx, because it was still in CrashBootloop state.
I applied fix from here kubernetes-sigs/kind#2744 (comment) and it fixed my problem

But it is basically the same as @marek1712 mentioned, but with other values.

rchaud · 2024-01-01T05:51:32Z

I think I have found the issue. It seems that all tasks controller pods are running 4 containers which are redis, task(awx image), rsyslog(awx image) and ee( awx-ee image).

I did notice that ee container in the pods were running awx-ee:latest. I changed that deployment to use the same awx-ee version as the awx deployment and it resolved the problem for me.

i think it is something with the latest image running as a controller.

I did not make any changes to my sysctl inodes or anything else.

I am running version 23.2.0 and everything is working perfectly after matching the ee container to my running awx version.

I suspect it might be a bug introduced in the ansible/awx-ee . I do see it was not updated since Jun and in Nov and Dec they had some updates.

ngsin · 2024-01-03T02:44:43Z

I think I have found the issue. It seems that all tasks controller pods are running 4 containers which are redis, task(awx image), rsyslog(awx image) and ee( awx-ee image).

I did notice that ee container in the pods were running awx-ee:latest. I changed that deployment to use the same awx-ee version as the awx deployment and it resolved the problem for me.

i think it is something with the latest image running as a controller.

I did not make any changes to my sysctl inodes or anything else.

I am running version 23.2.0 and everything is working perfectly after matching the ee container to my running awx version.

I suspect it might be a bug introduced in the ansible/awx-ee . I do see it was not updated since Jun and in Nov and Dec they had some updates.

@chinochao it seem like a bug in ansible runner (a python package awx would use it ), btw, which awx-ee version do you use?

rchaud · 2024-01-03T02:54:16Z

I think I have found the issue. It seems that all tasks controller pods are running 4 containers which are redis, task(awx image), rsyslog(awx image) and ee( awx-ee image).
I did notice that ee container in the pods were running awx-ee:latest. I changed that deployment to use the same awx-ee version as the awx deployment and it resolved the problem for me.
i think it is something with the latest image running as a controller.
I did not make any changes to my sysctl inodes or anything else.
I am running version 23.2.0 and everything is working perfectly after matching the ee container to my running awx version.
I suspect it might be a bug introduced in the ansible/awx-ee . I do see it was not updated since Jun and in Nov and Dec they had some updates.

@chinochao it seem like a bug in ansible runner (a python package awx would use it ), btw, which awx-ee version do you use?

I am using 23.2.0 for awx and awx-ee in the deployment. For the EE in the AWX UI, I have latest configured. It seems the issue is using awx-ee as latest in the controller task containers.

ngsin · 2024-01-03T05:06:38Z

I think I have found the issue. It seems that all tasks controller pods are running 4 containers which are redis, task(awx image), rsyslog(awx image) and ee( awx-ee image).
I did notice that ee container in the pods were running awx-ee:latest. I changed that deployment to use the same awx-ee version as the awx deployment and it resolved the problem for me.
i think it is something with the latest image running as a controller.
I did not make any changes to my sysctl inodes or anything else.
I am running version 23.2.0 and everything is working perfectly after matching the ee container to my running awx version.
I suspect it might be a bug introduced in the ansible/awx-ee . I do see it was not updated since Jun and in Nov and Dec they had some updates.

@chinochao it seem like a bug in ansible runner (a python package awx would use it ), btw, which awx-ee version do you use?

I am using 23.2.0 for awx and awx-ee in the deployment. For the EE in the AWX UI, I have latest configured. It seems the issue is using awx-ee as latest in the controller task containers.

I set control_plane_ee_image and Execution Environment ee image with same version, and try both 23.2.0 and 23.3.1. The execution still fails with the json parse error

Error probabilities occur when job contains a lot of hosts

rchaud · 2024-01-03T05:09:47Z

I think I have found the issue. It seems that all tasks controller pods are running 4 containers which are redis, task(awx image), rsyslog(awx image) and ee( awx-ee image).
I did notice that ee container in the pods were running awx-ee:latest. I changed that deployment to use the same awx-ee version as the awx deployment and it resolved the problem for me.
i think it is something with the latest image running as a controller.
I did not make any changes to my sysctl inodes or anything else.
I am running version 23.2.0 and everything is working perfectly after matching the ee container to my running awx version.
I suspect it might be a bug introduced in the ansible/awx-ee . I do see it was not updated since Jun and in Nov and Dec they had some updates.

@chinochao it seem like a bug in ansible runner (a python package awx would use it ), btw, which awx-ee version do you use?

I am using 23.2.0 for awx and awx-ee in the deployment. For the EE in the AWX UI, I have latest configured. It seems the issue is using awx-ee as latest in the controller task containers.

I set control_plane_ee_image and Execution Environment ee image with same version, and try both 23.2.0 and 23.3.1. The execution still fails with the json parse error

Can you provide the output from kubectl to see and make sure the awx-ee image is not using latest? Something like kubectl describe to one of the task containers.

2and3makes23 · 2024-02-15T13:45:10Z

Ok PRs merged and the fixes should be in the latest awx-ee

Looks like this issue is recolved for us as well, thanks @TheRealHaoLiu ❤️

ngsin · 2024-02-16T07:35:09Z

anyone else have any other unique combination of
"job_explanation": "Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b''",
and
result_traceback
so far we addressed

"result_traceback": "Receptor detail:\nFinished", Fix erroneous override of work unit status receptor#942

"result_traceback": "Receptor detail:\nError with pod's stdout: EOF" Do not set stdoutErr at the end of EOF retry receptor#941

"result_traceback": "Receptor detail:\nError creating pod: client rate limiter Wait returned an error: can not be accept" (DO NOT USE RECEPTOR_KUBE_CLIENTSET_RATE_LIMITER = never)

Fixed a bug that cause enabling RECEPTOR_KUBE_SUPPORT_RECONNECT to be very "expensive" ansible/receptor#939

and in a lot of cases for job withlarge log size upping the containerLogMaxSize or better yet setting RECEPTOR_KUBE_SUPPORT_RECONNECT = enabled (better than just upping containerLogMaxSize since it add resiliency against other disconnection problem)

hi @TheRealHaoLiu , my AWX
Running with:

k8s 1.22
quay.io/ansible/awx:23.3.1
not set the receptor reconnect feature

i use a custom control panel ee image build on https://github.com/ansible/awx-ee
and still encounter

"result_traceback": "Receptor detail:\nFinished",

David-Igou · 2024-02-17T16:46:18Z

I set image_pull_policy: Always in AWX cr and rebuilding my custom ee based on the latest build of awx-ee

it fixed the issue for roughly a day then it came back? strange

ilbarone87 · 2024-02-18T16:39:11Z

Setting image_pull_policy: Always didn't fix for me, even after rmi the awx-ee image and restart awx-task.
What fixed it was setting gather_facts: no

EDIT: Revert what i said above, is still happening but just at task level now

mcapra · 2024-02-20T14:52:52Z

I can replicate this "Failed to JSON parse a line from worker stream." issue very consistently when running job templates, with an awx-operator managed install of AWX, on EKS v1.22.17 (I know, it's old). I also have problems syncing inventory from a Git managed project -- some binascii / base64 padding errors.

Both of these problems go away when I change nothing other than deploying to a local Kind cluster running v1.29.1

fosterseth · 2024-02-20T16:34:18Z

@mcapra enabling RECEPTOR_KUBE_SUPPORT_RECONNECT only works on these kubernetes versions:

>= 1.23.14
>= 1.24.8
>= 1.25.4

make sure to NOT enable that if you are on 1.22.17

2and3makes23 · 2024-02-21T17:10:03Z

One more note from our side:

The message Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b'' still shows up within AWX on Image Pull Backoff (in case of e.g. invalid image path)

Just in case you would like to get rid of all instances of this error message/you´d like to inform the user about this problem with a different message

lbrigman124 · 2024-02-23T19:37:00Z

If you have the error and are using a custom AWX-EE image.
Does that image need to include the receptor package?

We easily reproduce this issue but only during facts gather when we have multiple endpoint to connect.
and AWX is not running on a single node.

marek1712 · 2024-02-27T08:39:54Z

I'm not sure if I'm doing something wrong, but I still get this error.
What I tried:

configure:
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288
update AWX to 23.8.1
followed @kurokobo's suggestion to enable RECEPTOR_KUBE_SUPPORT_RECONNECT (I hope I did that in the correct place - by editing awx-on-k3s/base/awx.yaml):
used k3s rmi to remove quay.io/ansible/awx-ee from the cache (my setting in the GUI is Pull: Always, though deployment says: image_pull_policy: IfNotPresent).
on top of that I'm already running Rancher (v1.28.6+k3s2) with the following:

This:
kubectl -n awx exec -it automation-job-30469-fqbbq -- env | grep RECEPTOR_KUBE_SUPPORT_RECONNECT
doesn't return anything.

EDIT - just ran this:
kubectl -n awx exec -it deployment/awx-task -c awx-ee -- env | grep RECEPTOR_KUBE_SUPPORT_RECONNECT
and get:
RECEPTOR_KUBE_SUPPORT_RECONNECT=enabled

Any idea what to do next?

jon-nfc · 2024-02-27T08:51:50Z

@marek1712, it's possible that the deployment isn't updating. few items as food for thought:

don't use any container image with a tag of latest, always specify a version tag or specify the container sha256
inspect the image crictl inspecti <name> to ensure that it's updated to the desired one
check that the deployment did infact update kubectl describe deployment (check times)
check that the node that the automation is running on doesn't have errors kubectl describe no $HOSTNAME (one of my nodes had a wierd issue of reporting invalid drive size which was causing the node to become not ready and evicting pods)
check kubectl events for other issues

I'm also running k3s, although 1.26.13

marek1712 · 2024-02-27T09:34:52Z

Thank you!

* don't use any container image with a tag of `latest`, always specify a version tag or specify the container sha256

Just switched to 23.8.1 tag. Let's see how it goes.

* inspect the image `crictl inspecti <name>` to ensure that it's updated to the desired one

It's updated now.

* check that the deployment did infact update `kubectl describe deployment` (check times)

Event Age confirms that it did. I also see this:

* check that the node that the automation is running on doesn't have errors `kubectl describe no $HOSTNAME` (one of my nodes had a wierd issue of reporting `invalid drive size` which was causing the node to become not ready and evicting pods)

Looks OK:
Normal NodeAllocatableEnforced 19m kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 19m kubelet Node $HOSTNAME status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 19m kubelet Node $HOSTNAME status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 19m kubelet Node $HOSTNAME status is now: NodeHasSufficientPID

* check `kubectl events` for other issues

I actually encountered OOM event (with 10 forks of Cisco.IOS module)... Just asked my colleagues to raise the amount of RAM (8GB as of now, nothing else runs on the VM).

EDIT: no resources have been added yet, but task is running for 4 hours now and has ~13k lines.

Satyam777-git · 2024-02-27T16:34:32Z

I am having the same problem even after replacing "latest" tag with verison tag (23.8.1)

I am using rke2 cluster and have setup awx with helm chart.

kubernetes version - 1.29.2
rke2 version - 1.26
AWX Operator - 2.12.1
AWX Web - 23.8.1

marek1712 · 2024-02-29T14:03:49Z

@Satyam777-git - did you enable RECEPTOR_KUBE_SUPPORT_RECONNECT? What's your container-log-max-size (the latter should be only a workaround)?

TheRealHaoLiu · 2024-03-07T21:45:53Z

we gotta start pinning awx-ee image in the release...

TheRealHaoLiu · 2024-03-07T21:47:02Z

ansible/awx-operator#1740
it wont make it into the next release... but next next release hopefully

2and3makes23 · 2024-03-08T11:27:39Z

ansible/awx-operator#1740 it wont make it into the next release... but next next release hopefully

Just out of curiosity, are there plans to change the "latest" reference from awx-ee to receptor as well?

oherma01 · 2024-03-19T19:06:18Z

Understanding that not everybody's environment is the same, is it fair to say that the baseline intended resolution for those experiencing this issue is to:

Upgrade the awx image and the awx-operator image to >= 23.8.1 / 2.12.1 respectively

Run kubectl edit awx ...
Modify awx image_version field under the spec block
Run kubectl edit deployment ...
Modify awx-operator image field under the spec block

Add the field: name: RECEPTOR_KUBE_SUPPORT_RECONNECT value: enabled under the ee_extra_env block in the AWX Custom Resource definition, adding it if it doesn't exist already.

Run kubectl edit awx ...

Inspect the images present on the system, and confirm that new images have been downloaded for awx and awx-operator:

Run crictl images
Grep whatever pair of versions you installed

Check that quay.io/ansible/awx-ee is showing as using the intended version

Run crictl inspecti quay.io/ansible/awx-ee

Confirm that the awx deployment has been updated to use the new version, and shows that RECEPTOR_KUBE_SUPPORT_RECONNECT has been enabled

Run kubectl get deployment ...

And that if one were to start fresh with a new AWX environment, images would now be bound to the DEFAULT_AWX_VERSION as opposed to latest, and enabling RECEPTOR_KUBE_SUPPORT_RECONNECT would be still be required as part of the fix, (due to the reasons described here: #11805 (comment) and here: ansible/awx-operator#1484) but should only be enabled on a case by case basis?

We have a playbook that is responsible for moving many terabytes of data around from time to time, and while it doesn't fail after 4 hours as mentioned in #11805, it does generate a lot of output if we don't disable logging from rclone with the no-log flag, which caused us to run into this issue. We are running k3s v1.28.5, with AWX 23.5.1 and AWX operator 2.9.0.

Thank you!

mxs-weixiong · 2024-04-08T10:08:17Z

Added RECEPTOR to true but still getting error.

Do I need to specify all the latest image?
I running AWX 23.8.1 with operator 2.12.1

Anyone has steps on how to view the full error log?

Thanks.

TheRealHaoLiu · 2024-05-29T16:57:30Z

@mxs-weixiong please provide the result_traceback from /api/v2/jobs/<job_id> of the fail job

David-Igou · 2024-05-29T17:17:59Z

Setting web and task replicas to 1 fixed this issue for me

kzinas-adv · 2024-06-18T08:17:57Z

It could be memory consumption issue: #15273

akakshuki · 2024-08-28T07:09:08Z

I got this error as well.

"Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b''

AWX 23.4.0 Execution Environment: 23.0.0

I no idea what is causing this. Logs doesn't tell me anything. It will occur at different job.

If anyone got idea on how to troubleshoot this , please advise me.

Thanks.

I have apply this solution but there have no changes. If this is bug any solution else or do we have older stable version ?

Hi do we have any s

kevrrnet · 2024-09-15T21:49:34Z

it could be memory consumption issue, after applied this config , the issues gone.

  task_resource_requirements:
    requests:
      cpu: 1000m
      memory: 1000Mi
    limits:
      cpu: 2000m
      memory: 4Gi
  web_resource_requirements:
    requests:
      cpu: 1000m
      memory: 1000Mi
    limits:
      cpu: 2000m
      memory: 4Gi
  ee_resource_requirements:
    requests:
      cpu: 2000m
      memory: 2000Mi
    limits:
      cpu: 8000m
      memory: 16Gi
  redis_resource_requirements:
    requests:
      cpu: 1000m
      memory: 1000Mi
    limits:
      cpu: 2000m
      memory: 4Gi
  rsyslog_resource_requirements:
    requests:
      cpu: 1000m
      memory: 2000Mi
    limits:
      cpu: 2000m
      memory: 4Gi
  init_container_resource_requirements:
    requests:
      cpu: 1000m
      memory: 2000Mi
    limits:
      cpu: 2000m
      memory: 4Gi

HenriWahl · 2024-10-01T07:57:03Z

@kevrrnet nice to hear it works for you. Do you know what the defaults are for the values you changed here?

sskvulcan · 2024-10-06T00:37:18Z

Same issue. Cluster version v1.24.12+rke2r1, AWX 23.3.1, the error occurs randomly, yesterday everything worked on custom Execution Environments

UPD: When I set env ansible/receptor#683 ee_extra_env: | - name: RECEPTOR_KUBE_SUPPORT_RECONNECT value: disable the error became complete and it: Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b'failed to create fsnotify watcher: too many open files after that I set the value to x3 (from installed by default) from the previous https:// serverfault.com/questions/1137211/failed-to-create-fsnotify-watcher-too-many-open-files and everything worked :)

Hope this helps you

The implementation in the serverfault indicated link was resolved this issue for me!

kevrrnet · 2024-10-16T22:44:33Z

update on my previous post:
seems it is not only the memory/cpu limitation issue, i still see a few errors in last few days. it is a intermittent error, and it is hard to reproduce it.
awx version that i use : 2.19.1
kubernetes version Server Version: v1.24.17

kevrrnet · 2024-10-25T00:39:04Z

any update on this issue? any suggestion ?
@HenriWahl

fgardelli · 2024-10-28T17:59:03Z

+1

github-actions bot added component:ui needs_triage type:bug community labels Nov 29, 2023

zou-can mentioned this issue Apr 22, 2024

Failed to JSON parse a line from worker stream #15122

Open

11 tasks

Sasa993 mentioned this issue Dec 20, 2024

Bugfix: adjust incorrectly passed keywords with exclude-strings argument #15721

Open

Failed to JSON parse a line from worker stream due to unexpected EOF(b'') #14693

Failed to JSON parse a line from worker stream due to unexpected EOF(b'') #14693

Comments

aryklein commented Nov 29, 2023 • edited Loading

Please confirm the following

Bug Summary

Error Message

Environment

AWX version

Select the relevant components

Installation method

Modifications

Ansible version

Operating system

Web browser

Steps to reproduce

Expected results

Actual results

Additional information

jessicamack commented Nov 29, 2023

TheRealHaoLiu commented Nov 30, 2023

TheRealHaoLiu commented Nov 30, 2023

aryklein commented Dec 1, 2023 • edited Loading

aryklein commented Dec 1, 2023

fosterseth commented Dec 6, 2023

aryklein commented Dec 6, 2023

Dodexq commented Dec 8, 2023 • edited Loading

djyasin commented Dec 13, 2023 • edited Loading

aryklein commented Dec 13, 2023

mxs-weixiong commented Dec 15, 2023 • edited Loading

kuba2915 commented Dec 17, 2023

aryklein commented Dec 18, 2023

mis4s commented Dec 18, 2023 • edited Loading

JSGUYOT commented Dec 20, 2023

rchaud commented Dec 23, 2023

marek1712 commented Dec 28, 2023

mattiaoui commented Dec 28, 2023

marek1712 commented Dec 28, 2023

mattiaoui commented Dec 28, 2023

kuba2915 commented Dec 29, 2023

rchaud commented Jan 1, 2024 • edited Loading

ngsin commented Jan 3, 2024

rchaud commented Jan 3, 2024 • edited Loading

ngsin commented Jan 3, 2024 • edited Loading

rchaud commented Jan 3, 2024

2and3makes23 commented Feb 15, 2024

ngsin commented Feb 16, 2024 • edited Loading

David-Igou commented Feb 17, 2024 • edited Loading

ilbarone87 commented Feb 18, 2024 • edited Loading

mcapra commented Feb 20, 2024 • edited Loading

fosterseth commented Feb 20, 2024 • edited Loading

2and3makes23 commented Feb 21, 2024

lbrigman124 commented Feb 23, 2024

marek1712 commented Feb 27, 2024 • edited Loading

jon-nfc commented Feb 27, 2024

marek1712 commented Feb 27, 2024 • edited Loading

Satyam777-git commented Feb 27, 2024 • edited Loading

marek1712 commented Feb 29, 2024

TheRealHaoLiu commented Mar 7, 2024

TheRealHaoLiu commented Mar 7, 2024

2and3makes23 commented Mar 8, 2024

oherma01 commented Mar 19, 2024

mxs-weixiong commented Apr 8, 2024

TheRealHaoLiu commented May 29, 2024

David-Igou commented May 29, 2024

kzinas-adv commented Jun 18, 2024

akakshuki commented Aug 28, 2024

kevrrnet commented Sep 15, 2024 • edited Loading

HenriWahl commented Oct 1, 2024

sskvulcan commented Oct 6, 2024

kevrrnet commented Oct 16, 2024 • edited Loading

kevrrnet commented Oct 25, 2024 • edited Loading

fgardelli commented Oct 28, 2024

aryklein commented Nov 29, 2023 •

edited

Loading

aryklein commented Dec 1, 2023 •

edited

Loading

Dodexq commented Dec 8, 2023 •

edited

Loading

djyasin commented Dec 13, 2023 •

edited

Loading

mxs-weixiong commented Dec 15, 2023 •

edited

Loading

mis4s commented Dec 18, 2023 •

edited

Loading

rchaud commented Jan 1, 2024 •

edited

Loading

rchaud commented Jan 3, 2024 •

edited

Loading

ngsin commented Jan 3, 2024 •

edited

Loading

ngsin commented Feb 16, 2024 •

edited

Loading

David-Igou commented Feb 17, 2024 •

edited

Loading

ilbarone87 commented Feb 18, 2024 •

edited

Loading

mcapra commented Feb 20, 2024 •

edited

Loading

fosterseth commented Feb 20, 2024 •

edited

Loading

marek1712 commented Feb 27, 2024 •

edited

Loading

marek1712 commented Feb 27, 2024 •

edited

Loading

Satyam777-git commented Feb 27, 2024 •

edited

Loading

kevrrnet commented Sep 15, 2024 •

edited

Loading

kevrrnet commented Oct 16, 2024 •

edited

Loading

kevrrnet commented Oct 25, 2024 •

edited

Loading