Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[External Logger] Rsyslogd unexpectedly stop sending events to Splunk http collector #7560

Closed
ziouf opened this issue Jul 7, 2020 · 13 comments · Fixed by #14719
Closed

[External Logger] Rsyslogd unexpectedly stop sending events to Splunk http collector #7560

ziouf opened this issue Jul 7, 2020 · 13 comments · Fixed by #14719

Comments

@ziouf
Copy link

ziouf commented Jul 7, 2020

ISSUE TYPE
  • Bug Report
SUMMARY

Rsyslog daemon seam to unexpectedly stop publication of events to Splunk HTTP event collector.

ENVIRONMENT
  • AWX version: 11.2.0
  • AWX install method: docker on linux
  • Ansible version: 2.9.7
  • Operating System: RHEL 7
  • Web Browser: Firefox 78
STEPS TO REPRODUCE
  • Setup external logger to publish events on Splunk HTTP event collector
  • Run playbooks
  • Wait until events are not published anymore
EXPECTED RESULTS

Events published to Splunk HTTP event collector

ACTUAL RESULTS

I increased verbosity of logs by adding $DebugLevel 2 to /var/lib/awx/rsyslog/rsyslog.conf file. Like the following :

$DebugLevel 2
$WorkDirectory /var/lib/awx/rsyslog
$MaxMessageSize 700000
$IncludeConfig /var/lib/awx/rsyslog/conf.d/*.conf
main_queue(queue.spoolDirectory="/var/lib/awx" queue.maxdiskspace="1g" queue.type="Disk" queue.filename="awx-external-logger-backlog")
module(load="imuxsock" SysSock.Use="off")
input(type="imuxsock" Socket="/var/run/awx-rsyslog/rsyslog.sock" unlink="on")
template(name="awx" type="string" string="%rawmsg-after-pri%")
module(load="omhttp")
action(type="omhttp" server="xxx" serverport="8088" usehttps="on" allowunsignedcerts="on" skipverifyhost="on" action.resumeRetryCount="-1" template="awx" errorfile="/var/log/tower/rsyslog.err" action.resumeInterval="10" restpath="services/collector/event" httpheaderkey="Authorization" httpheadervalue="Splunk xxx")

And then restarted rsyslogd daemon by running:

docker container exec -ti awx_web supervisorctl -c /supervisor.conf restart tower-processes:awx-rsyslogd
ADDITIONAL INFORMATION

I get the following logs with 2020-07-07T09:30:38.645Z as the very last timestamp processed by Splunk. It seams that next events fail to publish because we can see that omHttp try commit action return non-zero code. But I don't see the root cause in logs.

...
2020-07-07T09:30:38.692724000Z  4238.685265456:imuxsock.c     : imuxsock.c: Message from UNIX socket: #4, size 2144
2020-07-07T09:30:38.692933000Z  4238.685275329:imuxsock.c     : parser.c: dropped LF at very end of message (DropTrailingLF is set)
2020-07-07T09:30:38.693141000Z  4238.685279952:imuxsock.c     : datetime.c: ParseTIMESTAMP3339: invalid year: 0, pszTS: '"'
2020-07-07T09:30:38.693357000Z  4238.685284023:imuxsock.c     : ruleset.c: processBATCH: batch of 1 elements must be processed
2020-07-07T09:30:38.693566000Z  4238.685286499:imuxsock.c     : ruleset.c: processBATCH: next msg 0: <14>{"event": {"@timestamp": "2020-07-07T09:30:38.645Z", "me
ssage": "Event data saved.", "host": 149, "level": "INFO", "logger_n
2020-07-07T09:30:38.693788000Z  4238.685289404:imuxsock.c     : rainerscript.c:     ACTION 0 [omhttp:action(type="omhttp" ...)]
2020-07-07T09:30:38.693998000Z  4238.685295419:imuxsock.c     : ruleset.c: executing action 0
2020-07-07T09:30:38.694215000Z  4238.685298043:imuxsock.c     : ../action.c: action 'action-0-omhttp': called, logging to omhttp (susp 0/0, direct q 1)
2020-07-07T09:30:38.694425000Z  4238.685301860:imuxsock.c     : ../action.c: action 'action-0-omhttp': is transactional - executing in commit phase
2020-07-07T09:30:38.694640000Z  4238.685304435:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:30:38.694857000Z  4238.685306679:imuxsock.c     : ../action.c: checking external state file
2020-07-07T09:30:38.695066000Z  4238.685308782:imuxsock.c     : ../action.c: done checking external state file, iRet=0
2020-07-07T09:31:00.031976000Z  RESULT 2
2020-07-07T09:31:00.032297000Z  OKREADY
2020-07-07T09:31:07.082980000Z  4267.081868585:imuxsock.c     : imuxsock.c: Message from UNIX socket: #4, size 1236
2020-07-07T09:31:07.083269000Z  4267.081886907:imuxsock.c     : parser.c: dropped LF at very end of message (DropTrailingLF is set)
2020-07-07T09:31:07.083499000Z  4267.081891938:imuxsock.c     : datetime.c: ParseTIMESTAMP3339: invalid year: 0, pszTS: '"'
2020-07-07T09:31:07.083730000Z  4267.081898096:imuxsock.c     : ruleset.c: processBATCH: batch of 1 elements must be processed
2020-07-07T09:31:07.083950000Z  4267.081901096:imuxsock.c     : ruleset.c: processBATCH: next msg 0: <14>{"event": {"@timestamp": "2020-07-07T09:31:07.080Z", "me
ssage": "Event data saved.", "host": 36, "level": "INFO", "logger_na
2020-07-07T09:31:07.084173000Z  4267.081904611:imuxsock.c     : rainerscript.c:     ACTION 0 [omhttp:action(type="omhttp" ...)]
2020-07-07T09:31:07.084389000Z  4267.081911077:imuxsock.c     : ruleset.c: executing action 0
2020-07-07T09:31:07.084602000Z  4267.081914375:imuxsock.c     : ../action.c: action 'action-0-omhttp': called, logging to omhttp (susp 0/0, direct q 1)
2020-07-07T09:31:07.084825000Z  4267.081919046:imuxsock.c     : ../action.c: action 'action-0-omhttp': is transactional - executing in commit phase
2020-07-07T09:31:07.085040000Z  4267.081922051:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:31:07.085251000Z  4267.081924724:imuxsock.c     : ../action.c: action 'action-0-omhttp': set suspended state to 0
2020-07-07T09:31:07.085470000Z  4267.081927362:imuxsock.c     : ruleset.c: END batch execution phase, entering to commit phase [processed 1 of 1 messages]
2020-07-07T09:31:07.085680000Z  4267.081930072:imuxsock.c     : ../action.c: actionCommitAllDirect: action 0, state 5, nbr to commit 1 isTransactional 1
2020-07-07T09:31:07.085904000Z  4267.081932513:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: enter, 1 msgs
2020-07-07T09:31:07.086140000Z  4267.081934637:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: processing...
2020-07-07T09:31:07.086364000Z  4267.081936725:imuxsock.c     : ../action.c: actionTryCommit[action-0-omhttp] enter
2020-07-07T09:31:07.086578000Z  4267.081938889:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:31:07.086804000Z  4267.081941372:imuxsock.c     : ../action.c: doTransaction: action 'action-0-omhttp', currIParam 1
2020-07-07T09:31:07.087017000Z  4267.081943473:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:31:07.087235000Z  4267.081945859:imuxsock.c     : ../action.c: doTransaction: action 0, processing msg 0, result -2218
2020-07-07T09:31:07.087458000Z  4267.081948059:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: return actionTryCommit -2218
2020-07-07T09:31:07.087669000Z  4267.081950261:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: done, iRet -2218
2020-07-07T09:31:07.090779000Z  4267.081952391:imuxsock.c     : ruleset.c: processBATCH: batch of 1 elements has been processed2020-07-07T09:31:07.091045000Z  4267.081957182:imuxsock.c     : imuxsock.c: --------imuxsock calling poll() on 2 fds
2020-07-07T09:31:07.098515000Z  4267.097736092:imuxsock.c     : imuxsock.c: Message from UNIX socket: #4, size 1911
2020-07-07T09:31:07.098803000Z  4267.097756048:imuxsock.c     : parser.c: dropped LF at very end of message (DropTrailingLF is set)
2020-07-07T09:31:07.099032000Z  4267.097761546:imuxsock.c     : datetime.c: ParseTIMESTAMP3339: invalid year: 0, pszTS: '"'
2020-07-07T09:31:07.099255000Z  4267.097767933:imuxsock.c     : ruleset.c: processBATCH: batch of 1 elements must be processed
2020-07-07T09:31:07.099472000Z  4267.097771126:imuxsock.c     : ruleset.c: processBATCH: next msg 0: <14>{"event": {"@timestamp": "2020-07-07T09:31:07.093Z", "message": "Event data saved.", "host": 670, "level": "INFO", "logger_n
2020-07-07T09:31:07.099712000Z  4267.097774770:imuxsock.c     : rainerscript.c:     ACTION 0 [omhttp:action(type="omhttp" ...)]
2020-07-07T09:31:07.099932000Z  4267.097781233:imuxsock.c     : ruleset.c: executing action 0
2020-07-07T09:31:07.100151000Z  4267.097784508:imuxsock.c     : ../action.c: action 'action-0-omhttp': called, logging to omhttp (susp 0/0, direct q 1)
2020-07-07T09:31:07.100367000Z  4267.097789530:imuxsock.c     : ../action.c: action 'action-0-omhttp': is transactional - executing in commit phase
2020-07-07T09:31:07.100580000Z  4267.097792267:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:31:07.100801000Z  4267.097795152:imuxsock.c     : ../action.c: action 'action-0-omhttp': set suspended state to 0
2020-07-07T09:31:07.101020000Z  4267.097797424:imuxsock.c     : ruleset.c: END batch execution phase, entering to commit phase [processed 1 of 1 messages]
2020-07-07T09:31:07.101231000Z  4267.097800529:imuxsock.c     : ../action.c: actionCommitAllDirect: action 0, state 5, nbr to commit 1 isTransactional 1
2020-07-07T09:31:07.101456000Z  4267.097803055:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: enter, 1 msgs
2020-07-07T09:31:07.101669000Z  4267.097805090:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: processing...
2020-07-07T09:31:07.101937000Z  4267.097807151:imuxsock.c     : ../action.c: actionTryCommit[action-0-omhttp] enter
2020-07-07T09:31:07.102149000Z  4267.097809295:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:31:07.102364000Z  4267.097811763:imuxsock.c     : ../action.c: doTransaction: action 'action-0-omhttp', currIParam 1
2020-07-07T09:31:07.102572000Z  4267.097813873:imuxsock.c     : ../action.c: actionPrepare[action-0-omhttp]: enter
2020-07-07T09:31:07.102790000Z  4267.097816217:imuxsock.c     : ../action.c: doTransaction: action 0, processing msg 0, result -2218
2020-07-07T09:31:07.103001000Z  4267.097818403:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: return actionTryCommit -2218
2020-07-07T09:31:07.103215000Z  4267.097820562:imuxsock.c     : ../action.c: actionCommit[action-0-omhttp]: done, iRet -2218
2020-07-07T09:31:07.103433000Z  4267.097822641:imuxsock.c     : ruleset.c: processBATCH: batch of 1 elements has been processed
2020-07-07T09:31:07.103651000Z  4267.097827025:imuxsock.c     : imuxsock.c: --------imuxsock calling poll() on 2 fds
...

It somtimes works again after a manual trigger of Rsyslogd restart with this cmd, but never for a long time...

docker container exec -ti awx_web supervisorctl -c /supervisor.conf restart tower-processes:awx-rsyslogd

Can you help me to identify and fix the root cause ?
Thank you

Edit:
I found a first clue : Rsyslog error code 2218 is about failure caused by data passed to action. (https://www.rsyslog.com/rsyslog-error-2218/)
Maybe we can dig in this direction ?

@awxbot awxbot added the type:bug label Jul 7, 2020
@ryanpetrello
Copy link
Contributor

ryanpetrello commented Jul 7, 2020

@ziouf,

This is pretty weird - it might be worth opening an issue at https://github.com/rsyslog/rsyslog/issues/ with this output as well to see if they have any ideas/pointers (or they might have suggestions about how to get more detail about the failure).

@ziouf
Copy link
Author

ziouf commented Jul 8, 2020

Issue opened in rsyslog project issue tracker : rsyslog/rsyslog#4348

@ziouf
Copy link
Author

ziouf commented Jul 9, 2020

rsyslog/rsyslog#4348 (comment)

Possible that your message is not valid JSON (pulled the below from the debug log, unless it's getting cropped?) and the server you're sending to is throwing an error. Since you have action.resumeRetryCount="-1", this is getting retried (and keeps failing) forever, so you're stuck in a suspend loop.

2020-07-07T09:31:07.083950000Z  4267.081901096:imuxsock.c     : ruleset.c: processBATCH: next msg 0: <14>{"event": {"@timestamp": "2020-07-07T09:31:07.080Z", "me
ssage": "Event data saved.", "host": 36, "level": "INFO", "logger_na

Can you try to verify that your messages are all valid JSON? Or even better, use a template format that escapes your message and let rsyslog produce valid JSON?

@ryanpetrello Is it possible that AWX exposes a bad formatted json to rsyslog ?
How can I verify json format of blocking event ?

What do you think of the template suggestion?

@ziouf
Copy link
Author

ziouf commented Jul 9, 2020

I have found events in error in /var/log/tower/rsyslog.err file

The last one is :

{
  "request": {
    "url": "https://splunkhost.com:8088/services/collector/event",
    "postdata": "{\"event\": {\"@timestamp\": \"2020-07-07T09:30:53.780Z\", \"message\": \"Event data saved.\", \"host\": 686, \"level\": \"INFO\", \"logger_name\": \"awx.analytics.job_events\", \"id\": null, \"modified\": null, \"event\": \"runner_on_failed\", \"event_data\": {\"playbook\": \"AWX_Ansible/ansible_check_setup.yml\", \"playbook_uuid\": \"a7eb8878-345b-4237-bfce-9e5d18b0a9e9\", \"play\": \"Playbook pour test setup\", \"play_uuid\": \"0242ac12-0006-80ff-50f5-0000000008b9\", \"play_pattern\": \"all\", \"task\": \"get_ansible_facts\", \"task_uuid\": \"0242ac12-0006-80ff-50f5-0000000008bb\", \"task_action\": \"setup\", \"task_args\": \"\", \"task_path\": \"/tmp/awx_1440_enxdule_/project/AWX_Ansible/ansible_check_setup.yml:7\", \"host\": \"hostname.com\", \"remote_addr\": \"hostname.com\", \"res\": {\"msg\": \"winrm send_input failed; \\nstdout: \\nstderr W\\u0000i\\u0000n\\u0000d\\u0000o\\u0000w\\u0000s\\u0000 \\u0000P\\u0000o\\u0000w\\u0000e\\u0000r\\u0000S\\u0000h\\u0000e\\u0000l\\u0000l\\u0000 \\u0000s\\u0000'\\u0000e\\u0000s\\u0000t\\u0000 \\u0000a\\u0000r\\u0000r\\u0000\\udcea\\u0000t\\u0000\\udce9\\u0000 \\u0000a\\u0000v\\u0000e\\u0000c\\u0000 \\u0000l\\u0000'\\u0000e\\u0000r\\u0000r\\u0000e\\u0000u\\u0000r\\u0000 \\u0000s\\u0000u\\u0000i\\u0000v\\u0000a\\u0000n\\u0000t\\u0000e\\u0000\\udca0\\u0000:\\u0000 \\u0000\\r\\u0000\\n\\u0000 \\u0000I\\u0000m\\u0000p\\u0000o\\u0000s\\u0000s\\u0000i\\u0000b\\u0000l\\u0000e\\u0000 \\u0000d\\u0000e\\u0000 \\u0000c\\u0000h\\u0000a\\u0000r\\u0000g\\u0000e\\u0000r\\u0000 \\u0000l\\u0000e\\u0000 \\u0000f\\u0000i\\u0000c\\u0000h\\u0000i\\u0000e\\u0000r\\u0000 \\u0000o\\u0000u\\u0000 \\u0000l\\u0000'\\u0000a\\u0000s\\u0000s\\u0000e\\u0000m\\u0000b\\u0000l\\u0000y\\u0000 \\u0000'\\u0000S\\u0000y\\u0000s\\u0000t\\u0000e\\u0000m\\u0000.\\u0000M\\u0000a\\u0000n\\u0000a\\u0000g\\u0000e\\u0000m\\u0000e\\u0000n\\u0000t\\u0000.\\u0000A\\u0000u\\u0000t\\u0000o\\u0000m\\u0000a\\u0000t\\u0000i\\u0000o\\u0000n\\u0000,\\u0000 \\u0000V\\u0000e\\u0000r\\u0000s\\u0000i\\u0000o\\u0000n\\u0000=\\u00003\\u0000.\\u00000\\u0000.\\u00000\\u0000.\\u00000\\u0000,\\u0000 \\u0000C\\u0000u\\u0000l\\u0000t\\u0000u\\u0000r\\u0000e\\u0000=\\u0000n\\u0000e\\u0000u\\u0000t\\u0000r\\u0000a\\u0000l\\u0000,\\u0000 \\u0000P\\u0000u\\u0000b\\u0000l\\u0000i\\u0000c\\u0000K\\u0000e\\u0000y\\u0000T\\u0000o\\u0000k\\u0000e\\u0000n\\u0000=\\u00003\\u00001\\u0000b\\u0000f\\u00003\\u00008\\u00005\\u00006\\u0000a\\u0000d\\u00003\\u00006\\u00004\\u0000e\\u00003\\u00005\\u0000'\\u0000 \\u0000o\\u0000u\\u0000 \\u0000u\\u0000n\\u0000e\\u0000 \\u0000d\\u0000e\\u0000 \\u0000s\\u0000e\\u0000s\\u0000 \\u0000d\\u0000\\udce9\\u0000p\\u0000e\\u0000n\\u0000d\\u0000a\\u0000n\\u0000c\\u0000e\\u0000s\\u0000.\\u0000 \\u0000L\\u0000e\\u0000 \\u0000f\\u0000i\\u0000c\\u0000h\\u0000i\\u0000e\\u0000r\\u0000 \\u0000d\\u0000e\\u0000 \\u0000p\\u0000a\\u0000g\\u0000i\\u0000n\\u0000a\\u0000t\\u0000i\\u0000o\\u0000n\\u0000 \\u0000e\\u0000s\\u0000t\\u0000 \\u0000i\\u0000n\\u0000s\\u0000u\\u0000f\\u0000f\\u0000i\\u0000s\\u0000a\\u0000n\\u0000t\\u0000 \\u0000p\\u0000o\\u0000u\\u0000r\\u0000 \\u0000t\\u0000e\\u0000r\\u0000m\\u0000i\\u0000n\\u0000e\\u0000r\\u0000 \\u0000c\\u0000e\\u0000t\\u0000t\\u0000e\\u0000 \\u0000o\\u0000p\\u0000\\udce9\\u0000r\\u0000a\\u0000t\\u0000i\\u0000o\\u0000n\\u0000.\\u0000 \\u0000(\\u0000E\\u0000x\\u0000c\\u0000e\\u0000p\\u0000t\\u0000i\\u0000o\\u0000n\\u0000 \\u0000d\\u0000e\\u0000 \\u0000H\\u0000R\\u0000E\\u0000S\\u0000U\\u0000L\\u0000T\\u0000 \\u0000:\\u0000 \\u00000\\u0000x\\u00008\\u00000\\u00000\\u00007\\u00000\\u00005\\u0000A\\u0000F\\u0000)\\u0000\\r\\u0000\\n\\u0000\", \"_ansible_no_log\": false}, \"start\": \"2020-07-07T09:30:52.679120\", \"end\": \"2020-07-07T09:30:53.679388\", \"duration\": 1.000268, \"ignore_errors\": null, \"event_loop\": null, \"uuid\": \"802a661d-71a7-499d-aa0e-a280234a94c9\"}, \"failed\": true, \"changed\": false, \"uuid\": \"802a661d-71a7-499d-aa0e-a280234a94c9\", \"playbook\": \"AWX_Ansible/ansible_check_setup.yml\", \"play\": \"Playbook pour test setup\", \"role\": \"\", \"task\": \"get_ansible_facts\", \"counter\": 416, \"stdout\": \"\\u001b[0;31mfatal: [hostname.com]: FAILED! => {\\\"msg\\\": \\\"winrm send_input failed; \\\\nstdout: \\\\nstderr W\\\\u0000i\\\\u0000n\\\\u0000d\\\\u0000o\\\\u0000w\\\\u0000s\\\\u0000 \\\\u0000P\\\\u0000o\\\\u0000w\\\\u0000e\\\\u0000r\\\\u0000S\\\\u0000h\\\\u0000e\\\\u0000l\\\\u0000l\\\\u0000 \\\\u0000s\\\\u0000'\\\\u0000e\\\\u0000s\\\\u0000t\\\\u0000 \\\\u0000a\\\\u0000r\\\\u0000r\\\\u0000\\ufffd\\\\u0000t\\\\u0000\\ufffd\\\\u0000 \\\\u0000a\\\\u0000v\\\\u0000e\\\\u0000c\\\\u0000 \\\\u0000l\\\\u0000'\\\\u0000e\\\\u0000r\\\\u0000r\\\\u0000e\\\\u0000u\\\\u0000r\\\\u0000 \\\\u0000s\\\\u0000u\\\\u0000i\\\\u0000v\\\\u0000a\\\\u0000n\\\\u0000t\\\\u0000e\\\\u0000\\ufffd\\\\u0000:\\\\u0000 \\\\u0000\\\\r\\\\u0000\\\\n\\\\u0000 \\\\u0000I\\\\u0000m\\\\u0000p\\\\u0000o\\\\u0000s\\\\u0000s\\\\u0000i\\\\u0000b\\\\u0000l\\\\u0000e\\\\u0000 \\\\u0000d\\\\u0000e\\\\u0000 \\\\u0000c\\\\u0000h\\\\u0000a\\\\u0000r\\\\u0000g\\\\u0000e\\\\u0000r\\\\u0000 \\\\u0000l\\\\u0000e\\\\u0000 \\\\u0000f\\\\u0000i\\\\u0000c\\\\u0000h\\\\u0000i\\\\u0000e\\\\u0000r\\\\u0000 \\\\u0000o\\\\u0000u\\\\u0000 \\\\u0000l\\\\u0000'\\\\u0000a\\\\u0000s\\\\u0000s\\\\u0000e\\\\u0000m\\\\u0000b\\\\u0000l\\\\u0000y\\\\u0000 \\\\u0000'\\\\u0000S\\\\u0000y\\\\u0000s\\\\u0000t\\\\u0000e\\\\u0000m\\\\u0000.\\\\u0000M\\\\u0000a\\\\u0000n\\\\u0000a\\\\u0000g\\\\u0000e\\\\u0000m\\\\u0000e\\\\u0000n\\\\u0000t\\\\u0000.\\\\u0000A\\\\u0000u\\\\u0000t\\\\u0000o\\\\u0000m\\\\u0000a\\\\u0000t\\\\u0000i\\\\u0000o\\\\u0000n\\\\u0000,\\\\u0000 \\\\u0000V\\\\u0000e\\\\u0000r\\\\u0000s\\\\u0000i\\\\u0000o\\\\u0000n\\\\u0000=\\\\u00003\\\\u0000.\\\\u00000\\\\u0000.\\\\u00000\\\\u0000.\\\\u00000\\\\u0000,\\\\u0000 \\\\u0000C\\\\u0000u\\\\u0000l\\\\u0000t\\\\u0000u\\\\u0000r\\\\u0000e\\\\u0000=\\\\u0000n\\\\u0000e\\\\u0000u\\\\u0000t\\\\u0000r\\\\u0000a\\\\u0000l\\\\u0000,\\\\u0000 \\\\u0000P\\\\u0000u\\\\u0000b\\\\u0000l\\\\u0000i\\\\u0000c\\\\u0000K\\\\u0000e\\\\u0000y\\\\u0000T\\\\u0000o\\\\u0000k\\\\u0000e\\\\u0000n\\\\u0000=\\\\u00003\\\\u00001\\\\u0000b\\\\u0000f\\\\u00003\\\\u00008\\\\u00005\\\\u00006\\\\u0000a\\\\u0000d\\\\u00003\\\\u00006\\\\u00004\\\\u0000e\\\\u00003\\\\u00005\\\\u0000'\\\\u0000 \\\\u0000o\\\\u0000u\\\\u0000 \\\\u0000u\\\\u0000n\\\\u0000e\\\\u0000 \\\\u0000d\\\\u0000e\\\\u0000 \\\\u0000s\\\\u0000e\\\\u0000s\\\\u0000 \\\\u0000d\\\\u0000\\ufffd\\\\u0000p\\\\u0000e\\\\u0000n\\\\u0000d\\\\u0000a\\\\u0000n\\\\u0000c\\\\u0000e\\\\u0000s\\\\u0000.\\\\u0000 \\\\u0000L\\\\u0000e\\\\u0000 \\\\u0000f\\\\u0000i\\\\u0000c\\\\u0000h\\\\u0000i\\\\u0000e\\\\u0000r\\\\u0000 \\\\u0000d\\\\u0000e\\\\u0000 \\\\u0000p\\\\u0000a\\\\u0000g\\\\u0000i\\\\u0000n\\\\u0000a\\\\u0000t\\\\u0000i\\\\u0000o\\\\u0000n\\\\u0000 \\\\u0000e\\\\u0000s\\\\u0000t\\\\u0000 \\\\u0000i\\\\u0000n\\\\u0000s\\\\u0000u\\\\u0000f\\\\u0000f\\\\u0000i\\\\u0000s\\\\u0000a\\\\u0000n\\\\u0000t\\\\u0000 \\\\u0000p\\\\u0000o\\\\u0000u\\\\u0000r\\\\u0000 \\\\u0000t\\\\u0000e\\\\u0000r\\\\u0000m\\\\u0000i\\\\u0000n\\\\u0000e\\\\u0000r\\\\u0000 \\\\u0000c\\\\u0000e\\\\u0000t\\\\u0000t\\\\u0000e\\\\u0000 \\\\u0000o\\\\u0000p\\\\u0000\\ufffd\\\\u0000r\\\\u0000a\\\\u0000t\\\\u0000i\\\\u0000o\\\\u0000n\\\\u0000.\\\\u0000 \\\\u0000(\\\\u0000E\\\\u0000x\\\\u0000c\\\\u0000e\\\\u0000p\\\\u0000t\\\\u0000i\\\\u0000o\\\\u0000n\\\\u0000 \\\\u0000d\\\\u0000e\\\\u0000 \\\\u0000H\\\\u0000R\\\\u0000E\\\\u0000S\\\\u0000U\\\\u0000L\\\\u0000T\\\\u0000 \\\\u0000:\\\\u0000 \\\\u00000\\\\u0000x\\\\u00008\\\\u00000\\\\u00000\\\\u00007\\\\u00000\\\\u00005\\\\u0000A\\\\u0000F\\\\u0000)\\\\u0000\\\\r\\\\u0000\\\\n\\\\u0000\\\"}\\u001b[0m\", \"verbosity\": 0, \"start_line\": 207, \"end_line\": 208, \"created\": \"2020-07-07T09:30:53.679Z\", \"job\": 1440, \"host_name\": \"hostname.com\", \"parent_uuid\": \"0242ac12-0006-80ff-50f5-0000000008bb\", \"event_display\": \"Host Failed\", \"cluster_host_id\": \"awx\", \"tower_uuid\": null}}"
  },
  "response": {
    "status": 400,
    "message": "{\"text\":\"Invalid data format\",\"code\":6,\"invalid-event-number\":0}"
  }
}

postdata field content:

{
    "event": {
        "@timestamp": "2020-07-07T09:30:53.780Z",
        "message": "Event data saved.",
        "host": 686,
        "level": "INFO",
        "logger_name": "awx.analytics.job_events",
        "id": null,
        "modified": null,
        "event": "runner_on_failed",
        "event_data": {
            "playbook": "AWX_Ansible/ansible_check_setup.yml",
            "playbook_uuid": "a7eb8878-345b-4237-bfce-9e5d18b0a9e9",
            "play": "Playbook pour test setup",
            "play_uuid": "0242ac12-0006-80ff-50f5-0000000008b9",
            "play_pattern": "all",
            "task": "get_ansible_facts",
            "task_uuid": "0242ac12-0006-80ff-50f5-0000000008bb",
            "task_action": "setup",
            "task_args": "",
            "task_path": "/tmp/awx_1440_enxdule_/project/AWX_Ansible/ansible_check_setup.yml:7",
            "host": "hostname.com",
            "remote_addr": "hostname.com",
            "res": {
                "msg": "winrm send_input failed; \nstdout: \nstderr W\u0000i\u0000n\u0000d\u0000o\u0000w\u0000s\u0000 \u0000P\u0000o\u0000w\u0000e\u0000r\u0000S\u0000h\u0000e\u0000l\u0000l\u0000 \u0000s\u0000'\u0000e\u0000s\u0000t\u0000 \u0000a\u0000r\u0000r\u0000\udcea\u0000t\u0000\udce9\u0000 \u0000a\u0000v\u0000e\u0000c\u0000 \u0000l\u0000'\u0000e\u0000r\u0000r\u0000e\u0000u\u0000r\u0000 \u0000s\u0000u\u0000i\u0000v\u0000a\u0000n\u0000t\u0000e\u0000\udca0\u0000:\u0000 \u0000\r\u0000\n\u0000 \u0000I\u0000m\u0000p\u0000o\u0000s\u0000s\u0000i\u0000b\u0000l\u0000e\u0000 \u0000d\u0000e\u0000 \u0000c\u0000h\u0000a\u0000r\u0000g\u0000e\u0000r\u0000 \u0000l\u0000e\u0000 \u0000f\u0000i\u0000c\u0000h\u0000i\u0000e\u0000r\u0000 \u0000o\u0000u\u0000 \u0000l\u0000'\u0000a\u0000s\u0000s\u0000e\u0000m\u0000b\u0000l\u0000y\u0000 \u0000'\u0000S\u0000y\u0000s\u0000t\u0000e\u0000m\u0000.\u0000M\u0000a\u0000n\u0000a\u0000g\u0000e\u0000m\u0000e\u0000n\u0000t\u0000.\u0000A\u0000u\u0000t\u0000o\u0000m\u0000a\u0000t\u0000i\u0000o\u0000n\u0000,\u0000 \u0000V\u0000e\u0000r\u0000s\u0000i\u0000o\u0000n\u0000=\u00003\u0000.\u00000\u0000.\u00000\u0000.\u00000\u0000,\u0000 \u0000C\u0000u\u0000l\u0000t\u0000u\u0000r\u0000e\u0000=\u0000n\u0000e\u0000u\u0000t\u0000r\u0000a\u0000l\u0000,\u0000 \u0000P\u0000u\u0000b\u0000l\u0000i\u0000c\u0000K\u0000e\u0000y\u0000T\u0000o\u0000k\u0000e\u0000n\u0000=\u00003\u00001\u0000b\u0000f\u00003\u00008\u00005\u00006\u0000a\u0000d\u00003\u00006\u00004\u0000e\u00003\u00005\u0000'\u0000 \u0000o\u0000u\u0000 \u0000u\u0000n\u0000e\u0000 \u0000d\u0000e\u0000 \u0000s\u0000e\u0000s\u0000 \u0000d\u0000\udce9\u0000p\u0000e\u0000n\u0000d\u0000a\u0000n\u0000c\u0000e\u0000s\u0000.\u0000 \u0000L\u0000e\u0000 \u0000f\u0000i\u0000c\u0000h\u0000i\u0000e\u0000r\u0000 \u0000d\u0000e\u0000 \u0000p\u0000a\u0000g\u0000i\u0000n\u0000a\u0000t\u0000i\u0000o\u0000n\u0000 \u0000e\u0000s\u0000t\u0000 \u0000i\u0000n\u0000s\u0000u\u0000f\u0000f\u0000i\u0000s\u0000a\u0000n\u0000t\u0000 \u0000p\u0000o\u0000u\u0000r\u0000 \u0000t\u0000e\u0000r\u0000m\u0000i\u0000n\u0000e\u0000r\u0000 \u0000c\u0000e\u0000t\u0000t\u0000e\u0000 \u0000o\u0000p\u0000\udce9\u0000r\u0000a\u0000t\u0000i\u0000o\u0000n\u0000.\u0000 \u0000(\u0000E\u0000x\u0000c\u0000e\u0000p\u0000t\u0000i\u0000o\u0000n\u0000 \u0000d\u0000e\u0000 \u0000H\u0000R\u0000E\u0000S\u0000U\u0000L\u0000T\u0000 \u0000:\u0000 \u00000\u0000x\u00008\u00000\u00000\u00007\u00000\u00005\u0000A\u0000F\u0000)\u0000\r\u0000\n\u0000",
                "_ansible_no_log": false
            },
            "start": "2020-07-07T09:30:52.679120",
            "end": "2020-07-07T09:30:53.679388",
            "duration": 1.000268,
            "ignore_errors": null,
            "event_loop": null,
            "uuid": "802a661d-71a7-499d-aa0e-a280234a94c9"
        },
        "failed": true,
        "changed": false,
        "uuid": "802a661d-71a7-499d-aa0e-a280234a94c9",
        "playbook": "AWX_Ansible/ansible_check_setup.yml",
        "play": "Playbook pour test setup",
        "role": "",
        "task": "get_ansible_facts",
        "counter": 416,
        "stdout": "\u001b[0;31mfatal: [hostname.com]: FAILED! => {\"msg\": \"winrm send_input failed; \\nstdout: \\nstderr W\\u0000i\\u0000n\\u0000d\\u0000o\\u0000w\\u0000s\\u0000 \\u0000P\\u0000o\\u0000w\\u0000e\\u0000r\\u0000S\\u0000h\\u0000e\\u0000l\\u0000l\\u0000 \\u0000s\\u0000'\\u0000e\\u0000s\\u0000t\\u0000 \\u0000a\\u0000r\\u0000r\\u0000\ufffd\\u0000t\\u0000\ufffd\\u0000 \\u0000a\\u0000v\\u0000e\\u0000c\\u0000 \\u0000l\\u0000'\\u0000e\\u0000r\\u0000r\\u0000e\\u0000u\\u0000r\\u0000 \\u0000s\\u0000u\\u0000i\\u0000v\\u0000a\\u0000n\\u0000t\\u0000e\\u0000\ufffd\\u0000:\\u0000 \\u0000\\r\\u0000\\n\\u0000 \\u0000I\\u0000m\\u0000p\\u0000o\\u0000s\\u0000s\\u0000i\\u0000b\\u0000l\\u0000e\\u0000 \\u0000d\\u0000e\\u0000 \\u0000c\\u0000h\\u0000a\\u0000r\\u0000g\\u0000e\\u0000r\\u0000 \\u0000l\\u0000e\\u0000 \\u0000f\\u0000i\\u0000c\\u0000h\\u0000i\\u0000e\\u0000r\\u0000 \\u0000o\\u0000u\\u0000 \\u0000l\\u0000'\\u0000a\\u0000s\\u0000s\\u0000e\\u0000m\\u0000b\\u0000l\\u0000y\\u0000 \\u0000'\\u0000S\\u0000y\\u0000s\\u0000t\\u0000e\\u0000m\\u0000.\\u0000M\\u0000a\\u0000n\\u0000a\\u0000g\\u0000e\\u0000m\\u0000e\\u0000n\\u0000t\\u0000.\\u0000A\\u0000u\\u0000t\\u0000o\\u0000m\\u0000a\\u0000t\\u0000i\\u0000o\\u0000n\\u0000,\\u0000 \\u0000V\\u0000e\\u0000r\\u0000s\\u0000i\\u0000o\\u0000n\\u0000=\\u00003\\u0000.\\u00000\\u0000.\\u00000\\u0000.\\u00000\\u0000,\\u0000 \\u0000C\\u0000u\\u0000l\\u0000t\\u0000u\\u0000r\\u0000e\\u0000=\\u0000n\\u0000e\\u0000u\\u0000t\\u0000r\\u0000a\\u0000l\\u0000,\\u0000 \\u0000P\\u0000u\\u0000b\\u0000l\\u0000i\\u0000c\\u0000K\\u0000e\\u0000y\\u0000T\\u0000o\\u0000k\\u0000e\\u0000n\\u0000=\\u00003\\u00001\\u0000b\\u0000f\\u00003\\u00008\\u00005\\u00006\\u0000a\\u0000d\\u00003\\u00006\\u00004\\u0000e\\u00003\\u00005\\u0000'\\u0000 \\u0000o\\u0000u\\u0000 \\u0000u\\u0000n\\u0000e\\u0000 \\u0000d\\u0000e\\u0000 \\u0000s\\u0000e\\u0000s\\u0000 \\u0000d\\u0000\ufffd\\u0000p\\u0000e\\u0000n\\u0000d\\u0000a\\u0000n\\u0000c\\u0000e\\u0000s\\u0000.\\u0000 \\u0000L\\u0000e\\u0000 \\u0000f\\u0000i\\u0000c\\u0000h\\u0000i\\u0000e\\u0000r\\u0000 \\u0000d\\u0000e\\u0000 \\u0000p\\u0000a\\u0000g\\u0000i\\u0000n\\u0000a\\u0000t\\u0000i\\u0000o\\u0000n\\u0000 \\u0000e\\u0000s\\u0000t\\u0000 \\u0000i\\u0000n\\u0000s\\u0000u\\u0000f\\u0000f\\u0000i\\u0000s\\u0000a\\u0000n\\u0000t\\u0000 \\u0000p\\u0000o\\u0000u\\u0000r\\u0000 \\u0000t\\u0000e\\u0000r\\u0000m\\u0000i\\u0000n\\u0000e\\u0000r\\u0000 \\u0000c\\u0000e\\u0000t\\u0000t\\u0000e\\u0000 \\u0000o\\u0000p\\u0000\ufffd\\u0000r\\u0000a\\u0000t\\u0000i\\u0000o\\u0000n\\u0000.\\u0000 \\u0000(\\u0000E\\u0000x\\u0000c\\u0000e\\u0000p\\u0000t\\u0000i\\u0000o\\u0000n\\u0000 \\u0000d\\u0000e\\u0000 \\u0000H\\u0000R\\u0000E\\u0000S\\u0000U\\u0000L\\u0000T\\u0000 \\u0000:\\u0000 \\u00000\\u0000x\\u00008\\u00000\\u00000\\u00007\\u00000\\u00005\\u0000A\\u0000F\\u0000)\\u0000\\r\\u0000\\n\\u0000\"}\u001b[0m",
        "verbosity": 0,
        "start_line": 207,
        "end_line": 208,
        "created": "2020-07-07T09:30:53.679Z",
        "job": 1440,
        "host_name": "hostname.com",
        "parent_uuid": "0242ac12-0006-80ff-50f5-0000000008bb",
        "event_display": "Host Failed",
        "cluster_host_id": "awx",
        "tower_uuid": null
    }
}

I only changed hostname value for confidentiality reason.

I don't see JSON formatting error...

Edited because of mistake on my side

@ryanpetrello
Copy link
Contributor

Is it possible that AWX exposes a bad formatted json to rsyslog ?

I wouldn't think so - we're not configuring the template as JSON, so I'm not sure why rsyslogd would be attempting to interpret it as such:

template(name="awx" type="string" string="%rawmsg-after-pri%")

@IshwarKanse
Copy link

One of the customer is also running into missing job events issue with that invalid year messages in the awx-rsyslog.log file. Its using omfwd though instead of omhttp.

9431.199455280:main Q:Reg/w0  : main Q: queue.c: dequeued 0 consumable elements, szlog 0 sz phys 0
9431.199460781:main Q:Reg/w0  : queue.c: regular consumer finished, iret=4, szlog 0 sz phys 0
9431.199465381:main Q:Reg/w0  : wti.c: main Q:Reg/w0: worker IDLE, waiting for work.
9431.427717136:imuxsock.c     : imuxsock.c: Message from UNIX socket: #4, size 340
9431.427732037:imuxsock.c     : parser.c: dropped LF at very end of message (DropTrailingLF is set)
9431.427738637:imuxsock.c     : datetime.c: ParseTIMESTAMP3339: invalid year: 0, pszTS: '"'
9431.427758939:imuxsock.c     : strm 0x557645c69740: stream.c: file 6 strmFlush
9431.427764539:imuxsock.c     : strm 0x557645c69740: stream.c: strmFlushinternal: file 6(/var/lib/awx/awx-external-logger-backlog.00000001) flush, buflen 724
9431.427769239:imuxsock.c     : strm 0x557645c69740: stream.c: file 6(/var/lib/awx/awx-external-logger-backlog.00000001) doWriteInternal: bFlush 1
9431.427774140:imuxsock.c     : stream.c: strmPhysWrite, stream 0x557645c69740, len 724
9431.427790541:imuxsock.c     : strm 0x557645c69740: stream.c: file 6 write wrote 724 bytes
9431.427796641:imuxsock.c     : main Q: queue.c: write wrote 724 octets to disk, queue disk size now 1392 octets, EnqOnly:0
9431.427808042:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
9431.427813342:imuxsock.c     : imuxsock.c: --------imuxsock calling poll() on 2 fds

The protocol is TCP and the remote logging aggregator is Logpoint

action(type="omfwd" target="x.x.x.x" port="514" protocol="tcp" action.resumeRetryCount="-1" action.resumeInterval="20" template="awx")

With tcp protocol and omfwd, it doesn't generate a rsyslog.err log file.

@icdatev
Copy link

icdatev commented Jan 18, 2022

Same issue over here. Setting action.resumeRetryCount=10 doesn't change anything. rsyslogd stops sending events/logs to splunk. Only a restart of rsyslogd helps. Any further workarounds?

@icdatev
Copy link

icdatev commented Jan 19, 2022

@ryanpetrello Seems to be a problem because of the escaped postdata. If we unescape the postdata everything is fine. No 400 error. Btw.: This error comes out of Splunk.
Is it possible, that not every postdata is escaped? We could not figure this out because even with $DebugLevel 2 the postdata is not being logged out.
We get this error (because of the escaped postdata) when we do some ldap searches with community.general.ldap_search.

@icdatev
Copy link

icdatev commented Jan 21, 2022

Is it possible to modify the rsyslog-template (template(name="awx" type="string" string="%rawmsg-after-pri%")), so the double escape disappears again before the event gets send to splunk?

Splunk won't accept an escaped event!!!
It is no problem if just the message is escaped. But it is a problem when the whole postdata is escaped!

@icdatev
Copy link

icdatev commented Jan 21, 2022

Here are some news. We have found out, that the problem with Splunk is because of the (unknown? or user defined character) unicode charcters that start with \udc.

When we send i.e. just one \udc00 we geht {"text":"Invalid data format","code":6,"invalid-event-number":0} from Splunk.

@rbertol
Copy link

rbertol commented Nov 16, 2022

Hey @icdatev, are you able to get a workaround for this issue? If yes, can you please share it with us?

@icdatev
Copy link

icdatev commented Nov 17, 2022

@rbertol Our "workaround" is a automation, which recognizes the error and restarts the rsyslogd. That works good, the need of this is of course unsatisfying. The real problem is how rsyslogd reacts to 4xx errors. See also rsyslog/rsyslog#4348 (comment). On Red Hat 8 it is now fixed. Don't know about other distros.

@Crimrose
Copy link

Crimrose commented Dec 10, 2023

I still got this issue today with operator version 2.7.2. Do we have any ways to solve it?
Updated:

    - setting: LOG_AGGREGATOR_TCP_TIMEOUT
      value: 10

I modified this config and it works!

TheRealHaoLiu added a commit to TheRealHaoLiu/awx that referenced this issue Dec 11, 2023
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
TheRealHaoLiu added a commit to TheRealHaoLiu/awx that referenced this issue Dec 12, 2023
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
TheRealHaoLiu added a commit to TheRealHaoLiu/awx that referenced this issue Dec 12, 2023
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error

Update Dockerfile.j2
TheRealHaoLiu added a commit to TheRealHaoLiu/awx that referenced this issue Dec 12, 2023
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
TheRealHaoLiu added a commit to TheRealHaoLiu/awx that referenced this issue Dec 14, 2023
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
TheRealHaoLiu added a commit that referenced this issue Dec 14, 2023
Due to #7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
djyasin added a commit to djyasin/awx that referenced this issue Jan 4, 2024
Fixed ldap url a bit and cleaned up a few things.

Changed conditional back to adn rather than or and changed var back to password rather than username.

fix wsrelay connection in ipv6 environments

Fix: ansible#14511 Add alt-text codeblock to Images for Userguide: main_menu.rst

Signed-off-by: Ratan Gulati <[email protected]>

Revised the proposed Alt text for main_menu.rst

Update insights.rst

Update insights.rst

Revised proposed alt text for insights.rst

Updated images - Workflow Templates chapter of Userguide.

Removed references to images that were deleted.

Fix: ansible#14523 Add alt-text codeblock to Images for workflow_template.rst (ansible#14604)

* add alt to images in workflow_templates.rst

Signed-off-by: Ratan Gulati <[email protected]>

* add alt to images in workflow_templates.rst

Signed-off-by: Ratan Gulati <[email protected]>

* Update workflow_templates.rst

* Revised proposed alt text for workflow_templates.rst

---------

Signed-off-by: Ratan Gulati <[email protected]>
Co-authored-by: TVo <[email protected]>

Added alt text for settings-menu.rst (ansible#14639)

* Re-do for PR ansible#14595 to fix CI issues.

* Added alt text to settings-menu.rst

* Update docs/docsite/rst/common/settings-menu.rst

Co-authored-by: Don Naro <[email protected]>

---------

Co-authored-by: Hao Liu <[email protected]>
Co-authored-by: Don Naro <[email protected]>

Docs: Include REST API reference content from swagger.json (ansible#14607)

Setting credential_type as required (ansible#14651)

* Setting credential_type as required

* Added test for missing credential_type in credential module

* Corrected test assertion

---------

Co-authored-by: Lucas Benedito <[email protected]>

Replaced with larger graphic. (ansible#14647)

Correctly handle case where unpartitioned table does not exist (ansible#14648)

Set subscription type as developer for developer subscriptions. (ansible#14584)

* Set subscription type as developer for developer subscriptions.

Signed-off-by: Tong He <[email protected]>

* Set subscription type as developer for developer subscription manifests.

Signed-off-by: Tong He <[email protected]>

* Remedy the wrong character to assign value.

Signed-off-by: Tong He <[email protected]>

* Reformat licensing.py by black.

Signed-off-by: Tong He <[email protected]>

---------

Signed-off-by: Tong He <[email protected]>

issue ansible#14653 heading does not render correctly (ansible#14665)

Adding the possibility to decode base64 decoded strings to Delinea's Devops Secret Vault (DSV) (ansible#14646)

Adding the possibility to decode base64 decoded strings to Delinea's Devops Secret Vault (DSV).
This is necessary as uploading files to DSV is not possible (and not meant to be) and files should be added base64 encoded.
The commit is making sure to remain backward compatible (no secret decoding), as a default is supplied.

This has been tested with DSV and works for secrets that are base64 encoded and secrets that are not base64 encoded (which is the default).

Signed-off-by: Steffen Scheib <[email protected]>

Added missing pointers to release notes (ansible#14659)

* Replaced with larger graphic.

* Revert "Replaced with larger graphic."

This reverts commit 1214b00.

* Added missing pointers to release notes.

Make vault init more idempotent (ansible#14664)

Currently if you cleanup docker volume for vault and bring docker-compose development back up with vault enabled we will not initialize vault because the secret files still exist.

This change will attempt to initialize vault reguardless and update the secret file if vault is initialized

Remove unused methods we attach to user model (ansible#14668)

Update RBAC docs, remove unused get_permissions (ansible#14492)

* Update RBAC docs, remove unused get_permissions

* Add back in section for get_roles_on_resource

Upgrade doc requirements (ansible#14669)

* upgrade when pip compiling doc reqs

* upgrade doc requirements

allow pytest --migrations to succeed (ansible#14663)

* allow pytest --migrations to succeed

* We actually subvert migrations from running in test via pytest.ini
  --no-migrations option. This has led to bit rot for the sqlite
  migrations happy path. This changeset pays off that tech debt and
  allows for an sqlite migration happy path.
* This paves the way for programatic invocation of individual migrations
  and weaving of the creation of resources (i.e. Instance, Job Template,
  etc). With this, a developer can instantiate various database states,
  trigger a migration, assert the state of the db, and then have pytest
  rollback all of that.
* I will note that in practice, running these migrations is dog shit
  slow BUT this work also opens up the possibility of saving and
  re-using sqlite3 database files. Normally, caching is not THE answer
  and causes more harm than good. But in this case, our migrations are
  mostly write-once (I say mostly because this change set violates
  that :) so cache invalidation isn't a major issue.

* functional test for migrations on sqlite

* We commonly subvert running migrations in test land. Test land uses
  sqlite. By not constantly exercising this code path it atrophies. The
  smoke test here is to continuously exercise that code path.
* Add ci test to run migration tests separately, they take =~ 2-3
  minutes each on my laptop.
* The smoke tests also serves as an example of how to write migration
  tests.

* run migration tests in ci

Fix awx collection publishing on galaxy (ansible#14642)

--location (-L) parameter will prompt curl to submit a new request if the URL is a redirect.

After moving to galaxy-NG without -L the curl falsely return 302 for any version

Co-authored-by: John Barker <[email protected]>

Fixing wsrelay connection loop (ansible#14692)

* Fixing wsrelay connection loop

* The loop was being interrupted when reaching the return statements, causing a race condition that would make nodes remain disconnected from their websockets
* Added log messages for the previous return state to improve the logging from this state.

* Added logging for malformed payload

* Update awx/main/wsrelay.py

Co-authored-by: Rick Elrod <[email protected]>

* Moved logmsg outside condition

---------

Co-authored-by: Lucas Benedito <[email protected]>
Co-authored-by: Rick Elrod <[email protected]>

Fix the bulk Job Launch Integration test in awx collection (ansible#14702)

* fix the integration tests

[CI] Reduce GHA timeouts from 6h default (ansible#14704)

* [CI] Reduce GHA timeouts from 6h default

The goal here is to never interfere with a real run (so most of the
timeout-minutes values seem rather high) but to avoid having 6h long
runs if something goes crazy and never ends.

Signed-off-by: Rick Elrod <[email protected]>

* Do bash hackery instead

Signed-off-by: Rick Elrod <[email protected]>

---------

Signed-off-by: Rick Elrod <[email protected]>

separate tox calls in readthedocs config (ansible#14673)

Add TLS certificate auth for HashiCorp Vault (ansible#14534)

* Add TLS certificate auth for HashiCorp Vault

Add support for AWX to authenticate with HashiCorp Vault using
TLS client certificates.

Also updates the documentation for the HashiCorp Vault secret management
plugins to include both the new TLS options and the missing Kubernetes
auth method options.

Signed-off-by: Andrew Austin <[email protected]>

* Refactor docker-compose vault for TLS cert auth

Add TLS configuration to the docker-compose Vault configuration and
use that method by default in vault plumbing.

This ensures that the result of bringing up the docker-compose stack
with vault enabled and running the plumb-vault playbook is a fully
working credential retrieval setup using TLS client cert authentication.

Signed-off-by: Andrew Austin <[email protected]>

* Remove incorrect trailing space

Co-authored-by: Hao Liu <[email protected]>

* Make vault init idempotent

- improve error handling for vault_initialization
- ignore error if vault cert auth is already configured
- removed unused register

* Add VAULT_TLS option

Make TLS for HashiCorp Vault optional and configurable via VAULT_TLS env var

* Add retries for vault init

Sometime it took longer for vault to fully come up and init will fail

---------

Signed-off-by: Andrew Austin <[email protected]>
Co-authored-by: Hao Liu <[email protected]>
Co-authored-by: Hao Liu <[email protected]>

remove unnecessary required flags for saml backend (ansible#14666)

Signed-off-by: Tyler Muir <[email protected]>

Dependabot for docsite requirements (ansible#14670)

Add django-ansible-base (ansible#14705)

* add django-ansible-base

Signed-off-by: jessicamack <[email protected]>

* add licenses

* add django-ansible-base

Signed-off-by: jessicamack <[email protected]>

* add licenses

* apply patch to fix permissions issue

---------

Signed-off-by: jessicamack <[email protected]>

Remove incorrectly formatted line from requirements.txt (ansible#14714)

remove git+ line

Fix updater bug due to missing newline at EOF (ansible#14713)

Fix undefined error in settings/logging/edit form (ansible#14715)

Fix undefined error in logging settings edit form

Update setuptools-scm (ansible#14716)

* properly format requirement

* upgrade setuptools_scm

* Revert "properly format requirement"

This reverts commit 4c87929.

* test ansible-runner package upgrade

* Revert "test ansible-runner package upgrade"

This reverts commit ba4b74f.

Adding hosts bulk deletion feature (ansible#14462)

* Adding hosts bulk deletion feature

Signed-off-by: Avi Layani <[email protected]>

* fix the type of the argument

Signed-off-by: Avi Layani <[email protected]>

* fixing activity_entry tracking

Signed-off-by: Avi Layani <[email protected]>

* Revert "fixing activity_entry tracking"

This reverts commit c8eab52.
Since the bulk_delete is not related to an inventory, only hosts which
can be from different inventories.

* get only needed vars to reduce memory consumption

Signed-off-by: Avi Layani <[email protected]>

* filtering the data to reduce memory increase the number of queries

Signed-off-by: Avi Layani <[email protected]>

* update the activity stream for inventories

Signed-off-by: Avi Layani <[email protected]>

* fix the changes dict initialiazation

Signed-off-by: Avi Layani <[email protected]>

---------

Signed-off-by: Avi Layani <[email protected]>

Remove superwatcher from docker-compose dev (ansible#14708)

When making changes to the application sometime you can accidentally cause FATAL state and cause the dev container to crash which will remove any ephemeral changes that you have made and is ANNOYING!

Recover rsyslog from 4xx error

Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error

Correct misuse of stdxxx_event_enabled

Not every log messages need to be emitted as a event!

Send SIGKILL to rsyslog if hard cancellation is needed

Simplify RBAC get_roles_on_resource method (ansible#14710)

* Simplify RBAC get_roles_on_resource method

* Fix bug

* Fix query type bug

Narrow the actor types accepted for RBAC evaluations (ansible#14709)

* Narrow the scope of RBAC evaluations

* Update tests for RBAC method changes

* Simplify querset for credentials in org

* Fix call pattern to pass in team role obj

Use filtering/sorting from django-ansible-base (ansible#14726)

* Move filtering to DAB

* add comment to trigger building a new image

Signed-off-by: jessicamack <[email protected]>

* remove unneeded comment

Signed-off-by: jessicamack <[email protected]>

* remove unused imports

Signed-off-by: jessicamack <[email protected]>

* change mock import

Signed-off-by: jessicamack <[email protected]>

---------

Signed-off-by: jessicamack <[email protected]>
Co-authored-by: jessicamack <[email protected]>

add awx collection export tests

* Basic export tests
* Added test that highlights a problem with running Schedule exports as
  non-root user. We rely on the POST key in the OPTIONS response to
  determine the fields to export for a resource. The POST key is not
  present if a user does NOT have create privileges.
* Fixed up forwarding all headers from the API server back to the test
  code. This was causing a problem in awxkit code that checks for
  allowed HTTP Verbs in the headers.

refactor awxkit import code

* Move awxkit import code into a pytest fixture to better control when
  the import happens
* Ensure /awx_devel/awxkit is added to sys path before awxkit import
  runs

Fix twilio_backend.py to send SMS to multiple destinations. (ansible#14656)

AWX only sends Twilio notifications to one destination with the current version of code, but this is a bug. Fixed this bug for sending SMS to multiple destinations.

Persist schedule prompt on launch fields when editing (ansible#14736)

* persist schedule prompt on launch fields when editing

* Merge job template default credentials with schedule overrides in schedule prompt

* rename vars for clarity

* handle undefined defaultCredentials

---------

Co-authored-by: Michael Abashian <[email protected]>
djyasin pushed a commit to djyasin/awx that referenced this issue Sep 16, 2024
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
djyasin pushed a commit to djyasin/awx that referenced this issue Nov 11, 2024
Due to ansible#7560

'omhttp' module for rsyslog will completely stop forwarding message to external log aggregator after receiving a 4xx error from the external log aggregator

This PR is an "workaround" for this problem by restarting rsyslogd after detecting that rsyslog received a 4xx error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants