-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] zos_operator_action_query reported UnicodeDecodeError when the outstanding messages contain special characters #776
Comments
@cdlwhui - do you happen to have a full traceback (-vvvv) log, I wanted to see where in the stack it occurred because we have come across this (not for this module) in ansible-core later versions and are looking into it. A full traceback would really help pinpoint which part of the stack to focus on. I suppose there is more to this block we can't see, eg a
Although not a solution and you probably know this, but you could add |
I do not have full traceback log currently but I can collect it when we hit the issue next time. I think I can also try creating an outstanding message with the same special content above to recreate the problem. Yes, |
I could also inject the noted block (below) but those seem to be mapped to a utf-8 code set, its what i can't see that is likely what is causing the issue, for example, you can't see Any byte stream which cannot be assigned to a UTF-8 text encoding will break the reading of it, this is why I think there is more not visible. I have an idea where it is happening, we can work from that for now, but if you do get a log that would be super helpful.
My assumption without a recreate or backtrace is that this occurred when the response came back and was read by python which encountered a byte that can not be assigned to UTF-8. Thus adding a 'waiting on dependency' tag. |
@cdlwhui - the output you shared above, it did not obviously come back from Ansible since it broke, how did you get that output? Is that from a x3270? Next time it happens in addition to a traceback could you also run these commands from x3270 |
I'm curious as to how you retrieved the sysplex messages. Did you issue an operator command somehow and copy and paste the output? Was it from SDSF, ZOAU |
Yes, I had a test to generate the message but it did not hit the issue, query using zos_operator_action_query successed. I got the message using 'd r,r' command. I can run the commands |
I issued operator command 'd r,r' from the console and copied the output. The corrupted message is on the queue but there is nothing happened when just replying it. It disappeared until the LPAR is shutted down and restarted. |
Thank you, that's good to know. It means that ZOAU isn't corrupting anything, but instead is just processing whatever is on the message queue. I think there is a bigger question as to what can be done with "unprintable" characters here. ZOAU |
The teams will be working on resolving this; whenever on z/OS we always have to take into account any byte stream which cannot be assigned to a UTF-8 text encoding will break the contract imposed on us by python which text must be UTF-8 assignable. Thank you @cdlwhui , and if you encounter this again, the requested information could be of help to us while the teams try to resolve this. We understand the importance of this. |
You are welcome. |
See the following issues below, they are all related to non-printable UTF-8 chars, essentially chararcetors that don't don't correspond to a UTF-8 value. This is being addressed as noted in ZOAU 1.3 and IBM z/OS Core collection will adopt ZOAU 1.3 in Q1 2024. There is a recreate involving a job submission you can see here. This work is also being tracked in JIRA 9687.
|
@cdlwhui - This work has been scheduled to be tested in iteration 4 (Feb 26th) approximately, reading over the history I don't see that we were able to recreate this, if we knew the unprintable character, we could simulate the response. When it comes time to evaluate this issue in iteration 4 we have 2 choices, recreate by understanding the unprintable char or if you could take an early development build and evaluate it yourself. Otherwise, we will have to close this issue and assume since this overlaps with other code we were able to resolve it. |
Thank you for the update @ddimatos . If you could not recreate, I think it should be ok to close this and we will keep monitoring in our environment with the fix. |
@cdlwhui - all I can think of is putting a WTO on the console using some C then reading them, such a test case would probably have to go into our backlog for the time being. |
The ZOAU command |
Was unable to test this on our systems. Display commands didn't return unprintable characters and seems like WTO messages automatically filter any that we may introduce ourselves:
From the WTO docs under the For now I would ask you @cdlwhui if you could take an early build of v1.10.0-beta.1 and test on your end, if you have the opportunity. |
I've managed to get unprintable characters to display to the console, but that's usually because the offending code has a bug and is displaying raw memory or something. I think there is some filtering going on, but it's not at all layers of the stack. ZOAU has recently started filtering |
@rexemin We can take an early build v1.10.0-beta.1 and test on our environment. We have not hit the WTO message for months but it was just the time it occurred again last week. We suspected it might be caused by a RRS test case but has not determined which one. We opened a case to ask for help to identify the cause. If we can recreate it, we will be able to test soon, otherwise, we may need to keep monitoring and test when it occurred again. For now, I would ask how to get the early build v1.10.0-beta.1? And does it require a specific ZOAU version? |
@cdlwhui - our You can install directly from our repo with either of these commands:
This version will minimally require zoau 1.3.0. |
Thank you @ddimatos. Is it able to install this |
@cdlwhui - while I have not tried to install a development branch using
Steps:
I believe the current behavior of |
This issue has been without update from the user for more than 3 months and still waiting on response, hence I'm closing it due to inactivity. |
Is there an existing issue for this?
Are the dependencies a supported version?
IBM Z Open Automation Utilities
v1.2.2
IBM Enterprise Python
v3.11.x
IBM z/OS Ansible core Version
v1.5.0
ansible-version
v2.12.x
z/OS version
v2.5
Ansible module
zos_operator_action_query
Bug description
It failed when using zos_operator_action_query to check a wtor in the following playbook:
At the failure time, the outstanding messages in the sysplex are as below.
The playbook can be run successfully when there are no following messages which should be the root cause. The special characters caused the error and it block the whole playbook running and caused the automation job break.
Is there a way to handle the message containing the special charaters without breaking the task running?
Playbook verbosity output.
Ansible configuration.
No response
Contents of the inventory
No response
Contents of
group_vars
orhost_vars
No response
The text was updated successfully, but these errors were encountered: