Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to retrieve exit code of a failed service call #183

Merged
merged 2 commits into from
Dec 11, 2024

Conversation

marmarek
Copy link
Member

@marmarek marmarek commented Dec 3, 2024

When service call fails, the remote (service) side sends EOF + exit code
and closes the vchan immediately. If the client side tries to send
something, it will fail and exit immediately (with code 1). But reading
data that was queued in the vchan before its closing is still possible.
So, on send error check if there is anything interesting to receive
(especially exit code, but potentially also some service output) and, if
yes, don't exit immediately. Since the service exit code is sent last
(after all stdout/stderr data and their EOF), it's okay to check just
for remote_status.
This is relevant only to the service client side, as exit status of the
service-handling process is not relevant.

Fixes QubesOS/qubes-issues#9618

When service call fails, the remote (service) side sends EOF + exit code
and closes the vchan immediately. If the client side tries to send
something, it will fail and exit immediately (with code 1). But reading
data that was queued in the vchan before its closing is still possible.
So, on send error check if there is anything interesting to receive
(especially exit code, but potentially also some service output) and, if
yes, don't exit immediately. Since the service exit code is sent last
(after all stdout/stderr data and their EOF), it's okay to check just
for remote_status.
This is relevant only to the service client side, as exit status of the
service-handling process is not relevant.

Fixes QubesOS/qubes-issues#9618
Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 97.36842% with 1 line in your changes missing coverage. Please review.

Project coverage is 79.11%. Comparing base (5c028c0) to head (df25090).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
libqrexec/process_io.c 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #183      +/-   ##
==========================================
+ Coverage   79.05%   79.11%   +0.05%     
==========================================
  Files          54       54              
  Lines        9916     9953      +37     
==========================================
+ Hits         7839     7874      +35     
- Misses       2077     2079       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@qubesos-bot
Copy link

qubesos-bot commented Dec 3, 2024

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024121004-4.3&flavor=pull-requests

Test run included the following:

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024111705-4.3&flavor=update

  • system_tests_extra

    • TC_00_QVCTest_whonix-workstation-17: test_010_screenshare (failure)
      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^... AssertionError: 0 == 0
  • system_tests_kde_gui_interactive

    • gui_keyboard_layout: wait_serial (wait serial expected)
      # wait_serial expected: "echo -e '[Layout]\nLayoutList=us,de' | sud...

Failed tests

4 failures
  • system_tests_extra

    • TC_00_QVCTest_whonix-gateway-17: test_010_screenshare (failure)
      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^... AssertionError: 0 == 0

    • TC_00_QVCTest_whonix-workstation-17: test_010_screenshare (failure)
      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^... AssertionError: 0 == 0

  • system_tests_kde_gui_interactive

    • gui_keyboard_layout: wait_serial (wait serial expected)
      # wait_serial expected: "echo -e '[Layout]\nLayoutList=us,de' | sud...

    • gui_keyboard_layout: Failed (test died)
      # Test died: command 'test "$(cd ~user;ls e1*)" = "$(qvm-run -p wor...

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/119126#dependencies

2 fixed
  • system_tests_audio@hw1

  • system_tests_basic_vm_qrexec_gui_zfs

    • switch_pool: Failed (test died)
      # Test died: command 'dnf install -y ./zfs-release.rpm' failed at /...

Unstable tests

  • system_tests_audio

    TC_20_AudioVM_PipeWire_fedora-40-xfce/test_260_audio_mic_enabled_switch_audiovm (1/5 times with errors)
    • job 117586 AssertionError: too short audio, expected 10s, got 0.00013605442176...
  • system_tests_audio@hw1

    TC_20_AudioVM_PipeWire_fedora-40-xfce/test_260_audio_mic_enabled_switch_audiovm (1/5 times with errors)
    • job 117586 AssertionError: too short audio, expected 10s, got 0.00013605442176...

Check if the service exit code is correctly retrieved even if the
the service terminates at the exact moment the qrexec-client tries to
send some data.
Try to win the race by initially sending SIGSTOP to the qrexec-client
process, and sending SIGCONT only after preparing both local and remote
data streams. qrexec-client will handle local data stream first, at
which point remote socket is already closed.

Similar issue applies to qrexec-client-vm, but since the implementation
is shared, one test is enough.

QubesOS/qubes-issues#9618
@marmarek marmarek merged commit df25090 into QubesOS:main Dec 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong qrexec-client exit code on failed service call
2 participants