Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout after 3 seconds #749

Closed
larus-argentatus-sec opened this issue Mar 15, 2024 · 14 comments · Fixed by #772
Closed

Timeout after 3 seconds #749

larus-argentatus-sec opened this issue Mar 15, 2024 · 14 comments · Fixed by #772
Labels
help wanted Extra attention is needed timeout Dangerzone Times Out
Milestone

Comments

@larus-argentatus-sec
Copy link

I recently installed Dangerzone (Version 0.6.0) from the Dangerzone Repo in Fedora 39 (Silverblue).

When converting a document (tested with .pdf and .png) the document fails to convert. Always at 49% (when it starts converting the document to pixels).

The error message is [ERROR] [doc 6XF2yi] 0% Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds

Fedora Silverblue comes with Podman pre-installed. I am not sure if this is useful information.

Thank you and let me know if you need anything else.

@deeplow
Copy link
Contributor

deeplow commented Mar 21, 2024

Thank you for reporting the issue, we'll look into it, potentially next week. We do not officially support Fedora Silverblue, but the behavior may be similar to Fedora 39.

For reference @apyrgio I belive this issue happens on this line:

https://github.com/freedomofpress/dangerzone/blob/634523d/dangerzone/isolation_provider/base.py#L76

This means that somehow the conversion process is not terminating, it seems.

@deeplow deeplow added the timeout Dangerzone Times Out label Mar 21, 2024
@apyrgio
Copy link
Contributor

apyrgio commented Mar 22, 2024

Thanks for the report @larus-argentatus-sec. I guess we need a Fedora Silverblue machine to test it out, but I haven't got to it yet. In the meantime, can you give us the output of the following commands?

podman run hello-world
/usr/bin/podman run --network none -u dangerzone --log-driver none --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -i dangerzone.rocks/dangerzone echo "It works"

Also, if you haven't done so already, can you please run Dangerzone from CLI (see dangerzone-cli command), and paste here its error output, when you convert the document?

@larus-argentatus-sec
Copy link
Author

Yes!

podman run hello-world

Resolved "hello-world" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull quay.io/podman/hello:latest...
Getting image source signatures
Copying blob 77dc5236f342 done   | 
Copying config 17b9204932 done   | 
Writing manifest to image destination
!... Hello Podman World ...!

         .--"--.           
       / -     - \         
      / (O)   (O) \        
   ~~~| -=(,Y,)=- |         
    .---. /`  \   |~~      
 ~/  o  o \~~~~.----. ~~   
  | =(X)= |~  / (O (O) \   
   ~~~~~~~  ~| =(Y_)=-  |   
  ~~~~    ~~~|   U      |~~ 

Project:   https://github.com/containers/podman
Website:   https://podman.io
Desktop:   https://podman-desktop.io
Documents: https://docs.podman.io
YouTube:   https://youtube.com/@Podman
X/Twitter: @Podman_io
Mastodon:  @[email protected]
It works

/usr/bin/podman run --network none -u dangerzone --log-driver none --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -i dangerzone.rocks/dangerzone echo "It works"

It works

dangerzone-cli test.pdf


Assigning ID 'o1bAez' to doc '/var/home/x/Downloads/test.pdf'

Converting document to safe PDF
> /usr/bin/podman run --network none -u dangerzone --log-driver none --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -i dangerzone.rocks/dangerzone /usr/bin/python3 -m dangerzone.conversion.doc_to_pixels
[doc o1bAez] 0% Converting page 1/1 to pixels
[doc o1bAez] 49% Converted document to pixels
ERROR An exception occurred while converting document 'o1bAez'
Traceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/dangerzone/isolation_provider/base.py", line 76, in convert
    conversion_proc.wait(3)
  File "/usr/lib64/python3.12/subprocess.py", line 1264, in wait
    return self._wait(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/subprocess.py", line 2043, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds
ERROR [doc o1bAez] 0% Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds

Failed to convert document(s)
/var/home/x/Downloads/test.pdf

@apyrgio apyrgio mentioned this issue Mar 23, 2024
@apyrgio
Copy link
Contributor

apyrgio commented Mar 26, 2024

(@zyberzebra CC'ing you here, so that we can use a single issue for investigating this error)

Thank you both for the CLI output. We now know that this error happens here:

conversion_proc.wait(3)

This was introduced as a sanity check, in case the conversion had finished successfully, but a Podman/Docker process was still lingering. My guess is that this happens because we don't close the container's stderr, as we close its stdout:

# Ensure nothing else is read after all bitmaps are obtained
p.stdout.close()

Still, we haven't encountered yet a conversion run that has generated that much output to stderr. This is an indication that you are hitting an issue in the container, and the buggy stderr is the canary.

So, let's dig a bit deeper.

@larus-argentatus-sec: Can you verify in your system that you can convert a simple .doc?

Also, questions to all:

  1. Without going into much details, are the docs you're converting particularly large, use a specific file extension, or are coming from a single source?
  2. If you are comfortable with running CLI commands, could you please run Dangerzone from source (see BUILD.md, right up to the dev_scripts/dangerzone command), and check if the error occurs there as well?

@apyrgio
Copy link
Contributor

apyrgio commented Mar 26, 2024

My guess is that this happens because we don't close the container's stderr, as we close its stdout:

Thankfully, this is not the case. When a user runs Dangerzone, we pipe stderr to /dev/null (or the OS equivalent), so there's no need to close the stderr. Writing to stderr happens only on development runs, which is not the case here.

if getattr(sys, "dangerzone_dev", False) == True:
self.proc_stderr = subprocess.PIPE
else:
self.proc_stderr = subprocess.DEVNULL

The only operation that happens right before the conversion ends is printing the LibreOffice output to stderr:

# Write debug information
await DocumentToPixels.write_bytes(converter.captured_output, file=sys.stderr)

Again though, stderr points to /dev/null, which means that writing this output should finish real quick, definitely under a second. Also, in the output that @larus-argentatus-sec provided above, only one page is converted, which makes this scenario even more unlikely.

@apyrgio
Copy link
Contributor

apyrgio commented Mar 26, 2024

@larus-argentatus-sec I've installed Dangerzone using rpm-ostree. I assume you did the same thing as well, right? In any case, the sample document I provided you with seems to work in my installation.

@zyberzebra
Copy link

  1. Without going into much details, are the docs you're converting particularly large, use a specific file extension, or are coming from a single source?
  2. If you are comfortable with running CLI commands, could you please run Dangerzone from source (see BUILD.md, right up to the dev_scripts/dangerzone command), and check if the error occurs there as well?
  • pdf files
  • Size is between 20-50mb

@apyrgio
Copy link
Contributor

apyrgio commented Mar 26, 2024

Ok, so fairly large files. Are they in the order of 100s / 1000s of pages? If not, do they contain any multimedia?

@zyberzebra
Copy link

<400 but a lot of pictures

@apyrgio
Copy link
Contributor

apyrgio commented Mar 28, 2024

Thanks @zyberzebra. I just sanitized a 5000 pages document where each page has a picture. Still not being able to reproduce it. Will look into it more. In the meantime, any extra hint helps.

@apyrgio apyrgio changed the title F39 - Timeout Timeout after 3 seconds Mar 30, 2024
@apyrgio apyrgio added the help wanted Extra attention is needed label Mar 30, 2024
@apyrgio apyrgio pinned this issue Mar 30, 2024
@apyrgio
Copy link
Contributor

apyrgio commented Mar 30, 2024

The following is in response to #754 (comment):

i have had this problem, and by editing
dangerzone/isolation_provider/base.py

line 76 conversion_proc.wait(3) to '10' conversion worked.
I guess I should submit a MR to make this timeout something in settings?

@banerian Thanks a lot for this extra info. This shows that whatever is blocking the container from quitting, does not stay on indefinitely. However, the proper fix here is to find out what's the blocker in the first place. Waiting for an arbitrary number of seconds during each conversion will not cut it, I'm afraid.

I have some questions, if you have some time:

  1. What is your OS and Dangerzone version?

  2. Can you convert a simple document with no errors, or does that fail as well?

  3. Can you give us any specifics about the documents that you are converting and you hit this error?

    • Think document type (.pdf, .doc, etc.), number of pages, if they come from a particular source, etc.
    • Even better, if you can share a ⚠️ non-sensitive ⚠️ document with us that triggers this error, that would be amazing. You can hit me up at [email protected], if you don't want to share something publicly.
  4. Since you can edit the Dangerzone source, could you make the following change in the dangerzone/isolation_provider/container.py file, run dangerzone-cli, and send us its output?

    diff --git a/dangerzone/isolation_provider/container.py b/dangerzone/isolation_provider/container.py
    index cbadcc3..6553677 100644
    --- a/dangerzone/isolation_provider/container.py
    +++ b/dangerzone/isolation_provider/container.py
    @@ -165,7 +165,7 @@ class Container(IsolationProvider):
                 args,
                 stdin=subprocess.PIPE,
                 stdout=subprocess.PIPE,
    -            stderr=self.proc_stderr,
    +            #stderr=self.proc_stderr,
                 startupinfo=startupinfo,
             )

(the above applies to anyone affected by this issue, thanks in advance)

@banerian
Copy link

the OS is linux, debian 12, kernel in a VM6.1.0-18-amd64
dangerzone version is the deb package from FoP, version 0.6.0
the document used was a sketchy pdf i got in email, so i will not provide that.
it was a single page.

after resetting timeout to 3, and with the diff above, I got:


Converting document to safe PDF
> /usr/bin/podman run --network none -u dangerzone --log-driver none --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -i dangerzone.rocks/dangerzone /usr/bin/python3 -m dangerzone.conversion.doc_to_pixels
Converting page 1/1 to pixels
[doc w338vV] 0% Converting page 1/1 to pixels
Converted document to pixels
[doc w338vV] 49% Converted document to pixels
ERROR An exception occurred while converting document 'w338vV'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 76, in convert
    conversion_proc.wait(3)
  File "/usr/lib/python3.11/subprocess.py", line 1262, in wait
    return self._wait(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1989, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds
ERROR [doc w338vV] 0% Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds

Failed to convert document(s)

I also tried, using a MS docx file, and got similar:


Assigning ID 'zWqoem' to doc 'ErrorRecreation_Nov8.docx'

Converting document to safe PDF
> /usr/bin/podman run --network none -u dangerzone --log-driver none --security-opt no-new-privileges --userns keep-id --cap-drop all --rm -i dangerzone.rocks/dangerzone /usr/bin/python3 -m dangerzone.conversion.doc_to_pixels
Converting to PDF using LibreOffice
Converting page 1/3 to pixels
[doc zWqoem] 0% Converting page 1/3 to pixels
Converting page 2/3 to pixels
[doc zWqoem] 16% Converting page 2/3 to pixels
Converting page 3/3 to pixels
[doc zWqoem] 32% Converting page 3/3 to pixels
Converted document to pixels
[COMMAND] libreoffice --headless --safe-mode --convert-to pdf --outdir /tmp /tmp/input_file
convert /tmp/input_file as a Writer document -> /tmp/input_file.pdf using filter : writer_pdf_Export
[doc zWqoem] 49% Converted document to pixels
ERROR An exception occurred while converting document 'zWqoem'
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/dangerzone/isolation_provider/base.py", line 76, in convert
    conversion_proc.wait(3)
  File "/usr/lib/python3.11/subprocess.py", line 1262, in wait
    return self._wait(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1989, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds
ERROR [doc zWqoem] 0% Command '['/usr/bin/podman', 'run', '--network', 'none', '-u', 'dangerzone', '--log-driver', 'none', '--security-opt', 'no-new-privileges', '--userns', 'keep-id', '--cap-drop', 'all', '--rm', '-i', 'dangerzone.rocks/dangerzone', '/usr/bin/python3', '-m', 'dangerzone.conversion.doc_to_pixels']' timed out after 3 seconds

Failed to convert document(s)
ErrorRecreation_Nov8.docx

@apyrgio
Copy link
Contributor

apyrgio commented Apr 1, 2024

Thank you very much for the updated info. From the logs you have provided above, it seems that the container does not log anything unexpected to stderr, which was my main suspicion. More importantly, with the logs for the last .docx file, we now know that the container reaches this line:

await DocumentToPixels.write_bytes(converter.captured_output, file=sys.stderr)

This line is the last one before the conversion exits (along with the finalizer of asyncio.run()). This bug starts to look more like a container termination delay.


I'll try to sum up here all the data points from this thread:

  1. The bug occurs when the spawned conversion process takes more than 3 seconds to terminate.
  2. It occurs in Debian, Fedora, and Windows, so it's not OS-related
  3. It occurs with various document types, reportedly .pdf and .docx
  4. It affects both small and large documents, as well as single page ones.
  5. The conversion process does not block while writing to stdout / stderr.
  6. The container does not remain stuck indefinitely, as a user has reported that changing the timeout to 10 seconds makes the conversion resume.

@apyrgio apyrgio added this to the 0.6.1 milestone Apr 1, 2024
apyrgio added a commit that referenced this issue Apr 9, 2024
Previously, we always assumed that the spawned process would quit
within 3 seconds. This was an arbitrary call, and did not work in
practice.

We can improve our standing here by doing the following:

1. Make `Popen.wait()` calls take a generous amount of time (since they
   are usually on the sad path), and handle any timeout errors that they
   throw. This way, a slow conversion process cleanup does not take too
   much of our users time, nor is it reported as an error.
2. Always make sure that once the conversion of doc to pixels is over,
   the corresponding process will finish within a reasonable amount of
   time as well.

Fixes #749
apyrgio added a commit that referenced this issue Apr 17, 2024
Previously, we always assumed that the spawned process would quit
within 3 seconds. This was an arbitrary call, and did not work in
practice.

We can improve our standing here by doing the following:

1. Make `Popen.wait()` calls take a generous amount of time (since they
   are usually on the sad path), and handle any timeout errors that they
   throw. This way, a slow conversion process cleanup does not take too
   much of our users time, nor is it reported as an error.
2. Always make sure that once the conversion of doc to pixels is over,
   the corresponding process will finish within a reasonable amount of
   time as well.

Fixes #749
@eloquence eloquence unpinned this issue Apr 25, 2024
@apyrgio
Copy link
Contributor

apyrgio commented Apr 30, 2024

Update on this situation. We have merged a fix for the bulk of the cases, which will be available in 0.6.1. We've also uncovered a minor bug that affects documents with more than 10,000 pages, or whose dimensions exceed 10,000 pixels on either side (#791). If you have such a document, from now on it will fail immediately with a proper error message, instead of getting stuck for 15 seconds.

(That is, we assume that we should fail these types of documents. If you have a valid use case for them, do let us know)

We will send soon a PR that fixes the above, and hopefully after that we will be able to release 0.6.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed timeout Dangerzone Times Out
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants