-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove timeouts that are not proportional to the pages #327
Comments
Timeout for office documentsThis may be the only thing that's out of the scope of this issue. I noticed that the there are two situations where timeouts cannot be proportional to the number of pages. This is because they happen before we have a PDF document and thus before we can even calculate its size. Images should be fine but office documents may be normally large dangerzone/container/dangerzone.py Lines 197 to 200 in 56b5b98
I tested converting a 500 slide document and it went past a minute on my system. I don't think Another way to solve this it to make it proportional document's file size or just simply remove the timeout here. |
I agree, that's why I suggested above that we can have a per-MB timeout. But we'll see... |
Introduce proportional timeouts in the container code, where the conversion logic runs. Previously, we had a single timeout for each command (120 seconds), which didn't scale well either with the number of pages in a document, or with the size of the document. In this commit, we look into each operation, and we're trying to figure out the following: 1. What's the number of pages we will operate on? 2. How large is the document? Knowing the above, we can break down a command into multiple operations, at least conceptually. Having a number of operations and a sane timeout value per operation (10 seconds), we can multiply those and reach to a timeout that fits the command better. Refs #327
Introduce proportional timeouts in the container code, where the conversion logic runs. Previously, we had a single timeout for each command (120 seconds), which didn't scale well either with the number of pages in a document, or with the size of the document. In this commit, we look into each operation, and we're trying to figure out the following: 1. What's the number of pages we will operate on? 2. How large is the document? Knowing the above, we can break down a command into multiple operations, at least conceptually. Having a number of operations and a sane timeout value per operation (10 seconds), we can multiply those and reach to a timeout that fits the command better. Refs #327
Introduce proportional timeouts in the container code, where the conversion logic runs. Previously, we had a single timeout for each command (120 seconds), which didn't scale well either with the number of pages in a document, or with the size of the document. In this commit, we look into each operation, and we're trying to figure out the following: 1. What's the number of pages we will operate on? 2. How large is the document? Knowing the above, we can break down a command into multiple operations, at least conceptually. Having a number of operations and a sane timeout value per operation (10 seconds), we can multiply those and reach to a timeout that fits the command better. Refs #327
Introduce proportional timeouts in the container code, where the conversion logic runs. Previously, we had a single timeout for each command (120 seconds), which didn't scale well either with the number of pages in a document, or with the size of the document. In this commit, we look into each operation, and we're trying to figure out the following: 1. What's the number of pages we will operate on? 2. How large is the document? Knowing the above, we can break down a command into multiple operations, at least conceptually. Having a number of operations and a sane timeout value per operation (10 seconds), we can multiply those and reach to a timeout that fits the command better. Refs #327
Introduce proportional timeouts in the container code, where the conversion logic runs. Previously, we had a single timeout for each command (120 seconds), which didn't scale well either with the number of pages in a document, or with the size of the document. In this commit, we look into each operation, and we're trying to figure out the following: 1. What's the number of pages we will operate on? 2. How large is the document? Knowing the above, we can break down a command into multiple operations, at least conceptually. Having a number of operations and a sane timeout value per operation (10 seconds), we can multiply those and reach to a timeout that fits the command better. Fixes #306 Fixes #314 Refs #327
The Dangerzone container currently has two types of timeouts:
dangerzone/container/dangerzone.py
Lines 27 to 31 in 56b5b98
DEFAULT_TIMEOUT
is a timeout that applies to any command that runs in a Dangerzone container.COMPRESSION_TIMEOUT
is a timeout that applies to a specific command (compressing a PDF) that is proportional to the number of pages in the PDF.The second type of timeout should be our end goal. The reason is that the first type of timeout is page-agnostic, meaning that operations on large PDFs may surpass it. There are several issues that exemplify this problem:
pdfunite
times out on M1 Pro #314The reason we're still stuck with the first type of timeout is that (until recently) we didn't have a way to know the number of pages of a PDF beforehand. Now that we do, we can do the following:
.docx
,.xlsx
) will have to use a page-agnostic timeout.main
branch but works on our new implementation.The text was updated successfully, but these errors were encountered: