-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abort/timeout long running requests #5241
Labels
bug
Something isn't working
Comments
PSeitz
added a commit
that referenced
this issue
Sep 9, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from chitchat when opening a channel and one from the search client. The timeout is applied to both (That means all chitchat connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 9, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from chitchat when opening a channel and one from the search client. The timeout is applied to both (That means all chitchat connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 9, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all chitchat connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 9, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all cluster connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 12, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all cluster connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 13, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all cluster connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 16, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all cluster connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 17, 2024
On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all cluster connections have the same request_timeout applied, not only search nodes) Related: #5241
PSeitz
added a commit
that referenced
this issue
Sep 17, 2024
* add request_timeout config On very large datasets the fixed timeouts are too low for some queries. This PR adds a setting to configure the timeout. Two settings are introduced: - `request_timeout` on the node config - `QW_REQUEST_TIMEOUT` env parameter Currently there are two timeouts when doing a distributed search request, one from quickwit cluster when opening a channel and one from the search client. The timeout is applied to both (That means all cluster connections have the same request_timeout applied, not only search nodes) Related: #5241 * move timeout to search config, add timeout tower layer * cancel search after timeout * use tokio::timeout * use global timeoutlayer
Fixed by #5402 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently long running requests may continue to consume a large amount resources, while the original request already timed-out in a http-layer above. We should add a configurable timeout to abort requests in quickwit
There are two different timeouts configured:
Tonic Timeout
tonic timeouts return errors like:
This timeout is originating here:
Tower Timeout
There's another similar timeout, from tower
The tower timeout is defined here:
The behavior of timeouts also differs currently on the dispatch type, local dispatch never times out.
Retries
The current behavior is to retry directly after a timeout (
cluster_client.rs::leaf_search
). This could add additional load on a overloaded node.The first step would be to properly transform the error type to keep the semantics
The text was updated successfully, but these errors were encountered: