Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When QueryContext is executed with a deadline, underlying Trino query does not get canceled #133

Open
justin-hu-ai opened this issue Jan 16, 2025 · 5 comments

Comments

@justin-hu-ai
Copy link

justin-hu-ai commented Jan 16, 2025

From this line here, I would expect that we send a request with "DELETE" to actually stop the underlying query. But we don't?

I am able to do this using my GRPC service:

	timeoutCtx, cancel := context.WithTimeout(ctx, 15*time.Second)
	defer cancel()

	rows, queryErr := s.buildAndExecutePaginatedFramesQuery(
		...
	)

       if queryErr != nil {
		// First check the context errors, then fall back to the query error.
		if timeoutCtx.Err() == context.DeadlineExceeded {
			return nil, status.Errorf(
				codes.DeadlineExceeded,
				"Request timed out",
			)
		}

		return nil, queryErr
	}

I verified that the DeadlineExceeded error is being returned from my RPC call, but when exec-ing into the shell and describing the system.runtime.queries table, my query remains in RUNNING for a bit. When attempting to run expensive queries, it feels like (and from the shell output) that the query continues to run even though timeout has been canceled.

@nineinchnick
Copy link
Member

To cancel the query you have to close the result set:

case <-ctx.Done():

You always should do that explicitly, even if you get an error when reading results. We're actually a bit too eager about this, and we still try to delete a query even if all results have been consumed correctly: #102

@justin-hu-ai
Copy link
Author

justin-hu-ai commented Jan 16, 2025

rows, queryErr = s.trinoDB.QueryContext(ctx, query, args...)

re: you saying

You always should do that explicitly, even if you get an error when reading results

so you're saying that rows could be non-nil if the error from QueryContext actually exists?

That doesn't make sense right, since here if we see an error, we return nil. This means we can't call Close() on the result set (the rows)

Unless you're referring to calling Close on something else entirely

@nineinchnick
Copy link
Member

Right, we should close it internally before returning an error.

@justin-hu-ai
Copy link
Author

justin-hu-ai commented Jan 16, 2025

Right, we should close it internally before returning an error.

Oh, so you're saying there is a problem in the library? Sorry I thought you were saying that I need to make a change on my side 😅

I can put up a PR to fix, if that is our conclusion here?

@nineinchnick
Copy link
Member

I wasn't sure initially if it's a problem in the driver, but you explained it well. If you could prepare a PR that would be awesome. We have tests for timeouts, but they don't check if the query was cancelled. We can start with adjusting them, or adding new tests, and then adding more cleanup before returning errors at specific moments of the query lifecycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants