-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Introduce an async querying mode for SQL #73267
Conversation
This adds an async query mode to SQL. It (re)uses the same request and response async-specific EQL object parameters. Also similar to EQL, the running search task can have its state monitored and canceled and its results stored and deleted, with intermediary responses not supported (the entire result is available once search finished). The initial query and subsequent pagination/scrolling requests will both be started in the async mode.
@elasticmachine ok to test |
Note: text formatting not yet supported, that will be applied with a subsequent PR. |
Pinging @elastic/es-ql (Team:QL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general. I think we can move some classes that we moved to xpack.ql.async
all the way to xpack.core.async
since ML expressed desire to also use them.
@@ -300,6 +303,7 @@ | |||
"cluster:monitor/xpack/searchable_snapshots/stats", | |||
"cluster:monitor/xpack/security/saml/metadata", | |||
"cluster:monitor/xpack/spatial/stats", | |||
SQL_ASYNC_GET_STATUS_ACTION_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason is that the list is generated when the test is first introduced, i.e. the test reports all missing names when this list is empty. Also a few other reasons:
- Many names are not available to this test, e.g. autoscaling, enrich, grok processor, reindex
- Some names are computed, e.g. all the xpack usage actions and shard level actions (
[s]
) and quite a few others - I find a sorted list of consistent literal strings easier to read and useful to scan through
- Sometimes it is useful to repeat the literal string in tests which can itself act as a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it sounds like we have several good enough reasons to continue using strings here instead of constants, and we should probably continue doing it in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced the var with the string (and added the var location as a comment).
@@ -5,7 +5,7 @@ | |||
* 2.0. | |||
*/ | |||
|
|||
package org.elasticsearch.xpack.eql.async; | |||
package org.elasticsearch.xpack.ql.async; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep this generic and move it into org.elasticsearch.xpack.core.async so ML could use it. The same for StoredAsyncTask and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @imotov!
Will wait one more review and depending on the amount of changes required will move the classes either part of this, or of a next PR that I have to open anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two classes have been moved.
@@ -95,4 +95,15 @@ protected synchronized void onFailure(Exception e) { | |||
public void cancelTask(TaskManager taskManager, Runnable runnable, String reason) { | |||
taskManager.cancelTaskAndDescendants(this, reason, true, ActionListener.wrap(runnable)); | |||
} | |||
|
|||
public static QlStatusResponse getStatusResponse(StoredAsyncTask<?> asyncTask) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would introduce it one level above so we can keep this task generic and usable by ML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Nice work! Left a couple of minor comments.
Long startTimeMillis, | ||
long expirationTimeMillis, | ||
RestStatus completionStatus) { | ||
public interface AsyncStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, I would extract this to its own file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the interface tightly bound to the outer class (i.e. no independent applicability), that's why I was also using it as a prefix wherever the interface is used. But if you think it'd be better practice to extract it, I can still rename it and do it.
(Thanks for this long-PR review!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it up to you, not insisting, it's just that from the name AsyncStatus
it seems like a generic iface and could also belong to a separate file.
testCase("user2", "user1"); | ||
} | ||
|
||
private void testCase(String user, String other) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, to make it more clear:
private void testCase(String user, String other) throws Exception { | |
private void testCase(String user, String otherUser) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I've tried to keep as close to the original EQL tests as possible (since many of them can't be deduplicated EQL-SQL). But this is small, I'll change it.
- move StoredAsyncResponse and StoredAsyncTask classes from ql to core.async packages. - rename function parameter.
Replace constant with hardcoded string.
jenkins re-test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments.
builder.field("id", id); | ||
builder.field("is_running", isRunning); | ||
builder.field("is_partial", isPartial); | ||
if (startTimeMillis != null) { // start time is available only for a running eql search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eql
-> eql/sql
?
builder.timeField("start_time_in_millis", "start_time", startTimeMillis); | ||
} | ||
builder.timeField("expiration_time_in_millis", "expiration_time", expirationTimeMillis); | ||
if (isRunning == false) { // completion status is available only for a completed eql search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eql
-> eql/sql
?
There are, also, other eql
mentions in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will fix them with the following PR.
@@ -144,6 +145,10 @@ public void query(List<Attribute> output, QueryContainer query, String index, Ac | |||
l = new ScrollActionListener(listener, client, cfg, output, query); | |||
} | |||
|
|||
if (cfg.task() != null && cfg.task().isCancelled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why checking for a cancelled task here (before the actual search and after the listener
flavor is created) and not at the start of the method? Does it matter what kind of listener has its onFailure
method called for a TaskCancelledException
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of places where the task is checked against being canceled within SQL, both placed before "leaving" SQL:
- before resolving the index;
- here, before running the search.
It makes sense to do it there because (1) compared to the distributed part, the SQL code will run quick; and (2) there's anyways a race between the cancelation and task execution -- one could add more checkpoints throughout SQL, but it won't practically improve much (like reactivity to the cancelation).
This adds an async query mode to SQL.
It (re)uses the same request and response async-specific EQL object
parameters.
Also similar to EQL, the running search task can have its state
monitored and canceled and its results stored and deleted, with
intermediary responses not supported (the entire result is available
once search finished).
The initial query and subsequent pagination/scrolling requests will both
be started in the async mode.