-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: pull queries available on /query
rest & ws endpoint
#3820
fix: pull queries available on /query
rest & ws endpoint
#3820
Conversation
fixes: confluentinc#3672 by providing alternative way of issuing pull queries that does NOT log Makes pull queries available on the `/query` RESTful and Websocket endpoints, in the same way that push queries are. Note: this change does not _remove_ pull query support from the `/ksql` endpoint, nor does it switch the CLI over to use the `/query` endpoint. The CLI continues to use the `/ksql` endpoint for pull queries. Push and pull queries to the `/query` rest endpoint now return the schema of the rows in the first message. This is required as the 'DESCRIBE' that CLI was previously running to get column headers doesn't work for pull queries yet. (Known issue: confluentinc#3495). This is similar to the pattern used by the websocket endpoint, which also sends the schema in the first message. In addition, I've hidden null fields and added a 'header' row to return the schema of the data. The output now looks like: ```json [{"header":{"queryId":"someId","schema":"`USERID` STRING, `PAGEID` STRING, `VIEWTIME` BIGINT, `ROWKEY` STRING"}}, {"row":{"columns":["USER_1","PAGE_1",1,"1"]}}, {"row":{"columns":["USER_2","PAGE_2",2,"2"]}}, {"finalMessage":"Limit Reached"}]" ``` BREAKING CHANGE: the response from the RESTful API for push queries has changed: it now returns a line with the schema and query id in a `header` field and null fields are not included in the payload. The CLI is backwards compatible with older versions of the server, though it won't output column headings from older versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please get another +1 as I am not really familiar with the web socket code.
output.write("\n".getBytes(StandardCharsets.UTF_8)); | ||
output.flush(); | ||
} | ||
|
||
private StreamedRow buildHeader() { | ||
// Push queries only return value columns, but query metadata schema includes key and meta: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the comment, you mean you want the schema to only contain the value columns, but currently they return also the meta columns and that's what you are working on fixing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. The schema in the query metadata contains key and meta columns. But push queries only return value columns. Hence in this method I create a new schema with only the value columns.
The longer term fix is to store the right schema in the query metadata, but that's outside the scope of this PR.
The outstanding issue with schemas is for pull queries, not push. Pull queries return key columns, but the returned schema currently doesn't include this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM . Just one clarification.
@@ -316,18 +314,6 @@ private void handleStreamedQuery( | |||
final String query, | |||
final SqlBaseParser.QueryStatementContext ignored | |||
) { | |||
final RestResponse<KsqlEntityList> explainResponse = restClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we have #3495 , is the plan to introduce some header to control whether schema
is part of the response or not? It may make sense for CLI to request the schema always, it may add a constant overhead for queries from real applications.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah parameter makes sense.
test this please |
retest this please |
I rekicked it and it still failed with checkstyle
|
* fix: pull queries available on `/query` rest & ws endpoint fixes: #3672 by providing alternative way of issuing pull queries that does NOT log Makes pull queries available on the `/query` RESTful and Websocket endpoints, in the same way that push queries are. Note: this change does not _remove_ pull query support from the `/ksql` endpoint, nor does it switch the CLI over to use the `/query` endpoint. The CLI continues to use the `/ksql` endpoint for pull queries. Push and pull queries to the `/query` rest endpoint now return the schema of the rows in the first message. This is required as the 'DESCRIBE' that CLI was previously running to get column headers doesn't work for pull queries yet. (Known issue: #3495). This is similar to the pattern used by the websocket endpoint, which also sends the schema in the first message. In addition, I've hidden null fields and added a 'header' row to return the schema of the data. The output now looks like: ```json [{"header":{"queryId":"someId","schema":"`USERID` STRING, `PAGEID` STRING, `VIEWTIME` BIGINT, `ROWKEY` STRING"}}, {"row":{"columns":["USER_1","PAGE_1",1,"1"]}}, {"row":{"columns":["USER_2","PAGE_2",2,"2"]}}, {"finalMessage":"Limit Reached"}]" ``` BREAKING CHANGE: the response from the RESTful API for push queries has changed: it now returns a line with the schema and query id in a `header` field and null fields are not included in the payload. The CLI is backwards compatible with older versions of the server, though it won't output column headings from older versions. (cherry picked from commit e2321f5)
Description
fixes: #3672 by providing alternative way of issuing pull queries that does NOT log
fixes: #3806
We should look to include #3819 as part of this fix.
Makes pull queries available on the
/query
RESTful and Websocket endpoints, in the same way that push queries are.Note: this change does not remove pull query support from the
/ksql
endpoint, nor does it switch the CLI over to usethe
/query
endpoint. The CLI continues to use the/ksql
endpoint for pull queries.Push and pull queries to the
/query
rest endpoint now return the schema of the rows in the first message.This is required as the 'DESCRIBE' that CLI was previously running to get column headers doesn't work for pull queries yet. (Known issue: #3495).
This is similar to the pattern used by the websocket endpoint, which also sends the schema in the first message.
In addition, I've hidden null fields and added a 'header' row to return the schema of the data. The output now looks like:
Note: With this PR the payload from the rest endpoint continues to be invalid JSON, just like it is for push queries. #3819 addresses this.
BREAKING CHANGE: the response from the RESTful API for push queries has changed: it now returns a line with the schema and query id in a
header
field and null fields are not included in the payload.The CLI is backwards compatible with older versions of the server, though it won't output column headings from older versions.
Note: This PR does NOT address issue #3663. At the moment I have a feeling an admin client is still made per-request.
Outstanding tasks:
/query
endpoint./ksql
endpoint.Testing done
Lots of testing, including manual
Reviewer checklist