Skip to content

Commit

Permalink
revise some more
Browse files Browse the repository at this point in the history
  • Loading branch information
Chittaranjan Prasad committed May 26, 2021
1 parent 98747cd commit 02c1540
Showing 1 changed file with 78 additions and 61 deletions.
139 changes: 78 additions & 61 deletions docs/developer-guide/ksqldb-reference/select-pull-query.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,40 +15,61 @@ Synopsis
```sql
SELECT select_expr [, ...]
FROM table
WHERE key_column=key [AND ...]
[AND window_bounds];
[ WHERE where_condition ]
[ AND window_bounds ];
```

Description
-----------

Pulls the current value from the materialized view and terminates. The result
of this statement isn't persisted in a {{ site.ak }} topic and is printed out
only in the console.

Pull queries enable you to fetch the current state of a materialized view.
Because materialized views are incrementally updated as new events arrive,
pull queries run with predictably low latency. They're a great match for
request/response flows. For asynchronous application flows, see
of this statement is not persisted in a {{ site.ak }} topic and is printed out
only in the console. Because materialized views are incrementally updated as new
events arrive, pull queries run with predictably low latency. They are a great match
for request/response flows. For asynchronous application flows, see
[Push Queries](select-push-query.md).

Execute a pull query by sending an HTTP request to the ksqlDB REST API, and
You can execute a pull query by sending an HTTP request to the ksqlDB REST API, and
the API responds with a single response.

The WHERE clause must contain a value for each primary-key column to retrieve and may
optionally include bounds on `WINDOWSTART` and `WINDOWEND` if the materialized view is windowed.
For more information, see
[Time and Windows in ksqlDB](../../concepts/time-and-windows-in-ksqldb-queries.md).

Example
-------
You can issue a pull query against a derived table that was created by using
the [CREATE TABLE AS SELECT](../../ksqldb-reference/create-table-as-select)
statement. First, create a table named `GRADES` by using a [CREATE TABLE](../../ksqldb-reference/create-table)
You can issue a pull query against any table that was created by using
a [CREATE TABLE AS SELECT](../../ksqldb-reference/create-table-as-select)
statement. Currently, we do not support pull queries against tables created
by using a [CREATE TABLE](../../ksqldb-reference/create-table) statement.

WHERE Clause Guidelines
-----------------------

By default, only key lookups are enabled. They have the following requirements:
- Key column(s) must use an equality comparison to a literal (e.g. KEY = 'abc').
- On windowed tables, WINDOWSTART and WINDOWEND can be optionally compared to literals.
For more information on windowed tables, see [Time and Windows in ksqlDB](../../concepts/time-and-windows-in-ksqldb-queries.md).

You can enable table scans to loosen the restrictions on the `WHERE` clause or eliminate
the `WHERE` clause altogether. Table scans can be turned on for pull queries running in the
current CLI session with the command `SET 'ksql.query.pull.table.scan.enabled'='true';`.
They can also be enabled by default by setting a server configuration property with
`ksql.query.pull.table.scan.enabled=true`.Once table scans are enabled, the following
additional expressions are allowed:
- Key column(s) using range comparisons to literals.
- Non key columns to be used alone, without key references.
- Columns to be compared to other columns.
- References to subsets of columns from a multi-column key.

!!! note
Table scan based queries are just the next incremental step for ksqlDB pull queries.
In future releases, we will continue pushing the envelope of new query capabilities and
greater performance and efficiency.

Examples
--------
Pull queries against a table `TOP_TEN_RANKS` created by using a
[CREATE TABLE AS SELECT](../../ksqldb-reference/create-table-as-select) statement:
First, create a table named `GRADES` by using a [CREATE TABLE](../../ksqldb-reference/create-table)
statement:
```sql
CREATE TABLE GRADES (ID INT PRIMARY KEY, GRADE STRING, RANK INT)
WITH (kafka_topic = 'test_topic', value_format = 'JSON', partitions = 1);
WITH (kafka_topic = 'test_topic', value_format = 'JSON', partitions = 4);
```
Then, create a derived table named `TOP_TEN_RANKS` by using a
[CREATE TABLE AS SELECT](../../ksqldb-reference/create-table-as-select) statement:
Expand All @@ -58,64 +79,60 @@ CREATE TABLE TOP_TEN_RANKS
FROM GRADES
WHERE RANK <= 10;
```
You can fetch the current state of your materialized view `TOP_TEN_RANKS` by using a pull query:
```sql
SELECT * FROM TOP_TEN_RANKS;
```
If you want to look up only the student with `ID = 5` in the materialized view:
If you want to look up only the student with `ID = 5` in the `TOP_TEN_RANKS` table using a pull query:
```sql
SELECT * FROM TOP_TEN_RANKS
WHERE ID = 5;
```
If the materialized view `pageviews_by_region` is windowed:
After enabling table scans, you can fetch the current state of your `TOP_TEN_RANKS` table using a pull query:
```sql
SELECT * FROM pageviews_by_region
WHERE regionId = 'Region_1'
AND 1570051876000 <= WINDOWSTART AND WINDOWEND <= 1570138276000;
SELECT * FROM TOP_TEN_RANKS;
```

If the `pageviews_by_region` table was created as an aggregation of multiple columns,
then each key column must be present in the WHERE clause. The following example shows how to
query the table if `countryId` and `regionId` where both key columns:

If you want to look up the students whose ranks lie in the range `(4, 8)`:
```sql
SELECT * FROM pageviews_by_region
WHERE countryId = 'USA' AND regionId = 'Region_1'
AND 1570051876000 <= WINDOWSTART AND WINDOWEND <= 1570138276000;
```

When writing logical expressions using `WINDOWSTART` or `WINDOWEND`, you can use ISO-8601
formatted datestrings to represent date times. For example, the previous
query is equivalent to the following:

```sql
SELECT * FROM pageviews_by_region
WHERE regionId = 'Region_1'
AND '2019-10-02T21:31:16' <= WINDOWSTART AND WINDOWEND <= '2019-10-03T21:31:16';
SELECT * FROM TOP_TEN_RANKS
WHERE ID > 4 AND ID < 8;
```

You can specify time zones within the datestring. For example,
`2017-11-17T04:53:45-0330` is in the Newfoundland time zone. If no time zone is
specified within the datestring, then timestamps are interpreted in the UTC
time zone.

If no bounds are placed on `WINDOWSTART` or `WINDOWEND`, rows are returned for all windows
in the windowed table.

You can also issue pull queries against materialized views that are created using
joining multiple tables:
Pull queries against a table `INNER_JOIN` that is created by joining multiple tables:
```sql
CREATE TABLE LEFT_TABLE (ID BIGINT PRIMARY KEY, NAME varchar, VALUE bigint)
WITH (kafka_topic='left_topic', value_format='json', partitions=4);
WITH (kafka_topic='left_topic', value_format='JSON', partitions=4);
```
```sql
CREATE TABLE RIGHT_TABLE (ID BIGINT PRIMARY KEY, F1 varchar, F2 bigint)
WITH (kafka_topic='right_topic', value_format='json', partitions=4);
WITH (kafka_topic='right_topic', value_format='JSON', partitions=4);
```
```sql
CREATE TABLE INNER_JOIN AS SELECT L.ID, NAME, VALUE, F1, F2 FROM LEFT_TABLE L JOIN RIGHT_TABLE R ON L.ID = R.ID;
```
You can fetch the current state of your materialized view `INNER_JOIN` by using a pull query:
You can fetch the current state of your table `INNER_JOIN` by using a pull query:
```sql
SELECT * FROM INNER_JOIN [ WHERE where_condition ];
```

Pull queries against a windowed table `NUMBER_OF_TESTS` created by aggregating a stream `STUDENTS`:
```sql
CREATE STREAM STUDENTS (ID STRING KEY, SCORE INT)
WITH (kafka_topic='students_topic', value_format='JSON', partitions=4);
```
```sql
CREATE TABLE NUMBER_OF_TESTS AS
SELECT ID, COUNT(1) AS COUNT
FROM STUDENTS
WINDOW TUMBLING(SIZE 1 SECOND)
GROUP BY ID;
```
Look up the number of tests taken by a student with `ID='10'`:
```sql
SELECT *
FROM NUMBER_OF_TESTS
WHERE ID='10';
```
Look up the number of tests taken by a student with `ID='10'`
in the window range `100 <= WindowStart AND WindowEnd <= 16000`:
```sql
SELECT *
FROM NUMBER_OF_TESTS
WHERE ID='10' AND 100 <= WindowStart AND WindowEnd <= 16000;
```

0 comments on commit 02c1540

Please sign in to comment.