-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: improved sql/dql/join in en #2238
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
05ec84e
docs: fixed format error in zh version
michelle-qinqin 6672538
docs: change splice to join
michelle-qinqin 1f9ab26
docs: updated the table in zh
michelle-qinqin 3313198
docs: added sql examples and translated them
michelle-qinqin 20657be
docs: fixed the statement template on line 22
michelle-qinqin c987150
docs: fixed syntax error and improve the example format
michelle-qinqin 24ae84e
docs: improved zh version
michelle-qinqin 171d70f
docs: modified the online serving to request
michelle-qinqin c05a34b
Merge branch 'main' of https://github.com/michelle-qinqin/OpenMLDB in…
michelle-qinqin 1ff2892
doc: fixed the problem in win and select overview
michelle-qinqin db26366
doc: capitalize the mode name
michelle-qinqin 905c158
docs: improved line 116
michelle-qinqin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,13 @@ | ||
# JOIN Clause | ||
|
||
OpenMLDB currently supports only one **JoinType** of `LAST JOIN`. | ||
OpenMLDB currently only supports `LAST JOIN`. | ||
|
||
LAST JOIN can be seen as a special kind of LEFT JOIN. On the premise that the JOIN condition is met, each row of the left table is spelled with a last row that meets the condition. LAST JOIN is divided into unsorted splicing and sorted splicing. | ||
`LAST JOIN` can be seen as a special kind of `LEFT JOIN`. On the premise that the JOIN condition is met, each row of the left table is joined with the last row of the right table that meets the condition. There are two types of `LAST JOIN`: unsorted join and sorted join. | ||
|
||
- Unsorted splicing refers to the direct splicing without sorting the right table. | ||
- Sorting and splicing refers to sorting the right table first, and then splicing. | ||
- The unsorted join will join two tables directly without sorting the right table. | ||
- The sorted join will sort the right table first, and then join two tables. | ||
|
||
Like `LEFT JOIN`, `LAST JOIN` returns all rows in the left table, even if there are no matched rows in the right table. | ||
## Syntax | ||
|
||
``` | ||
|
@@ -18,51 +19,227 @@ JoinType ::= 'LAST' | |
## SQL Statement Template | ||
|
||
```sql | ||
SELECT ... FROM table_ref LAST JOIN table_ref; | ||
SELECT ... FROM table_ref LAST JOIN table_ref ON expression; | ||
``` | ||
|
||
## Boundary Description | ||
## Description | ||
|
||
| SELECT statement elements | state | direction | | ||
| :------------- | --------------- | :----------------------------------------------------------- | | ||
| JOIN Clause | Only LAST JOIN is supported | Indicates that the data source multiple tables JOIN. OpenMLDB currently only supports LAST JOIN. During Online Serving, you need to follow [The usage specification of LAST JOIN under Online Serving](../deployment_manage/ONLINE_SERVING_REQUIREMENTS.md#online-serving usage specification of last-join) | | ||
| `SELECT` Statement Elements | Offline Mode | Online Preview Mode | Online Request Mode | Note | | ||
|:-----------------------------------------------------------|--------------|---------------------|---------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| JOIN Clause | **``✓``** | **``✓``** | **``✓``** | The Join clause indicates that the data source comes from multiple joined tables. OpenMLDB currently only supports LAST JOIN. For Online Request Mode, please follow [the specification of LAST JOIN under Online Request Mode](../deployment_manage/ONLINE_SERVING_REQUIREMENTS.md#online-servinglast-join) | | ||
|
||
### LAST JOIN without ORDER BY | ||
|
||
#### Example: **LAST JOIN Unsorted Concatenation** | ||
|
||
```sql | ||
-- desc: simple spelling query without ORDER BY | ||
|
||
SELECT t1.col1 as t1_col1, t2.col1 as t2_col2 from t1 LAST JOIN t2 ON t1.col1 = t2.col1 | ||
``` | ||
### LAST JOIN without ORDER BY | ||
|
||
When `LAST JOIN` is spliced without sorting, the first hit data row is spliced | ||
#### Example of the Computation Logic | ||
|
||
data:image/s3,"s3://crabby-images/791b2/791b29af191d61e7ccd96b114b426d76353e21cc" alt="Figure 7: last join without order" | ||
The unsorted `LAST JOIN` will concat every row of the left table with the last matched row of the right table. | ||
|
||
data:image/s3,"s3://crabby-images/791b2/791b29af191d61e7ccd96b114b426d76353e21cc" alt="Figure 7: last join without order" | ||
|
||
Take the second row of the left table as an example, the right table that meets the conditions is unordered, there are 2 hit conditions, select the last one `5, b, 2020-05-20 10:11:12` | ||
|
||
Take the second row of the left table as an example. The right table is unordered, and there are 2 matched rows. The last one `5, b, 2020-05-20 10:11:12` will be joined with the second row of the left. | ||
The final result is shown in the figure bellow. | ||
data:image/s3,"s3://crabby-images/3453c/3453c9f1d4b9d739978b1dd4ac0fa29d5437fbd2" alt="Figure 8: last join without order result" | ||
|
||
The final result is shown in the figure above. | ||
```{note} | ||
To realize the above JOIN result, please follow [the specification of LAST JOIN under Online Request mode](../deployment_manage/ONLINE_SERVING_REQUIREMENTS.md#online-servinglast-join) like the SQL example bellow, even if you are using offline mode. | ||
Otherwise, you may not obtain the above result because of the uncertainty of the underlying storage order, although the result is correct as well. | ||
``` | ||
|
||
### LAST JOIN with ORDER BY | ||
#### SQL Example | ||
|
||
#### Example: LAST JOIN Sorting And Splicing | ||
The following SQL commands created the left table t1 as mentioned above and inserted corresponding data. | ||
In order to check the results conveniently, it is recommended to create index on `col1` and use `std_ts` as timestamp. It doesn't matter if you create t1 without index, since it doesn't affect the concatenation in this case. | ||
```sql | ||
>CREATE TABLE t1 (id INT, col1 STRING,std_ts TIMESTAMP,INDEX(KEY=col1,ts=std_ts)); | ||
SUCCEED | ||
>INSERT INTO t1 values(1,'a',20200520101112); | ||
SUCCEED | ||
>INSERT INTO t1 values(2,'b',20200520101114); | ||
SUCCEED | ||
>INSERT INTO t1 values(3,'c',20200520101116); | ||
SUCCEED | ||
>SELECT * from t1; | ||
---- ------ ---------------- | ||
id col1 std_ts | ||
---- ------ ---------------- | ||
1 a 20200520101112 | ||
2 b 20200520101114 | ||
3 c 20200520101116 | ||
---- ------ ---------------- | ||
|
||
3 rows in set | ||
``` | ||
The following SQL commands created the right table t2 as mentioned above and inserted corresponding data. | ||
|
||
```{note} | ||
The storage order of data rows is not necessarily the same as their insert order. And the storage order will influence the matching order when JOIN. | ||
In this example, we want to realize the storage order of t2 as the above figure displayed, which will lead to a result that is convenient to check. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. realize => implement |
||
To guarantee the storage order of t2, please create following index, do not set `ts`, and sequentially instert data one by one. | ||
Detail explanation is in [columnindex](https://openmldb.ai/docs/en/main/reference/sql/ddl/CREATE_TABLE_STATEMENT.html#columnindex). | ||
``` | ||
```sql | ||
>CREATE TABLE t2 (id INT, col1 string,std_ts TIMESTAMP,INDEX(KEY=col1)); | ||
SUCCEED | ||
>INSERT INTO t2 values(1,'a',20200520101112); | ||
SUCCEED | ||
>INSERT INTO t2 values(2,'a',20200520101113); | ||
SUCCEED | ||
>INSERT INTO t2 values(3,'b',20200520101113); | ||
SUCCEED | ||
>INSERT INTO t2 values(4,'c',20200520101114); | ||
SUCCEED | ||
>INSERT INTO t2 values(5,'b',20200520101112); | ||
SUCCEED | ||
>INSERT INTO t2 values(6,'c',20200520101113); | ||
SUCCEED | ||
>SELECT * from t2; | ||
---- ------ ---------------- | ||
id col1 std_ts | ||
---- ------ ---------------- | ||
2 a 20200520101113 | ||
1 a 20200520101112 | ||
5 b 20200520101112 | ||
3 b 20200520101113 | ||
6 c 20200520101113 | ||
4 c 20200520101114 | ||
---- ------ ---------------- | ||
|
||
6 rows in set | ||
``` | ||
The result of `SELECT` with `LAST JOIN` is shown below. | ||
```sql | ||
> SELECT * from t1 LAST JOIN t2 ON t1.col1 = t2.col1; | ||
---- ------ ---------------- ---- ------ ---------------- | ||
id col1 std_ts id col1 std_ts | ||
---- ------ ---------------- ---- ------ ---------------- | ||
1 a 20200520101112 2 a 20200520101113 | ||
2 b 20200520101114 5 b 20200520101112 | ||
3 c 20200520101116 6 c 20200520101113 | ||
---- ------ ---------------- ---- ------ ---------------- | ||
|
||
3 rows in set | ||
``` | ||
If you create t1 without index, the result of `JOIN` is the same but the order of `SELECT` result is different. | ||
```sql | ||
> SELECT * from t1 LAST JOIN t2 ON t1.col1 = t2.col1; | ||
---- ------ ---------------- ---- ------ ---------------- | ||
id col1 std_ts id col1 std_ts | ||
---- ------ ---------------- ---- ------ ---------------- | ||
3 c 20200520101116 6 c 20200520101113 | ||
1 a 20200520101112 2 a 20200520101113 | ||
2 b 20200520101114 5 b 20200520101112 | ||
---- ------ ---------------- ---- ------ ---------------- | ||
|
||
3 rows in set | ||
``` | ||
|
||
```SQL | ||
-- desc: Simple spelling query with ORDER BY | ||
SELECT t1.col1 as t1_col1, t2.col1 as t2_col2 from t1 LAST JOIN t2 ORDER BY t2.std_ts ON t1.col1 = t2.col1 | ||
```{note} | ||
The execution of `LAST JOIN` can be optimized by index. If there is index corresponding with the `ORDER BY` and conditions in `LAST JOIN` clause, its `ts` will be used as the implicit order for unsorted `LAST JOIN`. If there is not index like this, the implicit order is the storage order. But the storage order of a table without index is unpredictable. | ||
If the `ts` was not given when create index, OpenMLDB uses the time when the data was inserted as `ts`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. create => creating |
||
``` | ||
|
||
When `LAST JOIN` is configured with `Order By`, the right table is sorted by Order, and the last hit data row is spliced. | ||
|
||
|
||
### LAST JOIN with ORDER BY | ||
|
||
#### Example of the Computation Logic | ||
|
||
When `LAST JOIN` is configured with `ORDER BY`, the right table is sorted by the specified order, and the last matched data row will be joined. | ||
|
||
data:image/s3,"s3://crabby-images/3834e/3834eb63c61b9ca1a45eb0db3c6c66dfb562b81c" alt="Figure 9: last join with order" | ||
|
||
Taking the second row of the left table as an example, there are 2 items in the right table that meet the conditions. After sorting by `std_ts`, select the last item `3, b, 2020-05-20 10:11:13` | ||
Taking the second row of the left table as an example, there are 2 rows in the right table that meet the conditions. After sorting by `std_ts`, the last row `3, b, 2020-05-20 10:11:13` will be joined. | ||
|
||
data:image/s3,"s3://crabby-images/6b117/6b117186de7eb24604c424c53003643f9e4a9363" alt="Figure 10: last join with order result" | ||
|
||
The final result is shown in the figure above. | ||
|
||
#### SQL Example | ||
|
||
|
||
The following SQL commands created the left table t1 as mentioned above and inserted corresponding data. | ||
```SQL | ||
>CREATE TABLE t1 (id INT, col1 STRING,std_ts TIMESTAMP); | ||
SUCCEED | ||
>INSERT INTO t1 values(1,'a',20200520101112); | ||
SUCCEED | ||
>INSERT INTO t1 values(2,'b',20200520101114); | ||
SUCCEED | ||
>INSERT INTO t1 values(3,'c',20200520101116); | ||
SUCCEED | ||
>SELECT * from t1; | ||
---- ------ ---------------- | ||
id col1 std_ts | ||
---- ------ ---------------- | ||
1 a 20200520101112 | ||
2 b 20200520101114 | ||
3 c 20200520101116 | ||
---- ------ ---------------- | ||
|
||
3 rows in set | ||
``` | ||
The following SQL commands created the right table t2 as mentioned above and inserted corresponding data. | ||
|
||
```sql | ||
>CREATE TABLE t2 (id INT, col1 string,std_ts TIMESTAMP); | ||
SUCCEED | ||
>INSERT INTO t2 values(1,'a',20200520101112); | ||
SUCCEED | ||
>INSERT INTO t2 values(2,'a',20200520101113); | ||
SUCCEED | ||
>INSERT INTO t2 values(3,'b',20200520101113); | ||
SUCCEED | ||
>INSERT INTO t2 values(4,'c',20200520101114); | ||
SUCCEED | ||
>INSERT INTO t2 values(5,'b',20200520101112); | ||
SUCCEED | ||
>INSERT INTO t2 values(6,'c',20200520101113); | ||
SUCCEED | ||
>SELECT * from t2; | ||
---- ------ ---------------- | ||
id col1 std_ts | ||
---- ------ ---------------- | ||
2 a 20200520101113 | ||
1 a 20200520101112 | ||
5 b 20200520101112 | ||
3 b 20200520101113 | ||
6 c 20200520101113 | ||
4 c 20200520101114 | ||
---- ------ ---------------- | ||
|
||
6 rows in set | ||
``` | ||
The result of `SELECT` with `LAST JOIN` is shown below. | ||
```sql | ||
>SELECT * from t1 LAST JOIN t2 ON t1.col1 = t2.col1; | ||
---- ------ ---------------- ---- ------ ---------------- | ||
id col1 std_ts id col1 std_ts | ||
---- ------ ---------------- ---- ------ ---------------- | ||
1 a 20200520101112 2 a 20200520101113 | ||
2 b 20200520101114 3 b 20200520101113 | ||
3 c 20200520101116 4 c 20200520101114 | ||
---- ------ ---------------- ---- ------ ---------------- | ||
``` | ||
|
||
### LAST JOIN with No Matched Rows | ||
The following example shows the result of LAST JOIN with no matched rows. | ||
|
||
Please insert a new row into t1 (created in [Example of LAST JOIN with ORDER BY](#LAST JOIN with ORDER BY)) as follows, then run `LAST JOIN` command. | ||
|
||
```sql | ||
>INSERT INTO t1 values(4,'d',20220707111111); | ||
SUCCEED | ||
>SELECT * from t1 LAST JOIN t2 ORDER BY t2.std_ts ON t1.col1 = t2.col1; | ||
---- ------ ---------------- ------ ------ ---------------- | ||
id col1 std_ts id col1 std_ts | ||
---- ------ ---------------- ------ ------ ---------------- | ||
4 d 20220707111111 NULL NULL NULL | ||
3 c 20200520101116 4 c 20200520101114 | ||
1 a 20200520101112 2 a 20200520101113 | ||
2 b 20200520101114 3 b 20200520101113 | ||
---- ------ ---------------- ------ ------ ---------------- | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The physical storage order of rows is not necessarily the same as their insertion order. However, the storage order will affect the matching order when performing the LAST JOIN.