From 307f899a4afa221bd90a0ae9dd9b6511bf0b949e Mon Sep 17 00:00:00 2001 From: Manfred Moser Date: Wed, 10 Jan 2024 15:36:15 -0800 Subject: [PATCH] Add docs for json_table Co-authored-by: Michael Eby --- docs/src/main/sphinx/functions/json.md | 216 +++++++++++++++++++++++++ 1 file changed, 216 insertions(+) diff --git a/docs/src/main/sphinx/functions/json.md b/docs/src/main/sphinx/functions/json.md index 2066d57998dd..7402dba514ba 100644 --- a/docs/src/main/sphinx/functions/json.md +++ b/docs/src/main/sphinx/functions/json.md @@ -1138,6 +1138,222 @@ FROM customers | 102 | 'missing' | | 103 | 'missing' | + +(json-table)= +## json_table + +The `json_table` table function extracts a table from a JSON value. Use this +function to transform JSON data into a relational format, making it easier to +query and analyze. + +```text +JSON_TABLE( + json_input, + json_path [ AS path_name ] + [ PASSING value AS parameter_name [, ...] ] + COLUMNS ( + column_definition [, ...] ) + [ PLAN ( json_table_specific_plan ) + | PLAN DEFAULT ( json_table_default_plan ) ] + [ { ERROR | EMPTY } ON ERROR ] +) +``` + +The `COLUMNS` clause supports the following `column_definition` arguments: + +```text +column_name FOR ORDINALITY +| column_name type + [ FORMAT JSON [ ENCODING { UTF8 | UTF16 | UTF32 } ] ] + [ PATH json_path ] + [ { WITHOUT | WITH { CONDITIONAL | UNCONDITIONAL } } [ ARRAY ] WRAPPER ] + [ { KEEP | OMIT } QUOTES [ ON SCALAR STRING ] ] + [ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT expression } ON EMPTY ] + [ { ERROR | NULL | DEFAULT expression } ON ERROR ] +| NESTED [ PATH ] json_path [ AS path_name ] COLUMNS ( column_definition [, ...] ) +``` + +`json_input` is a character string or a binary string. It must contain a single +JSON item. + +`json_path` is a string literal containing the path mode specification and the +path expression. It follows the syntax rules described in +{ref}`json-path-syntax-and-semantics`. + +```text +'strict ($.price + $.tax)?(@ > 99.9)' +'lax $[0 to 1].floor()?(@ > 10)' +``` + +In the `PASSING` clause, pass values as named parameters that the `json_path` +expression can reference. + +```text +PASSING orders.totalprice AS O_PRICE, + orders.tax % 10 AS O_TAX +``` + +Use named parameters to reference the values in the path expression. Prefix +named parameters with `$`. + +```text +'lax $?(@.price > $O_PRICE || @.tax > $O_TAX)' +``` + +You can also pass JSON values in the `PASSING` clause. Use `FORMAT JSON` to +specify the format and `ENCODING` to specify the encoding: + +```text +PASSING orders.json_desc FORMAT JSON AS o_desc, + orders.binary_record FORMAT JSON ENCODING UTF16 AS o_rec +``` + +The `json_path` value is case-sensitive. The SQL identifiers are uppercase. Use +quoted identifiers in the `PASSING` clause: + +```text +'lax $.$KeyName' PASSING nation.name AS KeyName --> ERROR; no passed value found +'lax $.$KeyName' PASSING nation.name AS "KeyName" --> correct +``` + +The `PLAN` clause specifies how to join columns from different paths. Use +`OUTER` or `INNER` to define how to join parent paths with their child paths. +Use `CROSS` or `UNION` to join sibling. + +`COLUMNS` defines the schema of your table. Each `column_definition` specifies +how to extract and format your `json_input` value into a relational column. + +`PLAN` is an optional clause to control how to process and join nested JSON +data. + +`ON ERROR` specifies how to handle processing errors. `ERROR ON ERROR` throws an +error. `EMPTY ON ERROR` returns an empty result set. + +`column_name` specifies a column name. + +`FOR ORDINALITY` adds incremental row numbers to the table, starting at `1`. + +`NESTED PATH` extracts data from nested levels of a `json_input` value. Each +`NESTED PATH` clause can contain `column_definition` values. + +The `json_table` function returns a result set that you can use like any other +table in your queries. You can join the result set with other tables or +combine multiple arrays from your JSON data. + +You can also process nested JSON objects without parsing the data multiple +times. + +### Examples + +The following query uses `json_table` to extract values from an array of JSON +objects and return them as rows in a table with three columns: + +```sql +SELECT + * +FROM + json_table( + '[ + {"id":1,"name":"Africa","wikiDataId":"Q15"}, + {"id":2,"name":"Americas","wikiDataId":"Q828"}, + {"id":3,"name":"Asia","wikiDataId":"Q48"}, + {"id":4,"name":"Europe","wikiDataId":"Q51"} + ]', + 'strict $' COLUMNS ( + NESTED PATH 'strict $[*]' COLUMNS ( + id integer PATH 'strict $.id', + name varchar PATH 'strict $.name', + wiki_data_id varchar PATH 'strict $."wikiDataId"' + ) + ) + ); +``` + +| id | child | wiki_data_id | +| -- | --------- | ------------- | +| 1 | Africa | Q1 | +| 2 | Americas | Q828 | +| 3 | Asia | Q48 | +| 4 | Europe | Q51 | + +The following query uses `json_table` to extract values from an array of nested +JSON objects: + +```sql +SELECT + * +FROM + json_table( + '[ + {"continent": "Asia", "countries": [ + {"name": "Japan", "population": 125.7}, + {"name": "Thailand", "population": 71.6} + ]}, + {"continent": "Europe", "countries": [ + {"name": "France", "population": 67.4}, + {"name": "Germany", "population": 83.2} + ]} + ]', + 'lax $' COLUMNS ( + NESTED PATH 'lax $[*]' COLUMNS ( + continent varchar PATH 'lax $.continent', + NESTED PATH 'lax $.countries[*]' COLUMNS ( + country varchar PATH 'lax $.name', + population double PATH 'lax $.population' + ) + ) + )); +``` + +| continent | country | population | +| ---------- | --------- | ------------- | +| Asia | Japan | 125.7 | +| Asia | Thailand | 71.6 | +| Europe | France | 67.4 | +| Europe | Germany | 83.2 | + +The following query uses `PLAN` to specify an `OUTER` join between a parent path +and a child path: + +```sql +SELECT + * +FROM + JSON_TABLE( + '[]', + 'lax $' AS "root_path" + COLUMNS( + a varchar(1) PATH 'lax "A"', + NESTED PATH 'lax $[*]' AS "nested_path" + COLUMNS (b varchar(1) PATH 'lax "B"')) + PLAN ("root_path" OUTER "nested_path")) +``` + +| a | b | +| ---- | ---- | +| A | null | + +The following query uses `PLAN` to specify an `INNER` join between a parent path +and a child path: + +```sql +SELECT + * +FROM + JSON_TABLE( + '[]', + 'lax $' AS "root_path" + COLUMNS( + a varchar(1) PATH 'lax "A"', + NESTED PATH 'lax $[*]' AS "nested_path" + COLUMNS (b varchar(1) PATH 'lax "B"')) + PLAN ("root_path" INNER "nested_path")) +``` + +| a | b | +| ---- | ---- | +| null | null | + (json-array)= ## json_array