diff --git a/README.md b/README.md index a94bd5fadf39b..8b9a98118540d 100644 --- a/README.md +++ b/README.md @@ -68,7 +68,7 @@ - [Prepared SQL Statement Syntax](sql/prepare.md) - [Utility Statements](sql/util.md) - [TiDB SQL Syntax Diagram](https://pingcap.github.io/sqlgram/) - - [JSON Functions and Generated Column](sql/json-functions-generated-column.md) + - [Generated Columns](sql/generated-columns.md) - [Connectors and APIs](sql/connection-and-APIs.md) - [TiDB Transaction Isolation Levels](sql/transaction-isolation.md) - [Error Codes and Troubleshooting](sql/error.md) diff --git a/sql/generated-columns.md b/sql/generated-columns.md new file mode 100644 index 0000000000000..3d30525e93637 --- /dev/null +++ b/sql/generated-columns.md @@ -0,0 +1,69 @@ +--- +title: Generated Columns +summary: Learn how to use generated columns +category: user guide +--- + +# Generated Columns + +TiDB supports generated columns as part of MySQL 5.7 compatibility. One of the primary use cases for generated columns is to extract data out of a JSON data type and enable it to be indexed. + +## Index JSON using generated column + +In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. i.e. The following table structure is **not supported**: + +```sql +CREATE TABLE person ( + id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, + name VARCHAR(255) NOT NULL, + address_info JSON, + KEY (address_info) +); +``` + +In order to index a JSON column, you must first extract it as a generated column. Using the `city` generated column as an example, you are then able to add an index: + +```sql +CREATE TABLE person ( + id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, + name VARCHAR(255) NOT NULL, + address_info JSON, + city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) VIRTUAL, + KEY (city) +); +``` + +In this table, the `city` column is a **generated column**. As the name implies, the column is generated from other columns in the table, and cannot be assigned a value when inserted or updated. The column is also _virtual_ in that it does not require any storage or memory, and is generated on demand. The index on `city` however is _stored_ and uses the same structure as other indexes of the type `varchar(64)`. + +You can use the index on the generated column in order to speed up the following statement: + +```sql +SELECT name, id FROM person WHERE city = 'Beijing'; +``` + +If no data exists at path `$.city`, `JSON_EXTRACT` returns `NULL`. If you want to enforce a constraint that `city` must be `NOT NULL`, you can define the virtual column as follows: + +```sql +CREATE TABLE person ( + id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, + name VARCHAR(255) NOT NULL, + address_info JSON, + city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) VIRTUAL NOT NULL, + KEY (city) +); +``` + +Both `INSERT` and `UPDATE` statements check virtual column definitions. Rows that do not pass validation return errors: + +```sql +mysql> INSERT INTO person (name, address_info) VALUES ('Morgan', JSON_OBJECT('Country', 'Canada')); +ERROR 1048 (23000): Column 'city' cannot be null +``` + +## Limitations + +The current limitations of JSON and generated columns are as follows: + +- You cannot add the generated column in the storage type of `STORED` through `ALTER TABLE`. +- You cannot create an index on the generated column through `ALTER TABLE`. +- Not all [JSON functions](json-functions.md) are supported. diff --git a/sql/json-functions-generated-column.md b/sql/json-functions-generated-column.md deleted file mode 100644 index 0fa72c7569789..0000000000000 --- a/sql/json-functions-generated-column.md +++ /dev/null @@ -1,122 +0,0 @@ ---- -title: JSON Functions and Generated Column -summary: Learn how to use JSON functions and generated column to handle scenarios with uncertain schema. -category: user guide ---- - -# JSON Functions and Generated Column - -## About - -To be compatible with MySQL 5.7 or later and better support the document store, TiDB supports JSON in the latest version. In TiDB, a document is a set of Key-Value pairs, encoded as a JSON object. You can use the JSON datatype in a TiDB table and create indexes for the JSON document fields using generated columns. In this way, you can flexibly deal with the business scenarios with uncertain schema and are no longer limited by the read performance and the lack of support for transactions in traditional document databases. - -## JSON functions - -The support for JSON in TiDB mainly refers to the user interface of MySQL 5.7. For example, you can create a table that includes a JSON field to store complex information: - -```sql -CREATE TABLE person ( - id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, - name VARCHAR(255) NOT NULL, - address_info JSON -); -``` - -When you insert data into a table, you can deal with those data with uncertain schema like this: - -```sql -INSERT INTO person (name, address_info) VALUES ("John", '{"city": "Beijing"}'); -``` - -You can insert JSON data into the table by inserting a legal JSON string into the column corresponding to the JSON field. TiDB will then parse the text and save it in a more compact and easy-to-access binary form. - -You can also convert other data type into JSON using CAST: - -```sql -INSERT INTO person (name, address_info) VALUES ("John", CAST('{"city": "Beijing"}' AS JSON)); -INSERT INTO person (name, address_info) VALUES ("John", CAST('123' AS JSON)); -INSERT INTO person (name, address_info) VALUES ("John", CAST(123 AS JSON)); -``` - -Now, if you want to query all the users living in Beijing from the table, you can simply use the following SQL statement: - -```sql -SELECT id, name FROM person WHERE JSON_EXTRACT(address_info, '$.city') = 'Beijing'; -``` - -TiDB supports the `JSON_EXTRACT` function which is exactly the same as in MySQL. The function is to extract the `city` field from the `address_info` document. The second argument is a "path expression" and is used to specify which field to extract. See the following few examples to help you understand the "path expression": - -```sql -SET @person = '{"name":"John","friends":[{"name":"Forest","age":16},{"name":"Zhang San","gender":"male"}]}'; - -SELECT JSON_EXTRACT(@person, '$.name'); -- gets "John" -SELECT JSON_EXTRACT(@person, '$.friends[0].age'); -- gets 16 -SELECT JSON_EXTRACT(@person, '$.friends[1].gender'); -- gets "male" -SELECT JSON_EXTRACT(@person, '$.friends[2].name'); -- gets NULL -``` - -In addition to inserting and querying data, TiDB also supports editing JSON. In general, TiDB currently supports the following JSON functions in MySQL 5.7: - -- [JSON_EXTRACT](https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-extract) -- [JSON_ARRAY](https://dev.mysql.com/doc/refman/5.7/en/json-creation-functions.html#function_json-array) -- [JSON_OBJECT](https://dev.mysql.com/doc/refman/5.7/en/json-creation-functions.html#function_json-object) -- [JSON_SET](https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set) -- [JSON_REPLACE](https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-replace) -- [JSON_INSERT](https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-insert) -- [JSON_REMOVE](https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-remove) -- [JSON_TYPE](https://dev.mysql.com/doc/refman/5.7/en/json-attribute-functions.html#function_json-type) -- [JSON_UNQUOTE](https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-unquote) -- [JSON_MERGE](https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-merge) -- [JSON_CONTAINS](https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-contains) -- [JSON_CONTAINS_PATH](https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-contains-path) -- [JSON_LENGTH](https://dev.mysql.com/doc/refman/5.7/en/json-attribute-functions.html#function_json-length) - -You can get the general use of these functions directly from the function name. These functions in TiDB behave the same as in MySQL 5.7. For more information, see the [JSON Functions document of MySQL 5.7](https://dev.mysql.com/doc/refman/5.7/en/json-functions.html). If you are a user of MySQL 5.7, you can migrate to TiDB seamlessly. - -Currently TiDB does not support all the JSON functions in MySQL 5.7. You can track our progress in adding this functionality in [TiDB #7546](https://github.com/pingcap/tidb/issues/7546). - -## Index JSON using generated column - -The full table scan is executed when you query a JSON field. When you run the `EXPLAIN` statement in TiDB, the results show that it is full table scan. Then, can you index the JSON field? - -First, this type of index is wrong: - -```sql -CREATE TABLE person ( - id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, - name VARCHAR(255) NOT NULL, - address_info JSON, - KEY (address_info) -); -``` - -This is not because of technical impossibility but because the direct comparison of JSON itself is meaningless. Although we can agree on some comparison rules, such as `ARRAY` is bigger than all `OBJECT`, it is useless. Therefore, as what is done in MySQL 5.7, TiDB prohibits the direct creation of index on JSON field, but you can index the fields in the JSON document in the form of generated column: - -```sql -CREATE TABLE person ( - id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, - name VARCHAR(255) NOT NULL, - address_info JSON, - city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) VIRTUAL, - KEY (city) -); -``` - -In this table, the `city` column is a **generated column**. As the name implies, the column is generated by other columns in the table, and cannot be assigned a value when inserted or updated. For generating a column, you can specify it as `VIRTUAL` to prevent it from being explicitly saved in the record, but by other columns when needed. This is particularly useful when the column is wide and you need to save storage space. With this generated column, you can create an index on it, and it looks the same with other regular columns. In query, you can run the following statements: - -```sql -SELECT name, id FROM person WHERE city = 'Beijing'; -``` - -In this way, you can create an index. - -> **Note**: In the JSON document, if the field in the specified path does not exist, the result of `JSON_EXTRACT` will be `NULL`. The value of the generated column with index is also `NULL`. If this is not what you want to see, you can add a `NOT NULL` constraint on the generated column. In this way, when the value of the `city` field is `NULL` after you insert data, it can be detected. - -## Limitations - -The current limitations of JSON and generated column are as follows: - -- You cannot add the generated column in the storage type of `STORED` through `ALTER TABLE`. -- You cannot create an index on the generated column through `ALTER TABLE`. - -The above functions and some other JSON functions are under development. diff --git a/sql/json-functions.md b/sql/json-functions.md index ee52706900a77..b9ae307a4bab6 100644 --- a/sql/json-functions.md +++ b/sql/json-functions.md @@ -6,22 +6,64 @@ category: user guide # JSON Functions -| Function Name and Syntactic Sugar | Description | -| ---------- | ------------------ | -| [JSON_EXTRACT(json_doc, path[, path] ...)][json_extract]| Return data from a JSON document, selected from the parts of the document matched by the `path` arguments | -| [JSON_UNQUOTE(json_val)][json_unquote] | Unquote JSON value and return the result as a `utf8mb4` string | -| [JSON_TYPE(json_val)][json_type] | Return a `utf8mb4` string indicating the type of a JSON value | -| [JSON_SET(json_doc, path, val[, path, val] ...)][json_set] | Insert or update data in a JSON document and return the result | -| [JSON_INSERT(json_doc, path, val[, path, val] ...)][json_insert] | Insert data into a JSON document and return the result | -| [JSON_REPLACE(json_doc, path, val[, path, val] ...)][json_replace] | Replace existing values in a JSON document and return the result | -| [JSON_REMOVE(json_doc, path[, path] ...)][json_remove] | Remove data from a JSON document and return the result | -| [JSON_MERGE(json_doc, json_doc[, json_doc] ...)][json_merge] | Merge two or more JSON documents and return the merged result | -| [JSON_OBJECT(key, val[, key, val] ...)][json_object] | Evaluate a (possibly empty) list of key-value pairs and return a JSON object containing those pairs | -| [JSON_ARRAY([val[, val] ...])][json_array] | Evaluate a (possibly empty) list of values and return a JSON array containing those values | -| -> | Return value from JSON column after evaluating path; the syntactic sugar of `JSON_EXTRACT(doc, path_literal)` | -| ->> | Return value from JSON column after evaluating path and unquoting the result; the syntactic sugar of `JSON_UNQUOTE(JSONJSON_EXTRACT(doc, path_literal))` | +TiDB supports most of the JSON functions that shipped with the GA release of MySQL 5.7. Additional JSON functions were added to MySQL 5.7 after its release, and not all are available in TiDB (see [unsupported functions](#unsupported-functions)). + +## Functions that create JSON values + +| Function Name and Syntactic Sugar | Description | +| --------------------------------- | ----------- | +| [JSON_ARRAY([val[, val] ...])][json_array] | Evaluates a (possibly empty) list of values and returns a JSON array containing those values | +| [JSON_OBJECT(key, val[, key, val] ...)][json_object] | Evaluates a (possibly empty) list of key-value pairs and returns a JSON object containing those pairs | + +## Functions that search JSON values + +| Function Name and Syntactic Sugar | Description | +| --------------------------------- | ----------- | +| [JSON_CONTAINS(target, candidate[, path])][json_contains] | Indicates by returning 1 or 0 whether a given candidate JSON document is contained within a target JSON document | +| [JSON_CONTAINS_PATH(json_doc, one_or_all, path[, path] ...)][json_contains_path] | Returns 0 or 1 to indicate whether a JSON document contains data at a given path or paths | +| [JSON_EXTRACT(json_doc, path[, path] ...)][json_extract]| Returns data from a JSON document, selected from the parts of the document matched by the `path` arguments | +| [->][json_short_extract] | Returns the value from a JSON column after the evaluating path; the syntactic sugar of `JSON_EXTRACT(doc, path_literal)` | +| [->>][json_short_extract_unquote] | Returns the value from a JSON column after the evaluating path and unquoting the result; the syntactic sugar of `JSON_UNQUOTE(JSON_EXTRACT(doc, path_literal))` | + +## Functions that modify JSON values + +| Function Name and Syntactic Sugar | Description | +| --------------------------------- | ----------- | +| [JSON_INSERT(json_doc, path, val[, path, val] ...)][json_insert] | Inserts data into a JSON document and returns the result | +| [JSON_MERGE(json_doc, json_doc[, json_doc] ...)][json_merge] | Merges two or more JSON documents and returns the merged result | +| [JSON_REMOVE(json_doc, path[, path] ...)][json_remove] | Removes data from a JSON document and returns the result | +| [JSON_REPLACE(json_doc, path, val[, path, val] ...)][json_replace] | Replaces existing values in a JSON document and returns the result | +| [JSON_SET(json_doc, path, val[, path, val] ...)][json_set] | Inserts or updates data in a JSON document and returns the result | +| [JSON_UNQUOTE(json_val)][json_unquote] | Unquotes a JSON value and returns the result as a string | + +## Functions that return JSON value attributes + +| Function Name and Syntactic Sugar | Description | +| --------------------------------- | ----------- | +| [JSON_LENGTH(json_doc[, path])][json_length] | Returns the length of a JSON document, or, if a path argument is given, the length of the value within the path | +| [JSON_TYPE(json_val)][json_type] | Returns a string indicating the type of a JSON value | + +## Unsupported functions + +The following JSON functions are unsupported in TiDB. You can track the progress in adding them in [TiDB #7546](https://github.com/pingcap/tidb/issues/7546): + +* `JSON_APPEND` and its alias `JSON_ARRAY_APPEND` +* `JSON_ARRAY_INSERT` +* `JSON_DEPTH` +* `JSON_KEYS` +* `JSON_MERGE_PATCH` +* `JSON_MERGE_PRESERVE`, use the alias `JSON_MERGE` instead +* `JSON_PRETTY` +* `JSON_QUOTE` +* `JSON_SEARCH` +* `JSON_STORAGE_SIZE` +* `JSON_VALID` +* `JSON_ARRAYAGG` +* `JSON_OBJECTAGG` [json_extract]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-extract +[json_short_extract]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#operator_json-column-path +[json_short_extract_unquote]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#operator_json-inline-path [json_unquote]: https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-unquote [json_type]: https://dev.mysql.com/doc/refman/5.7/en/json-attribute-functions.html#function_json-type [json_set]: https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set @@ -31,3 +73,10 @@ category: user guide [json_merge]: https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-merge [json_object]: https://dev.mysql.com/doc/refman/5.7/en/json-creation-functions.html#function_json-object [json_array]: https://dev.mysql.com/doc/refman/5.7/en/json-creation-functions.html#function_json-array +[json_keys]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-keys +[json_length]: https://dev.mysql.com/doc/refman/5.7/en/json-attribute-functions.html#function_json-length +[json_valid]: https://dev.mysql.com/doc/refman/5.7/en/json-attribute-functions.html#function_json-valid +[json_quote]: https://dev.mysql.com/doc/refman/5.7/en/json-creation-functions.html#function_json-quote +[json_contains]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-contains +[json_contains_path]: https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-contains-path +[json_arrayagg]: https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_json-arrayagg