Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support RANGE INTERVAL partitioning #35683

Closed
mjonss opened this issue Jun 23, 2022 · 0 comments · Fixed by #35662
Closed

Support RANGE INTERVAL partitioning #35683

mjonss opened this issue Jun 23, 2022 · 0 comments · Fixed by #35662
Assignees
Labels
type/enhancement The issue or PR belongs to an enhancement. type/new-feature

Comments

@mjonss
Copy link
Contributor

mjonss commented Jun 23, 2022

Enhancement

RANGE table partitioning can be very verbose when having many partitions with the same range interval, so TiDB should extend RANGE with INTERVAL which would take this thousand row CREATE TABLE:

CREATE TABLE t (id bigint unsigned, data varchar(255))
PARTITION BY RANGE (id)
(PARTITION pNull VALUES LESS THAN (0),
 PARTITION p0 VALUES LESS THAN (1000000),
 PARTITION p1 VALUES LESS THAN (2000000),
...
 PARTITION p998 VALUES LESS THAN (999000000),
 PARTITION pMax VALUES LESS THAN (MAXVALUE))

and allow it to be written as something like

CREATE TABLE t (id bigint unsigned, data varchar(255))
PARTITION BY RANGE (id) INTERVAL (1000000)
FIRST PARTITION LESS THAN (1000000)
LAST PARTITION LESS THAN (999000000)
NULL PARTITION
MAXVALUE PARTITION

Also maintenance/ALTER commands needs to be extended, like

-- No change to existing ADD/DROP PARTITION, but simplified creation of new partitions, extending the range partitioning scheme,
-- this would add partitions for every interval from current LAST PARTITION expr to the new expr
ALTER TABLE t LAST PARTITION LESS THAN (expr)
-- And simplified DROP PARTITION, which will drop all partitions less than the matched FIRST partition (not touching the NULL partition if exists)
ALTER TABLE t FIRST PARTITION LESS THAN (expr)

-- REORGANIZE PARTITION (only syntax supported, not yet implemented in TiDB)
-- For the FIRST PARTITION, reorganize all partitions before the new FIRST PARTITION expr into the new FIRST PARTITION
ALTER TABLE MERGE FIRST PARTITION LESS THAN (expr)
-- For the MAXVALUE PARTITION, create new partitions up to the new expr (which will become the new LAST PARTITION expr) and reorganize the data in the old MAXVALUE PARTITION into the new partitions, including a new MAXVALUE PARTITION
ALTER TABLE SPLIT MAXVALUE PARTITION LESS THAN (expr)

This can be implemented without changing any stored meta data or impact on backwards compatibility, by translating the new syntax (together with existing table metadata for ALTER commands) into normal CREATE/ALTER TABLE commands.

Suggested syntax

CREATE TABLE

CREATE TABLE ... PARTITION BY RANGE [COLUMNS] (partition_col [, partition_col...])
INTERVAL (<interval expr>)
FIRST PARTITION LESS THAN (expr)
LAST PARTITION LESS THAN (expr)
[NULL PARTITION]
[MAXVALUE PARTITION]

Where interval expr for temporal column types are defined as for MySQL but without the INTERVAL token. Supported time units are YEAR, QUARTER, MONTH, WEEK, DAY, HOUR, MINUTE and SECOND. Examples:

expr unit
1 YEAR
1 QUARTER
1 MONTH
1 DAY
30 MINUTE

Full example for daily partitions for year 2021 and 2022:

CREATE TABLE t (id bigint, dt datetime, data varchar(255))
PARTITION BY RANGE COLUMNS (dt)
INTERVAL (1 DAY)
FIRST PARTITION LESS THAN ('2021-02-01')
LAST PARTITION LESS THAN ('2023-01-01')

Since this is a very compact syntax, we will not support per partition options like Placement Rules (table level options will still be ok)

Adding / splitting partitions

In addition to ADD PARTITION, we also support adding more partitions by simply alter the LAST PARTITION expr:

ALTER TABLE <table_name> LAST PARTITION LESS THAN (expr)

Where expr is the same as for CREATE TABLE above.
Since this is a very compact syntax, we will not support per partition options like Placement Rules (table level options will still be ok)

And if there is a MAXVALUE partition, we need to split it (new command instead of REORGANIZE to avoid needing the generated partitioning name and make automation easier):

ALTER TABLE <table_name> SPLIT MAXVALUE PARTITION LESS THAN (expr)

Since REORGANIZE PARTITION is not yet supported, this new SPLIT command will only be parsed, but needs REORGANIZE PARTITION to be implemented)

Drop partition / merge partitions

ALTER TABLE <table_name> DROP PARTITION <list of partitions> will still be extended to drop all partitions before a specified partition

ALTER TABLE table_name FIRST PARTITION LESS THAN (expr)

Which will remove all partitions before the matching partition (except for a possible NULL partition)

Merge partitions would two or more partitions into a new FIRST PARTITION without deleting any data:

ALTER TABLE table_name MERGE FIRST PARTITION LESS THAN (expr)

Only to be implemented in parser until REORGANIZE PARTITION is implemented, same as SPLIT PARTITION

Limitations

  • Only for integer columns/partition expressions and date/datetime column types in RANGE COLUMNS.
  • Automatically add or drop partitions is not a part of this enhancement (internally it may mix DML and DDL).
  • Since we will not change any metadata, for the new ALTER commands, we need to rediscover the interval, null partition, maxvalue partition, first partition and last partition, including verifying that all partitions would be the same as if they were generated with a CREATE TABLE statement with those interval arguments. If such validation fail, we will not execute the ALTER command and return an error. Partition names and partition level options are not taken into consideration when comparing with the discovered/generated table and the actual table.
  • Only changes to CREATE TABLE and ALTER TABLE syntax
    • No change in internal metadata, DDL functionality, just accepting syntactic sugar which will generate the same operations as its more verbose syntax.
    • No impact on SHOW CREATE TABLE (will still show full partition definition). Simplification of SHOW CREATE TABLE is a separate issue.
    • No impact on other tools like TiCDC etc. i.e. the new INTERVAL syntax will be converted to existing DDL jobs, including an updated statement string replacing the syntactic sugar part with the generated equivalent of the compatible syntax.
    • No support for per partition options like Placement Rules (still OK to have on Table level).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement. type/new-feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants