-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow DELETE on compressed chunks without decompression #6882
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
svenklemm
force-pushed
the
compressed_delete
branch
from
May 10, 2024 13:38
c22c271
to
bf07fc7
Compare
svenklemm
force-pushed
the
compressed_delete
branch
from
May 20, 2024 02:42
bf07fc7
to
611a361
Compare
svenklemm
force-pushed
the
compressed_delete
branch
2 times, most recently
from
July 28, 2024 11:49
69e55d1
to
2af01aa
Compare
svenklemm
force-pushed
the
compressed_delete
branch
from
July 29, 2024 05:30
2af01aa
to
73d9558
Compare
akuzm
approved these changes
Jul 29, 2024
antekresic
approved these changes
Aug 7, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, couple of nits and a test with fks which disables this optimization would be nice.
Approving assuming the test will be added.
svenklemm
force-pushed
the
compressed_delete
branch
7 times, most recently
from
August 10, 2024 08:29
af1330b
to
5a37192
Compare
When the constraints of a DELETE on a compressed chunks fully cover the batches we can optimize the DELETE to work directly on the compressed batches and skip the expensive decompression part. This optimization is disabled when we detect any JOINs.
svenklemm
force-pushed
the
compressed_delete
branch
from
August 10, 2024 10:25
5a37192
to
d2bd5ca
Compare
pallavisontakke
added a commit
to pallavisontakke/timescaledb
that referenced
this pull request
Sep 20, 2024
This release contains performance improvements and bug fixes since the 2.16.1 release. We recommend that you upgrade at the next available opportunity. **Features** * timescale#6882: Allow DELETE on the compressed chunks without decompression. * timescale#7033 Use MERGE statement on CAgg Refresh * timescale#7126: Add functions to show the compression information. * timescale#7147: Vectorize partial aggregation for `sum * timescale#7204: Track additional extensions in telemetry. * timescale#7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * timescale#7209: Add a function to drop the `osm` chunk. **Bugfixes** * timescale#7187: Fix the string literal length for the `compressed_data_info` function. * timescale#7191: Fix creating default indexes on chunks when migrating the data. * timescale#7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * timescale#7201: Use the generic extension description when building `apt` and `rpm` loader packages. * timescale#7227: Add an index to the `compression_chunk_size` catalog table. * timescale#7229: Fix the foreign key constraints where the index and the constraint column order are different. * timescale#7230: Do not propagate the foreign key constraints to the `osm` chunk. * timescale#7234: Release the cache after accessing the cache entry. * timescale#7258 Force English in the pg_config command executed by cmake to avoid unexpected building errors * timescale#7270 Fix memory leak in compressed DML batch filtering **Thanks** * @MiguelTubio for reporting and fixing a Windows build error * @posuch for reporting the misleading extension description in the generic loader packages.
Merged
pallavisontakke
added a commit
to pallavisontakke/timescaledb
that referenced
this pull request
Sep 25, 2024
This release contains performance improvements and bug fixes since the 2.16.1 release. We recommend that you upgrade at the next available opportunity. **Features** * timescale#6882: Allow DELETE on the compressed chunks without decompression. * timescale#7033 Use MERGE statement on CAgg Refresh * timescale#7126: Add functions to show the compression information. * timescale#7147: Vectorize partial aggregation for `sum * timescale#7204: Track additional extensions in telemetry. * timescale#7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * timescale#7209: Add a function to drop the `osm` chunk. **Bugfixes** * timescale#7187: Fix the string literal length for the `compressed_data_info` function. * timescale#7191: Fix creating default indexes on chunks when migrating the data. * timescale#7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * timescale#7201: Use the generic extension description when building `apt` and `rpm` loader packages. * timescale#7227: Add an index to the `compression_chunk_size` catalog table. * timescale#7229: Fix the foreign key constraints where the index and the constraint column order are different. * timescale#7230: Do not propagate the foreign key constraints to the `osm` chunk. * timescale#7234: Release the cache after accessing the cache entry. * timescale#7258 Force English in the pg_config command executed by cmake to avoid unexpected building errors * timescale#7270 Fix memory leak in compressed DML batch filtering **Thanks** * @MiguelTubio for reporting and fixing a Windows build error * @posuch for reporting the misleading extension description in the generic loader packages.
pallavisontakke
added a commit
to pallavisontakke/timescaledb
that referenced
this pull request
Sep 26, 2024
This release contains performance improvements and bug fixes since the 2.16.1 release. We recommend that you upgrade at the next available opportunity. **Features** * timescale#6882: Allow DELETE on the compressed chunks without decompression. * timescale#7033 Use MERGE statement on CAgg Refresh * timescale#7126: Add functions to show the compression information. * timescale#7147: Vectorize partial aggregation for `sum * timescale#7204: Track additional extensions in telemetry. * timescale#7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * timescale#7209: Add a function to drop the `osm` chunk. **Bugfixes** * timescale#7187: Fix the string literal length for the `compressed_data_info` function. * timescale#7191: Fix creating default indexes on chunks when migrating the data. * timescale#7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * timescale#7201: Use the generic extension description when building `apt` and `rpm` loader packages. * timescale#7227: Add an index to the `compression_chunk_size` catalog table. * timescale#7229: Fix the foreign key constraints where the index and the constraint column order are different. * timescale#7230: Do not propagate the foreign key constraints to the `osm` chunk. * timescale#7234: Release the cache after accessing the cache entry. * timescale#7258 Force English in the pg_config command executed by cmake to avoid unexpected building errors * timescale#7270 Fix memory leak in compressed DML batch filtering **Thanks** * @MiguelTubio for reporting and fixing a Windows build error * @posuch for reporting the misleading extension description in the generic loader packages.
pallavisontakke
added a commit
to pallavisontakke/timescaledb
that referenced
this pull request
Sep 30, 2024
This release contains performance improvements and bug fixes since the 2.16.1 release. We recommend that you upgrade at the next available opportunity. **Features** * timescale#6882: Allow DELETE on the compressed chunks without decompression. * timescale#7033 Use MERGE statement on CAgg Refresh * timescale#7126: Add functions to show the compression information. * timescale#7147: Vectorize partial aggregation for `sum * timescale#7204: Track additional extensions in telemetry. * timescale#7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * timescale#7209: Add a function to drop the `osm` chunk. **Bugfixes** * timescale#7187: Fix the string literal length for the `compressed_data_info` function. * timescale#7191: Fix creating default indexes on chunks when migrating the data. * timescale#7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * timescale#7201: Use the generic extension description when building `apt` and `rpm` loader packages. * timescale#7227: Add an index to the `compression_chunk_size` catalog table. * timescale#7229: Fix the foreign key constraints where the index and the constraint column order are different. * timescale#7230: Do not propagate the foreign key constraints to the `osm` chunk. * timescale#7234: Release the cache after accessing the cache entry. * timescale#7258 Force English in the pg_config command executed by cmake to avoid unexpected building errors * timescale#7270 Fix memory leak in compressed DML batch filtering **Thanks** * @MiguelTubio for reporting and fixing a Windows build error * @posuch for reporting the misleading extension description in the generic loader packages.
pallavisontakke
added a commit
to pallavisontakke/timescaledb
that referenced
this pull request
Oct 8, 2024
This release contains performance improvements and bug fixes since the 2.16.1 release. We recommend that you upgrade at the next available opportunity. **Features** * timescale#6882: Allow DELETE on the compressed chunks without decompression. * timescale#7033 Use MERGE statement on CAgg Refresh * timescale#7126: Add functions to show the compression information. * timescale#7147: Vectorize partial aggregation for `sum * timescale#7200: Vectorize common aggregate functions like `min`, `max`, `sum`, `avg`, `stddev`, `variance` for compressed columns of arithmetic types, when there is grouping on segmentby columns or no grouping. * timescale#7204: Track additional extensions in telemetry. * timescale#7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * timescale#7209: Add a function to drop the `osm` chunk. * timescale#7275: Add support for RETURNING clause for MERGE * timescale#7295 Support ALTER TABLE SET ACCESS METHOD on hypertable **Bugfixes** * timescale#7187: Fix the string literal length for the `compressed_data_info` function. * timescale#7191: Fix creating default indexes on chunks when migrating the data. * timescale#7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * timescale#7201: Use the generic extension description when building `apt` and `rpm` loader packages. * timescale#7227: Add an index to the `compression_chunk_size` catalog table. * timescale#7229: Fix the foreign key constraints where the index and the constraint column order are different. * timescale#7230: Do not propagate the foreign key constraints to the `osm` chunk. * timescale#7234: Release the cache after accessing the cache entry. * timescale#7258 Force English in the pg_config command executed by cmake to avoid unexpected building errors * timescale#7270 Fix memory leak in compressed DML batch filtering * timescale#7286: Fix index column check while searching for index * timescale#7290 Add check for NULL offset for caggs built on top of caggs * timescale#7301 Make foreign key behaviour for hypertables consistent * timescale#7318: Fix chunk skipping range filtering * timescale#7320 Set license specific extension comment in install script **Thanks** * @MiguelTubio for reporting and fixing a Windows build error * @posuch for reporting the misleading extension description in the generic loader packages. * @snyrkill for discovering and reporting the issue
Merged
pallavisontakke
added a commit
that referenced
this pull request
Oct 8, 2024
This release adds support for PostgreSQL 17, significantly improves the performance of continuous aggregate refreshes, and contains performance improvements for analytical queries and delete operations over compressed hypertables. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.17.0** * Full PostgreSQL 17 support for all existing features. TimescaleDB v2.17 is available for PostgreSQL 14, 15, 16, and 17. * Significant performance improvements for continuous aggregate policies: continuous aggregate refresh is now using `merge` instead of deleting old materialized data and re-inserting. This update can decrease dramatically the amount of data that must be written on the continuous aggregate in the presence of a small number of changes, reduce the `i/o` cost of refreshing a continuous aggregate, and generate fewer Write-Ahead Logs (`WAL`). Overall, continuous aggregate policies will be more lightweight, use less system resources, and complete faster. * Increased performance for real-time analytical queries over compressed hypertables: we are excited to introduce additional Single Instruction, Multiple Data (`SIMD`) vectorization optimization to our engine by supporting vectorized execution for queries that group by using the `segment_by` column(s) and aggregate using the basic aggregate functions (`sum`, `count`, `avg`, `min`, `max`). Stay tuned for more to come in follow-up releases! Support for grouping on additional columns, filtered aggregation, vectorized expressions, and `time_bucket` is coming soon. * Improved performance of deletes on compressed hypertables when a large amount of data is affected. This improvement speeds up operations that delete whole segments by skipping the decompression step. It is enabled for all deletes that filter by the `segment_by` column(s). **PostgreSQL 14 deprecation announcement** We will continue supporting PostgreSQL 14 until April 2025. Closer to that time, we will announce the specific version of TimescaleDB in which PostgreSQL 14 support will not be included going forward. **Features** * #6882: Allow delete of full segments on compressed chunks without decompression. * #7033: Use `merge` statement on continuous aggregates refresh. * #7126: Add functions to show the compression information. * #7147: Vectorize partial aggregation for `sum(int4)` with grouping on `segment by` columns. * #7204: Track additional extensions in telemetry. * #7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * #7209: Add a function to drop the `osm` chunk. * #7275: Add support for the `returning` clause for `merge`. * #7200: Vectorize common aggregate functions like `min`, `max`, `sum`, `avg`, `stddev`, `variance` for compressed columns of arithmetic types, when there is grouping on `segment by` columns or no grouping. **Bug fixes** * #7187: Fix the string literal length for the `compressed_data_info` function. * #7191: Fix creating default indexes on chunks when migrating the data. * #7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * #7201: Use the generic extension description when building `apt` and `rpm` loader packages. * #7227: Add an index to the `compression_chunk_size` catalog table. * #7229: Fix the foreign key constraints where the index and the constraint column order are different. * #7230: Do not propagate the foreign key constraints to the `osm` chunk. * #7234: Release the cache after accessing the cache entry. * #7258: Force English in the `pg_config` command executed by `cmake` to avoid the unexpected building errors. * #7270: Fix the memory leak in compressed DML batch filtering. * #7286: Fix the index column check while searching for the index. * #7290: Add check for null offset for continuous aggregates built on top of continuous aggregates. * #7301: Make foreign key behavior for hypertables consistent. * #7318: Fix chunk skipping range filtering. * #7320: Set the license specific extension comment in the install script. **Thanks** * @MiguelTubio for reporting and fixing the Windows build error. * @posuch for reporting the misleading extension description in the generic loader packages. * @snyrkill for discovering and reporting the issue with continuous aggregates built on top of continuous aggregates.
svenklemm
added a commit
that referenced
this pull request
Oct 8, 2024
This release adds support for PostgreSQL 17, significantly improves the performance of continuous aggregate refreshes, and contains performance improvements for analytical queries and delete operations over compressed hypertables. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.17.0** * Full PostgreSQL 17 support for all existing features. TimescaleDB v2.17 is available for PostgreSQL 14, 15, 16, and 17. * Significant performance improvements for continuous aggregate policies: continuous aggregate refresh is now using `merge` instead of deleting old materialized data and re-inserting. This update can decrease dramatically the amount of data that must be written on the continuous aggregate in the presence of a small number of changes, reduce the `i/o` cost of refreshing a continuous aggregate, and generate fewer Write-Ahead Logs (`WAL`). Overall, continuous aggregate policies will be more lightweight, use less system resources, and complete faster. * Increased performance for real-time analytical queries over compressed hypertables: we are excited to introduce additional Single Instruction, Multiple Data (`SIMD`) vectorization optimization to our engine by supporting vectorized execution for queries that group by using the `segment_by` column(s) and aggregate using the basic aggregate functions (`sum`, `count`, `avg`, `min`, `max`). Stay tuned for more to come in follow-up releases! Support for grouping on additional columns, filtered aggregation, vectorized expressions, and `time_bucket` is coming soon. * Improved performance of deletes on compressed hypertables when a large amount of data is affected. This improvement speeds up operations that delete whole segments by skipping the decompression step. It is enabled for all deletes that filter by the `segment_by` column(s). **PostgreSQL 14 deprecation announcement** We will continue supporting PostgreSQL 14 until April 2025. Closer to that time, we will announce the specific version of TimescaleDB in which PostgreSQL 14 support will not be included going forward. **Features** * #6882: Allow delete of full segments on compressed chunks without decompression. * #7033: Use `merge` statement on continuous aggregates refresh. * #7126: Add functions to show the compression information. * #7147: Vectorize partial aggregation for `sum(int4)` with grouping on `segment by` columns. * #7204: Track additional extensions in telemetry. * #7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * #7209: Add a function to drop the `osm` chunk. * #7275: Add support for the `returning` clause for `merge`. * #7200: Vectorize common aggregate functions like `min`, `max`, `sum`, `avg`, `stddev`, `variance` for compressed columns of arithmetic types, when there is grouping on `segment by` columns or no grouping. **Bug fixes** * #7187: Fix the string literal length for the `compressed_data_info` function. * #7191: Fix creating default indexes on chunks when migrating the data. * #7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * #7201: Use the generic extension description when building `apt` and `rpm` loader packages. * #7227: Add an index to the `compression_chunk_size` catalog table. * #7229: Fix the foreign key constraints where the index and the constraint column order are different. * #7230: Do not propagate the foreign key constraints to the `osm` chunk. * #7234: Release the cache after accessing the cache entry. * #7258: Force English in the `pg_config` command executed by `cmake` to avoid the unexpected building errors. * #7270: Fix the memory leak in compressed DML batch filtering. * #7286: Fix the index column check while searching for the index. * #7290: Add check for null offset for continuous aggregates built on top of continuous aggregates. * #7301: Make foreign key behavior for hypertables consistent. * #7318: Fix chunk skipping range filtering. * #7320: Set the license specific extension comment in the install script. **Thanks** * @MiguelTubio for reporting and fixing the Windows build error. * @posuch for reporting the misleading extension description in the generic loader packages. * @snyrkill for discovering and reporting the issue with continuous aggregates built on top of continuous aggregates. --------- Signed-off-by: Pallavi Sontakke <[email protected]> Signed-off-by: Yannis Roussos <[email protected]> Signed-off-by: Sven Klemm <[email protected]> Co-authored-by: Yannis Roussos <[email protected]> Co-authored-by: atovpeko <[email protected]> Co-authored-by: Sven Klemm <[email protected]>
kpan2034
pushed a commit
to kpan2034/timescaledb
that referenced
this pull request
Oct 11, 2024
This release adds support for PostgreSQL 17, significantly improves the performance of continuous aggregate refreshes, and contains performance improvements for analytical queries and delete operations over compressed hypertables. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.17.0** * Full PostgreSQL 17 support for all existing features. TimescaleDB v2.17 is available for PostgreSQL 14, 15, 16, and 17. * Significant performance improvements for continuous aggregate policies: continuous aggregate refresh is now using `merge` instead of deleting old materialized data and re-inserting. This update can decrease dramatically the amount of data that must be written on the continuous aggregate in the presence of a small number of changes, reduce the `i/o` cost of refreshing a continuous aggregate, and generate fewer Write-Ahead Logs (`WAL`). Overall, continuous aggregate policies will be more lightweight, use less system resources, and complete faster. * Increased performance for real-time analytical queries over compressed hypertables: we are excited to introduce additional Single Instruction, Multiple Data (`SIMD`) vectorization optimization to our engine by supporting vectorized execution for queries that group by using the `segment_by` column(s) and aggregate using the basic aggregate functions (`sum`, `count`, `avg`, `min`, `max`). Stay tuned for more to come in follow-up releases! Support for grouping on additional columns, filtered aggregation, vectorized expressions, and `time_bucket` is coming soon. * Improved performance of deletes on compressed hypertables when a large amount of data is affected. This improvement speeds up operations that delete whole segments by skipping the decompression step. It is enabled for all deletes that filter by the `segment_by` column(s). **PostgreSQL 14 deprecation announcement** We will continue supporting PostgreSQL 14 until April 2025. Closer to that time, we will announce the specific version of TimescaleDB in which PostgreSQL 14 support will not be included going forward. **Features** * timescale#6882: Allow delete of full segments on compressed chunks without decompression. * timescale#7033: Use `merge` statement on continuous aggregates refresh. * timescale#7126: Add functions to show the compression information. * timescale#7147: Vectorize partial aggregation for `sum(int4)` with grouping on `segment by` columns. * timescale#7204: Track additional extensions in telemetry. * timescale#7207: Refactor the `decompress_batches_scan` functions for easier maintenance. * timescale#7209: Add a function to drop the `osm` chunk. * timescale#7275: Add support for the `returning` clause for `merge`. * timescale#7200: Vectorize common aggregate functions like `min`, `max`, `sum`, `avg`, `stddev`, `variance` for compressed columns of arithmetic types, when there is grouping on `segment by` columns or no grouping. **Bug fixes** * timescale#7187: Fix the string literal length for the `compressed_data_info` function. * timescale#7191: Fix creating default indexes on chunks when migrating the data. * timescale#7195: Fix the `segment by` and `order by` checks when dropping a column from a compressed hypertable. * timescale#7201: Use the generic extension description when building `apt` and `rpm` loader packages. * timescale#7227: Add an index to the `compression_chunk_size` catalog table. * timescale#7229: Fix the foreign key constraints where the index and the constraint column order are different. * timescale#7230: Do not propagate the foreign key constraints to the `osm` chunk. * timescale#7234: Release the cache after accessing the cache entry. * timescale#7258: Force English in the `pg_config` command executed by `cmake` to avoid the unexpected building errors. * timescale#7270: Fix the memory leak in compressed DML batch filtering. * timescale#7286: Fix the index column check while searching for the index. * timescale#7290: Add check for null offset for continuous aggregates built on top of continuous aggregates. * timescale#7301: Make foreign key behavior for hypertables consistent. * timescale#7318: Fix chunk skipping range filtering. * timescale#7320: Set the license specific extension comment in the install script. **Thanks** * @MiguelTubio for reporting and fixing the Windows build error. * @posuch for reporting the misleading extension description in the generic loader packages. * @snyrkill for discovering and reporting the issue with continuous aggregates built on top of continuous aggregates. --------- Signed-off-by: Pallavi Sontakke <[email protected]> Signed-off-by: Yannis Roussos <[email protected]> Signed-off-by: Sven Klemm <[email protected]> Co-authored-by: Yannis Roussos <[email protected]> Co-authored-by: atovpeko <[email protected]> Co-authored-by: Sven Klemm <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When the constraints of a DELETE on a compressed chunks fully cover the batches we can optimize the DELETE to work directly on the compressed batches and skip the expensive decompression part.