Improve parallel workers for decompression #5655

sotirissl · 2023-05-03T19:24:40Z

So far, we have set the number of desired workers for decompression to
If a query touches only one chunk, we end up with one worker in a
parallel plan. Only if the query touches multiple chunks PostgreSQL
spins up multiple workers. These workers could then be used to process
the data of one chunk.

This patch removes our custom worker calculation and relies on
PostgreSQL logic to calculate the desired parallelity.

Co-authored-by: Jan Kristof Nidzwetzki [email protected]

Benchmark

Below is an example containing ANALYZE and a bigger table
with only one chunk, than the one used in the test case.

The reader can notice that in a TimescaleDB before this PR
only 1 worker per chunk was allocated by PostgreSQL.
In a TimescaleDB after this PR 4 workers per chunk
are used by PostgreSQL (and after encouraging the use of parallel plans).
Moreover, the total execution time is less after this PR
comparing to the execution time before this PR.
So we have performance improvement in big tables
when parallel plans are chosen by PostgreSQL.
Below are some measurements made in a 4 CPUs computer.

Execution Time	Before This PR	After This PR
seq_scan	2492.898 ms	1726.454 ms
index_scan	4889.353 ms	4245.768 ms

CREATE TABLE f_sensor_data(
      time timestamptz not null,
      sensor_id integer not null,
      cpu double precision null,
      temperature double precision null
    );

SELECT FROM create_hypertable('f_sensor_data','time');
SELECT set_chunk_time_interval('f_sensor_data', INTERVAL '1 year');

SELECT * FROM _timescaledb_internal.create_chunk('f_sensor_data',' {"time": [181900977000000, 515024000000000]}');

INSERT INTO f_sensor_data
SELECT
    time AS time,
    sensor_id,
    100.0,
    36.6
FROM
    generate_series('1980-01-01 00:00'::timestamp, '1980-02-28 12:00', INTERVAL '1 day') AS g1(time),
    generate_series(1, 170000, 1 ) AS g2(sensor_id)
ORDER BY
    time;

ALTER TABLE f_sensor_data SET (timescaledb.compress, timescaledb.compress_segmentby='sensor_id' ,timescaledb.compress_orderby = 'time DESC');

SELECT compress_chunk(i) FROM show_chunks('f_sensor_data') i;

-- Encourage use of parallel plans
SET parallel_setup_cost = 0;
SET parallel_tuple_cost = 0;
SET min_parallel_table_scan_size TO '0';

\set explain 'EXPLAIN (ANALYZE, VERBOSE, COSTS OFF)'

SHOW min_parallel_table_scan_size;
SHOW max_parallel_workers;
SHOW max_parallel_workers_per_gather;

SET max_parallel_workers_per_gather = 4;
SHOW max_parallel_workers_per_gather;
:explain
SELECT sum(cpu) FROM f_sensor_data;

-- Encourage use of Index Scan

SET enable_seqscan = false;
SET enable_indexscan = true;
SET min_parallel_index_scan_size = 0;
SET min_parallel_table_scan_size = 0;

CREATE INDEX ON f_sensor_data (time, sensor_id);
:explain
SELECT * FROM f_sensor_data WHERE sensor_id > 100;

TimescaleDB before This PR

                                                                                                                                                                         QUERY PLAN                                                                                                                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate (actual time=2487.184..2492.814 rows=1 loops=1)
   Output: sum(_hyper_3_3_chunk.cpu)
   ->  Gather (actual time=2486.827..2492.801 rows=2 loops=1)
         Output: (PARTIAL sum(_hyper_3_3_chunk.cpu))
         Workers Planned: 1
         Workers Launched: 1
         ->  Partial Aggregate (actual time=2480.490..2480.493 rows=1 loops=2)
               Output: PARTIAL sum(_hyper_3_3_chunk.cpu)
               Worker 0:  actual time=2474.895..2474.898 rows=1 loops=1
               ->  Parallel Append (actual time=0.045..1937.103 rows=5015000 loops=2)
                     Worker 0:  actual time=0.059..1884.817 rows=4566600 loops=1
                     ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_3_3_chunk (actual time=0.044..1417.154 rows=5015000 loops=2)
                           Output: _hyper_3_3_chunk.cpu
                           Worker 0:  actual time=0.057..1392.702 rows=4566600 loops=1
                           ->  Parallel Seq Scan on _timescaledb_internal.compress_hyper_4_4_chunk (actual time=0.029..150.251 rows=85000 loops=2)
                                 Output: compress_hyper_4_4_chunk."time", compress_hyper_4_4_chunk.sensor_id, compress_hyper_4_4_chunk.cpu, compress_hyper_4_4_chunk.temperature, compress_hyper_4_4_chunk._ts_meta_count, compress_hyper_4_4_chunk._ts_meta_sequence_num, compress_hyper_4_4_chunk._ts_meta_min_1, compress_hyper_4_4_chunk._ts_meta_max_1
                                 Worker 0:  actual time=0.038..134.648 rows=77400 loops=1
 Planning Time: 1.043 ms
 Execution Time: 2492.898 ms
(19 rows)

                                                                                                                                                                   QUERY PLAN                                                                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather (actual time=0.821..4303.116 rows=10024100 loops=1)
   Output: _hyper_3_3_chunk."time", _hyper_3_3_chunk.sensor_id, _hyper_3_3_chunk.cpu, _hyper_3_3_chunk.temperature
   Workers Planned: 1
   Workers Launched: 1
   ->  Parallel Append (actual time=0.068..2469.097 rows=5012050 loops=2)
         Worker 0:  actual time=0.071..2720.591 rows=5279320 loops=1
         ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_3_3_chunk (actual time=0.067..1958.993 rows=5012050 loops=2)
               Output: _hyper_3_3_chunk."time", _hyper_3_3_chunk.sensor_id, _hyper_3_3_chunk.cpu, _hyper_3_3_chunk.temperature
               Worker 0:  actual time=0.070..2163.342 rows=5279320 loops=1
               ->  Parallel Index Scan using compress_hyper_4_4_chunk__compressed_hypertable_4_sensor_id__ts on _timescaledb_internal.compress_hyper_4_4_chunk (actual time=0.049..130.157 rows=84950 loops=2)
                     Output: compress_hyper_4_4_chunk."time", compress_hyper_4_4_chunk.sensor_id, compress_hyper_4_4_chunk.cpu, compress_hyper_4_4_chunk.temperature, compress_hyper_4_4_chunk._ts_meta_count, compress_hyper_4_4_chunk._ts_meta_sequence_num, compress_hyper_4_4_chunk._ts_meta_min_1, compress_hyper_4_4_chunk._ts_meta_max_1
                     Index Cond: (compress_hyper_4_4_chunk.sensor_id > 100)
                     Worker 0:  actual time=0.047..153.037 rows=89480 loops=1
 Planning Time: 1.043 ms
 Execution Time: 4889.353 ms
(15 rows)

TimescaleDB after This PR

                                                                                                                                                                         QUERY PLAN                                                                                                                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate (actual time=1721.317..1726.368 rows=1 loops=1)
   Output: sum(_hyper_1_1_chunk.cpu)
   ->  Gather (actual time=1720.522..1726.355 rows=5 loops=1)
         Output: (PARTIAL sum(_hyper_1_1_chunk.cpu))
         Workers Planned: 4
         Workers Launched: 4
         ->  Partial Aggregate (actual time=1701.820..1701.823 rows=1 loops=5)
               Output: PARTIAL sum(_hyper_1_1_chunk.cpu)
               Worker 0:  actual time=1706.170..1706.172 rows=1 loops=1
               Worker 1:  actual time=1707.349..1707.351 rows=1 loops=1
               Worker 2:  actual time=1706.207..1706.210 rows=1 loops=1
               Worker 3:  actual time=1671.192..1671.194 rows=1 loops=1
               ->  Parallel Append (actual time=0.067..1386.520 rows=2006000 loops=5)
                     Worker 0:  actual time=0.072..1395.645 rows=2078334 loops=1
                     Worker 1:  actual time=0.058..1372.202 rows=1852128 loops=1
                     Worker 2:  actual time=0.081..1416.501 rows=1766578 loops=1
                     Worker 3:  actual time=0.082..1328.991 rows=2154798 loops=1
                     ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_1_1_chunk (actual time=0.066..1046.828 rows=2006000 loops=5)
                           Output: _hyper_1_1_chunk.cpu
                           Worker 0:  actual time=0.071..1068.026 rows=2078334 loops=1
                           Worker 1:  actual time=0.057..1047.539 rows=1852128 loops=1
                           Worker 2:  actual time=0.079..1074.672 rows=1766578 loops=1
                           Worker 3:  actual time=0.081..987.521 rows=2154798 loops=1
                           ->  Parallel Seq Scan on _timescaledb_internal.compress_hyper_2_2_chunk (actual time=0.045..140.669 rows=34000 loops=5)
                                 Output: compress_hyper_2_2_chunk."time", compress_hyper_2_2_chunk.sensor_id, compress_hyper_2_2_chunk.cpu, compress_hyper_2_2_chunk.temperature, compress_hyper_2_2_chunk._ts_meta_count, compress_hyper_2_2_chunk._ts_meta_sequence_num, compress_hyper_2_2_chunk._ts_meta_min_1, compress_hyper_2_2_chunk._ts_meta_max_1
                                 Worker 0:  actual time=0.050..139.442 rows=35226 loops=1
                                 Worker 1:  actual time=0.033..123.829 rows=31392 loops=1
                                 Worker 2:  actual time=0.059..241.316 rows=29942 loops=1
                                 Worker 3:  actual time=0.056..89.220 rows=36522 loops=1
 Planning Time: 1.232 ms
 Execution Time: 1726.454 ms
(31 rows)

                                                                                                                                                                   QUERY PLAN                                                                                                                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather (actual time=2.556..3478.430 rows=10024100 loops=1)
   Output: _hyper_1_1_chunk."time", _hyper_1_1_chunk.sensor_id, _hyper_1_1_chunk.cpu, _hyper_1_1_chunk.temperature
   Workers Planned: 4
   Workers Launched: 4
   ->  Parallel Append (actual time=0.110..1626.333 rows=2004820 loops=5)
         Worker 0:  actual time=0.099..1973.618 rows=2504904 loops=1
         Worker 1:  actual time=0.105..1982.349 rows=2310558 loops=1
         Worker 2:  actual time=0.114..1979.178 rows=2450506 loops=1
         Worker 3:  actual time=0.112..1962.645 rows=2375340 loops=1
         ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_1_1_chunk (actual time=0.109..1295.361 rows=2004820 loops=5)
               Output: _hyper_1_1_chunk."time", _hyper_1_1_chunk.sensor_id, _hyper_1_1_chunk.cpu, _hyper_1_1_chunk.temperature
               Worker 0:  actual time=0.098..1620.239 rows=2504904 loops=1
               Worker 1:  actual time=0.103..1564.843 rows=2310558 loops=1
               Worker 2:  actual time=0.112..1554.423 rows=2450506 loops=1
               Worker 3:  actual time=0.111..1547.799 rows=2375340 loops=1
               ->  Parallel Index Scan using compress_hyper_2_2_chunk__compressed_hypertable_2_sensor_id__ts on _timescaledb_internal.compress_hyper_2_2_chunk (actual time=0.079..80.981 rows=33980 loops=5)
                     Output: compress_hyper_2_2_chunk."time", compress_hyper_2_2_chunk.sensor_id, compress_hyper_2_2_chunk.cpu, compress_hyper_2_2_chunk.temperature, compress_hyper_2_2_chunk._ts_meta_count, compress_hyper_2_2_chunk._ts_meta_sequence_num, compress_hyper_2_2_chunk._ts_meta_min_1, compress_hyper_2_2_chunk._ts_meta_max_1
                     Index Cond: (compress_hyper_2_2_chunk.sensor_id > 100)
                     Worker 0:  actual time=0.069..97.910 rows=42456 loops=1
                     Worker 1:  actual time=0.069..112.020 rows=39162 loops=1
                     Worker 2:  actual time=0.082..82.709 rows=41534 loops=1
                     Worker 3:  actual time=0.078..94.887 rows=40260 loops=1
 Planning Time: 1.111 ms
 Execution Time: 4245.768 ms
(24 rows)

codecov · 2023-05-03T19:48:05Z

Codecov Report

Merging #5655 (086acc3) into main (10cab43) will decrease coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #5655      +/-   ##
==========================================
- Coverage   87.86%   87.80%   -0.06%     
==========================================
  Files         234      234              
  Lines       54993    54982      -11     
  Branches    12116    12114       -2     
==========================================
- Hits        48317    48277      -40     
- Misses       4826     4843      +17     
- Partials     1850     1862      +12

Impacted Files	Coverage Δ
src/import/allpaths.c	`74.03% <100.00%> (ø)`
tsl/src/nodes/decompress_chunk/decompress_chunk.c	`89.70% <100.00%> (-0.82%)`	⬇️

... and 27 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

github-actions · 2023-05-08T15:15:45Z

@mahipv, @sb230132: please review this pull request.

Powered by pull-review

akuzm · 2023-05-08T15:55:18Z

tsl/src/nodes/decompress_chunk/decompress_chunk.c

@@ -578,10 +578,12 @@ ts_decompress_chunk_generate_paths(PlannerInfo *root, RelOptInfo *chunk_rel, Hyp

 	/*
 	 * since we rely on parallel coordination from the scan below


What does it mean that "we rely on parallel coordination from the scan below"?

Removed. Now we use PostgreSQL function create_plain_partial_paths() for creating parallel plans.

akuzm · 2023-05-08T15:55:42Z

tsl/src/nodes/decompress_chunk/decompress_chunk.c

+	int parallel_workers =
+		compute_parallel_worker(chunk_rel, chunk_rel->pages, -1, max_parallel_workers_per_gather);


Can we have an index scan here? Might make sense to pass a proper value for index pages, and add a test for it.

Regarding index scan in Timescale we are rely on PostgreSQL.
So no changes had been made.

We use the function set_plain_rel_pathlist()
which calls PostgreSQL function create_index_paths()
https://github.com/timescale/timescaledb/blob/main/src/import/allpaths.c#L129-L130

create_index_paths() calls get_index_paths() which calls build_index_paths()
Finally build_index_paths() function constructs zero or more partial IndexPaths

https://github.com/postgres/postgres/blob/ccd3623256220b944d9da00df75d91ef4d550362/src/backend/optimizer/path/indxpath.c#L235
https://github.com/postgres/postgres/blob/ccd3623256220b944d9da00df75d91ef4d550362/src/backend/optimizer/path/indxpath.c#L733
https://github.com/postgres/postgres/blob/ccd3623256220b944d9da00df75d91ef4d550362/src/backend/optimizer/path/indxpath.c#L855
https://github.com/postgres/postgres/blob/ccd3623256220b944d9da00df75d91ef4d550362/src/backend/optimizer/path/indxpath.c#L816

Regarding the parallel index scan test case a test case has been added,
which shows that PostgreSQL allocates 2 Workers for a Parallel Index Scan.

QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Gather Output: _hyper_37_71_chunk."time", _hyper_37_71_chunk.sensor_id, _hyper_37_71_chunk.cpu, _hyper_37_71_chunk.temperature Workers Planned: 2 -> Parallel Append -> Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_37_71_chunk Output: _hyper_37_71_chunk."time", _hyper_37_71_chunk.sensor_id, _hyper_37_71_chunk.cpu, _hyper_37_71_chunk.temperature -> Parallel Index Scan using compress_hyper_38_72_chunk__compressed_hypertable_38_sensor_id_ on _timescaledb_internal.compress_hyper_38_72_chunk Output: compress_hyper_38_72_chunk."time", compress_hyper_38_72_chunk.sensor_id, compress_hyper_38_72_chunk.cpu, compress_hyper_38_72_chunk.temperature, compress_hyper_38_72_chunk._ts_meta_count, compress_hyper_38_72_chunk._ts_meta_sequence_num, compress_hyper_38_72_chunk._ts_meta_min_1, compress_hyper_38_72_chunk._ts_meta_max_1 Index Cond: (compress_hyper_38_72_chunk.sensor_id > 100) (9 rows)

akuzm · 2023-05-08T15:57:08Z

tsl/test/sql/compression.sql

+SELECT sum(cpu) FROM f_sensor_data;
+
+SET max_parallel_workers_per_gather = 2;
+SHOW max_parallel_workers_per_gather;
+:explain
+SELECT sum(cpu) FROM f_sensor_data;
+
+SET max_parallel_workers_per_gather = 4;
+SHOW max_parallel_workers_per_gather;
+:explain
+SELECT sum(cpu) FROM f_sensor_data;


What do we gain by testing 1, 2, 4, maybe just 4 is enough? For simplicity.

Removed. Only one test case with 4 parallel workers is been kept.

tsl/test/sql/compression.sql

konskov · 2023-05-12T13:55:26Z

tsl/test/sql/compression.sql

+
+SET min_parallel_table_scan_size TO '1';
+
+\set explain 'EXPLAIN (VERBOSE, COSTS OFF)'


just a thought, as I have not experimented to find the appropriate pattern matching: maybe adding an appropriate sed pattern in test/runner.sh would help to remove the test flakiness that ANALYZE introduces and thus allow ANALYZE to show that workers have been planned for the following queries?

I have removed ANALYZE
because the rows of each parallel worker always changing
and the tests where failing.

konskov · 2023-05-30T15:02:31Z

.unreleased/5655_decompression_workers.txt

@@ -0,0 +1 @@
+Implements: Improve the number of parallel workers for decompression 


Suggested change

Implements: Improve the number of parallel workers for decompression

Implements: #5655 Improve the number of parallel workers for decompression

konskov · 2023-05-30T15:42:40Z

tsl/src/nodes/decompress_chunk/decompress_chunk.c

@@ -879,7 +871,8 @@ ts_decompress_chunk_generate_paths(PlannerInfo *root, RelOptInfo *chunk_rel, Hyp
 														  list_make2(path, uncompressed_path),
 														  NIL /* pathkeys */,
 														  req_outer,
-														  parallel_workers,
+														  Max(path->parallel_workers,


You could perhaps also add a test for partially compressed chunks. Up to you

I done it. I added a test for partially compressed chunks.

konskov

please don't forget to update the .unreleased entry to include the PR number!

So far, we have set the number of desired workers for decompression to 1. If a query touches only one chunk, we end up with one worker in a parallel plan. Only if the query touches multiple chunks PostgreSQL spins up multiple workers. These workers could then be used to process the data of one chunk. This patch removes our custom worker calculation and relies on PostgreSQL logic to calculate the desired parallelity. Co-authored-by: Jan Kristof Nidzwetzki <[email protected]>

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * timescale#5137 Insert into index during chunk compression * timescale#5150 MERGE support on hypertables * timescale#5515 Make hypertables support replica identity * timescale#5586 Index scan support during UPDATE/DELETE on compressed hypertables * timescale#5596 Support for partial aggregations at chunk level * timescale#5599 Enable ChunkAppend for partially compressed chunks * timescale#5655 Improve the number of parallel workers for decompression * timescale#5758 Enable altering job schedule type through `alter_job` * timescale#5805 Make logrepl markers for (partial) decompressions * timescale#5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * timescale#5839 Support CAgg names in chunk_detailed_size * timescale#5852 Make set_chunk_time_interval CAggs aware * timescale#5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * timescale#5875 Add job exit status and runtime to log * timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * timescale#5860 Fix interval calculation for hierarchical CAggs * timescale#5894 Check unique indexes when enabling compression * timescale#5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * timescale#5988 Move functions to _timescaledb_functions schema * timescale#5788 Chunk_create must add an existing table or fail * timescale#5872 Fix duplicates on partially compressed chunk reads * timescale#5918 Fix crash in COPY from program returning error * timescale#5990 Place data in first/last function in correct mctx * timescale#5991 Call eq_func correctly in time_bucket_gapfill * timescale#6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * timescale#6035 Fix server crash on UPDATE of compressed chunk * timescale#6044 Fix server crash when using duplicate segmentby column * timescale#6045 Fix segfault in set_integer_now_func * timescale#6053 Fix approximate_row_count for CAggs * timescale#6081 Improve compressed DML datatype handling * timescale#6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes * #6102 Schedule compression policy more often **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes * #6102 Schedule compression policy more often **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

github-actions bot assigned sotirissl May 3, 2023

sotirissl marked this pull request as ready for review May 8, 2023 15:15

github-actions bot requested review from mahipv and sb230132 May 8, 2023 15:15

akuzm reviewed May 8, 2023

View reviewed changes

konskov reviewed May 12, 2023

View reviewed changes

tsl/test/sql/compression.sql Outdated Show resolved Hide resolved

konskov reviewed May 12, 2023

View reviewed changes

konskov self-requested a review May 30, 2023 14:33

konskov reviewed May 30, 2023

View reviewed changes

konskov approved these changes May 30, 2023

View reviewed changes

akuzm approved these changes May 30, 2023

View reviewed changes

sotirissl mentioned this pull request Jun 2, 2023

Sotirissl/new decompression workers #5696

Closed

sotirissl merged commit 1a93c2d into timescale:main Jun 2, 2023

sotirissl mentioned this pull request Jul 4, 2023

Ensure partial paths are always created #5855

Closed

svenklemm mentioned this pull request Sep 20, 2023

Release 2.12.0 #6086

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve parallel workers for decompression #5655

Improve parallel workers for decompression #5655

sotirissl commented May 3, 2023 •

edited

Loading

codecov bot commented May 3, 2023 •

edited

Loading

github-actions bot commented May 8, 2023

akuzm May 8, 2023

sotirissl May 25, 2023

akuzm May 8, 2023

sotirissl May 18, 2023

sotirissl May 18, 2023 •

edited

Loading

akuzm May 8, 2023

sotirissl May 25, 2023

konskov May 12, 2023

sotirissl May 30, 2023

konskov May 30, 2023

sotirissl Jun 2, 2023

konskov May 30, 2023

sotirissl Jun 2, 2023

konskov left a comment

		@@ -578,10 +578,12 @@ ts_decompress_chunk_generate_paths(PlannerInfo root, RelOptInfo chunk_rel, Hyp

		/*
		* since we rely on parallel coordination from the scan below

		int parallel_workers =
		compute_parallel_worker(chunk_rel, chunk_rel->pages, -1, max_parallel_workers_per_gather);


		SET min_parallel_table_scan_size TO '1';

		\set explain 'EXPLAIN (VERBOSE, COSTS OFF)'

		@@ -0,0 +1 @@
		Implements: Improve the number of parallel workers for decompression

	Implements: Improve the number of parallel workers for decompression
	Implements: #5655 Improve the number of parallel workers for decompression

Improve parallel workers for decompression #5655

Improve parallel workers for decompression #5655

Conversation

sotirissl commented May 3, 2023 • edited Loading

Benchmark

codecov bot commented May 3, 2023 • edited Loading

Codecov Report

github-actions bot commented May 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sotirissl May 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

konskov left a comment

Choose a reason for hiding this comment

sotirissl commented May 3, 2023 •

edited

Loading

codecov bot commented May 3, 2023 •

edited

Loading

sotirissl May 18, 2023 •

edited

Loading