Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Deletion from compressed chunks performance and behaviour #5802

Open
mickael-choisnet opened this issue Jun 19, 2023 · 17 comments
Open

Comments

@mickael-choisnet
Copy link

mickael-choisnet commented Jun 19, 2023

What type of bug is this?

Performance issue

What subsystems and features are affected?

Compression

What happened?

I did some tests regarding the new feature which allows deleting rows from compressed chunks.
I have a few concerns that I would like to share regarding performance.

The use case is this:
We need to be able to purge data, every day, up to about 3000 devices for over a year of data.
We have a hypertable with 14 day chunks containing indexes (1/day/device for this case, but in our real world data we also have 1/hour/device).
I generated random data for 3 million meters for 1 month to have a sample to do my tests and then project the results.

Here is what I observed:

  • Delete from compressed chunk

    • Target directly a compressed chunk with a delete statement
    DELETE FROM _timescaledb_internal.compress_hyper_131_316_chunk  AS measure
    USING replay_meter_deleted_staging AS meter
    WHERE meter.id = measure.meter_id AND meter.fluid_id = measure.fluid_id;
    • Result message: DELETE 3000
      3000 is the number of IDs targeted for delete, but there is 42 000 row really affected
    • Duration: ~13 sec
    • No autovacuum triggered afterwards
    • When triggered, the compression policy job does nothing (the job terminate quickly doing nothing). Were the rows really deleted? It seems not
    • The select query plan is unaffected, rows seems to have been deleted, they no longer are part of the result, but the chunk size is still the same
      (The chunk size is not really affected for 3000 devices, but for 6000 it shows a slight decrease when deleting from hypertable)
  • Delete from hypertable

    • Target hypertable with a delete statement
    DELETE FROM measures_simple AS measure
    USING replay_meter_deleted_staging AS meter
    WHERE meter.id = measure.meter_id AND meter.fluid_id = measure.fluid_id
    AND measure.measure_date >= '2023-04-27' AND measure.measure_date < '2023-05-11';
    • Result message: DELETE 42000
    • Duration: ~2 min
    • There is an autovacuum triggered for 2 chunks: original and compressed equivalent (~3 min/chunk)
    • Without triggering the compression policy job, the select query plan is affected and poorly efficient
    • When triggered, the auto compression job takes some time (~2 min), the select query plan is back to normal

It seems the intended way to delete rows from compressed chunks is through the hypertable directly, however it does take quite some time.
For 1 chunk of 14 days: 2 min delete + 3 min autovacuum + 2 min compression job = 7 min.
If I project this result for 18 months, which is likely our present use case but will grow with time, it goes up to ~270 min (~4h).

It sure is faster than decompressing/deleting/compressing the chunks, but it is still not really efficient for our use case (we got 3 hypertables this size to handle and 2 smaller).
Would it be possible to skip the autovacuum part?
What about the 2 min of the delete operation if the compression policy job needs the same time to effectively delete the data afterwards? Is there a mean to gain time here?

Furthermore, it seems targeting directly the compressed chunk can achieve the same result in a faster way.
It just lacks the final deletion from the chunk through the job (or so it seems).
Which would leave us with just ~10 sec operation + 2 min compression job.
Was it intended to be able to delete data by targeting directly the compressed chunks?
Could you explore this removal method to speed up the overall process?

For the record, I've created a table with 18 months of random data and tried to delete rows for 1500 devices and gave up after 1h30 of processing the delete statement.

I hesitated between a bug (the deletion from the compressed chunk seems to be an option but not fully implemented) and an enhancement because, in the end, the functionality is operational. Please feel free to adjust as you see fit.

TimescaleDB version affected

2.11.0

PostgreSQL version used

PostgreSQL 14.5 (Debian 14.5-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

What operating system did you use?

Debian GNU/Linux 10 (buster) (10.13)

What installation method did you use?

Deb/Apt

What platform did you run on?

Microsoft Azure Cloud

Relevant log output and stack trace

Here is the SELECT query EXPLAIN before any delete: https://explain.depesz.com/s/rTeH

'Sort  (cost=38116.71..38866.71 rows=300000 width=53) (actual time=0.443..0.453 rows=150 loops=1)'
'  Output: m.id, i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'  Sort Key: i_1.measure_date DESC'
'  Sort Method: quicksort  Memory: 46kB'
'  ->  Nested Loop  (cost=0.98..4818.30 rows=300000 width=53) (actual time=0.070..0.390 rows=150 loops=1)'
'        Output: m.id, i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'        ->  Nested Loop  (cost=0.43..155.75 rows=100 width=40) (actual time=0.034..0.079 rows=5 loops=1)'
'              Output: m.id, mid.val'
'              Inner Unique: true'
'              ->  Function Scan on pg_catalog.jsonb_array_elements_text mid  (cost=0.00..1.00 rows=100 width=32) (actual time=0.010..0.011 rows=5 loops=1)'
'                    Output: mid.val'
'                    Function Call: jsonb_array_elements_text(''["1000001", "1500001", "2000001", "2500001", "3000001"]''::jsonb)'
'              ->  Index Only Scan using meter_pkey on public.meter m  (cost=0.43..1.55 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=5)'
'                    Output: m.id'
'                    Index Cond: (m.id = mid.val)'
'                    Heap Fetches: 0'
'        ->  Append  (cost=0.54..16.63 rows=3000 width=45) (actual time=0.019..0.058 rows=30 loops=5)'
'              ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_312_chunk i_1  (cost=0.54..0.54 rows=1000 width=45) (actual time=0.018..0.019 rows=4 loops=5)'
'                    Output: i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'                    Filter: ((i_1.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_1.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (m.id = i_1.meter_id))'
'                    Rows Removed by Filter: 10'
'                    ->  Index Scan using compress_hyper_131_317_chunk__compressed_hypertable_131_meter_i on _timescaledb_internal.compress_hyper_131_317_chunk  (cost=0.43..0.54 rows=1 width=222) (actual time=0.013..0.013 rows=1 loops=5)'
'                          Output: compress_hyper_131_317_chunk.measure_date, compress_hyper_131_317_chunk.received_date, compress_hyper_131_317_chunk.value, compress_hyper_131_317_chunk.meter_id, compress_hyper_131_317_chunk.fluid_id, compress_hyper_131_317_chunk.source, compress_hyper_131_317_chunk.device_type, compress_hyper_131_317_chunk.device_id, compress_hyper_131_317_chunk._ts_meta_count, compress_hyper_131_317_chunk._ts_meta_sequence_num, compress_hyper_131_317_chunk._ts_meta_min_1, compress_hyper_131_317_chunk._ts_meta_max_1'
'                          Index Cond: ((compress_hyper_131_317_chunk.meter_id = m.id) AND (compress_hyper_131_317_chunk.fluid_id = 1))'
'                          Filter: ((compress_hyper_131_317_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_317_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone))'
'              ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_311_chunk i_2  (cost=0.54..0.54 rows=1000 width=45) (actual time=0.014..0.017 rows=14 loops=5)'
'                    Output: i_2.measure_date, i_2.received_date, i_2.value, i_2.meter_id, i_2.fluid_id, i_2.source, i_2.device_type, i_2.device_id'
'                    Filter: ((i_2.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_2.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (m.id = i_2.meter_id))'
'                    ->  Index Scan using compress_hyper_131_316_chunk__compressed_hypertable_131_meter_i on _timescaledb_internal.compress_hyper_131_316_chunk  (cost=0.43..0.54 rows=1 width=222) (actual time=0.012..0.012 rows=1 loops=5)'
'                          Output: compress_hyper_131_316_chunk.measure_date, compress_hyper_131_316_chunk.received_date, compress_hyper_131_316_chunk.value, compress_hyper_131_316_chunk.meter_id, compress_hyper_131_316_chunk.fluid_id, compress_hyper_131_316_chunk.source, compress_hyper_131_316_chunk.device_type, compress_hyper_131_316_chunk.device_id, compress_hyper_131_316_chunk._ts_meta_count, compress_hyper_131_316_chunk._ts_meta_sequence_num, compress_hyper_131_316_chunk._ts_meta_min_1, compress_hyper_131_316_chunk._ts_meta_max_1'
'                          Index Cond: ((compress_hyper_131_316_chunk.meter_id = m.id) AND (compress_hyper_131_316_chunk.fluid_id = 1))'
'                          Filter: ((compress_hyper_131_316_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_316_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone))'
'              ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_310_chunk i_3  (cost=0.54..0.54 rows=1000 width=45) (actual time=0.014..0.016 rows=12 loops=5)'
'                    Output: i_3.measure_date, i_3.received_date, i_3.value, i_3.meter_id, i_3.fluid_id, i_3.source, i_3.device_type, i_3.device_id'
'                    Filter: ((i_3.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_3.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (m.id = i_3.meter_id))'
'                    ->  Index Scan using compress_hyper_131_314_chunk__compressed_hypertable_131_meter_i on _timescaledb_internal.compress_hyper_131_314_chunk  (cost=0.43..0.54 rows=1 width=222) (actual time=0.012..0.012 rows=1 loops=5)'
'                          Output: compress_hyper_131_314_chunk.measure_date, compress_hyper_131_314_chunk.received_date, compress_hyper_131_314_chunk.value, compress_hyper_131_314_chunk.meter_id, compress_hyper_131_314_chunk.fluid_id, compress_hyper_131_314_chunk.source, compress_hyper_131_314_chunk.device_type, compress_hyper_131_314_chunk.device_id, compress_hyper_131_314_chunk._ts_meta_count, compress_hyper_131_314_chunk._ts_meta_sequence_num, compress_hyper_131_314_chunk._ts_meta_min_1, compress_hyper_131_314_chunk._ts_meta_max_1'
'                          Index Cond: ((compress_hyper_131_314_chunk.meter_id = m.id) AND (compress_hyper_131_314_chunk.fluid_id = 1))'
'                          Filter: ((compress_hyper_131_314_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_314_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone))'
'Planning Time: 1.215 ms'
'Execution Time: 0.527 ms'

Here is the same plan after deleting directly from compressed chunk: https://explain.depesz.com/s/AHtc

'Sort  (cost=38116.71..38866.71 rows=300000 width=53) (actual time=0.390..0.397 rows=80 loops=1)'
'  Output: m.id, i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'  Sort Key: i_1.measure_date DESC'
'  Sort Method: quicksort  Memory: 36kB'
'  ->  Nested Loop  (cost=0.98..4818.30 rows=300000 width=53) (actual time=0.077..0.355 rows=80 loops=1)'
'        Output: m.id, i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'        ->  Nested Loop  (cost=0.43..155.75 rows=100 width=40) (actual time=0.038..0.082 rows=5 loops=1)'
'              Output: m.id, mid.val'
'              Inner Unique: true'
'              ->  Function Scan on pg_catalog.jsonb_array_elements_text mid  (cost=0.00..1.00 rows=100 width=32) (actual time=0.010..0.011 rows=5 loops=1)'
'                    Output: mid.val'
'                    Function Call: jsonb_array_elements_text(''["1000000", "1500000", "2000000", "2500000", "3000000"]''::jsonb)'
'              ->  Index Only Scan using meter_pkey on public.meter m  (cost=0.43..1.55 rows=1 width=8) (actual time=0.013..0.013 rows=1 loops=5)'
'                    Output: m.id'
'                    Index Cond: (m.id = mid.val)'
'                    Heap Fetches: 0'
'        ->  Append  (cost=0.54..16.63 rows=3000 width=45) (actual time=0.020..0.052 rows=16 loops=5)'
'              ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_312_chunk i_1  (cost=0.54..0.54 rows=1000 width=45) (actual time=0.019..0.020 rows=4 loops=5)'
'                    Output: i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'                    Filter: ((i_1.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_1.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (m.id = i_1.meter_id))'
'                    Rows Removed by Filter: 10'
'                    ->  Index Scan using compress_hyper_131_317_chunk__compressed_hypertable_131_meter_i on _timescaledb_internal.compress_hyper_131_317_chunk  (cost=0.43..0.54 rows=1 width=222) (actual time=0.013..0.014 rows=1 loops=5)'
'                          Output: compress_hyper_131_317_chunk.measure_date, compress_hyper_131_317_chunk.received_date, compress_hyper_131_317_chunk.value, compress_hyper_131_317_chunk.meter_id, compress_hyper_131_317_chunk.fluid_id, compress_hyper_131_317_chunk.source, compress_hyper_131_317_chunk.device_type, compress_hyper_131_317_chunk.device_id, compress_hyper_131_317_chunk._ts_meta_count, compress_hyper_131_317_chunk._ts_meta_sequence_num, compress_hyper_131_317_chunk._ts_meta_min_1, compress_hyper_131_317_chunk._ts_meta_max_1'
'                          Index Cond: ((compress_hyper_131_317_chunk.meter_id = m.id) AND (compress_hyper_131_317_chunk.fluid_id = 1))'
'                          Filter: ((compress_hyper_131_317_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_317_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone))'
'              ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_311_chunk i_2  (cost=0.54..0.54 rows=1000 width=45) (actual time=0.011..0.011 rows=0 loops=5)'
'                    Output: i_2.measure_date, i_2.received_date, i_2.value, i_2.meter_id, i_2.fluid_id, i_2.source, i_2.device_type, i_2.device_id'
'                    Filter: ((i_2.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_2.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (m.id = i_2.meter_id))'
'                    ->  Index Scan using compress_hyper_131_316_chunk__compressed_hypertable_131_meter_i on _timescaledb_internal.compress_hyper_131_316_chunk  (cost=0.43..0.54 rows=1 width=222) (actual time=0.010..0.010 rows=0 loops=5)'
'                          Output: compress_hyper_131_316_chunk.measure_date, compress_hyper_131_316_chunk.received_date, compress_hyper_131_316_chunk.value, compress_hyper_131_316_chunk.meter_id, compress_hyper_131_316_chunk.fluid_id, compress_hyper_131_316_chunk.source, compress_hyper_131_316_chunk.device_type, compress_hyper_131_316_chunk.device_id, compress_hyper_131_316_chunk._ts_meta_count, compress_hyper_131_316_chunk._ts_meta_sequence_num, compress_hyper_131_316_chunk._ts_meta_min_1, compress_hyper_131_316_chunk._ts_meta_max_1'
'                          Index Cond: ((compress_hyper_131_316_chunk.meter_id = m.id) AND (compress_hyper_131_316_chunk.fluid_id = 1))'
'                          Filter: ((compress_hyper_131_316_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_316_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone))'
'              ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_310_chunk i_3  (cost=0.54..0.54 rows=1000 width=45) (actual time=0.014..0.017 rows=12 loops=5)'
'                    Output: i_3.measure_date, i_3.received_date, i_3.value, i_3.meter_id, i_3.fluid_id, i_3.source, i_3.device_type, i_3.device_id'
'                    Filter: ((i_3.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_3.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (m.id = i_3.meter_id))'
'                    ->  Index Scan using compress_hyper_131_314_chunk__compressed_hypertable_131_meter_i on _timescaledb_internal.compress_hyper_131_314_chunk  (cost=0.43..0.54 rows=1 width=222) (actual time=0.012..0.012 rows=1 loops=5)'
'                          Output: compress_hyper_131_314_chunk.measure_date, compress_hyper_131_314_chunk.received_date, compress_hyper_131_314_chunk.value, compress_hyper_131_314_chunk.meter_id, compress_hyper_131_314_chunk.fluid_id, compress_hyper_131_314_chunk.source, compress_hyper_131_314_chunk.device_type, compress_hyper_131_314_chunk.device_id, compress_hyper_131_314_chunk._ts_meta_count, compress_hyper_131_314_chunk._ts_meta_sequence_num, compress_hyper_131_314_chunk._ts_meta_min_1, compress_hyper_131_314_chunk._ts_meta_max_1'
'                          Index Cond: ((compress_hyper_131_314_chunk.meter_id = m.id) AND (compress_hyper_131_314_chunk.fluid_id = 1))'
'                          Filter: ((compress_hyper_131_314_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_314_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone))'
'Planning Time: 1.367 ms'
'Execution Time: 0.480 ms'

Here is the same plan if we do not run the compression job after deleting from the hypertable: https://explain.depesz.com/s/951

'Gather Merge  (cost=32273560.87..32287083.49 rows=117588 width=53) (actual time=41696.719..41711.036 rows=150 loops=1)'
'  Output: m.id, i_2.measure_date, i_2.received_date, i_2.value, i_2.meter_id, i_2.fluid_id, i_2.source, i_2.device_type, i_2.device_id'
'  Workers Planned: 1'
'  Workers Launched: 1'
'  ->  Sort  (cost=32272560.86..32272854.83 rows=117588 width=53) (actual time=41656.539..41656.559 rows=75 loops=2)'
'        Output: m.id, i_2.measure_date, i_2.received_date, i_2.value, i_2.meter_id, i_2.fluid_id, i_2.source, i_2.device_type, i_2.device_id'
'        Sort Key: i_2.measure_date DESC'
'        Sort Method: quicksort  Memory: 36kB'
'        Worker 0:  actual time=41617.105..41617.126 rows=70 loops=1'
'          Sort Method: quicksort  Memory: 34kB'
'          JIT:'
'            Functions: 22'
'            Options: Inlining true, Optimization true, Expressions true, Deforming true'
'            Timing: Generation 4.140 ms, Inlining 89.233 ms, Optimization 219.889 ms, Emission 138.196 ms, Total 451.458 ms'
'        ->  Hash Join  (cost=157.15..32260302.51 rows=117588 width=53) (actual time=793.006..41656.429 rows=75 loops=2)'
'              Output: m.id, i_2.measure_date, i_2.received_date, i_2.value, i_2.meter_id, i_2.fluid_id, i_2.source, i_2.device_type, i_2.device_id'
'              Hash Cond: (i_2.meter_id = m.id)'
'              Worker 0:  actual time=1221.237..41616.973 rows=70 loops=1'
'              ->  Parallel Append  (cost=0.15..19003779.11 rows=3534716843 width=45) (actual time=406.284..36262.610 rows=44949000 loops=2)'
'                    Worker 0:  actual time=447.969..36591.625 rows=39543308 loops=1'
'                    ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_311_chunk i_2  (cost=0.15..261652.35 rows=1764706000 width=45) (actual time=224.282..17965.182 rows=20979000 loops=2)'
'                          Output: i_2.measure_date, i_2.received_date, i_2.value, i_2.meter_id, i_2.fluid_id, i_2.source, i_2.device_type, i_2.device_id'
'                          Filter: ((i_2.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_2.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone))'
'                          Worker 0:  actual time=447.967..32847.851 rows=39543308 loops=1'
'                          ->  Parallel Seq Scan on _timescaledb_internal.compress_hyper_131_316_chunk  (cost=0.00..261652.35 rows=1764706 width=222) (actual time=224.250..11130.016 rows=1498500 loops=2)'
'                                Output: compress_hyper_131_316_chunk.measure_date, compress_hyper_131_316_chunk.received_date, compress_hyper_131_316_chunk.value, compress_hyper_131_316_chunk.meter_id, compress_hyper_131_316_chunk.fluid_id, compress_hyper_131_316_chunk.source, compress_hyper_131_316_chunk.device_type, compress_hyper_131_316_chunk.device_id, compress_hyper_131_316_chunk._ts_meta_count, compress_hyper_131_316_chunk._ts_meta_sequence_num, compress_hyper_131_316_chunk._ts_meta_min_1, compress_hyper_131_316_chunk._ts_meta_max_1'
'                                Filter: ((compress_hyper_131_316_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_316_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_316_chunk.fluid_id = 1))'
'                                Worker 0:  actual time=447.917..19819.440 rows=2824522 loops=1'
'                    ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_310_chunk i_3  (cost=0.14..244923.47 rows=1762941000 width=45) (actual time=0.080..23243.589 rows=35964000 loops=1)'
'                          Output: i_3.measure_date, i_3.received_date, i_3.value, i_3.meter_id, i_3.fluid_id, i_3.source, i_3.device_type, i_3.device_id'
'                          Filter: ((i_3.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_3.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone))'
'                          ->  Parallel Seq Scan on _timescaledb_internal.compress_hyper_131_320_chunk  (cost=0.00..244923.47 rows=1762941 width=222) (actual time=0.044..11814.146 rows=2997000 loops=1)'
'                                Output: compress_hyper_131_320_chunk.measure_date, compress_hyper_131_320_chunk.received_date, compress_hyper_131_320_chunk.value, compress_hyper_131_320_chunk.meter_id, compress_hyper_131_320_chunk.fluid_id, compress_hyper_131_320_chunk.source, compress_hyper_131_320_chunk.device_type, compress_hyper_131_320_chunk.device_id, compress_hyper_131_320_chunk._ts_meta_count, compress_hyper_131_320_chunk._ts_meta_sequence_num, compress_hyper_131_320_chunk._ts_meta_min_1, compress_hyper_131_320_chunk._ts_meta_max_1'
'                                Filter: ((compress_hyper_131_320_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_320_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_320_chunk.fluid_id = 1))'
'                    ->  Parallel Seq Scan on _timescaledb_internal._hyper_130_312_chunk i_1  (cost=0.00..823619.07 rows=7068843 width=45) (actual time=364.592..5148.690 rows=11976000 loops=1)'
'                          Output: i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'                          Filter: ((i_1.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_1.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (i_1.fluid_id = 1))'
'                          Rows Removed by Filter: 29940000'
'                    ->  Custom Scan (DecompressChunk) on _timescaledb_internal._hyper_130_312_chunk i_1  (cost=0.00..0.00 rows=1000 width=45) (actual time=0.005..0.006 rows=0 loops=1)'
'                          Output: i_1.measure_date, i_1.received_date, i_1.value, i_1.meter_id, i_1.fluid_id, i_1.source, i_1.device_type, i_1.device_id'
'                          Filter: ((i_1.measure_date >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (i_1.measure_date < ''2023-05-01 00:00:00''::timestamp without time zone) AND (i_1.fluid_id = 1))'
'                          ->  Parallel Seq Scan on _timescaledb_internal.compress_hyper_131_319_chunk  (cost=0.00..0.00 rows=1 width=222) (actual time=0.002..0.002 rows=0 loops=1)'
'                                Output: compress_hyper_131_319_chunk.measure_date, compress_hyper_131_319_chunk.received_date, compress_hyper_131_319_chunk.value, compress_hyper_131_319_chunk.meter_id, compress_hyper_131_319_chunk.fluid_id, compress_hyper_131_319_chunk.source, compress_hyper_131_319_chunk.device_type, compress_hyper_131_319_chunk.device_id, compress_hyper_131_319_chunk._ts_meta_count, compress_hyper_131_319_chunk._ts_meta_sequence_num, compress_hyper_131_319_chunk._ts_meta_min_1, compress_hyper_131_319_chunk._ts_meta_max_1'
'                                Filter: ((compress_hyper_131_319_chunk._ts_meta_max_1 >= ''2023-04-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_319_chunk._ts_meta_min_1 < ''2023-05-01 00:00:00''::timestamp without time zone) AND (compress_hyper_131_319_chunk.fluid_id = 1))'
'              ->  Hash  (cost=155.75..155.75 rows=100 width=40) (actual time=0.215..0.216 rows=5 loops=2)'
'                    Output: m.id, mid.val'
'                    Buckets: 1024  Batches: 1  Memory Usage: 9kB'
'                    Worker 0:  actual time=0.283..0.285 rows=5 loops=1'
'                    ->  Nested Loop  (cost=0.43..155.75 rows=100 width=40) (actual time=0.138..0.208 rows=5 loops=2)'
'                          Output: m.id, mid.val'
'                          Inner Unique: true'
'                          Worker 0:  actual time=0.194..0.277 rows=5 loops=1'
'                          ->  Function Scan on pg_catalog.jsonb_array_elements_text mid  (cost=0.00..1.00 rows=100 width=32) (actual time=0.069..0.070 rows=5 loops=2)'
'                                Output: mid.val'
'                                Function Call: jsonb_array_elements_text(''["1000002", "1500002", "2000002", "2500002", "3000002"]''::jsonb)'
'                                Worker 0:  actual time=0.111..0.112 rows=5 loops=1'
'                          ->  Index Only Scan using meter_pkey on public.meter m  (cost=0.43..1.55 rows=1 width=8) (actual time=0.025..0.025 rows=1 loops=10)'
'                                Output: m.id'
'                                Index Cond: (m.id = mid.val)'
'                                Heap Fetches: 0'
'                                Worker 0:  actual time=0.030..0.030 rows=1 loops=5'
'Planning Time: 1.738 ms'
'JIT:'
'  Functions: 44'
'  Options: Inlining true, Optimization true, Expressions true, Deforming true'
'  Timing: Generation 6.897 ms, Inlining 104.813 ms, Optimization 430.022 ms, Emission 276.432 ms, Total 818.164 ms'
'Execution Time: 41714.283 ms'

How can we reproduce the bug?

Here are the scripts used to produce the results.

-- CREATE METADATA TABLE
DROP TABLE IF EXISTS public.meter;
CREATE TABLE IF NOT EXISTS public.meter(
    id TEXT,
    fluid_id SMALLINT NOT NULL,
    date_install TIMESTAMP,
    serial_number TEXT,
    PRIMARY KEY(id)
);
INSERT INTO meter(id, fluid_id, date_install, serial_number)
SELECT 
    CAST(gs.id AS TEXT) AS id, 
    1 AS fluid_id,
    '2022-01-01'::TIMESTAMP + (floor(random() * 364) || ' DAYS')::INTERVAL AS date_install,
    floor(random() * 10000000 + 1000000) AS serial_number
FROM (SELECT generate_series(1000000,4000000) AS id) gs;


-- CREATE HYPERTABLE TABLE
DROP TABLE IF EXISTS measures_simple;
CREATE TABLE public.measures_simple (
    measure_date TIMESTAMP NOT NULL,
    received_date TIMESTAMP NOT NULL,
    value DOUBLE PRECISION,
    meter_id TEXT NOT NULL,
    fluid_id SMALLINT NOT NULL,
    source SMALLINT NOT NULL,
    device_type SMALLINT NULL,
    device_id TEXT NULL
);
SELECT create_hypertable('measures_simple', 'measure_date', chunk_time_interval => INTERVAL '14 DAYS', create_default_indexes => false);
ALTER TABLE measures_simple SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'meter_id,fluid_id'
);
SELECT add_compression_policy(hypertable => 'measures_simple', compress_after => INTERVAL '30 DAYS', if_not_exists => true);


-- CREATE PROCEDURE TO FAKE DATA
CREATE OR REPLACE PROCEDURE public.load_measures_simple (IN date_start TIMESTAMP, IN date_end TIMESTAMP, IN gs_start INTEGER, IN gs_end INTEGER) AS
$BODY$
DECLARE
    inserted_at TIMESTAMP;
BEGIN
    SELECT current_timestamp::TIMESTAMP INTO inserted_at;

    WITH RECURSIVE some_days AS (
        SELECT date_start AS rdate 
        UNION ALL 
        SELECT rdate + INTERVAL '1 DAY'
        FROM some_days
        WHERE rdate < date_end
    )
    INSERT INTO measures_simple(measure_date, received_date, value, meter_id, fluid_id, source, device_type, device_id)
    SELECT
        d.rdate + (floor(random() * 22) || ' HOURS')::INTERVAL AS measure_date, 
        inserted_at AS received_date, 
        floor(random() * 10000 + 1000) AS value, 
        CAST(gs.id as TEXT) AS meter_id, 
        1 AS fluid_id, 
        floor(random() * 10 + 1) AS source, 
        floor(random() * 8 + 1) AS device_type, 
        'ABCDEF' AS device_id
    FROM (SELECT generate_series(gs_start, gs_end) AS id) gs
    INNER JOIN lateral (SELECT * FROM some_days) d ON true;
END;
$BODY$ 
LANGUAGE PLPGSQL;


-- FAKE DATA FOR ABOUT A MONTH FOR 3 000 000 meters
CALL load_measures_simple(date_start => '2023-04-01', date_end => '2023-05-12', gs_start => 1000000, gs_end => 3999999);


-- MANUALLY COMPRESS HYPERTABLE CHUNKS
DO
$BODY$
    DECLARE
        chunk   RECORD;
    BEGIN
        FOR chunk IN SELECT show_chunks('measures_simple', older_than => ('15 DAYS')::INTERVAL) AS name
        LOOP
            RAISE NOTICE '% - Compressing chunk "%"...', timeofday(), chunk.name;
            PERFORM compress_chunk(chunk.name, if_not_compressed => true);
            RAISE NOTICE '% - chunk "%" Compressed.', timeofday(), chunk.name;
            COMMIT;
        END LOOP;
    END;
$BODY$;


-- CREATE HYPERTABLE INDEX
DROP INDEX IF EXISTS ix_measures_simple_meter_id_fluid_id_measure_date_received_date;
CREATE INDEX ix_measures_simple_meter_id_fluid_id_measure_date_received_date ON public.measures_simple (meter_id, fluid_id, measure_date desc, received_date desc);


-- TRIGGER THE COMPRESSION JOB (just to be sure nothing's pending)
DO
$BODY$
    DECLARE
        jid  INT;
    BEGIN
        SELECT job_id INTO jid FROM timescaledb_information.jobs WHERE hypertable_name = 'measures_simple' AND proc_name = 'policy_compression';
        CALL run_job(jid);
        --PERFORM alter_job(jid, next_start=> current_timestamp);
    END;
$BODY$;


-- CREATE A TABLE WITH THE IDS TO DELETE FROM HYPERTABLE
DROP TABLE IF EXISTS public.replay_meter_deleted_staging;
CREATE TABLE public.replay_meter_deleted_staging(
    id TEXT NOT NULL,
    fluid_id SMALLINT NOT NULL
);
INSERT INTO replay_meter_deleted_staging(id, fluid_id)
SELECT 
    CAST(gs.id AS TEXT) AS id, 
    1 AS fluid_id
FROM (SELECT generate_series(1000000,3999999,1000) AS id) gs;
DROP INDEX IF EXISTS ix_replay_meter_deleted_staging_id_fluid_id;
CREATE INDEX ix_replay_meter_deleted_staging_id_fluid_id ON public.replay_meter_deleted_staging (id, fluid_id);


-- FIRST CASE: DELETE TARGETING COMPRESSED CHUNK DIRECTLY
-- compress_hyper_131_316_chunk -> get the name of the chunk through internal timescale tables
DELETE FROM _timescaledb_internal.compress_hyper_131_316_chunk AS measure
USING replay_meter_deleted_staging AS meter
WHERE meter.id = measure.meter_id AND meter.fluid_id = measure.fluid_id;


-- SECOND CASE: DELETE TARGETING HYPERTABLE
DELETE FROM measures_simple AS measure
USING replay_meter_deleted_staging AS meter
WHERE meter.id = measure.meter_id AND meter.fluid_id = measure.fluid_id
AND measure.measure_date >= '2023-04-27' AND measure.measure_date < '2023-05-11';


-- SELECT QUERY USED TO CHECK IF THE PLAN CHANGE
EXPLAIN(ANALYZE,VERBOSE)
SELECT m.id, i.*
FROM measures_simple i
INNER JOIN meter m ON m.id = i.meter_id
INNER JOIN jsonb_array_elements_text('["1000000","1500000","2000000","2500000","3000000"]') mid(val) ON mid.val = m.id
WHERE i.measure_date >= '2023-04-01' AND i.measure_date < '2023-05-01'
    AND i.fluid_id = 1
ORDER BY i.measure_date DESC;
@sb230132
Copy link
Contributor

Hi @mickael-choisnet
Thanks for the detailed report.
Deleting directly from a compressed chunk is not recommended. As you mentioned deleting from a hypertable is time consuming, because of several internal operations like decompressing + delete + recompressing + vacuum.

What you ask is: Is there a way to skip vacuum, which is a good suggestion.
Vacuum is run to reclaim space which is no longer required as part of delete. However i agree that end user should be given an option to let run vacuum or not via a guc variable.

Let me discuss this with the team and check what can be done. Thanks.

@mickael-choisnet
Copy link
Author

Hi @sb230132,

Thanks for the quick reply and asking about the possibility to skip the vacuum if needed (as it is triggered after the job anyway and without triggering it the chunk can no longer be queried efficiently).

About what you explained, I wonder if it would be possible to keep whatever behavior is implemented when directly deleting from compressed chunk, as it is very fast and seems efficient regarding the SELECT results and plan.
And put the whole decompressing + delete + recompressing + vacuum process during the compression policy job?

I'm not sure that I understand why there is 2 min spent during deleting and then 2 more during the next job run if the deleting already took care of decompressing + delete + recompressing?

@sb230132
Copy link
Contributor

Hi @mickael-choisnet

Any UPDATE/DELETE on compressed chunk will result in decompressing(~2min) + update/delete.
As part of compression policy job recompressing (~2min) + vacuum process (~3min) is done.

Hope its clear now.

@mickael-choisnet
Copy link
Author

Hi @sb230132,

Thanks for the explanations.
I've done more tests last Friday and I have some new observations I would like to share with you.

It looks like the chunk will be decompressed whether there are some rows to delete or not.
A good example is to run twice the same delete statement. The second time, there is nothing left to delete, but the query goes on with the decompression.
Another use case is to insert data only in one chunk and execute a delete statement across the entire hypertable.
In a case where we have to delete all data related to an ID, it could save a lot of time if the process is able to determine if there is something to delete before starting the decompression.
Is it possible to improve the process to do so?

I tried to delete various amount of rows and the duration was always the same.
It leads me to believe that all the chunk is decompressed in order to proceed with the deletion.
I remembered something that was said to me regarding an improvement made for the insert process here.
In the delete scenario, there are no new rows, but could it be possible to have something similar about the partial decompress/recompress?

@sb230132
Copy link
Contributor

Hi @mickael-choisnet

Decompression happens only if there are rows which match the WHERE clause. Please check below.
Second delete does nothing since all rows have already been deleted.

If you mention a DELETE statement with segment by columns in WHERE clause, then only affected segments in compressed chunks are decompressed and deleted. Next delete will do nothing.

Same is the case for UPDATE as well.

test1=# begin;
BEGIN
Time: 2.906 ms
test1=*# select table_name, case when status=1 then 'compressed' else 'uncompressed' end as compression_status from _timescaledb_catalog.chunk where hypertable_id = 1;
    table_name    | compression_status
------------------+--------------------
 _hyper_1_3_chunk | compressed
 _hyper_1_2_chunk | compressed
 _hyper_1_1_chunk | compressed
(3 rows)

Time: 3.217 ms
test1=*# delete from metrics_compressed where device_id = 3 and time >= '2000-01-06 05:30:00+05:30'::timestamp with time zone AND time < '2000-01-13 05:30:00+05:30'::timestamp with time zone;
DELETE 5038
Time: 65.287 ms
test1=*# select table_name, case when status=1 then 'compressed' else 'uncompressed' end as compression_status from _timescaledb_catalog.chunk where hypertable_id = 1;
    table_name    | compression_status
------------------+--------------------
 _hyper_1_3_chunk | compressed
 _hyper_1_2_chunk | uncompressed
 _hyper_1_1_chunk | compressed
(3 rows)

Time: 3.339 ms
test1=*# delete from metrics_compressed where device_id = 3 and time >= '2000-01-06 05:30:00+05:30'::timestamp with time zone AND time < '2000-01-13 05:30:00+05:30'::timestamp with time zone;
DELETE 0
Time: 6.887 ms
test1=*# select table_name, case when status=1 then 'compressed' else 'uncompressed' end as compression_status from _timescaledb_catalog.chunk where hypertable_id = 1;
    table_name    | compression_status
------------------+--------------------
 _hyper_1_3_chunk | compressed
 _hyper_1_2_chunk | uncompressed
 _hyper_1_1_chunk | compressed
(3 rows)

Time: 1.665 ms
test1=*# rollback;
ROLLBACK
Time: 3.449 ms

If you are still not convinced please provide a testcase for my better understanding. I will try to provide more details based on your testcase.

@mickael-choisnet
Copy link
Author

Hi @sb230132,

Indeed, a direct delete with a where targeting the meter_id column does not decompress the chunks.
Although it takes ~1 minute to delete 0 rows with 3 chunks of ~5.9 GB.
The chunks are still compressed at the end, so the time is the time needed to scan all the chunks?

But my use case is with a DELETE using another table to target the rows.
In this specific case, it seems all the chunks are decompressed (I've just run the delete statement on my base and the chunk status is 9 for all the chunks at the end of the command and was 1 before).

You can use what I suggested in my reproduce steps, for your test case it would be something like:

DROP TABLE IF EXISTS public.device_to_delete;
CREATE TABLE public.device_to_delete(
    id TEXT NOT NULL
);
INSERT INTO device_to_delete(id)
SELECT 3;
DROP INDEX IF EXISTS ix_device_to_delete_id;
CREATE INDEX ix_device_to_delete_id ON public.device_to_delete (id);

DELETE FROM metrics_compressed AS metrics
USING device_to_delete AS device
WHERE device.id = metrics.device_id;
-- you can add a condition on the date 

Here are the traces from my case:

testing-db=# select table_name, case when status=1 then 'compressed' else 'uncompressed' end as compression_status from _timescaledb_catalog.chunk where hypertable_id = 138;
      table_name      | compression_status
----------------------+--------------------
 _hyper_138_348_chunk | compressed
 _hyper_138_349_chunk | compressed
 _hyper_138_350_chunk | compressed
(3 rows)

Time: 0.481 ms
testing-db=# select * from replay_meter_deleted_staging;
   id    | fluid_id
---------+----------
 1000001 |        1
(1 row)

Time: 0.352 ms
testing-db=# select count(*) from measures_simple m join replay_meter_deleted_staging r on r.id = m.meter_id and r.fluid_id = m.fluid_id;
 count
-------
     0
(1 row)

Time: 2.701 ms
releve-data-analytics-testing-db=# delete from measures_simple as measure using replay_meter_deleted_staging as meter where meter.id = measure.meter_id and meter.fluid_id = measure.fluid_id;
DELETE 0
Time: 790867.776 ms (13:10.868)
releve-data-analytics-testing-db=# select table_name, case when status=1 then 'compressed' else 'uncompressed' end as compression_status from _timescaledb_catalog.chunk where hypertable_id = 138;
      table_name      | compression_status
----------------------+--------------------
 _hyper_138_348_chunk | uncompressed
 _hyper_138_349_chunk | uncompressed
 _hyper_138_350_chunk | uncompressed
(3 rows)

Time: 0.712 ms

@sb230132
Copy link
Contributor

Hi @mickael-choisnet
Thanks for the testcase.

You are right, unfortunately as of today, the WHERE predicate with simple conditions having const literals has the optimisation implemented. We have a PR in the roadmap to increase scope of optimisation for any kind of WHERE predicates. With the implementation of the new PR we should be able to resolve this issue.

@mickael-choisnet
Copy link
Author

Hi @sb230132

Thanks for the input.
Do you know if this will part of the next release (2.11.1 or 2.11.2) or if it needs more time to be completed?

I'm eager to test this version, it could significantly improve the performance of my scenario, and we may eventually be able to implement the solution we want.

@sb230132
Copy link
Contributor

Hi @mickael-choisnet
Iam not sure when it will get prioritised. Definitely not in 2.11.1 or 2.

Copy link

Dear Author,

This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days.

Thank you!

@wrobell
Copy link

wrobell commented Nov 12, 2023

@sb230132 @jnidzwetzki Is this ticket a duplicate of some other ticket? Is the "where" predicate improvement still on your roadmap?

Whatever it is, and whatever you decision, IMHO, this ticket does not deserve automatic culling.

@jnidzwetzki
Copy link
Contributor

Hello @wrobell,

Thanks for getting back to us. I can confirm that the issue is still in the backlog. However, it seems that the issue was mislabeled and as a result, the stale bot processed it last night. I have corrected the labels and the issue will remain open until it is resolved. I apologize for any inconvenience this may have caused.

@wrobell
Copy link

wrobell commented Nov 12, 2023

thank you

@yarkoyarok
Copy link

Our team met this issue as well. For our case is very suitable to drop segments from compressed chunks directly, using segment_by field: we have then time for DELETE query 43ms, and while deleting from hypertable directly exactly sme amount of data from single chunks we get 1:30 minutes. So we decided to use drop segments technique.

So from our side some proposal has been born:

Can timescale query planner use this internally somehow? From a distance the possible algorithm seems to be simple:

If WHERE query consists only of segment_by and time fields and completely overlap some compressed chunk(s) then data can be dropped directly by segment_by from compressed chunk without decompression.

PS. @sb230132 mentioned that

Deleting directly from a compressed chunk is not recommended.

If you have a bit of time are you able, probably, to disclose possible risks of such deletion bit more?

Our team will be very pleased.

Currently I see only one possible problem: if chunk has recently added new data and was not recompressed some tuples can be not compressed, marked as

timescaledb/src/chunk.h

Lines 312 to 316 in e1676ce

/*
* A chunk is in this state when it is compressed but also has uncompressed tuples
* in the uncompressed chunk.
*/
#define CHUNK_STATUS_COMPRESSED_PARTIAL 8

By the way, is it possible to obtain such chunk state via SQL?

@antekresic
Copy link
Contributor

antekresic commented Dec 9, 2024

Hi there.

Seems this issue was addressed in version 2.17. PR in question: #6882

Have you had the time to check it out?

@antekresic antekresic self-assigned this Dec 9, 2024
@mickael-choisnet
Copy link
Author

Hi,

Unfortunately, I haven't had time lately to check the latest improvements made by the team.
I don't think I'll have time to check this before next year.

Thanks for the news tho, I'll try to test this version as soon as possible as it could well be a must-have if it's an effective way to delete data in compressed chunks.

@antekresic
Copy link
Contributor

Feel free to chime in here with your findings.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants