-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Deletion in compressed chunks led to database size increase of nearly 20x #6196
Comments
Hi @Oidlichtnwoada , thanks for reaching out. I understand this issue is not of your making so the extra time you spent reporting it is really appreciated. I think that is probably why VACUUM appears to be ineffective: if the majority of the chunk remains uncompressed, and a minor amount is deleted, then this has no noticeable effect on the chunk size which is dominated by the uncompressed data size. Could you please share the explain plan for the delete statement you are running? That would help us understand the situation better. As a potential workaround to help in this situation, we would suggest to try updating and then recompressing each chunk one by one. This would keep the size reasonable, as at most one chunk would be uncompressed at a time.. Hope that helps |
This is what I've been told but I do not find this to be the case. I've been having disk space issues trying to use the new feature of directly deleting from compressed chunks too. If I do it the old manual way of manually decompressing the affected chunks, then deleting the rows, everything works as expected. Trying the automatic process resource usage goes crazy, cpu, ram, and disk. Something isnt right. |
Certain execution plans by Timescale might be tricky. I am struggling with something similar (I hope to raise separate ticket soon). @djzurawski Please try to see the plan for
vs
Is there a big difference? (Shot in the dark as the table schemas are unspecified). |
Also our resource usage went completely broken, I would not have released the deletion feature from compressed chunks as there are only two user groups:
|
#6309 might be related. |
Same behavior observed. Running a Delete operation shows big jumps in RAM, Disk and Processor usage even if targeting a single sample. |
I understand the increase in disk since the chunk needs to be decompressed, but even after complete of deletion we can experience bad performance while querying data over the chunk which data is deleted. This issue seems also similar to this one #5802 Any update regarding "fixing" deletion from compressed chunks or anyone has a recommendation/workaround that would avoid running into performance degradation? |
There have been recent optimizations to UPDATE/DELETE operations released with version 2.17. Those should improve the resource usage when running such operations. Have you had the chance to check if the situation improved since? |
What type of bug is this?
Crash, Performance issue
What subsystems and features are affected?
Compression
What happened?
We have a hypertable "measurements" with 2300 6-hour chunks ... it had 25GB compressed (with hypertable_size, whole database used 30GB), we ran a delete statement for measurements to delete measurements of deleted projects in that hypertable ... this took our whole cluster down as suddenly after executing the delete call the whole database had a size of 330GB in longhorn (the size was mostly the hypertable, as queried with hypertable_size) and it was still increasing, but the HDD space of all nodes was full ... this was effectively killing our whole Kubernetes cluster as each volume replica in Longhorn filled up the HDD of the underlying node (not even the volume size limit in Longhorn was working) ... could you please mention in the docs that during delete the database size will rapidly increase as it seems that every chunk is uncompressing itself? I am not sure if that is intended, it would be more cleverly to uncompress, delete data, and recompress chunks in batches and not to uncompress everything leading to a massive HDD usage ...
timescale/timescaledb-ha:pg14.9-ts2.11.2-all, we use the docker version with a Longhorn volume ...
Furthermore, once chunks are uncompressed, longhorn cannot reclaim the unused HDD disk space, even if VACUUM is executed ... is a full vacuum call with downtime necessary? After the uncompressing the size stays large in Longhorn and even after a filesystem trim, nothing can be reclaimed in Longhorn ... there must be some errors in the PG/TimescaleDB implementation ...
Thanks in advance for the response ...
The delete statement was smth like:
DELETE FROM measurements m WHERE variable not in (SELECT id FROM other_table)"
TimescaleDB version affected
2.11.2
PostgreSQL version used
14.9
What operating system did you use?
docker
What installation method did you use?
Docker
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
The text was updated successfully, but these errors were encountered: