-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aborted full compaction group ... compaction in progress ... file exists #8559
Comments
From my current analysis it appears that in some cases compactions freeze up. The reason for the aborting messages is due to the compactor trying the compaction again but failing because Restarting the server fixes it because @phemmer how many logical cores do you have on the machine? What about RAM? |
@phemmer the next time it happens would you be able to |
8 cores, 16gb ram. |
|
Yes, usually a span of about 5 minutes.
Sometimes deleting/dropping, but not regularly.
I only have one DB. |
Sorry forgot to add:
|
303 shards. |
Thanks for the quick responses. I'm currently trying to come up with a similar workload and see if I can reproduce. Do you have lots of measurement or just a few? You mentioned occasional deletes/drops - which command would they be using typically? |
11 measurements. I use |
@phemmer |
I might be able to. I think I've figured out why I'm having so many issues with InfluxDB, and why the performance is so abysmal. It seems writing points to time in the past is not handled well. Even if the points are written to a completely different database. So I'm going to have to spin up a second InfluxDB with a completely different data storage path to store my analytical results in, so that it doesn't tank the main InfluxDB. I can run this second database with your patch since I'm not worried about losing data there. I just mention it as it won't be the same configuration as what the issue was reported on. There'll still be |
@phemmer I have managed to repro the issue without needing incoming writes. Merely dropping measurements and getting in the way of compactions is enough to trigger it, so I think your alternative setup would be fine for testing the issue. In terms of historical data. That's an interesting suggestion, and we will certainly look into that. Do you feel historical write performance has degraded as compared to |
Fixed via #8562 |
Bug report
System info: [Include InfluxDB version, operating system name, and other relevant details]
Version: e3918e0
OS: Linux
Steps to reproduce:
Expected behavior: [What you expected to happen]
Not see errors in the log.
Actual behavior: [What actually happened]
See errors in the log.
Additional info: [Include gist of relevant config, logs, etc.]
Issue being opened as requested here.
The issue continues until I restart InfluxDB.
This has happened to me several times now. I have not identified a pattern on when it happens. Though I have been experiencing a ton of performance issues with InfluxDB, so I'm not sure if it's related. But basically InfluxDB has only 16GB of data, ~20k series, and only about 80KB/s of writes (according to "http" measurement in _internal db), yet is experiencing horrible performance. Queries & Writes frequently time out, constantly high CPU usage, high disk IO (~18MB/s reads and ~19MB/s writes), etc. So maybe compactions are going so slow that one is kicking off before the other finishes, dunno.
The text was updated successfully, but these errors were encountered: