Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki use too much memory #696

Closed
jgaa opened this issue Jun 24, 2019 · 5 comments
Closed

Loki use too much memory #696

jgaa opened this issue Jun 24, 2019 · 5 comments
Labels
component/loki type/bug Somehing is not working as expected

Comments

@jgaa
Copy link

jgaa commented Jun 24, 2019

Describe the bug
I have deployed loki and grafana with docker-compose, running loki in one container. Log collection from GCP seems to work fine, and querying the last few minutes works fine.

However, when I use the logcli program to get the logs to investigate a system failure, loki rapidly allocates all the memory on the machine and dies. The log-file from logcli is empty - suggesting that loki tries to hold all the relevant data in memory before producing any output.

In this query, I ask for just 10 000 log messages (that covers roughly a few minutes).

logcli --addr http://127.0.0.1:3100 query --since 3h  --limit 10000  '{app="db",region="europe-west2"}' > jgaa.log

Loki crashes after 45 seconds, when it fails to allocate > 64 GB of memory.

Environment:

  • Infrastructure: Linode standard instance with 64 GB RAM, 16 cores, 630 GB SSD
  • Deployment tool: docker-compose

Screenshots, promtail config, or terminal output
lokimem

@jgaa
Copy link
Author

jgaa commented Jun 24, 2019

Interestingly, the data-volume with the indexes and chunks is just 718 MB.

jgaa@loki:~/loki$ du -sh data/
718M    data/

@steven-sheehy
Copy link
Contributor

Duplicate of #613

@jgaa
Copy link
Author

jgaa commented Jun 25, 2019

It looks like a bug, not a memory optimization issue.

I re-created the data volume to zap all data, redeployed the GCP cluster and promtail, and just after a few minutes, I tried to look back 6 hours with Grafana, using this query {app=~"db|streams",federation="jgaamem5"}. Loki immediately allocated all the memory on the host and died after less than 60 seconds. The new data-volume had just 2.1 MB data at this time.

So it looks to me like the data allocations are unrelated to the chunks it is fetching (or that it is fetching the same chunk(s) again and again in a loop).

@cyriltovena
Copy link
Contributor

cyriltovena commented Jun 27, 2019

Interestingly, the data-volume with the indexes and chunks is just 718 MB.

Don't forget that logs are compressed to at least 8x factor, when we do the query we have to uncompress. I'm currently reproducing the issue and working on it, I'll keep you posted, if you find anything else interesting let us know.

Also feel free to join our slack: https://grafana.slack.com/messages/CEPJRLQNL

@cyriltovena
Copy link
Contributor

This should be fixed by 0.2.

@chaudum chaudum added the type/bug Somehing is not working as expected label Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/loki type/bug Somehing is not working as expected
Projects
None yet
Development

No branches or pull requests

4 participants