Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory pressure when reading large topic from beginning #1046

Closed
SimpleApp opened this issue Feb 12, 2018 · 13 comments
Closed

Reduce memory pressure when reading large topic from beginning #1046

SimpleApp opened this issue Feb 12, 2018 · 13 comments

Comments

@SimpleApp
Copy link

SimpleApp commented Feb 12, 2018

Versions

Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Sarama Version: 3b1b388
Kafka Version: 0.11.1
Go Version: go1.9.3 darwin/amd64

sarama-cluster version : v2.1.12 3001c2453136632aa3219a58ea3795bb584b83b5

Configuration

Problem arises in multiple configuration

Logs

N/A

Problem Description

Upon start of a daemon, I'm reading a fairly large topic (containing large messages, around 1MB / 2MB each) from the beginning, "draining it" from all the messages stored (using it like a poor-man's DB table). I know the limit of this approach, and i'm fine with it, but there is one problem i didn't expect :

it seems like reading a lot those messages makes the process memory raise tremendously, and the memory is only retrieved after many minutes (of doing nothing). I'm worried that under lower memory limit (using kubernetes), the process would simply crash when trying to read the topic.

My current understanding of the issue so far, and by reading similar issues, is that go garbage collector doesn't gives memory back very easily, but i'm wondering if sarama is itself reusing memory for the decoding operations from a memory pool, or if it's asking the runtime for new buffers for every message ?

In particular, this line in broker.go (line 530) made me wonder :

buf := make([]byte, decodedHeader.length-4)
bytesReadBody, err := io.ReadFull(b.conn, buf)
@eapache
Copy link
Contributor

eapache commented Feb 12, 2018

As you correctly note, Sarama does not use a memory pool and simply relies on Go's garbage collector. You are also correct that the garbage collector does not return memory to the OS very frequently. However, Go's allocator is very good about reusing memory internally so I wouldn't expect enormous memory usage or growth; it's not something I've seen in similar use cases before.

I would note that your messages are very large (1-2MB as you mention), which means the default Sarama configuration will not work well for you to begin with. Have you increased the value of Consumer.Fetch.Default? https://github.com/Shopify/sarama/blob/v1.15.0/config.go#L177-L182.

Note that the default value of that config was recently bumped to 1MB (after the version you're using): #1024 which is probably still too small for your use case. If your messages are as large as you claim, I would try setting it to 8MB.

@SimpleApp
Copy link
Author

SimpleApp commented Feb 12, 2018

Thansk for the answer. Indeed i had to increase it (to 2OMB actually, but i think 5MB would have been enough).

From what i see, it doesn't seem to reuse the memory at all, because it consumes up to 2 GB of memory , then after a few minutes (of total inactivity, since all the daemon is doing is reacting to consummed messages) goes back to 200MB. Which means the total active memory shouldn't be a lot more than 200MB.

Also, i set metrics.UseNilMetrics = true , because i feared that metrics gathering was the culprit, but it didn't change anything.

@eapache
Copy link
Contributor

eapache commented Feb 12, 2018

Hmm... is your code doing anything unusual or potentially long-running with the messages that the consumer returns you? It's possible that those have some references back to the original packet data, so if you're holding on to them it might prevent anything from being reclaimed?

@SimpleApp
Copy link
Author

I should add that the memory consumed during the draining of the topic seems proportional to the size of the topic (which is another thing i didn't expect).

@eapache
Copy link
Contributor

eapache commented Feb 12, 2018

Probably the most useful investigation you could do right now is to use go's builtin memory profiling tools (https://golang.org/pkg/runtime/pprof/) to figure out more precisely where the bulk of the memory is being allocated, and why it isn't being reused/reclaimed. If there's a bug that will show us where it is, and if it's just the GC being lazy that will show us where to use memory pools (e.g. I suspect the decode process is probably going to be worse in this regard than reading the raw packet).

@SimpleApp
Copy link
Author

Messages are just json-encoded structure, which i decode and then extract some info to store in a local map.

It doesn't do anything else. In particular, if that were a memory leak on my part, it couldn't retrieve memory back after some time of inactivity.

@SimpleApp
Copy link
Author

I did use pprof, and 99% of the memory is allocated by the decode function in Broker.go

@SimpleApp
Copy link
Author

One interesting update : we've actually run the daemon in a more memory constrained environment, using kubernetes to limit the allowed memory to 1GB, and it doesn't seem to crash. Still, i'm curious to understand why the memory buffers inside go runtime aren't reused..

@eapache
Copy link
Contributor

eapache commented Feb 12, 2018

I have no idea why this is behaving the way it is, this is getting into the guts of the Go memory allocator and GC which are well beyond my knowledge.

If the program runs fine in a more constrained system, then my best guess is that Go has decided allocating more (since it's available) is easier than reusing the existing chunks for some reason.

I'm happy to take pull requests to e.g. use a sync.Pool or something if they come with benchmarks or profiles showing an improvement, but I don't think there's much more I can do for your particular case without a lot more research.

@eapache eapache closed this as completed Feb 12, 2018
@eapache eapache mentioned this issue Feb 28, 2018
@chandradeepak
Copy link
Contributor

@SimpleApp @eapache , i have faced similar issue too and this is how we solved it

see if that makes any difference for you

  1. By default stores 256 messages for each partition . So in your case it if you have 4 partitions it is close to 1GB. which would explain what you are seeing .
    so reduce the Config.ChannelBufferSize( sarama config) to less like 10 . so that way each partition is consuming only 10mb and 4 partitions is 40mb .

https://github.com/Shopify/sarama/blob/master/config.go#L262

  1. Try not reading lot of data from the Messages() channel. If you don't read there wont be much stored. if you keep reading there is a chance the the rate at which you are reading go garbage collector has not freed the memory . also try making the pointer which you use for sarama.ConsumerMessage nil so that go immediately garbage collects it .

@l2eady
Copy link

l2eady commented Apr 24, 2019

@chandradeepak Hi Sir, I have the same issue like you, my situation is I have a consumer that consume a lot of data from the consumer.Message(). and you said try not reading lot of data from this function. So have we can handle this case. Do you have any suggestion for me please?

@chandradeepak
Copy link
Contributor

@l2eady
consumer.Message() is a read channel. If you don't read it it puts back pressure automatically.

some where in your code you should have
for {
select{
case <- consumer.Messages():
do some thing.

}
}

if that do some thing is hanging it won't go for next loop to read next message and that would slow it down.

@l2eady
Copy link

l2eady commented Apr 25, 2019

@chandradeepak Thank for your information. I was did that way, but it still not solve my problem about memory leak, and I try to trace the memory with pprof, and it show the sarama use a lot of memory in Broker file (sendAndReceieve, and also com/rcrowleygo-metrics-NewExpDecaySample)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants