Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics library leaks memory #897

Closed
gdutliuyun827 opened this issue Jun 20, 2017 · 13 comments
Closed

Metrics library leaks memory #897

gdutliuyun827 opened this issue Jun 20, 2017 · 13 comments

Comments

@gdutliuyun827
Copy link

Versions

Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Sarama Version:46dabe0f34a162d428640a0e9714eea9079e7ee2
Kafka Version: kafka_2.11-0.10.1.0.tgz
Go Version:1.8

Configuration

What configuration values are you using for Sarama and Kafka?

Logs

When filing an issue please provide logs from Sarama and Kafka if at all
possible. You can set sarama.Logger to a log.Logger to capture Sarama debug
output.

Problem Description

the Problem is metrics lib. the function is below.
When call Broker.Open cause new StandardMeter,but when Broker.Close does't destory it.

func NewMeter() Meter {
if UseNilMetrics {
return NilMeter{}
}
m := newStandardMeter()
arbiter.Lock()
defer arbiter.Unlock()
arbiter.meters = append(arbiter.meters, m)
if !arbiter.started {
arbiter.started = true
go arbiter.tick()
}
return m
}

@eapache
Copy link
Contributor

eapache commented Jun 20, 2017

The metrics library does not indicate that we need to do anything to clean up unused meters, so I would expect the go garbage-collector to take care of them eventually, once the broker itself is GCed or re-opened. @slaunay perhaps you can shed some light?

@slaunay
Copy link
Contributor

slaunay commented Jun 20, 2017

The meter abstraction relies on a single goroutine for computing rates instead of one per meter. This means that each meter is referenced by a global meterArbitrer struct and never gets garbage collected indeed 😢, see:
https://github.com/rcrowley/go-metrics/blob/f770e6f5e91a8770cecee02d5d3f7c00b023b4df/meter.go#L230

This would lead to memory leaks:

  • every time a Broker is closed and opened with a different broker id otherwise the meter would be reused (not sure how often this happens and probably depends on the application)
  • closing an AsyncProducer and creating a new one with a different Config that uses a new metrics.Registry by Broker as well as ProduceRequest (record-send-rate meters)

It does not seem possible in go-metrics to unregister the meter and get that reference removed (even if we do not unregister the metrics in Sarama today)...
Note that we do not use timer metrics but that would result in the same issue as the StandardTimer struct uses a meter under the hood...

Looks like this needs to be fixed in go-metrics first and we would also have to use Registry#Unregister(string) or something, see:

@lday0321
Copy link

Before go-metircs solving this leak problem, i think disable metrics collection metrics.UseNilMetrics=truewill be an workaround

@soar-zhengjian
Copy link

Me too. When frequence Call Client.GetOffset and Client.Close, Cause Memory Leak. Because Client.GetOffset will call Broker.Open.

@slaunay
Copy link
Contributor

slaunay commented Aug 31, 2017

@lday0321 You can also workaround that memory leak without requiring changes upstream. Disabling the metrics by default to re-enable them later on sounds confusing to me against the end user.

@soar-zhengjian I believe that would be the case if you were to "reopen" a Broker with a different broker id every time.
Do you have some numbers about that amount of memory leaked in your application over time related to that use case?

@eapache eapache changed the title When frequence Call Broker.Open and Broker.Close, May Cause Memory Leak Metrics library leaks memory Aug 31, 2017
@eapache eapache added the bug label Aug 31, 2017
@lday0321
Copy link

lday0321 commented Sep 1, 2017

@slaunay according to my test, after "reopen" Broker 778827 times, the pprof shows that the memory leak will be 939.05MB (almost 1k/time)

Entering interactive mode (type "help" for commands)
(pprof) top 20 sarama
939.05MB of 949.56MB total (98.89%)
Dropped 205 nodes (cum <= 4.75MB)
      flat  flat%   sum%        cum   cum%
  486.54MB 51.24% 51.24%   912.56MB 96.10%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.newStandardMeter
  150.51MB 15.85% 67.09%   150.51MB 15.85%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.NewEWMA1
  140.01MB 14.74% 81.83%   140.01MB 14.74%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.NewEWMA15
  135.51MB 14.27% 96.10%   135.51MB 14.27%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.NewEWMA5
   26.48MB  2.79% 98.89%   939.05MB 98.89%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.NewMeter
         0     0% 98.89%   939.05MB 98.89%  github.com/lday0321/myProj/vendor/github.com/Shopify/sarama.(*Broker).Open.func1
         0     0% 98.89%   612.02MB 64.45%  github.com/lday0321/myProj/vendor/github.com/Shopify/sarama.getOrRegisterBrokerMeter
         0     0% 98.89%   939.05MB 98.89%  github.com/lday0321/myProj/vendor/github.com/Shopify/sarama.withRecover
         0     0% 98.89%   939.05MB 98.89%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.(*StandardRegistry).GetOrRegister
         0     0% 98.89%   939.05MB 98.89%  github.com/lday0321/myProj/vendor/github.com/rcrowley/go-metrics.GetOrRegisterMeter
         0     0% 98.89%   939.05MB 98.89%  reflect.Value.Call
         0     0% 98.89%   939.05MB 98.89%  reflect.Value.call
         0     0% 98.89%   939.05MB 98.89%  runtime.call32
         0     0% 98.89%   939.05MB 98.89%  runtime.goexit

@slaunay
Copy link
Contributor

slaunay commented Sep 1, 2017

Thanks for providing those statistics @lday0321.
I believe we use 4 meters shared among all brokers and 4 meters specific to each broker, this means the leak per meter would be around ~100B (828/8 B) for a single broker cluster, probably less for multiple brokers cluster.

Are you allocating a new Client that references a new Config and transitively a new metrics.Registry for each iteration in your use/test case?

@lday0321
Copy link

lday0321 commented Sep 1, 2017

Yes, i use a new client each time. Actually, i use the sarama-cluster's NewConsumer method

@slaunay
Copy link
Contributor

slaunay commented Sep 1, 2017

You should be able to mostly avoid the memory leak by reusing either your own cluster.Client struct or cluster.Config struct as it embeds the sarama.Config struct that will reference a single metrics.Registry rather than creating new one each time.

Using a single Client could also speed up your application (cached metadata, already established connection to brokers, ...) but can also bring side effects depending on your use case.

Hopefully the memory leak can be fixed in the metrics library soon as the maintainer recently shown interest in rcrowley/go-metrics#197 😀.

@lday0321
Copy link

lday0321 commented Sep 3, 2017

@slaunay thanks for your suggestion! Is it still a good practice of using single Client if i need to setup thousands of consumers for thousands of topics? or any suggestions for this kind of case?

@slaunay
Copy link
Contributor

slaunay commented Sep 6, 2017

@lday0321 I think reusing cluster.Client and cluster.Config as much as possible makes sense but if you have thousands of consumers and topics you probably do not want to do that in a single Go application.

That being if you have that much topics in a single cluster then Apache Kafka might not be the best technology, an AMQP broker like RabbitMQ could be a better fit depending on your use case.

@mihasya
Copy link

mihasya commented Nov 28, 2017

(the PR for being able to stop unused meters/counters has been merged rcrowley/go-metrics#206)

@eapache
Copy link
Contributor

eapache commented Dec 5, 2017

Fixed (or enough, anyway) by #991.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants