-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a sync.Pool for bitmapContainers used in ToBytes() #246
Use a sync.Pool for bitmapContainers used in ToBytes() #246
Conversation
One of our applications maintains a lot of small compressed bitmaps and memory and occasionally serializes them to disk. When they are serialized, we notice a large bump in memory allocations (and accompanying cpu/gc activity) due to creating a new bitmapContainer each time. This commit modifies roaringArray to use a sync.Pool of *bitmapContainer in writeTo() instead of creating a new one each time. Since the bitmapContainers can now be reused, a clear() method is also added to reset its state
1 similar comment
Looks reasonable to me with the minor caveat that we may hold on to a bitmap container in the pool needlessly for a very long time (forever). |
Ping @alldroll ... or anyone... can we discuss this? |
I'd like to get feedback from a few others... |
A better solution would be avoid the allocation entirely... |
Prior to go 1.13, pools were cleared at the start of every GC The behavior has changed since then but it seems to indicate that objects will still be collected from the pool:
|
Yes, agreed. I don't know enough about the internals to suggest a solution though. |
@kevinconaway The notes regarding the GC is reassuring. This answers my main objection. I will give people time to react... |
I have some thoughts regarding this issue: Instead of initializing func newBitmapContainerByValue() bitmapContainer {
var bitmapBuf [(1 << 16) / 64]uint64
return bitmapContainer{
bitmap: bitmapBuf[:],
}
} And now we can call func BenchmarkSerializationSequential(b *testing.B) {
b.ReportAllocs()
b.StopTimer()
s := NewBitmap()
initsize := 650000
for i := 0; i < initsize; i++ {
s.Add(uint32(i))
}
s.RunOptimize()
buf := make([]byte, 0, s.GetSerializedSizeInBytes())
b.StartTimer()
for j := 0; j < b.N; j++ {
w := bytes.NewBuffer(buf[:0])
s.WriteTo(w)
}
}
// old
BenchmarkSerializationSequential-8 545766 1841 ns/op 8416 B/op 13 allocs/op
// proposed from PR
BenchmarkSerializationSequential-8 955692 1159 ns/op 224 B/op 12 allocs/op
// my solution
BenchmarkSerializationSequential-8 1525407 782 ns/op 224 B/op 12 allocs/op What do you think about it? |
+1 for the benchmark and the better performance, but I am thinking about something even simpler... let me see... |
Can you guys review my proposal at #247 ? |
Closing in favor of #247 |
One of our applications maintains a lot of small compressed bitmaps in memory and occasionally serializes them to disk. When they are serialized, we notice a large bump in memory allocations (and accompanying cpu/gc activity) due to creating a new
*bitmapContainer
each time.This commit modifies roaringArray to use a
sync.Pool
of*bitmapContainer
in writeTo() instead of creating a new one each time. Since the bitmapContainers can now be reused, a clear() method is also added to reset its state@lemire please let me know if you have alternative thoughts here. The goal is to avoid allocating a new
*bitmapContainer
(which is really a[1024]uint64
and anint
) every time we callToBytes()
on a compressed bit map