-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add memcached support to index cache #1881
Add memcached support to index cache #1881
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple nits
otherwise LGTM
a302126
to
38a9a41
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, awesome work, thanks for this 👍 Generally looking good, some suggestions so far, plus response to open questions.
Is this change set feasible to be admitted into Thanos?
Yes, happy to deploy this on our setup as well.
How should memcached config options specified? Through CLI flags or config file?
I believe the same as tracing
and objstore.config
- so config file.
Should MaxGetMultiBatchConcurrency be global (like right now) or a per MultiGet() call limit? I would be more lean towards the latter (see TODO in pkg/cacheutil/memcached_client.go)
Good question, I think in both cases we can congest at some point (local resources vs memcached server resources). With global we can at least know that we are short of workers. (: Thoughts? @squat @brancz ?
In any case it would be nice to have observabiity: latency metrics / traces for those.
) | ||
|
||
// MemcachedClient is a high level client to interact with memcached. | ||
type MemcachedClient interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need another interface?
I think we way we do it is to accept interfaces but return structs as advised by Go (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need another interface?
The interface is currently used to mock it in pkg/store/cache/memcached_test.go
. It may also be need to add multi-tenancy support in Cortex: an option may be having 1 single underlying memcached client and wrapping it through a proxy for each tenant having the proxy prefixing the keys with the tenant ID (but no decision has been taken and I'm open to suggestions).
I think we way we do it is to accept interfaces but return structs as advised by Go (:
Right, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like a YAGNI here. Does it mean creating distributing each client implementation for each block? (: Sounds like something we can add unless we know clear decision here, but let's discuss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not getting if the comment is about (a) removing the MemcachedClient
interface or (b) merging memcachedClient
with MemcachedIndexCache
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about c
merging this with IndeCache as this layer might be a bit shallow, but the idea of using this for other caches like chunks is tempting.
I think I am happy with this for this iteration (:
pkg/cacheutil/memcached_client.go
Outdated
) | ||
|
||
const ( | ||
defaultTimeout = 100 * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not too shorty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. Generally, we want a cache being significantly faster than the backend storage. Assuming a 1ms networking latency to comunicate to a memcached cluster running within the same cloud region, 100ms should be enough (ie. it's what we set in our Cortex clusters).
However, I do see that it's probably a low value for a default. What would you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's gather some data and start with something (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our Cortex clusters running on GCE we measure a 99th memcached latency of about 15ms (pretty constant) and max latency between 50ms and 250ms. I agree a default timeout should be conservative, so I'm going to set it to 500ms here, and then can be adjusted by the user (ie. in our cluster I don't see a good reason of setting it above 100ms, given we want to skip the cache if it's too slow for any reason).
f744f29
to
a54f6bd
Compare
@bwplotka Thanks for your initial review! I've addressed your feedback (or commented otherwise), added the config file and written some doc. May you take another look, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To respond on #1881 (comment)
Batch size feels ok to me 👍 So no batches by default.
Ok, IMO we do lots of micro-optimizations which makes this PR quite hard to review and iterate over. But given they were mostly based on Cortex code I think I am fine with adding this as long as there will be someone to maintain this code other than a few of us from maintainer team. We can definitely consider adding someone from your side as maintainers as well (:
It's LGTM from me, modulo small suggestions.
Would be nice to see some review from others like @GiedriusS @brancz, @squat or @metalmatze but it takes some effort (:
docs/components/store.md
Outdated
type: in-memory | ||
config: | ||
# Maximum number of bytes the cache can contain (defaults to 250MB). | ||
max_size_bytes: 262144000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how hard would be to support strings based sizes? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the same code from kingpin
can be reused here: https://godoc.org/github.com/alecthomas/units#Base2Bytes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've introduced storecache.Bytes
, whose yaml marshalling is based on units.Base2Bytes
, to allow to configure in-memory index cache size using byte units.
What's your take?
|
||
uid := ulid.MustNew(1, nil) | ||
|
||
tests := map[string]struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we do list for deterministic order, but that's fine here
) | ||
|
||
// MemcachedClient is a high level client to interact with memcached. | ||
type MemcachedClient interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like a YAGNI here. Does it mean creating distributing each client implementation for each block? (: Sounds like something we can add unless we know clear decision here, but let's discuss
pkg/cacheutil/memcached_client.go
Outdated
batchEnd = len(keys) | ||
} | ||
|
||
c.getMultiQueue <- &memcachedGetMultiBatch{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We lost the discussion on GitHub but I guess at the end we decided on per client concurrency for now?
Thinking loudly: if we choose this I would say we need good observability: how to find latency caused by starvation of this queue? The latency metric is per single call to Memcache (which is fair for from mameched client perspective).
I think tracing would help as we already create some spans here e.g in getMultiSingle
so should be enough..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe our current gate https://github.com/thanos-io/thanos/blob/master/pkg/store/gate.go could be adopted here somehow? It would be nice to have some metric here or in getMultiSingle
because otherwise all users would need to have some kind of tracing enabled in case this becomes a problem (and most of the time these things do, as we all know :P).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the metric standpoint, that's a good point. Do you think @pracucci we can adapt the existing way of queuing requests via gate
in this way? as suggested by @GiedriusS ? I don't have a strong opinion but can see those pros & cons:
Advantages:
- Reuse
- Latency + queue size metric
- We can switch from global to per request limit easily
gate
here and there.
Downsides:
- Each gated request requires a new goroutine. Bit more goroutine overhead and coordination.
I think for this iteration I would be happy with what we have now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally see a big value in having such metrics. Also the gate would simplify the implementation (less code).
We lost the discussion on GitHub but I guess at the end we decided on per client concurrency for now?
Re-iterating on this, I think the current implementation is confusing. We're currently limiting the batched GetMulti()
but not non-batched GetMulti()
. For example, if you have MaxGetMultiBatchSize=1000
and MaxAsyncConcurrency=10
and you concurrently call 100 times MemcachedClient.GetMulti(1K keys)
we end up with 100 concurrent underlying requests (each with 1K keys for a total of 100K keys). However, if you call once MemcachedClient.GetMulti(100K keys)
we end up with 10 concurrent underlying requests, each with 1K keys, until all 100K keys are fetched.
I see three options:
- Keep it as is (but weird)
- Enforce the concurrency limit even to non-batched requests
- Pro: the concurrency limit is an effective max number of concurrent fetch requests
- Con: a single request with a bunch of keys will slow down everything else because of the queue
- Switch to a per-request gate
- Pro: a single request with a bunch of keys doesn't slow down others
- Con: there's no way to set a cap on the max number of connections towards memcached
I'm currently more keen to #2, cause it looks the safest option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think @pracucci we can adapt the existing way of queuing requests via gate in this way? as suggested by @GiedriusS ?
I think so and I believe it would make sense. Observability - at this stage - is more important than goroutine overhead and we could implement it in a way that goroutines are created only for batched requests (so as far as the batching is disabled or requests do not exceed the max batch size, there will be no goroutine overhead).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this commit I've:
- Switched the max get multi concurrency to a gate
- Applied the gate also to non batched requests
What's your take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
// Wait for all batch results. In case of error, we keep | ||
// track of the last error occurred. | ||
items := make([]map[string]*memcache.Item, 0, numResults) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe sync.Pool
could be used here? Because this will be invoked quite a lot so it should help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but fine to play with it with some microbenchmark on later PRs.
Default("250MB").Bytes() | ||
|
||
indexCacheConfig := extflag.RegisterPathOrContent(cmd, "index-cache.config", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (: Thanks, I think there are few bits that we can tweak/optimize later on with some microbenchmarks like adding Gate
or pooling
but this is fine for now (:
) | ||
|
||
// MemcachedClient is a high level client to interact with memcached. | ||
type MemcachedClient interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about c
merging this with IndeCache as this layer might be a bit shallow, but the idea of using this for other caches like chunks is tempting.
I think I am happy with this for this iteration (:
pkg/cacheutil/memcached_client.go
Outdated
batchEnd = len(keys) | ||
} | ||
|
||
c.getMultiQueue <- &memcachedGetMultiBatch{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the metric standpoint, that's a good point. Do you think @pracucci we can adapt the existing way of queuing requests via gate
in this way? as suggested by @GiedriusS ? I don't have a strong opinion but can see those pros & cons:
Advantages:
- Reuse
- Latency + queue size metric
- We can switch from global to per request limit easily
gate
here and there.
Downsides:
- Each gated request requires a new goroutine. Bit more goroutine overhead and coordination.
I think for this iteration I would be happy with what we have now.
|
||
// Wait for all batch results. In case of error, we keep | ||
// track of the last error occurred. | ||
items := make([]map[string]*memcache.Item, 0, numResults) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but fine to play with it with some microbenchmark on later PRs.
"github.com/alecthomas/units" | ||
) | ||
|
||
// Bytes is a data type which supports yaml serialization/deserialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this 👍
ee88dd5
to
54eefc4
Compare
Thanks @bwplotka for your extensive review and sorry for this big PR. Next time I will address changes in a more iterative way. I went through a self review and further manual tests and it looks good to me, but I will be happy to re-iterate on it in case of further feedback.
Sure. I'm OK maintaining it. Feel free to assign me memcached-related issues in the next future. |
1249341
to
cf41f71
Compare
Last small conflict needs to be solved before the final review. Thanks for your work! |
…mory cache backend because generic Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…op() Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Co-Authored-By: Bartlomiej Plotka <[email protected]> Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…ng in MemcachedClient Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…is to have a global limit on the max concurrent batches Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…ment Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
cf41f71
to
f1482f2
Compare
@GiedriusS I've rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge on green 👍
Thanks @pracucci |
This PR proposes to introduce memcached support for the index cache. Few reasons why memcached may make sense:
Open questions:
MaxGetMultiBatchConcurrency
be global (like right now) or a perMultiGet()
call limit? I would be more lean towards the latter (see TODO inpkg/cacheutil/memcached_client.go
)Changes
MemcachedIndexCache
support, with the backend client inspired by Cortex, with some substantial differences:SetAsync()
instead ofSet()
Verification
Manual and unit tests.