You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At Grafana Labs, we see about 38% "read" object storage API calls issued by the compactor in Mimir clusters with a very large number of tenants. Surprisingly, this high volume of API calls is not caused by downloading blocks to compact, but actually checking if a block's meta.json exists (HEAD request). This issue is about reducing the number of HEAD API calls issued by the compactor to object storage.
Where does the compactor calls object storage client Exists()?
TenantDeletionMarkExists
1 per tenant per compactor shard size every -compactor.compaction-interval
block.Delete()
2 per deleted block
block.MarkForDeletion()
1 per deleted block
MetaFetcher.loadMeta()
1 per tenant's blocks in the storage, per tenant, per compactor shard size every -compactor.compaction-interval
Called before the local cache is checked
Who is the culprit?
Looking at metrics I've estimated that, at Grafana Labs, 96% of calls from compactor to object storage client Exists() is caused by MetaFetcher.loadMeta().
Example: assuming -compactor.compaction-interval=30m, if you have 10k tenants with 365 blocks each (1 each retention), the number of HEAD API calls issued by MetaFetcher.loadMeta() on a daily basis is (24h / 30m) * 10k * 365 = 175M.
History of Exists() call in MetaFetcher.loadMeta()
Inherited from Thanos, since the inception of MetaFetcher.loadMeta() (PR)
I noticed that the sync metas code in the compactor (before MetaFetcher replaced it in PR) didn't call Exists() before checking the cached meta.json
The text was updated successfully, but these errors were encountered:
At Grafana Labs, we see about 38% "read" object storage API calls issued by the compactor in Mimir clusters with a very large number of tenants. Surprisingly, this high volume of API calls is not caused by downloading blocks to compact, but actually checking if a block's
meta.json
exists (HEAD
request). This issue is about reducing the number ofHEAD
API calls issued by the compactor to object storage.Where does the compactor calls object storage client
Exists()
?TenantDeletionMarkExists
-compactor.compaction-interval
block.Delete()
block.MarkForDeletion()
MetaFetcher.loadMeta()
-compactor.compaction-interval
Who is the culprit?
Looking at metrics I've estimated that, at Grafana Labs, 96% of calls from compactor to object storage client
Exists()
is caused byMetaFetcher.loadMeta()
.Example: assuming
-compactor.compaction-interval=30m
, if you have 10k tenants with 365 blocks each (1 each retention), the number ofHEAD
API calls issued byMetaFetcher.loadMeta()
on a daily basis is(24h / 30m) * 10k * 365 = 175M
.History of
Exists()
call inMetaFetcher.loadMeta()
MetaFetcher.loadMeta()
(PR)MetaFetcher
replaced it in PR) didn't callExists()
before checking the cached meta.jsonThe text was updated successfully, but these errors were encountered: