memoize s3 client #5377

trinity-1686a · 2024-09-03T10:12:13Z

Description

memoize s3 client, keeping them until they seem unused. I don't think we ever use more than a single S3StorageConfig, so we could store a single value, but a map we gc seems less confusing if we start having multiple config at once
fix #4933
fix #5236

How was this PR tested?

ran a cluster and looked at searcher logs. Didn't look at janitor logs, though there is no reason for it to behave differently

github-actions · 2024-09-03T11:19:31Z

On SSD:

Average search latency is 1.01x that of the reference (lower is better).
Ref run id: 3235, ref commit: aac8b49
Link

On GCS:

Average search latency is 0.983x that of the reference (lower is better).
Ref run id: 3236, ref commit: aac8b49
Link

quickwit/quickwit-storage/src/object_storage/s3_compatible_storage.rs

quickwit/quickwit-storage/src/object_storage/s3_compatible_storage_resolver.rs

quickwit/quickwit-storage/src/object_storage/s3_compatible_storage.rs

rdettai · 2024-09-04T11:06:54Z

quickwit/quickwit-storage/src/object_storage/s3_compatible_storage_resolver.rs

 use crate::{
    DebouncedStorage, S3CompatibleObjectStorage, Storage, StorageFactory, StorageResolverError,
 };

 /// S3 compatible object storage resolver.
 pub struct S3CompatibleObjectStorageFactory {
    storage_config: S3StorageConfig,
+    // we cache the S3Client so we don't rebuild one every time we need to connect to S3.


Even without this caching we didn´t create a new client for every connection as long as the Storage object was reused.

it wasn't reused cross request

This makes me wonder if the best approach wouln't be to cache the resolved storages in the storage resolver. This would solve the problem once and for all for all storage types. But I think this solution is already good as it does solve the linked issues.

the storages are related to a Uri (the S3Client isn't), which changes for each index. It would be interesting to be able to share the debouncer though, which isn't done here. That could make concurrent requests slightly more efficient.

trinity-1686a requested a review from fulmicoton September 3, 2024 10:12

rdettai reviewed Sep 3, 2024

View reviewed changes

quickwit/quickwit-storage/src/object_storage/s3_compatible_storage.rs Outdated Show resolved Hide resolved

cache S3Client in factory

68207c4

trinity-1686a force-pushed the trinity/memoize-s3client branch from b85cfa3 to 68207c4 Compare September 3, 2024 14:46

trinity-1686a requested a review from rdettai September 3, 2024 14:46

rdettai approved these changes Sep 4, 2024

View reviewed changes

quickwit/quickwit-storage/src/object_storage/s3_compatible_storage_resolver.rs Outdated Show resolved Hide resolved

trinity-1686a added 2 commits September 4, 2024 12:04

use tokio oncecell instead of mutex

25ef787

Merge branch 'main' into trinity/memoize-s3client

6e1c627

rdettai reviewed Sep 4, 2024

View reviewed changes

trinity-1686a added 2 commits September 4, 2024 13:56

remove unused constructor for s3 storage

d02d239

Merge branch 'main' into trinity/memoize-s3client

87702fe

trinity-1686a enabled auto-merge (squash) September 4, 2024 13:04

trinity-1686a merged commit 0820c90 into main Sep 4, 2024
5 checks passed

trinity-1686a deleted the trinity/memoize-s3client branch September 4, 2024 13:18

PSeitz mentioned this pull request Sep 10, 2024

improve logging #5387

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memoize s3 client #5377

memoize s3 client #5377

trinity-1686a commented Sep 3, 2024

github-actions bot commented Sep 3, 2024 •

edited

Loading

rdettai Sep 4, 2024

trinity-1686a Sep 4, 2024

rdettai Sep 4, 2024

trinity-1686a Sep 4, 2024

memoize s3 client #5377

memoize s3 client #5377

Conversation

trinity-1686a commented Sep 3, 2024

Description

How was this PR tested?

github-actions bot commented Sep 3, 2024 • edited Loading

On SSD:

On GCS:

rdettai Sep 4, 2024

Choose a reason for hiding this comment

trinity-1686a Sep 4, 2024

Choose a reason for hiding this comment

rdettai Sep 4, 2024

Choose a reason for hiding this comment

trinity-1686a Sep 4, 2024

Choose a reason for hiding this comment

github-actions bot commented Sep 3, 2024 •

edited

Loading