-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memoize s3 client #5377
memoize s3 client #5377
Conversation
quickwit/quickwit-storage/src/object_storage/s3_compatible_storage.rs
Outdated
Show resolved
Hide resolved
b85cfa3
to
68207c4
Compare
quickwit/quickwit-storage/src/object_storage/s3_compatible_storage_resolver.rs
Outdated
Show resolved
Hide resolved
quickwit/quickwit-storage/src/object_storage/s3_compatible_storage.rs
Outdated
Show resolved
Hide resolved
use crate::{ | ||
DebouncedStorage, S3CompatibleObjectStorage, Storage, StorageFactory, StorageResolverError, | ||
}; | ||
|
||
/// S3 compatible object storage resolver. | ||
pub struct S3CompatibleObjectStorageFactory { | ||
storage_config: S3StorageConfig, | ||
// we cache the S3Client so we don't rebuild one every time we need to connect to S3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without this caching we didn´t create a new client for every connection as long as the Storage
object was reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it wasn't reused cross request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder if the best approach wouln't be to cache the resolved storages in the storage resolver. This would solve the problem once and for all for all storage types. But I think this solution is already good as it does solve the linked issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the storages are related to a Uri (the S3Client isn't), which changes for each index. It would be interesting to be able to share the debouncer though, which isn't done here. That could make concurrent requests slightly more efficient.
Description
memoize s3 client, keeping them until they seem unused. I don't think we ever use more than a single S3StorageConfig, so we could store a single value, but a map we gc seems less confusing if we start having multiple config at once
fix #4933
fix #5236
How was this PR tested?
ran a cluster and looked at searcher logs. Didn't look at janitor logs, though there is no reason for it to behave differently