Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ObjectStore::list_with_offset (#3970) #3973

Merged
merged 4 commits into from
Mar 30, 2023

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Mar 28, 2023

Which issue does this PR close?

Closes #3970
Closes #3975

Rationale for this change

See ticket. Originally I proposed adding a generic list_opts method, however, I decided against this because:

  • List with delimiter is a different operation with a different return type (it returns prefixes)
  • Looking at what the various stores support there aren't any obvious other candidates for list options

What changes are included in this PR?

Are there any user-facing changes?

@tustvold tustvold marked this pull request as draft March 28, 2023 23:13
@github-actions github-actions bot added the object-store Object Store Interface label Mar 28, 2023
prefix: Option<&Path>,
offset: &Path,
) -> Result<BoxStream<'_, Result<ObjectMeta>>> {
let offset = offset.clone();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to construct the lifetimes to avoid this clone, but it seems unnecessary given we're likely performing a network call here, the overhead of a string clone is not going to be relevant

@@ -371,11 +371,33 @@ pub trait ObjectStore: std::fmt::Display + Send + Sync + Debug + 'static {
///
/// Prefixes are evaluated on a path segment basis, i.e. `foo/bar/` is a prefix of `foo/bar/x` but not of
/// `foo/bar_baz/x`.
///
/// Note: the order of returned [`ObjectMeta`] is not guaranteed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I totally forgot about this detail. Is this an artifact of the requests not being in a particular order? or that local filesystems make no guarantees about order? Or that the order isn't guaranteed to be consistent between object stores?

It does seems like S3 and GCS (and even Azure) return in a particular order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may be able to guarantee it in that case

Copy link
Contributor Author

@tustvold tustvold Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a brief play, this is hard to implement on a filesystem as lexicographic sorting doesn't group "directories" together. Consider the case of a a/b.file and a/a/b.file, a true lexicographic sort would be as presented, but a lexicographic sort of directories would not.

@wjones127
Copy link
Member

This API looks good to me! I agree that there don't seem to be any other parameters of interest for list operations.

@tustvold
Copy link
Contributor Author

tustvold commented Mar 29, 2023

77e275c adds an initial implementation, although the behaviour of the listing API with escaped paths is a little perplexing... This could be a localstack bug though

@tustvold tustvold marked this pull request as ready for review March 30, 2023 15:52
@tustvold tustvold requested a review from wjones127 March 30, 2023 15:53
Copy link
Member

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I'd be happy to add GCS as a follow up.

@tustvold tustvold merged commit dc07f94 into apache:master Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document ObjectStore::list Ordering Add option to start listing at a particular key
2 participants