Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying store list_dir / listdir details in implementation vs. spec #161

Open
rabernat opened this issue Oct 27, 2022 · 6 comments
Open
Labels
bug Something isn't working

Comments

@rabernat
Copy link
Contributor

Here's what the V3 Spec currently says about the store list_dir method

``list_dir`` - Retrieve all keys and prefixes with a given prefix and
which do not contain the character "/" after the given prefix.

    | Parameters: `prefix`
    | Output: set of `keys` and set of `prefixes`

    For example, if a store contains the keys "a/b", "a/c", "a/d/e",
    "a/f/g", then ``list_dir("a/")`` would return keys "a/b" and "a/c"
    and prefixes "a/d/" and "a/f/". ``list_dir("b/")`` would return
    the empty set.

https://github.com/alimanfoo/zarr-specs/blob/b63e64dcf530f195b19691e06a44c85916f33291/docs/core/v3.0.rst?plain=1#L1344-L1353

There are three inconsistencies between this and the current zarr-python V3 implementation:

  • The function is named listdir (not list_dir) in the implementation
  • The listdir function does not include the prefix. So listdir("a/") returns b, not a/b. This is pretty important for custom store implementations.
  • The implementation do not return a trailing slash for prefixes, while the spec seems to recommend this.

I think we should make the spec and the implementation consistent.

cc @jstriebel and @alimanfoo, who are currently revising the V3 spec.

@jstriebel jstriebel added this to ZEP1 Nov 10, 2022
@jstriebel jstriebel moved this to In Discussion in ZEP1 Nov 10, 2022
@jstriebel jstriebel moved this from In Discussion to Needs PR in ZEP1 Nov 15, 2022
@jstriebel jstriebel added the core-protocol-v3.0 Issue relates to the core protocol version 3.0 spec label Nov 16, 2022
@jstriebel jstriebel moved this from Needs PR to In Review in ZEP1 Dec 12, 2022
@jstriebel jstriebel moved this from In Review to Needs PR in ZEP1 Feb 9, 2023
@jstriebel
Copy link
Member

@rabernat Do you have any preferences, e.g. should the python implementation change or rather the spec? I don't have a strong opinion which version is more appropriate, but agree that using this coherently would be nice.

@jstriebel
Copy link
Member

After re-reading this part of the spec, I think that the / suffix is useful in the API, especially to make this definition as simple as it is:

Retrieve all keys and prefixes with a given prefix and
which do not contain the character "/" after the given prefix.

@rabernat @joshmoore Should we turn this into a zarr-python issue then? (Related to zarr-developers/zarr-python#1290)

@jstriebel jstriebel removed this from ZEP1 Feb 22, 2023
@jstriebel jstriebel added bug Something isn't working and removed core-protocol-v3.0 Issue relates to the core protocol version 3.0 spec labels Feb 22, 2023
@rabernat
Copy link
Contributor Author

I agree it is a python issue.

I'm still unsure of how to distinguish keys vs. prefixes and whether this needs to be specified in the spec. This is related to #200. Ideally we should suggest a specific algorithm for efficiently traversing a store using listdir. Right now the zarr-python implementation makes tons of unnecessary list and __contains__ calls.

@joshmoore
Copy link
Member

I'm still unsure of how to distinguish keys vs. prefixes and whether this needs to be specified in the spec.

👍 This would be good to capture and discuss for v3.

@jstriebel jstriebel added this to ZEP1 Mar 3, 2023
@jstriebel jstriebel moved this to In Discussion in ZEP1 Mar 3, 2023
@jstriebel jstriebel moved this from In Discussion to Needs PR in ZEP1 Mar 3, 2023
@jstriebel jstriebel added the core-protocol-v3.0 Issue relates to the core protocol version 3.0 spec label Mar 3, 2023
@jstriebel jstriebel removed this from ZEP1 Mar 13, 2023
@jstriebel jstriebel added this to ZEP1 Mar 13, 2023
@jstriebel jstriebel moved this to In Discussion in ZEP1 Mar 13, 2023
@jstriebel
Copy link
Member

@rabernat @joshmoore Do you have any concrete proposals for the current spec? Happy to do a review, but I'm not quite sure what exactly should be changed.

@rabernat
Copy link
Contributor Author

Basically the store needs a way to communicate whether the thing it is returning is a file or a directory. One way to solve this would to have prefixes / directories always return a trailing / in a list operation. But that feels like an implementation detail, rather than a spec issue.

@jstriebel jstriebel removed this from ZEP1 Mar 15, 2023
@jstriebel jstriebel removed the core-protocol-v3.0 Issue relates to the core protocol version 3.0 spec label Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants