Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix S3 recursion for fake directory #199

Merged

Conversation

GeorgeSabu
Copy link
Contributor

@GeorgeSabu GeorgeSabu commented Feb 3, 2022

Closes #198

@pjbull
Copy link
Member

pjbull commented Feb 3, 2022

@GeorgeSabu Can you update the tests to pass? Looks like you just need to run black to format the file you changed (you can use the make lint command too).

@pjbull pjbull merged commit f951e21 into drivendataorg:master Feb 3, 2022
pjbull added a commit that referenced this pull request Feb 3, 2022
pjbull added a commit that referenced this pull request Feb 3, 2022
@GeorgeSabu
Copy link
Contributor Author

GeorgeSabu commented Feb 3, 2022

@pjbull I guess some test are failing.Shall I try fixing them?

@pjbull
Copy link
Member

pjbull commented Feb 3, 2022

@GeorgeSabu Yep, that'd be great! You can re-open this PR or start a new one. (Sorry, I wasn't paying attention and clicked thought I was clicking the button the run the tests, but hit the merge one instead.)

I thought it was going to be a simple fix to the tests, but it seems a little more involved. At the very least, you'll want to update the MockBoto3Paginator here to work like the real S3 API results look like. Here's where it is defined:
https://github.com/drivendataorg/cloudpathlib/blob/f951e21e5480065a93eb03e8459ffea2f40f4921/tests/mock_clients/mock_s3.py/#L192-L226

And in a quick look, I think you'll at least want something like this that adds the Size key and creates a "fake" directory like we see on S3.

class MockBoto3Paginator:
    def __init__(self, root, per_page=2):
        self.root = root
        self.per_page = per_page

    def paginate(self, Bucket=None, Prefix="", Delimiter=None):
        new_dir = self.root / Prefix

        items = [f for f in new_dir.iterdir() if not f.name.startswith(".")]

        for ix in range(0, len(items), self.per_page):
            page = items[ix : ix + self.per_page]
            dirs = [
                {"Prefix": str(_.relative_to(self.root).as_posix())} for _ in page if _.is_dir()
            ]
            files = [
                {
                    "Key": str(_.relative_to(self.root).as_posix()),
                    "Size": 123
                    if not _.relative_to(self.root).exists()
                    else _.relative_to(self.root).stat().st_size,
                }
                for _ in page
                if _.is_file()
            ]

            # s3 can have "fake" directories where size is 0, but it is listed in "Contents" (see #198)
            # add one in here for testing
            if dirs:
                fake_dir = dirs.pop(-1)
                fake_dir["Size"] = 0
                files.append(fake_dir)

            yield {"CommonPrefixes": dirs, "Contents": files}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recursion error on "fake" s3 directories
2 participants