Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with moving bandersnatch to python3.12 #172

Closed
cooperlees opened this issue Apr 21, 2024 · 7 comments
Closed

Help with moving bandersnatch to python3.12 #172

cooperlees opened this issue Apr 21, 2024 · 7 comments

Comments

@cooperlees
Copy link

Hi,

I am the maintainer of bandersnatch, but I have very little s3 skills in general - so would love help moving to >= 3.12 + S3Path and cleaning up our usage of s3path which I feel we hacked around issues rather than fixing bugs here when it was contributed.

Code

pypa/bandersnatch#1710

Feel free to close if you feel this is an abuse of using issues. I'd also love a list of how you'd approach the move if you don't have time to do it, as I am confused in the order I should potentially take and if there is any hope maintaining backwards compatibility or if I should give up with that and just move to >= 3.12 only for S3 support?

@liormizr
Copy link
Owner

Hi @cooperlees

As long that we don't have a lot of open issues to manage I'm OK with keeping this issue as a platform for helping you guys plan the refactor.

The beauty in s3path is that you don't need to know S3, you just need to think about it like a normal file system and use the regular pathlib API.
In this way you can also switch between S3Path and PosixPath when you want to do a local test.

Now about your code, in general I don't see any reason to keep your our S3Path object.
The glob method is optimize now and you are not really using the mkdir method that you write.

Except for that all looks greate
You are not using s3path for copy, this makes sense (it's more optimize then the API that we have from pathlib).
The only think that I change except for the local s3path inherent is not to use s3path privets
For example you have this code:

        resource, _ = path._accessor.configuration_map.get_configuration(path)

In python 3.12 it will brake
I recommend changing to this:

        import s3path
        resource, _ = s3path.configuration_map.get_configuration(path)

@cooperlees
Copy link
Author

not to use s3path privets

I tried to do this like you've said (Please correct on PR if I misuderstood) in regards to configuration_map but it seems to not hold authentication settings when using it this way.
PR: https://github.com/pypa/bandersnatch/pull/1724/files
Example CI Fails: https://github.com/pypa/bandersnatch/actions/runs/9055904268/job/24877694652?pr=1724

botocore.exceptions.NoCredentialsError: Unable to locate credentials

Do we need a public API to do this with auth settings correct? Or does bandersnatch's CI need to set auth settings more than ENV variables and in the test initialization:
https://github.com/pypa/bandersnatch/blob/main/src/bandersnatch/tests/conftest.py#L195

@liormizr
Copy link
Owner

Hi @cooperlees
I deployed a new version so you will have s3path.configuration_map in older Python versions
(s3path version 0.5.7)

About the issue that you have with boto credentials, can you give me a unit test (without your ecosystem) to understand that is the issue here?
I'm using s3 in integration tests and moto in unit tests and I don't see this issue

@cooperlees
Copy link
Author

cooperlees commented May 18, 2024

Thanks for the response. I guess moto mocks more than we do here. I don't know how to simplify the error situation, but I feel it's pretty clear it's something that works when we use the private instances that have been initialized with the auth settings we need vs. your public methods that do not. We just setup auth for minio and hit it via s3 APIs, the integration testing isn't that complex I feel for someone who understands s3 and s3path (which is not me).

I'll go try update to 0.5.7 and see if that helps, but I don't think it will.

All I want to do is move bandersnatch to 3.12, and s3path has made that difficult :(

@cooperlees
Copy link
Author

Sadly same errors with 0.5.7 - I'm not sure what I am meant to do differently with it

@liormizr
Copy link
Owner

@cooperlees
Maybe I'm missing something but it looks like something is weird with your setup
See the code here
The "public" PI that I added is the privet one, it's not really different
Maybe you changed your boto3 version? looks like something in the setup...

@cooperlees
Copy link
Author

Ok, good news. I sat down a read your PRs for 0.5.7 and saw what I misunderstood and we got further! I was able to move to the new public API everywhere (missed a few but they are in the PR linked below).

I now even have CI "passing" in a 3.12 environment, so I thank you for all your help.

I think we only have 1 remaining big item (apart from other unitest private usages I'll look into separately after working this out) tho. I'm sure you don't want us to be using from s3path.accessor import _generate_prefix, so what is your recommendation here for this code?

  • Sadly we do use the delete and glob through the "storage plugin" APIs, so I need to maintain them. It does look like we don't but we do (unless I've read things wrong - again, s3 was 100% contributed and I'm learning here) indirectly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants