-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support request_kwargs
param for HTTPFilesystem
#1142
Comments
ok after writing all this maybe |
Looks like we can just pass |
Correct, get() and related calls take extra kwargs that get passed eventually to the session.get . You might also want to consider creating two filesystem instances, one with the extra header information and one without (like two s3 instances at different endpoints).
looks like a bug, should be |
Should we close this issue, @martindurant? Not sure what exactly you mean by 'looks like a bug, should be **kw' |
I mean that on this line, I think it should be |
HTTPFilesystem supports a
client_kwargs
param, which is passed to the aiohttp ClientSession object constructed for all requests. This allowed us to pass auth credentials to all requests made in the session, which worked when trying to access data behind auth (such as EarthData login).However, this sent the credentials to all requests from the session, and it was not possible to restrict which requests the credentials get sent for. While this leaked credentials, it was fine for most cases as the extra Authorization header present does not make a big difference.
However, if you are making calls to AWS S3 presigned URLs, they will reject your request saying there is now an additional unexpected
Authorization
header present. This was the primary issue causing earthdata login and fsspec to not play well together when the user was running code from inside AWS us-west-2, as they will be eventually redirected to using presigned S3 URLs. More detailed information about this problem is captured in nsidc/earthaccess#188.The solution is to use earthdata tokens, but only send them on the initial request. If we use the existing
client_kwargs
param, they will be sent on every request, including redirects. Instead, we need a way to find a way to modify the params of the originating request only, and let redirects play out as they might.Adding a
request_kwargs
that is merged into the.get
calls (like here:filesystem_spec/fsspec/implementations/http.py
Line 242 in 45de5b5
The text was updated successfully, but these errors were encountered: