Skip to content
This repository has been archived by the owner on Nov 5, 2019. It is now read-only.

Enable BYO http library #128

Closed
2 of 7 tasks
dhermes opened this issue Feb 4, 2015 · 55 comments
Closed
2 of 7 tasks

Enable BYO http library #128

dhermes opened this issue Feb 4, 2015 · 55 comments
Assignees
Milestone

Comments

@dhermes
Copy link
Contributor

dhermes commented Feb 4, 2015

If this is a duplicate I apologize.

ISTM this requires

@dhermes
Copy link
Contributor Author

dhermes commented Aug 31, 2015

@nathanielmanistaatgoogle I think as a first start, using a first-class Http object (owned by oauth2client) as the output of Credentials.authorize. Right now, the wrapped http object just gets monkey patched and has a tenuous relationship with the original Http object and the original Credentials object.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 31, 2015

Here are all places an httplib2.Http object is created:

$ git log --pretty=%H -1  # current HEAD
6e3c0c58f5a3ba18dc69bce9c49a975738c46c74
$ git grep 'Http('
oauth2client/appengine.py:        return self.credentials.authorize(httplib2.Http(*args, **kwargs))
oauth2client/client.py:            h = httplib2.Http()
oauth2client/client.py:                http = httplib2.Http()
oauth2client/client.py:        http = httplib2.Http()
oauth2client/client.py:_cached_http = httplib2.Http(MemoryCache())
oauth2client/client.py:            http = httplib2.Http()
oauth2client/client.py:            http = httplib2.Http()
oauth2client/flask_util.py:        return self.credentials.authorize(httplib2.Http(*args, **kwargs))
samples/oauth2_for_devices.py:youtube = build("youtube", "v3", http=credentials.authorize(httplib2.Http()))
scripts/run_system_tests.py:    http = credentials.authorize(httplib2.Http())
tests/test_appengine.py:        http = httplib2.Http()
tests/test_appengine.py:        http = httplib2.Http()
tests/test_appengine.py:        http = httplib2.Http()
tests/test_appengine.py:            http = httplib2.Http()

I also searched for .Http (to make sure Http( wasn't missing anything, e.g. aliasing a class name. It doesn't appear to be an issue (only real different results are in tests/test_flask_util.py). I also did git grep 'httplib2' | grep import to make sure that the standalone Http wasn't imported as some other name.

@theacodes
Copy link
Contributor

To summarize from googleapis/google-cloud-python#1346:

  • We want to use urllib3 as our official http client library. urllib3 now supports app engine, so there shouldn't be any issues there.
  • We will implement a PoolManager wrapper to implement the retry and auth logic that's currently monkey-patched into httplib2.Http.

@shazow
Copy link

shazow commented Jan 5, 2016

We will implement a PoolManager wrapper to implement the retry and auth logic that's currently monkey-patched into httplib2.Http.

urllib3's modern retry logic is fairly configurable, please let us know if there's anything missing that you'd need. Maybe for auth too.

@theacodes
Copy link
Contributor

urllib3's modern retry logic is fairly configurable,

Will do. Mostly this logic handles this case:

  1. User makes a request to an authentication-required resource with expired credentials.
  2. The API says 401: Unauthorized, indicating that the credentials have expired.
  3. The credentials get refreshed and the entire request is retried.

Not sure if this should be the responsibility of the http library.

@shazow
Copy link

shazow commented Jan 5, 2016

Take a look at our Retry object, worst case our goal is that you should be able to extend it and provide your own with a matching interface (might need to replace type-checking with duck-typing where we forgot).

@theacodes
Copy link
Contributor

Awesome. Even more fuel to this fire I'm setting to httplib2. 👍

@dhermes
Copy link
Contributor Author

dhermes commented Jan 5, 2016

Ha. It's going to be a long road to excise it. It's also tied too deeply into google-api-python-client.

@shazow
Copy link

shazow commented Jan 5, 2016

Wish we did this when I was still at Google, I would have jumped on to help. :)

@theacodes
Copy link
Contributor

@dhermes oh, fair point about google-api-python-client. That library is in maintenance-only mode. We should consider some sort of fallback for httplib2 here for that. But for gcloud-python we can leave httplib2 firmly behind.

@theacodes
Copy link
Contributor

Okay, so I'm on board with @dhermes' original plan. We'll leave httplib2 as the default library here, but have a well-defined interface so that gcloud-python can move to a different library.

We're about 6% away from full coverage, so this isn't far off.

@bjmc
Copy link
Contributor

bjmc commented Jan 17, 2016

Any idea how soon this is happening? SNI support is a must-have feature for us, and even aside from that, we'd much prefer to use requests over httplib2.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 1, 2016

Currently working on finding all the places where the httplib2 surface area is assumed:

  • http.request attribute / method is present
  • particular signature of httplib2.Http.request is assumed
  • response is an httplib2.Response object
  • ... more to come

Then I will factor out the access into functions in oauth2client.transport.

@jonparrott @nathanielmanistaatgoogle Any suggestions for a systematic way of checking this? I am going to introduce a mock httplib2 module and then just raise AttributeError and other so I can trace code paths and figure out what they assume.

@theacodes
Copy link
Contributor

introduce a mock httplib2 module and then just raise AttributeError and other so I can trace code paths and figure out what they assume.

Considering we have 100% coverage that's a great idea.

Keep in mind that for BYO the only essential point is Credentials.authorize. What we use internally for refreshing doesn't matter. I'd love to see those to concerns separated.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 1, 2016

Well methods like _refresh won't matter but some public ones it will?

@theacodes
Copy link
Contributor

I've always thought it was a bit wonky that you have to pass in an http object to like every method on Credentials. I wonder how @nathanielmanistaatgoogle feels about that.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 1, 2016

We could make it optional or break it, but leave it optional for a generation?

@bjmc
Copy link
Contributor

bjmc commented Aug 1, 2016

We have a working (for our uses, at least) implementation of this using requests in place of httplib2.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 1, 2016

Thanks!

@dhermes
Copy link
Contributor Author

dhermes commented Aug 11, 2016

@jonparrott Maybe a good path forward would be to add retry hooks into httplib2 and then require a minimum version.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 11, 2016

They'd then have to use our interface instead of the familiar interface of their http library. I'm not cool with that.

Good point

@theacodes
Copy link
Contributor

Maybe a good path forward would be to add retry hooks into httplib2 and then require a minimum version.

Long-term, sure. But I'd rather not waste any more time with httplib2. Adding proper retry hooks to that library is lipstick on a pig at this point.

@dhermes
Copy link
Contributor Author

dhermes commented Aug 11, 2016

Ha good point. The monkey patching is just gross.

@theacodes
Copy link
Contributor

I agree, which is why we should warn users about httplib2 and make urllib3 the prefered "user transport". There are so many problems with it (ssl being the biggest glaring issue).

@bjmc
Copy link
Contributor

bjmc commented Aug 11, 2016

Why urllib3 and not requests? Are you just trying to stick with a lower-level library? requests gives you lots of nice stuff for free.

@theacodes
Copy link
Contributor

@bjmc requests is sugar on top of urllib3. For the internal use of this library, urllib3 is the way to go. We'd still support requests as an option to credentials.authorize.

@bjmc
Copy link
Contributor

bjmc commented Aug 11, 2016

requests is sugar on top of urllib3.

Sure, that's what I mean when I say it's "lower-level". It just seems like you're doing a fair amount of fiddling with URL/HTTP-related things (e.g.) and maybe requests would spare you from doing that yourself? But maybe not, I haven't looked at it in detail.

@theacodes
Copy link
Contributor

Probably not much moreso than urllib3 would.

On Thu, Aug 11, 2016 at 1:32 PM Brendan McCollam [email protected]
wrote:

requests is sugar on top of urllib3.

Sure, that's what I mean when I say "lower-level". It just seems like
you're doing a fair amount of fiddling with URL/HTTP-related things (e.g.
https://github.com/google/oauth2client/pull/622/files#diff-f752155b1078f859b6c1ce32715d39ddR300)
and maybe requests would spare you from doing that yourself? But maybe not.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#128 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPUc1TyrLgmLtdBNVnDJYXZSSp-mAvfks5qe4bKgaJpZM4Db5O8
.

@theacodes
Copy link
Contributor

@dhermes my thoughts and ideas for moving this forward are the same as my previous comments here and here.

In summary we should strictly separate the concepts of "transport" (internal http client used to refresh credentials/etc.) and the "adapter" (the means by which we provide the auth token to the user's http client). As I mentioned before, the biggest drawback is that it will be possible for the http client that this library uses and the http client that the user's code uses can be different. I am personally okay with this situation - it's actually more common than you think for a single application to use multiple http clients because of dependent libraries.

I'm happy to hear yours and @nathanielmanistaatgoogle's thoughts on this before we move forward.

If we do move forward the work items would look like this:

  • make transport self-contained - it can construct its own http client instead of needing it passed in.
  • expose a way to set the transport, e.g. oauth2client.transport.set_transport()
  • remove all http arguments from Credentials methods.
  • add the new adapters module and move the existing httplib2 authorize code there.
  • switch transport to use urllib3 instead of httplib2.
  • add an adapter for urllib3.
  • add an adapter for requests.

I can start on this myself later this month once I get a few other things off my plate, or, if someone beats me to it I'm happy to review.

@nathanielmanistaatgoogle
Copy link
Contributor

Two reactions right now:

  1. Are we forecasting kind of an ad hoc polymorphism in which the user can only choose among certain HTTP libraries, a parametric polymorphism in which the user can supply any implementation of an HTTP library interface that we define, or a best-of-both-worlds in which we still provide an interface and we provide implementations of it for the HTTP libraries toward which we'd want to steer users? If the second or third, what's the interface?
  2. Do we have to use global state? Doesn't that introduce a diamond problem in which library B tells oauth2client to use transport b, library C tells oauth2client to use transport c, and library A wants to use library B and library C at the same time but can't?

@theacodes
Copy link
Contributor

For 1, leaning towards the far latter: we'll provide an interface and provide implementations based on urllib3 (the new default) and httplib2 (for legacy reasons). As far as the interface - I'm not sure yet, but once we get the two in place I'll codify the interface in whichever way you feel is most appropriate.

For 2, I thought about this a bit and I can think of two ways forward:

A. Global state (e.g. oauth2client.transport.set_transport(...)) which sets the background transport for the entire app. I don't believe there would ever be a library making this call - this is a call that should be made by the final consuming application. Any libraries built on top of oauth2client should be oblivious to the background transport (libraries should only care about the authentication transport, which will be handled by Credentials.authorize and the adapter classes).
B. Make transport or background_transport a constructor argument to Credentials. I'm not a huge fan of this because it complicates our serialization/deserialization story.

I'm open to arguments for/against either or alternatives.

@theacodes
Copy link
Contributor

bump, now that I have gotten a few things off my plate I can actually continue some of this work if I can get agreement between @nathanielmanistaatgoogle and @dhermes on the direction.

@dhermes
Copy link
Contributor Author

dhermes commented Sep 13, 2016

I am still in disagreement on having a separate internal transport. It seems the main (only?) reason for this is to allow auto-magic method calls that don't require an http argument. But this is a stark trade-off for making users chase around different transports, especially if their application requires very detailed instrumenting.

@theacodes
Copy link
Contributor

theacodes commented Sep 13, 2016

It seems the main (only?) reason for this is to allow auto-magic method calls that don't require an http argument.

Definitely isn't the only reason. Some reasons:

  • Isolate this library from having to support every http transport for its internal purposes.
  • Give users who want to use oauth2client with new http libraries a smaller interface to implement.
  • Allow usage with legacy libraries without having to use a legacy transport(e.g. google-api-python-client).

users chase around different transports

It's not verify difficult for a user to do:

>>> oauth2client.transport.set_transport(oauth2client.transport.urllib3)
>>> ...
>>> http = urllib3.PoolManager()
>>> http = credentials.authorize(http)

which means the background transport and their http client library are one and the same. In fact, this is a much better situation than a lot of Python libraries which more or less force you to use one transport. For some libraries this transport is requests and that's okay, but some libraries (like us right now) force users to use httplib2 with no clean way to change it.

I'm interested to see @nathanielmanistaatgoogle's thoughts, but as a way to move forward how would you feel I went forward with the separation plan but left the door open to unifying it later?

@dhermes
Copy link
Contributor Author

dhermes commented Sep 13, 2016

Isolate this library from having to support every http transport for its internal purposes.

But if we define a common interface, then the people trying to use a novel http transport just need to consult our handy-dandy documentation for adding a new transport. Then they can adhere to the interface we define and none of our keep needs to care what kind of duck it is.

To make this easier, we could define HTTP and Response base classes so that everyone can subclass them, but that wouldn't be necessary.

@dhermes
Copy link
Contributor Author

dhermes commented Sep 13, 2016

Allow usage with legacy libraries without having to use a legacy transport

I'm not sure how that would be possible. Seems like it'll be a lot of work to make the "external calls" be not-httplib2, even if the internal calls use something else by default.

@theacodes
Copy link
Contributor

theacodes commented Sep 13, 2016

I'm not sure how that would be possible. Seems like it'll be a lot of work to make the "external calls" be not-httplib2, even if the internal calls use something else by default.

Not sure what you mean, but ideally the situation would work exactly as it does today - just with refresh using the internal transport instead of whatever http object is passed to authorize.

@dhermes
Copy link
Contributor Author

dhermes commented Sep 13, 2016

I see. The only entry point into google-api-python-client is passing an authorized credentials object, so the transport isn't really encoded in that library (other than the signature).

@theacodes
Copy link
Contributor

true.

@theacodes
Copy link
Contributor

Chatted with @dhermes, I'm going to go forward without separation. We can resume the discussion if needed later.

@theacodes
Copy link
Contributor

Thank you for creating this issue, however, this project is deprecatedand we will only be addressing critical security issues. You can read moreabout this deprecation here.

If you need support or help using this library, we recommend that you ask yourquestion on StackOverflow.

If you still think this issue is relevant and should be addressed, pleasecomment and let us know!

@dhermes
Copy link
Contributor Author

dhermes commented May 24, 2018

Some great memories with this issue @theacodes!

@shazow
Copy link

shazow commented May 24, 2018

Turns out the real issues were the friends we made along the way.

@theacodes
Copy link
Contributor

theacodes commented May 24, 2018 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants