Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proxy URI schema #17318

Closed
wants to merge 1 commit into from
Closed

Add proxy URI schema #17318

wants to merge 1 commit into from

Conversation

jhernand
Copy link
Contributor

@jhernand jhernand commented Apr 19, 2018

The Ruby URI parser doesn't decode the percent encoded characters in the URI, in particular it doesn't decode the password which is frequently used when specifying proxy addresses and authentication. For example, the following code:

require 'uri'
proxy = URI.parse('http://myuser:%24%[email protected]:3128')
puts proxy.password

Produces the following output:

%24%3fxxxx

But some gems, in particular rest-client and kubeclient, expect it to decode those characters, as they use the value returned by the password method directly, and thus they fail to authenticate against the proxy server when the password contains percent encoded characters.

To address this issue this patch adds a new proxy URI schema that almost identical to the http schema, but that decodes the password before returning it. Users can use this schema instead of http when
they need to use percent encoded characters in the password. For example, the user can type in the UI the following proxy URL:

proxy://myuser:%24%[email protected]:3128

And the new schema will automatically translate %24%3fxxxx into
$?xxxx.

Fixes https://bugzilla.redhat.com/1566615.

@jhernand
Copy link
Contributor Author

@moolitayer @blomquisg please review.

I know that this kind of change is unlikely to be merged. The alternative is to add to the UI new fields for the proxy user name and password, modify the backend to support that, modify the kubeclient gem so that it supports specifying the proxy user name and password independently of the URL, and maybe do the same for rest-client.

Copy link
Contributor

@imtayadeway imtayadeway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ruby URI parser doesn't decode the percent encoded characters in the userinfo part of URIs

AFAICT, the Ruby parser doesn't decode any characters in any part of the URI - that's left up to the client.

But according to RFC 3986 it should decode the percent encoded characters, and produce the following output:

Again, AFAICT, RFC 3986 doesn't make any pronouncements about how a URI parser should be implemented, it mere talks about the form/syntax of URIs.

As such, I think this patch results in surprising behavior, and I'd recommend looking into another approach.

@jhernand
Copy link
Contributor Author

Yes, I agree this isn't the right solution. May be good for an emergency, but not as the definitive solution. I wanted to share it anyhow. I am marking it as WIP to make sure it isn't merged.

@jhernand jhernand changed the title Patch URI to decode password [WIP] Patch URI to decode password Apr 19, 2018
@miq-bot miq-bot added the wip label Apr 19, 2018
@jhernand jhernand changed the title [WIP] Patch URI to decode password Add proxy URI schema Apr 20, 2018
@jhernand
Copy link
Contributor Author

jhernand commented Apr 20, 2018

@imtayadeway I have updated the pull request to use a less intrusive mechanism: it adds a new proxy URI schema that only differs from http in the automatic decoding of the password. This can be used then directly by the user, typing proxy://... instead of http://..., so it will only affect (positively, I think) those users that do it explicitly. I have also updated the pull request name and description accordingly.

@miq-bot miq-bot removed the wip label Apr 20, 2018
@jhernand
Copy link
Contributor Author

@miq-bot add_label gaprindashvili/yes

@jhernand
Copy link
Contributor Author

@moolitayer @blomquisg @imtayadeway can you please review again?

@moolitayer
Copy link

The direction Looks good. The only drawback is that users would have to know about this.

@miq-bot assign @agrare

module URI
class PROXY < HTTP
def password
CGI.unescape(super())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to handle gracefully this case:

uri = URI('http://192.168.122.40:3128') # uri.password is nil
uri = URI('proxy://192.168.122.40:3128') # uri.password raises exception

even though we don't expect the users to input no password when using the proxy protocol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@agrare
Copy link
Member

agrare commented Apr 23, 2018

@jhernand isn't the problem that the client is using a field of a URI without unescaping it? Could we either fix the clients or unescape the password before passing it to the client?

@jhernand
Copy link
Contributor Author

jhernand commented Apr 23, 2018

@agrare the client (the kubeclient gem) isn't aware at all of the fact that the URI may contain a user name or a password, it just blindly passes the proxy URI to the rest-client gem, which then extracts the proxy user name and password without decoding them. We can't unscape the password before passing it, because rest-client calls URI.parse which throws an exception if the URI contains characters like $ or ? in the userinfo.

The right solution would require all this:

  1. Add to the UI two new text boxes for the proxy user name and password.
  2. Add the corresponding columns to the database to save them.
  3. Add to the kubeclient gem the capability to receive explicitly the proxy user name and proxy password, separate from the proxy host name and port.
  4. Add that same capability to the rest-client gem (or else change kubeclient to use a different gem that already supports that).
  5. Update the core to use the new versions of these gems.

Copy link

@moolitayer moolitayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Seems like a reasonable solution given this is targeted to Gaprindashvili

@agrare
Copy link
Member

agrare commented Apr 23, 2018

IDK this still seems like the wrong place to be doing this. If rest-client is calling URI.parse, then uses a field in the URI without unescaping it that sounds like a bug in rest-client. Did you try opening an issue on rest-client?

I'm going to assign this over to @Fryguy for review

@agrare agrare assigned Fryguy and unassigned agrare Apr 23, 2018
@jhernand
Copy link
Contributor Author

Here is the rest-client issue: rest-client/rest-client/issues/661
And here the kubeclient issue: ManageIQ/kubeclient/issues/317

If you think that the fix should go in those gems, please close this pull request.


module URI
class PROXY < HTTP
def password
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might also want to handle percent-encoded username
(RFC just says whole userinfo field may contain pct-encoded chars; username encoding does happen in the wild for example when using email as username https://stackoverflow.com/questions/10050877/url-username-with)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. Can you comment that also in the rest-client and kubeclient issues?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you monkey patching an existing class? If so I am against this approach. This is a known thing about the URI class, and it's the callers responsibility to do the CGI escape/unescape thing.

I would not mind if you created a subclass, though not sure what it would be named. Id also argue that new subclass should live in a utility library like more_core_extensions or manageiq_gems_pending.

@bdunne @jrafanie Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the name PROXY, but I like the idea.

Id also argue that new subclass should live in a utility library like more_core_extensions

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, PROXY is confusing to me.

There's already find_proxy and use_proxy methods that leverage environment variables HTTP_PROXY, FTP_PROXY, NO_PROXY, which are completely different things than this. See here.

@cben
Copy link
Contributor

cben commented Apr 24, 2018

We can certainly change kubeclient.
rest-client seems unmaintained at present, zero activity since Sep 2017, issues and PRs not getting a response :-(

  • RFC 3986 also says:

    Use of the format "user:password" in the userinfo field is deprecated.

    Can we communicate proxy user & password to kubeclient / rest-client by another channels outside the URL? Looking at rest-client code, apparently not, all code paths expect a URL string :-(
    EDIT: forget this point, it's not helpful direction and there is a whole ecosystem of communicating proxy settings by a single URL string.

If we can't get a PR into rest-client (and see it released), I'd feel fine about somehow monkey-patching rest-client itself to handle http scheme right.
I can see how we might prefer this workaround for backporting, but it requires educating users...

Also, there might be a bug in ruby stdlib. When one doesn't specify any proxy, it's autodected from environment variables (by find_proxy), and at first glance I think that code path simply uses URI.parse.

@jhernand
Copy link
Contributor Author

jhernand commented Apr 24, 2018

My concerns about fixing this in kubeclient and rest-client are two:

  1. It will take probably a long time to have them fixed and released, and meanwhile we have users with the problem waiting for a solution.

  2. There may be users of those gem that are using proxy password like my%3fpass. I mean password that, by accident, contain something that looks like a percent encoded character. If we apply the fix then those passwords will stop working because they will be translated into my?pass. I admit this is very unlikely.

The Ruby URI parser doesn't decode the percent encoded characters in the
URI, in particular it doesn't decode the password which is frequently
used when specifying proxy addresses and authentication. For example,
the following code:

```ruby
require 'uri'
proxy = URI.parse('http://myuser:%24%[email protected]:3128')
puts proxy.password
```

Produces the following output:

```
%24%3fxxxx
```

But some gems, in particular `rest-client` and `kubeclient`, expect it
to decode those characters, as they use the value returned by the
`password` method directly, and thus they fail to authenticate against
the proxy server when the password contains percent encoded characters.

To address this issue this patch adds a new `proxy` URI schema that
almost identical to the `http` schema, but that decodes the password
before returning it. Users can use this schema instead of `http` when
they need to use percent encoded characters in the password. For
example, the user can type in the UI the following proxy URL:

proxy://myuser:%24%[email protected]:3128

And the new schema will automatically translate `%24%3fxxxx` into
`$?xxxx`.

Signed-off-by: Juan Hernandez <[email protected]>
@miq-bot
Copy link
Member

miq-bot commented Apr 24, 2018

Some comments on commit https://github.com/jhernand/manageiq/commit/73102d7e7ec1921c4c8658183408767ece7be198

config/initializers/proxy_uri.rb

  • ⚠️ - 8 - Detected puts. Remove all debugging statements.

@miq-bot
Copy link
Member

miq-bot commented Apr 24, 2018

Checked commit https://github.com/jhernand/manageiq/commit/73102d7e7ec1921c4c8658183408767ece7be198 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
2 files checked, 0 offenses detected
Everything looks fine. 🍰

@JPrause
Copy link
Member

JPrause commented May 2, 2018

@miq-bot add_label blocker

@miq-bot miq-bot added the blocker label May 2, 2018
@JPrause
Copy link
Member

JPrause commented May 3, 2018

@jhernand is there anything else needed for this PR?

@jhernand
Copy link
Contributor Author

jhernand commented May 3, 2018

@JPrause it needs to be reviewed and either merged or rejected.

Copy link
Contributor

@cben cben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I don't see any other direction we can realistically ship.

(IMHO it'd be OK to change behavior of http: scheme, but implementation would be more invasive and riskier to backport. This is good balance.)

@JPrause
Copy link
Member

JPrause commented May 24, 2018

@Fryguy it appears this is ready to be merged. This is for a blocker issue.

@Fryguy
Copy link
Member

Fryguy commented May 24, 2018

I'm sorry, but I had never seen this issue for some reason, so I apologise for the extremely delayed response. It only came onto my radar again because it was marked as a blocker. I would love to get input from @bdunne and @jrafanie.

It's hard to tell from the history here if this is a new URI class or monkey patching an existing one. I'm not a fan of either to be honest, and feel like this is more simply handled by just changing the callers.

@jrafanie
Copy link
Member

We can't assume that the password method should always be decoded, such as if we are doing URI building, right? It seems like we need password to be decoded when using the password for actual connections but should stay encoded if we're building URIs.

This sounds very familiar....

If I'm understanding this correctly, it seems very similar to how IPv6 addresses need to be wrapped in [] in URIs but when consumed by clients that need to make socket connections, it must be unwrapped. The URI class and even addressable (an alternative URI library) have two methods for get/set with the wrapped/unwrapped value, host/host= and hostname, hostname=

# parse only valid URIs (IPv6 address is wrapped with []

# INVALID
irb(main):028:0> URI.parse("https://::1/blah")
URI::InvalidURIError: bad URI(is not URI?): https://::1/blah
...

#VALID
irb(main):011:0> u = URI.parse("https://[::1]/blah")
=> #<URI::HTTPS https://[::1]/blah>


# hostname reader unwraps ipv6 address for use in socket connections:
irb(main):012:0> u.hostname
=> "::1"

# host reader wraps ipv6 address for use in URI building
irb(main):013:0> u.host
=> "[::1]"


# host setter raises on invalid unwrapped ipv6 addresses
irb(main):027:0> u.host = "::1"
URI::InvalidComponentError: bad component(expected host component): ::1

# host setter only works if the ipv6 address is wrapped
irb(main):016:0> u.host = "[::2]"
=> "[::2]"

# hostname setter will wrap ipv6 addresses whether it's wrapped or not
irb(main):021:0> u.hostname = "::1"
=> "::1"
irb(main):022:0> u.host
=> "[::1]"
irb(main):023:0> u.hostname
=> "::1"

irb(main):024:0> u.hostname = "[::2]"
=> "[::2]"
irb(main):025:0> u.host
=> "[::2]"
irb(main):026:0> u.hostname
=> "::2"
...

The only way to fix this was to have a "encoded/decoded getter/setter" in the URI/addressable libraries, ensure the .build method honored either wrapped/unwrapped IPv6 addresses and update all clients to use hostname reader for socket connections ("::1") and host reader for URI building.

This is not easy.

Feel free to look at the various PRs that were required to get this working for IPv6:
https://github.com/pulls?q=is%3Apr+author%3Ajrafanie+archived%3Afalse+ipv6+is%3Aclosed+sort%3Acreated-asc

Of note:
ruby URI::Generic.build with ipv6 support

As a temporary measure, I added a monkey patch prepended module to get the ruby fix above so all of my manageiq client code can use the correct syntax (while I waited for that fix to be merged and released)

Either way, clients needed to know when to use host and when to use hostname for wrapped(uri building) or unwrapped (socket connections)

excon
rest client 1
rest client 2
ovirt

There were many others.

@jrafanie
Copy link
Member

We can't assume that the password method should always be decoded, such as if we are doing URI building, right? It seems like we need password to be decoded when using the password for actual connections but should stay encoded if we're building URIs.

what I mean by this is this

u.password fed into another URI.build... will fail if the password is decoded
In the same way as URI.parse("https://....#{u.password}..." will fail if the password is decoded.

Copy link
Member

@jrafanie jrafanie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments, I don't think this is going to work.

@JPrause
Copy link
Member

JPrause commented May 29, 2018

@jhernand can you review the comments above.

@jhernand
Copy link
Contributor Author

jhernand commented Jun 4, 2018

@Fryguy @jrafanie if adding the new proxy://... scheme isn't acceptable, then the only solution I see is to stop using the rest-client gem, as it is that gem that assumes that it can use the proxy user name and password directly, without decoding them. Unfortunately that gem seems to be currently unmaintained and there was no response to the issue that I opened.

@cben what do you think it will cost (in time) to replace rest-client (in the kubeclient gem) with some other HTTP library that doesn't have this problem?

@jrafanie
Copy link
Member

jrafanie commented Jun 4, 2018

@jhernand rest-client may in fact not be actively maintained anymore but from the looks of the issue that was opened, it might just be a single line that needs CGI.unescape.

https://github.com/rest-client/rest-client/blob/f450a0f086f1cd1049abbef2a2c66166a1a9ba71/lib/restclient/request.rb#L453-L454

It seems far easier to verify this is the only place that needs it, make the change, add a test and open a PR, than to change the underlying http library in all of kubeclient. I have reached out to @ab in the past and can do it again if you have a simple PR that we need to get merged.

@jrafanie
Copy link
Member

jrafanie commented Jun 5, 2018

I opened a PR to rest-client, let's see if we can get it merged and released: rest-client/rest-client#665

@JPrause
Copy link
Member

JPrause commented Oct 11, 2018

@miq-bot add_label hammer/yes

jlsherrill added a commit to jlsherrill/katello that referenced this pull request Feb 12, 2019
rest-client alongside URIs parse method introduces a bug
where by usernames or passwords cannot have symbols in them.
More Information:
 rest-client/rest-client#661
 ManageIQ/manageiq#17318
jlsherrill added a commit to Katello/katello that referenced this pull request Feb 13, 2019
rest-client alongside URIs parse method introduces a bug
where by usernames or passwords cannot have symbols in them.
More Information:
 rest-client/rest-client#661
 ManageIQ/manageiq#17318
@jrafanie
Copy link
Member

@jhernand can this be closed? I opened rest-client/rest-client#665 on rest-client... they haven't reviewed/merged/commented on it, so we monkey patched it here: #18105

I think that monkey patch makes this PR no longer needed.

@jhernand
Copy link
Contributor Author

Not needed, I am closing it.

@jhernand jhernand closed this Feb 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants