Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chunked encoding with SSL via proxy #4179

Closed
wants to merge 1 commit into from
Closed

Conversation

jamespic
Copy link
Contributor

This is a patch applying @bpitman's proposed fix for #3844.

Assuming a proxy is running on port 8875 (docker run --rm --name='tinyproxy' -p 8875:8888 dannydirect/tinyproxy:latest ANY works for me), the issue can be reproduced with

requests.put('https://www.google.com', data=iter('Hello World'), proxies={'https': 'http://localhost:8875'})

@Lukasa
Copy link
Member

Lukasa commented Jun 27, 2017

Why aren't we taking the approach I suggested in #3844?

@jamespic
Copy link
Contributor Author

@Lukasa I don't know the codebase that well, so I couldn't figure out exactly how the approach you'd proposed would fit in with the existing code - I tried getting rid of the chunked branch and just adding a chunked parameter to the conn.urlopen, but it lead to a failed test that I couldn't make sense of. Whereas @bpitman's patch worked and passed the tests.

@Lukasa
Copy link
Member

Lukasa commented Jun 27, 2017

What test failed? Can you provide the test output?

@jamespic
Copy link
Contributor Author

Sure thing

_________________________________________________________________________________ TestTimeout.test_total_timeout_connect[timeout0] _________________________________________________________________________________
[gw0] linux -- Python 3.5.2 /home/james/workspace/requests/.tox/py35/bin/python

self = <tests.test_requests.TestTimeout object at 0x7f42473c8828>, timeout = (0.1, 0.1)

    @pytest.mark.parametrize(
        'timeout', (
            (0.1, 0.1),
            Urllib3Timeout(connect=0.1, read=0.1)
        ))
    def test_total_timeout_connect(self, timeout):
        try:
>           requests.get(TARPIT, timeout=timeout)

tests/test_requests.py:2077: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
requests/api.py:72: in get
    return request('get', url, params=params, **kwargs)
requests/api.py:58: in request
    return session.request(method=method, url=url, **kwargs)
requests/sessions.py:502: in request
    resp = self.send(prep, **send_kwargs)
requests/sessions.py:612: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <requests.adapters.HTTPAdapter object at 0x7f4247413b70>, request = <PreparedRequest [GET]>, stream = False, timeout = <urllib3.util.timeout.Timeout object at 0x7f4247413b38>, verify = True, cert = None
proxies = OrderedDict()

    def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
        """Sends PreparedRequest object. Returns Response object.
    
            :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
            :param stream: (optional) Whether to stream the request content.
            :param timeout: (optional) How long to wait for the server to send
                data before giving up, as a float, or a :ref:`(connect timeout,
                read timeout) <timeouts>` tuple.
            :type timeout: float or tuple or urllib3 Timeout object
            :param verify: (optional) Either a boolean, in which case it controls whether
                we verify the server's TLS certificate, or a string, in which case it
                must be a path to a CA bundle to use
            :param cert: (optional) Any user-provided SSL certificate to be trusted.
            :param proxies: (optional) The proxies dictionary to apply to the request.
            :rtype: requests.Response
            """
    
        conn = self.get_connection(request.url, proxies)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(request)
    
        chunked = not (request.body is None or 'Content-Length' in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError as e:
                # this may raise a string formatting error.
                err = ("Invalid timeout {0}. Pass a (connect, read) "
                       "timeout tuple, or a single float to set "
                       "both timeouts to the same value".format(timeout))
                raise ValueError(err)
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked
            )
    
        except (ProtocolError, socket.error) as err:
            raise ConnectionError(err, request=request)
    
        except MaxRetryError as e:
            if isinstance(e.reason, ConnectTimeoutError):
                # TODO: Remove this in 3.0.0: see #2811
                if not isinstance(e.reason, NewConnectionError):
                    raise ConnectTimeout(e, request=request)
    
            if isinstance(e.reason, ResponseError):
                raise RetryError(e, request=request)
    
            if isinstance(e.reason, _ProxyError):
                raise ProxyError(e, request=request)
    
>           raise ConnectionError(e, request=request)
E           requests.exceptions.ConnectionError: HTTPConnectionPool(host='10.255.255.1', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f42473bb5f8>: Failed to establish a new connection: [Errno 113] No route to host',))

requests/adapters.py:458: ConnectionError
=========================================================================== 1 failed, 481 passed, 12 skipped, 2 xpassed in 33.83 seconds ===========================================================================

Weirdly it only failed in Python 3. The same test passed in Python 2.

@Lukasa
Copy link
Member

Lukasa commented Jun 27, 2017

That failure should be unrelated: it seems to be changing from a connection timeout to an authoritative statement that no route exists to that host. Mind pushing the branch up so I can take a look at your change?

@jamespic
Copy link
Contributor Author

@Lukasa
Copy link
Member

Lukasa commented Jun 27, 2017

Ok, so I think this test was accidentally relying on an environment behaviour. Mind pushing it to a branch that gets CI'd to see if our CI reproduces it? If it does I'll dive in myself.

@jamespic
Copy link
Contributor Author

Yep, absolutely.

@forrestchang
Copy link
Contributor

I merge your code and test the code, but it's still hanging.

import requests
resp = requests.get('https://baidu.com', proxies={'https': 'localhost:8899'})

I tried curl with this proxy server, it worked.

@HoneyBaked
Copy link

We are running in this to as well, and the patch resolved our problem/ Is there a plan to get this merged in to master?

@stromnet
Copy link

stromnet commented Sep 17, 2018

Any update on this? I got the same problem with chunked requests (which I think I can avoid, I accidentally used chunked by providing a None value in the data dict).


Here is my analysis (done before I found #3844 and this PR):
Minimal reproduced case:

Python 3.7.0 (Mac OS)
urllib3 1.23
requests 2.19.1

Install tinyproxy (brew install tinyproxy), no configuration required.
Launch with tinyproxy -d

Python code:

import requests
print(requests.post('https://status.github.com',
					data={'a':None},
					proxies={'https': 'http://127.0.0.1:8888'}))

Result:

Traceback (most recent call last):
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/requests/adapters.py", line 471, in send
    low_conn.endheaders()
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1016, in _send_output
    self.send(msg)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 956, in send
    self.connect()
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/urllib3/connection.py", line 356, in connect
    ssl_context=context)
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/urllib3/util/ssl_.py", line 359, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 412, in wrap_socket
    session=session
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 850, in _create
    self.do_handshake()
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1108, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:1045)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    proxies={'https': 'http://localhost:8888'}))
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/requests/api.py", line 112, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/Users/johan/reqbug/venv/lib/python3.7/site-packages/requests/adapters.py", line 503, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:1045)

The key here is the None value in the data, which makes the HTTPAdapter pick the chunked route:
https://github.com/requests/requests/blame/master/requests/adapters.py#L449

This means relying on urllib3 internal functions.
The above blame indicats this was written 5 years ago.
In urllib3, 4 years ago, the code changed and (re?)introduced the _prepare_proxy call:

https://github.com/urllib3/urllib3/blame/master/src/urllib3/connectionpool.py#L592

By adding the following code just after adapters.py:453 (conn._get_conn(..)), the request works as expected:

                is_new_proxy_conn = conn.proxy is not None and not getattr(low_conn, 'sock', None)
                if is_new_proxy_conn:
                    conn._prepare_proxy(low_conn)

@lmvlmv
Copy link
Contributor

lmvlmv commented Jul 1, 2019

Commenting to bump. Am running into this when using chunked uploads viathe POST method using a proxy. Can confirm the above 3 liner is functional. Any chance this PR can get approved and pushed into the full module?

@jayvdb
Copy link
Contributor

jayvdb commented Nov 2, 2019

@jamespic, this needs to be rebased, or closed if it has been resolved by other merged changes, esp #5128 which says it solves this problem.

@ynouri
Copy link

ynouri commented Jan 10, 2020

I was experiencing the same issue (chunked encoding through an HTTPS proxy tunnel resulting in an SSL unknown protocol error) on a test case very similar to the one described by @stromnet above with requests 2.22.0.

I upgraded to a9ee0ee and the issue disappeared 👍 . After spending some time analyzing the issue I'm confident that the fix came from #5128 as suggested by @jayvdb .

Since as of Jan 10, 2020 the fix is not released I used this command to upgrade requests to head of master branch: pip install git+https://github.com/psf/requests.git --upgrade

@jamespic jamespic closed this Jan 10, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants