Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems about the proxies #6

Closed
duozhang opened this issue Feb 3, 2023 · 19 comments
Closed

Problems about the proxies #6

duozhang opened this issue Feb 3, 2023 · 19 comments
Assignees

Comments

@duozhang
Copy link

duozhang commented Feb 3, 2023

Thanks for this project!

When i set proxies parameters, i get this error:

(CurlError("Failed to perform, ErrCode: 35, Reason: 'error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER'"),)

the proxy has user name and password. and i set the param. like this:

auth = ["username","pwd"],

@perklet
Copy link
Collaborator

perklet commented Feb 3, 2023

Could you please try the format http://username:[email protected]:3128

@duozhang
Copy link
Author

duozhang commented Feb 3, 2023

Could you please try the format http://username:[email protected]:3128

Thanks for you reply. I tried and the error still exist.

Out [2]: (CurlError("Failed to perform, ErrCode: 35, Reason: 'error:100000f7:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER'"),)

@duozhang
Copy link
Author

duozhang commented Feb 3, 2023

Could you please try the format http://username:[email protected]:3128

i use python 3.7.16 and win11(x64)

@perklet
Copy link
Collaborator

perklet commented Feb 3, 2023

OK, more details could be helpfule.

Does it work for proxies without authentication?
Can you provide the url you were trying to access or the full code snippet?

@duozhang
Copy link
Author

duozhang commented Feb 3, 2023

OK, more details could be helpfule.

Does it work for proxies without authentication? Can you provide the url you were trying to access or the full code snippet?

it could work without authentication.
the url is https://tls.peet.ws/api/all

        try:
            r = requests.get(
                url = "https://kawayiyi.com/tls",
                allow_redirects = False,
                #proxies = self.myProxies,
                #auth = self.sAUTH,
                impersonate = "chrome101"
            )
            #r =  requests.get( url="https://kawayiyi.com/tls", impersonate="chrome101", proxies = self.myProxies)
            return r
        except Exception as e:
            pass

@perklet
Copy link
Collaborator

perklet commented Feb 3, 2023

TL;DR

You should change {"https": "https://localhost:3128"} to {"https": "http://localhost:3128"}.

Full explaination

I believe you are misusing https proxies, which is a very common pitfall.

A typical https over http proxy is a proxy capable of tunneling https traffic through the proxy. The client first send a plain http CONNECT request, then a tunnel is established between the via the proxy. The format should be:

proxies = {"https": "http://localhost:3128"}
            ^^^^^   ^^^^^^
                     Notice: http, not https

The connections:

Client <--http--> proxy <--https--> web
    \              /
     +----https---+

Because the CONNECT is plain http, the proxy address should start with http://, not https://.

A rare https over https proxy is a proxy that the client established a https connection with the proxy. In this case, the address should be:

proxies = {"https": "https://localhost:3128"}
            ^^^^^   ^^^^^^
                     Notice: https

The connections:

Client <--https--> proxy <--https--> web
    \              /
     +----https---+

So, you are using a https-over-http proxy as a https-over-https proxy, hence the SSL error.

Actually, in requests, this error is shout out more clearly:

r = requests.get("https://kawayiyi.com/tls", proxies={"https": "https://elastic:elastic@localhost:3128"})

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/brew/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/opt/brew/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/brew/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/brew/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/opt/brew/lib/python3.10/site-packages/requests/adapters.py", line 559, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='kawayiyi.com', port=443): 
Max retries exceeded with url: /tls 
(Caused by ProxyError('Your proxy appears to only use HTTP and not HTTPS, try changing your proxy URL to be HTTP. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#https-proxy-error-http-proxy', SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:997)'))))

The error message literally says: "Your proxy appears to only use HTTP and not HTTPS, try changing your proxy URL to be HTTP"

It seems that error message for curl_cffi should be updated to clarify this issue.

@duozhang
Copy link
Author

duozhang commented Feb 3, 2023

oH ,it works!! many thanks for your excellent works.

@duozhang duozhang closed this as completed Feb 3, 2023
@perklet perklet changed the title Problems about the proxies Problems about the proxies(Add better documentation for http proxy, like requests) Feb 3, 2023
@perklet perklet changed the title Problems about the proxies(Add better documentation for http proxy, like requests) Add better documentation for http proxy, like requests Feb 3, 2023
@perklet perklet self-assigned this Feb 3, 2023
@biscuitsan
Copy link

https://github.com/yifeikong/curl_cffi/blob/v0.6.0b7/curl_cffi/requests/session.py#L348
Please don't force this, a simple warning, or a check that can be disabled with a parameter is fine, but in this case I know what I'm trying to do.

@perklet
Copy link
Collaborator

perklet commented Jan 30, 2024

https://github.com/yifeikong/curl_cffi/blob/v0.6.0b7/curl_cffi/requests/session.py#L348 Please don't force this, a simple warning, or a check that can be disabled with a parameter is fine, but in this case I know what I'm trying to do.

It has been relaxed on the main branch, and will be released in next version.

@rlaphoenix
Copy link
Contributor

On my end using a NordVPN proxy (format https://service_user:[email protected]:89) which is a HTTPS proxy, it fails no matter what I try. It works as-is https:// scheme in requests.

The user pass is not the account user pass but the Service Credentials in the Nord Dashboard.

  • "https" proxies key, with http:// in proxy string: RequestsError: Failed to perform, ErrCode: 56, Reason: 'Recv failure: Connection was reset'. This may be a libcurl error, See https://curl.se/libcurl/c/libcurl-errors.html first for more details..
  • "https" proxies key, with https:// in proxy string (with the http:// check removed): RequestsError: Failed to perform, ErrCode: 60, Reason: 'SSL certificate problem: unable to get local issuer certificate'. This may be a libcurl error, See https://curl.se/libcurl/c/libcurl-errors.html first for more details.
    Doing the exact same request without the proxy to the same URL results in a normal and expected 403 Error as it is geoblocked.

@perklet
Copy link
Collaborator

perklet commented Feb 17, 2024

Do you have any non-ASCII characters in your path of working directory or virtualenv?

@rlaphoenix
Copy link
Contributor

Not that I can tell, no

@perklet
Copy link
Collaborator

perklet commented Feb 17, 2024

What about adding verify=False?

@rlaphoenix
Copy link
Contributor

What about adding verify=False?

Tried that too, but would just give a generic error.

@perklet perklet changed the title Add better documentation for http proxy, like requests Problems about the proxies Feb 24, 2024
@rlaphoenix
Copy link
Contributor

image
Someone else is having the exact same problem as me surrounding the use of NordVPN proxies.

@coletdjnz
Copy link
Contributor

curl_cffi is not setting PROXY_CAINFO, only CAINFO. Hence the SSL errors.

https://github.com/yifeikong/curl_cffi/blob/418e452c99dee5da176f0b0a768337cd5509c4c5/curl_cffi/curl.py#L273

try session.curl.setopt(CurlOpt.PROXY_CAINFO, certifi.where())

@rlaphoenix
Copy link
Contributor

rlaphoenix commented Apr 1, 2024

curl_cffi is not setting PROXY_CAINFO, only CAINFO. Hence the SSL errors.

https://github.com/yifeikong/curl_cffi/blob/418e452c99dee5da176f0b0a768337cd5509c4c5/curl_cffi/curl.py#L273

try session.curl.setopt(CurlOpt.PROXY_CAINFO, certifi.where())

That does in fact work on v0.6.2 with 0 modifications and may be the real fix for everyone in this issue including the original OP. Replacing https://->http:// in the proxy connection URI is not the real fix and may only work if the proxy happens to support both, which NordVPN explicitly turned off HTTP-only proxies back a few years ago.

Here is how I set the proxy:

session.proxies.update({"all": "https://user:[email protected]:89"})

# now on the same thread that you will download on, do:
session.curl.setopt(CurlOpt.PROXY_CAINFO, session.curl._cacert)

(notice I do session.curl._cacert instead of certifi.where() since that isn't exactly what the code does. It seems to code may have alternative ways to specify the CA, so it is likely best to take what it used instead of from certifi directly).

However, it's very slow. Like very slow. I get about 400-600 kb/s according to my own (likely inaccurate) speed calculation but it is definitely S.L.O.W. I can normally get about 2-3 mb/s easily on requests. It also gives me the imo entirely invalid warning of:

RuntimeWarning: You may be using http proxy WRONG, the prefix should be 'http://' not 'https://', see:
https://github.com/yifeikong/curl_cffi/issues/6

@perklet
Copy link
Collaborator

perklet commented Apr 1, 2024

It also gives me the imo entirely invalid warning

Mistakenly writing http tunneling proxy as https:// is a question that I have been asked millions of times. I believe putting the warning here does more good than evil. But perhaps it should be able to be disabled.

ret = self.setopt(CurlOpt.CAINFO, self._cacert)

Thanks for the pointer, I will add this in future versions.

it's very slow. Like very slow.

That's definitely a problem worth investigating. Could you please provide more details?

@rlaphoenix
Copy link
Contributor

rlaphoenix commented Apr 1, 2024

Mistakenly writing http tunneling proxy as https:// is a question that I have been asked millions of times.

It's not a mistake to do this, and if a user does it then it's their own fault. This is not something that should be a warning when it will effect anyone who does in fact need to use a https:// for a HTTPS proxy.

The fix

I just want to quickly note that this will need to be handled by Curl class in _ensure_cacert method. Doing it manually via session.curl.setopt will not work once the Curl.reset() method is called, i.e. on the second request of the Session/curl object. I can't even do a patched-class approach because the Session object doesn't honor it anyway when used in different threads as mentioned in the doc-string.

Possibly true fixes

  1. Get rid of that http vs. https warning, it generally causes more confusion than otherwise. It leads people to believe their proxy is not a HTTPS proxy if it is a HTTPS proxy. And if it is a HTTPS proxy, we are now bombarded with annoying warnings. I get the idea as to why this warning is in place, but it generally serves no real purpose. Your handholding for the people who just copy-paste, and annoying anyone who needs HTTPS proxies.
  2. In Curl._ensure_ca_cert just copy the two lines to also set the same _cacert property to PROXY_CAINFO curlopt.
  3. In Session.__init__ instead of asking for a Curl object, ask for a callable that can be called to initialize a Curl object. That way I can pass e.g., Session(curl=PatchedCurl, ...) where PatchedCurl is a class and it can be used on construction of the session and on any thread. If I need to pass arguments to PatchedCurl, or even just Curl class, I can do so with functools like functools.partial, Session(curl=functools.partial(PatchedCurl, debug=True), ...). You then use that curl argument as if its a function or a class's constructor, i.e., self._curl = curl(). You can still manually do arguments if you need as well.

Speed issue

As for this issue, it seems to be related to setting the PROXY_CAINFO too often. If I set it in such a way that it just so happens to set only after a reset() like the current CAINFO, then speeds seem to be as-expected. Either that or it's randomly slow and randomly fast and just so happened to go fast when I changed my code in such a way. Either way I don't think it's something to worry about unless someone else has the problem too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants