Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/impersonate 6.0 #163

Merged
merged 19 commits into from
Dec 31, 2023
Merged

Feature/impersonate 6.0 #163

merged 19 commits into from
Dec 31, 2023

Conversation

perklet
Copy link
Collaborator

@perklet perklet commented Nov 25, 2023

TODOs

  • add chrome 116 - 120
  • support websockets

ETA:

https://github.com/users/yifeikong/projects/1/views/3

Update:

Delayed other build options, none of them can be built directly...

@@ -40,6 +40,8 @@ To install beta releases:

## Usage

Use the latest impersonate versions, do NOT copy `chrome110` here without changing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section could be improved

README.md Outdated
Supported impersonate versions, as supported by [curl-impersonate](https://github.com/lwthiker/curl-impersonate):
Supported impersonate versions, as supported by my [fork](https://github.com/yifeikong/curl-impersonate) of [curl-impersonate](https://github.com/lwthiker/curl-impersonate):

However, only Chrome-like browsers are supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could be more explicit here to why not supported others.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be great to have some documentation explaining why we don't support firefox or other ones supported by the curl_impersonate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly simply adding a link to #59 (comment) will be enough

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appologies for the missunderstanding, I'll make sure it's well documented in the new version.

Copy link

@Kwsswart Kwsswart Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please never apologize! Thank you for the work on this mate its an amazing repository! Only reviewing to try help somewhat ^^

ret = lib.curl_ws_recv(self._curl, buffer, n, n_recv, p_frame)
self._check_error(ret, "WS_RECV")
frame = p_frame[0]
# print(frame.offset, frame.bytesleft)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove commented print

@T-256
Copy link
Contributor

T-256 commented Dec 10, 2023

@T-256
Copy link
Contributor

T-256 commented Dec 12, 2023

FYI I added win32 build for testing in my fork

@Snusmumr1000
Copy link
Contributor

FYI, WebSocket support doesn't seem to work properly.

import asyncio
from time import sleep

from curl_cffi import requests

with requests.Session() as s:
    w = s.connect("ws://localhost:8765")
    w.send(b"Foo")
    reply = w.recv()
    print(reply)
    assert reply == b"Hello Foo!"

    w.send(b"Bar")
    reply = w.recv()
    print(reply)

    assert reply == b"Hello Bar!"


async def async_examples():
    async with requests.AsyncSession() as s:
        w = await s.connect("ws://localhost:8765")
        await w.asend(b"Bar")
        reply = await w.arecv()
        print(reply)
        await w.asend(b"Test")
        sleep(5)
        reply2 = await w.arecv()
        print(reply2)
        assert reply == b"Hello Bar!"
        assert reply2 == b"Hello Test!"


asyncio.run(async_examples())

consider this test, it seems there are some problems with reading incoming WebSocket messages after the first one was accepted.

@perklet
Copy link
Collaborator Author

perklet commented Dec 20, 2023

@Snusmumr1000 That's why it's still a draft.

@Snusmumr1000
Copy link
Contributor

@yifeikong ok, but that wasn't mentioned anywhere, so probably somebody will find this precaution useful.

README-zh.md Outdated

- chrome99
- chrome100
- chrome101
- chrome104
- chrome107
- chrome110
- chrome116
- chrome117

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently curl_impersonate only supports until version 116, are we not worried already providing support to 120 when it doesn't handle this yet?

Ref. https://github.com/lwthiker/curl-impersonate?tab=readme-ov-file#supported-browsers

On that note would it not make more sense to offer support for firefox?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the usage table here, most users are using the latest versions of Chrome and Safari. For strict blocking strategy, it's reasonable to just block users with any older versions of browsers.

116 is mucher new than 110, but it does not make things significantly better, let alone that their fingerprints are actually the same. The insteresting part is in 117, when ECH was added.

Actually I have been working on this in my fork of curl-impersonate. Hopefully I could get it landed before Chrome 120 is main stream. I'm just too busy on other stuff recently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of firefox, it's really challenging to pack an addtional .so file in a python wheel. There are two options to bypass this:

  1. release another package, i.e. curl_cffi_ff, as suggested by one of our users
  2. Try to use boringssl(chrome) to emulate nss(firefox)

At least one of them should work, just haven't had time to try them out. Maybe I can experiment them during the Chinese New Year.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that it would be nice to have the Chrome version117+ available sooner as this will help a lot more with the more challenging sites. (Already saw you are well on the way through all the different versions there on your fork.)

As for firefox probably the easier of the two options you mentioned would be to simply release a new package for firefox curl, but this would require maintenance of both packages simultaneously which seems a lot more effort on your part.

I would love to get closer to this project although I am extremely new to it, if there are any smaller issues for me to explore and help out on let me know and will try tackle it in my free time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be open to trying out the multiple package option. The opencv_python project builds 4 different packages (each with slightly different configurations) out of the same base repo, so I think it should be possible to minimize the maintenance overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A related option could possibly be to factor out the ffi binding portion into its own standalone package, build chrome/firefox versions of that, and have curl_cffi import the bindings packages. This way, the requests/async interfaces that curl_cffi provides don't need be duplicated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At here, IMO, simulating NSS on BoringSSL could have more priority than maintaining multiple packages.
This may need to have some patches on BoringSSL, but I think it's worth to try investigating on it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try whatever you like, I'm open to merge them both since there is no conflict, actually.

@perklet
Copy link
Collaborator Author

perklet commented Dec 27, 2023

@Snusmumr1000 It seems that the local echo server impelmentation is behaving wrongly. Our implementation is actully correct. See the latest commit for fixed server and updated tests and examples.

@perklet perklet merged commit 9d83c94 into main Dec 31, 2023
0 of 3 checks passed
@perklet perklet deleted the feature/impersonate-6.0 branch February 3, 2024 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants