Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor CDP #422

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Refactor CDP #422

wants to merge 16 commits into from

Conversation

karlseguin
Copy link
Contributor

@karlseguin karlseguin commented Feb 12, 2025

This includes / is built on top of #408

CDP is now a struct, with a cleaner separation between Server, Client and CDP. Server <-> Client <-> CDP -> Browser. The CDP instance is the state.

Optimized how cdp results and events are sent to the client - only a single allocation is now required.

Each CDP message is now processed with an arena, which is re-used for each message. Small ad-hoc message-specific allocations, like the aux_data should now be more efficient.

Tried to remove some panics and undefines, or at least reduce the time that a field is undefined. Related, tried to remove the need for stable pointers (i.e. Browser.init now returns a Browser, rather than needing to be given a *Browser).

Improved parsing of CDP messages from the client.

Removed the newly added browser.currentPage(), since the cdp already has directly access to the session. (browser.currentPage() would make it harder to implement the "TODO allow multiple sessions per browser.")

Failed to get any unit test coverage from CDP, but i think it's important to do as there's more and more logic being added to it. Part of the blocker is global types. I'd consider introducing a global mock Browser/Session:

const Browser = if (builtin.is_test) MockBrowser else @import("../browser/browser.zig").Browser;

But I haven't given up hope on being able to quickly start a lightweight jsruntime.Env in unit tests.

To get a sense of the changes, I think you can look at dom.discardSearchResults which shows:
1 - how to get typed params
2 - how to access / modifity the state
3 - how to write a results

Adding HTTP & websocket awareness to the TCP server.

HTTP server handles `GET /json/version` and websocket upgrade requests.

Conceptually, websocket handling is the same code as before, but receiving
data will parse the websocket frames and writing data will wrap it in
a websocket frame.

The previous `Ctx` was split into a `Server` and a `Client`. This was
largely done to make it easy to write unit tests, since the `Client` is
a generic, all its dependencies (i.e. the server) can be mocked out. This
also makes it a bit nicer to know if there is or isn't a client (via the
server's client optional).

Added a MemoryPool for the Send object (I thought that was a nice touch!)

Removed MacOS hack on accept/conn completion usage.

Known issues:
- When framing an outgoing message, the entire message has to be duped. This
is no worse than how it was before, but it should be possible to eliminate
this in the future. Probably not part of this PR.

- Websocket parsing will reject continuation frames. I don't know of a single
client that will send a fragmented message (websocket has its own
message fragmentation), but we should probably still support this just in
case.

- I don't think the receive, timeout and close completions can safely be
re-used like we're doing. I believe they need to be associated with a specific
client socket.

- A new connection creates a new browser session. I think this is right (??),
but for the very first, we're throwing out a perfectly usable session. I'm
thinking this might be a change to how Browser/Sessions work.

- zig build test won't compile. This branch reproduces the issue with none
of these changes:
https://github.com/karlseguin/browser/tree/broken_test_build

(or, as a diff to main):
lightpanda-io/browser@main...karlseguin:broken_test_build
Move more logic into the reader. Avoid copying partial messages in
cases where we know that the buffer is large enough.

This is mostly groundwork for trying to add support for continuation
frames.
CDP is now an struct which contains its own state a browser and a session.

When a client connection is made and successfully upgrades, the client creates
the CDP instance. There is now a cleaner separation betwen Server, Client and
CDP.

Removed a number of allocations, especially when writing results/events from
CDP to the client. Improved input message parsing. Tried to remove some usage
of undefined.
@francisbouvier
Copy link
Member

Hi @karlseguin, first of all thanks for this PR, it's impressive.

It seems that there is a regression (leak) on memory usage. When I launch 1000 runs of the puppeteer demo on my MBA M2, looking Activity Monitor I have:

  • 190MB with this PR (release mode)
  • 14MB on main (release mode)

@karlseguin
Copy link
Contributor Author

karlseguin commented Feb 17, 2025

It seems that there is a regression (leak) on memory usage. When I launch 1000 runs of the puppeteer demo on my MBA M2, looking Activity Monitor I have:

In the page cdp, I was creating a copy of the page, thus copying the arena, which causes issues. I had removed session.currentPage(). I've re-added it, as it returns the &self.page which helps avoid this issue.

There's still a leak on Mac from the cancel callback not being called and these allocations not being freed, but this predates the PR.

Fixes issue where CDP closes the client, but client still registers a recv
operation.
@francisbouvier
Copy link
Member

Great, that fixes the leak, thanks.

I'm reviewing the code itself but it's a pretty big PR so it takes some time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants