Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puppeteer issues #20179

Closed
oscarotero opened this issue Aug 16, 2023 · 8 comments · Fixed by #25387
Closed

Puppeteer issues #20179

oscarotero opened this issue Aug 16, 2023 · 8 comments · Fixed by #25387
Assignees
Labels
bug Something isn't working correctly node API polyfill Related to various "node:*" modules APIs node compat

Comments

@oscarotero
Copy link
Contributor

The post to announce the release of Deno 1.35 says that npm:puppeteer is now supported.

I've been doing some tests and found a couple of issues with this package. This is the code I'm using:

import puppeteer from "npm:[email protected]";
import { Cache } from "npm:@puppeteer/[email protected]";

const options = {
  browser: "chrome",
  cacheDirectory: Deno.cwd() + "/_puppeteer",
};

// Install the binary if it's not already installed
try {
  await Deno.stat(options.cacheDirectory);
} catch (error) {
  if (error instanceof Deno.errors.NotFound) {
    const command = new Deno.Command(Deno.execPath(), {
      args: ["run", "-A", "npm:[email protected]/install.js"],
      stdout: "inherit",
      env: {
        PUPPETEER_CACHE_DIR: options.cacheDirectory,
        PUPPETEER_PRODUCT: options.browser,
      },
    });

    await command.output();
  } else {
    throw error;
  }
}

// Search the binary path
const cache = new Cache(options.cacheDirectory);

const installedBrowser = cache.getInstalledBrowsers().find((info) => {
  return info.browser === options.browser;
});

if (!installedBrowser) {
  throw new Error(`Browser ${options.browser} not installed`);
}

// Launch a browser
const browser = await puppeteer.launch({
  product: options.product,
  headless: "new",
  executablePath: installedBrowser.executablePath,
});

// Open a page
const page = await browser.newPage();
await page.setContent(`
<html>
  <head>
    <title>Testing</title>
  </head>
  <body>
    <h1>Title</h1>
  </body>
</html>
`);

// Evaluate some code
const result = await page.evaluate(() => {
  return Array.from(document.querySelectorAll("title, h1")).map((el) =>
    el.outerHTML
  );
});
console.log(result);

// Close everything
page.close();
await browser.close();

After running this code, I see the following:

➜  puppeteer deno run -A main.js
Warning: Not implemented: ClientRequest.options.createConnection
[ "<title>Testing</title>", "<h1>Title</h1>" ]

and the script keeps running for 30 seconds or so and cannot be closed, even with Ctrl + C. I'm not sure if the missing support for ClientRequest.options.createConnection makes the script to keep alive for 30 seconds after finishing.

I've tried the same script in Node (with some modifications) and it everything run smoothly:

// main.mjs

import { cwd } from "node:process";
import puppeteer from "puppeteer";
import { Cache } from "@puppeteer/browsers";

const options = {
  browser: "chrome",
  cacheDirectory: cwd() + "/_puppeteer",
};

const cache = new Cache(options.cacheDirectory);

const installedBrowser = cache.getInstalledBrowsers().find((info) => {
  return info.browser === options.browser;
});

if (!installedBrowser) {
  throw new Error(`Browser ${options.browser} not installed`);
}

const browser = await puppeteer.launch({
  product: options.product,
  headless: "new",
  executablePath: installedBrowser.executablePath,
});


const page = await browser.newPage();
await page.setContent(`
<html>
  <head>
    <title>Testing</title>
  </head>
  <body>
    <h1>Title</h1>
  </body>
</html>
`);

const result = await page.evaluate(() => {
  return Array.from(document.querySelectorAll("title, h1")).map((el) =>
    el.outerHTML
  );
});
console.log(result);

await page.close();

await browser.close();
@bartlomieju bartlomieju added bug Something isn't working correctly node compat labels Aug 27, 2023
@3raser95
Copy link

3raser95 commented Nov 6, 2023

I'm trying to run a similar script with AWS Lambda and it also shows the Warning: Not implemented: ClientRequest.options.createConnection in the logs and then it just times out while trying to run the puppeteer.launch. I've tried pretty much everything I've been able to find on the internet but nothing seems to be able to fix this issue.

@jespertheend
Copy link
Contributor

I think this might be the same as #19507

@guillaume86
Copy link

guillaume86 commented Jan 31, 2024

I noticed the same issues, the warning is not preventing puppeteer to work, but the 30s delay in finishing the process is annoying.

Here's a test case which I think exposes the cause of the delay:

import puppeteer from "npm:[email protected]";

Deno.test("A", async (t) => {
  const browser = await puppeteer.launch({
    headless: "new",
    handleSIGHUP: Deno.build.os !== "windows"
  });

  try {
    await t.step("B", async () => {});
  } finally {
    await browser.close();
  }
});

which outputs:

PS > deno test -A --trace-ops
running 1 test from ./src/main_test.ts
A ...
------- output -------
Warning: Not implemented: ClientRequest.options.createConnection
----- output end -----
  B ... ok (1ms)
A ... FAILED (430ms)

 ERRORS 

A => ./src/main_test.ts:3:6
error: Leaking async ops:
  - 1 async operation to op_read was started in this test, but never completed. The operation was started here:
    at handleOpCallTracing (ext:core/00_infra.js:125:42)
    at Object.op_read (ext:core/00_infra.js:302:21)
    at TcpConn.read (ext:deno_net/01_net.js:136:26)
    at TCP.#read (ext:deno_node/internal_binding/stream_wrap.ts:222:44)
    at TCP.#read (ext:deno_node/internal_binding/stream_wrap.ts:250:17)
    at eventLoopTick (ext:core/01_core.js:63:7)
  - 1 async operation to sleep for a duration was started in this test, but never completed. This is often caused by not cancelling a `setTimeout` or `setInterval` call. The operation was started here:
    at handleOpCallTracing (ext:core/00_infra.js:125:42)
    at op_sleep (ext:core/00_infra.js:302:21)
    at runAfterTimeout (ext:deno_web/02_timers.js:234:20)
    at initializeTimer (ext:deno_web/02_timers.js:192:3)
    at setTimeout (ext:deno_web/02_timers.js:336:10)
    at Timeout.<computed> (ext:deno_node/internal/timers.mjs:67:7)
    at new Timeout (ext:deno_node/internal/timers.mjs:54:37)
    at setTimeout (node:timers:16:10)
    at WebSocket.close (file:///C:/Users/guillaume.lecomte/AppData/Local/deno/npm/registry.npmjs.org/ws/8.16.0/lib/websocket.js:328:24)
    at NodeWebSocketTransport.close (file:///C:/Users/guillaume.lecomte/AppData/Local/deno/npm/registry.npmjs.org/puppeteer-core/21.10.0/lib/esm/puppeteer/node/NodeWebSocketTransport.js:51:18)

 FAILURES 

A => ./src/main_test.ts:3:6

FAILED | 0 passed (1 step) | 1 failed (447ms)

error: Test failed

@guillaume86
Copy link

Thanks to #20868 (comment) ,
I can confirm that the issue originates from the npm ws package, and just replacing it with the Deno standard WebSocket fixes the issues (both the warning and exit delay).

Now what would be the proper way to get this fixed upstream? Is there a mecanism for Deno specific overrides? Or the only solution is a fork? @dsherret

@marvinhagemeister
Copy link
Contributor

Related #19507

@kt3k
Copy link
Member

kt3k commented Aug 30, 2024

Also seems related to puppeteer/puppeteer#11839

The snippet given in the above issue (which patches NodeWebSocketTransport.create) seems working around this issue.

@kt3k
Copy link
Member

kt3k commented Aug 30, 2024

The below script reproduces the situation where ws.close() takes 30 seconds before finishing only using ws package.

import WebSocket from "npm:[email protected]";

const server = `
  Deno.serve({ port: 3000 }, (req) => {
    if (req.headers.get("upgrade") != "websocket") {
      return new Response(null, { status: 501 });
    }
    const { socket, response } = Deno.upgradeWebSocket(req);
    socket.addEventListener("message", (event) => {
      socket.send("pong");
    });
    return response;
  });
`;

const cmd = new Deno.Command("deno", { args: ["eval", server] });

const p = cmd.spawn();

setTimeout(() => {
  const ws = new WebSocket("ws://localhost:3000");
  ws.addEventListener("message", async (ev) => {
    console.log("received", ev.data);
    console.log("killing the server");
    p.kill();
    setTimeout(() => {
      console.log("closing client websocket");
      ws.close();
    }, 1000);
  });
  ws.addEventListener("open", () => {
    console.log("sending ping")
    ws.send("ping");
  });
}, 2000);

ws.close() tries to send some data to the peer, but if the peer is already gone, that request seems stuck (the resource used is fetchUpgradedStream).

ws package set 30 seconds timeout for that call here https://github.com/websockets/ws/blob/019f28ff1ffddfcdc428d1de5ecd98648057a2ab/lib/websocket.js#L332
That finishes the stalled write request after 30 sec

@kt3k
Copy link
Member

kt3k commented Sep 3, 2024

Looks like we don't propagate the closed state of hyper::upgrade::Upgraded to the paired tokio::io::DuplexStream in op_node_http_fetch_response_upgrade, and that seems causing the Deno.TcpConn/net.Socket writable even after the actual websocket connection is closed. I think I found a fix for it. I'll send a PR soon.

kt3k added a commit that referenced this issue Sep 5, 2024
…ion is closed (#25387)

This change fixes the handling of upgraded socket from `node:http` module.

In `op_node_http_fetch_response_upgrade`, we create DuplexStream paired
with `hyper::upgrade::Upgraded`. When the connection is closed from the
server, the read result from `Upgraded` becomes 0. However because we
don't close the paired DuplexStream at that point, the Socket object in
JS side keeps alive even after the server closed. That caused the issue
#20179

This change fixes it by closing the paired DuplexStream when the
`Upgraded` stream returns 0 read result.

closes #20179
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly node API polyfill Related to various "node:*" modules APIs node compat
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants