Headless/full Java browser with support for downloading files, working with cookies, retrieving HTML and simulating real user input. Possible via Node.js with Puppeteer and/or Playwright. Main focus on ease of use and high-level methods. Add this to your project with Maven/Gradle/Sbt/Leinigen (Java 8 or higher required).
try(PlaywrightWindow window = HB.newWin()){
window.load("https://example.com");
// ...
}
All examples here.
Note that the first run may take a bit because Node.js and its modules get installed into your current working dir under ./headless-browser
.
On newer playwright versions you might need to install additional dependencies manually on your machine, this requires root permissions. Normally those dependencies are pre-installed though. You will notice if there is an exception. For details print debug to System.out.
cd ./headless-browser/node-js/node-js-working-dir && ./headless-browser/node-js/node-js-installation/bin/npx playwright install-deps
- High-Level methods for...
- downloading files.
- working with cookies.
- retrieving HTML.
- simulating real user input.
- Integrated evasions for headless detection:
HB.newWinBuilder().headless(true).makeUndetectable(true)...
- Easy access to Node.js from within Java:
new NodeContext().executeJavaScript("console.log('Hello!');");
- HTML handling via Jsoup and JSON with Gson.
How good are the evasions?
try (PlaywrightWindow w = HB.newWinBuilder()
.headless(true).makeUndetectable(true).buildPlaywrightWindow())
{
w.load("https://infosimples.github.io/detect-headless/");
w.makeScreenshot(new File("screenshot.png"), true);
}
catch (Exception e) {e.printStackTrace();}
Last checked 18.06.2024.
Playwright is the default and recommended browser driver to use, since it supports downloads and more of its features were ported to Java. Checkout JG-Browser for a browser completely written in Java.
Name | JS-Engine | Downloads |
---|---|---|
Playwright | Node.js/V8 | Yes |
Puppeteer | Node.js/V8 | No |
You can find their versions in this class, which also allows you to set custom versions. (JS = JavaScript; Downloads = If the browser is able to download files other than html/xml/pdf;)
If you have never contributed before, we recommend this Beginners Article. If you are planning to make big changes, create an issue first, where you explain what you want to do. Thank you in advance for every contribution! If you don't know how to import a GitHub project, check out this guide: IntelliJ IDEA Cloning Guide
- Written in Java, with JDK 8, inside of IntelliJ IDEA
- Built with Maven, profiles: clean package
Name/Link | Usage | License |
---|---|---|
Playwright | Emulates different types of browsers | License |
Puppeteer | Emulates different types of browsers | License |
Node.js | Enables executing JavaScript code | License |
Jsoup | Used to load pages and modify their HTML code | License |