Skip to content

Headless/full Java browser with support for downloading files, working with cookies, retrieving HTML and simulating real user input. Possible via Node.js with Puppeteer and/or Playwright. Main focus on ease of use and high-level methods.

License

Notifications You must be signed in to change notification settings

Osiris-Team/HBrowser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HBrowser

Headless/full Java browser with support for downloading files, working with cookies, retrieving HTML and simulating real user input. Possible via Node.js with Puppeteer and/or Playwright. Main focus on ease of use and high-level methods. Add this to your project with Maven/Gradle/Sbt/Leinigen (Java 8 or higher required).

try(PlaywrightWindow window = HB.newWin()){
    window.load("https://example.com");
   // ...   
}

All examples here. Note that the first run may take a bit because Node.js and its modules get installed into your current working dir under ./headless-browser.

Important

On newer playwright versions you might need to install additional dependencies manually on your machine, this requires root permissions. Normally those dependencies are pre-installed though. You will notice if there is an exception. For details print debug to System.out.

cd ./headless-browser/node-js/node-js-working-dir && ./headless-browser/node-js/node-js-installation/bin/npx playwright install-deps

Features

  • High-Level methods for...
    • downloading files.
    • working with cookies.
    • retrieving HTML.
    • simulating real user input.
  • Integrated evasions for headless detection: HB.newWinBuilder().headless(true).makeUndetectable(true)...
  • Easy access to Node.js from within Java: new NodeContext().executeJavaScript("console.log('Hello!');");
  • HTML handling via Jsoup and JSON with Gson.

How good are the evasions?

image

try (PlaywrightWindow w = HB.newWinBuilder()
     .headless(true).makeUndetectable(true).buildPlaywrightWindow())
{
    w.load("https://infosimples.github.io/detect-headless/");
    w.makeScreenshot(new File("screenshot.png"), true);
} 
catch (Exception e) {e.printStackTrace();}

Last checked 18.06.2024.

Drivers

Playwright is the default and recommended browser driver to use, since it supports downloads and more of its features were ported to Java. Checkout JG-Browser for a browser completely written in Java.

Name JS-Engine Downloads
Playwright Node.js/V8 Yes
Puppeteer Node.js/V8 No

You can find their versions in this class, which also allows you to set custom versions. (JS = JavaScript; Downloads = If the browser is able to download files other than html/xml/pdf;)

Contribute/Build

Beginners

If you have never contributed before, we recommend this Beginners Article. If you are planning to make big changes, create an issue first, where you explain what you want to do. Thank you in advance for every contribution! If you don't know how to import a GitHub project, check out this guide: IntelliJ IDEA Cloning Guide

Build-Details

Libraries

Name/Link Usage License
Playwright Emulates different types of browsers License
Puppeteer Emulates different types of browsers License
Node.js Enables executing JavaScript code License
Jsoup Used to load pages and modify their HTML code License

About

Headless/full Java browser with support for downloading files, working with cookies, retrieving HTML and simulating real user input. Possible via Node.js with Puppeteer and/or Playwright. Main focus on ease of use and high-level methods.

Topics

Resources

License

Stars

Watchers

Forks