This project provides some basic sanity checking on html files.
It can be helpful in case of html generated from e.g. Asciidoctor, Markdown or other formats - as converters usually don’t check for missing images or broken links.
It can be used as Gradle plugin. Standalone Java and graphical UI are planned for future releases.
Use the following snippet inside a Gradle build file:
buildscript {
repositories {
jcenter()
}
dependencies {
classpath 'org.aim42:HtmlSanityCheck-gradle-plugin:0.8.0-SNAPSHOT'
}
}
apply plugin: 'org.aim42.HtmlSanityCheck-gradle-plugin'
The plugin adds a new task named htmlSanityCheck
.
This task exposes a few properties as part of its configuration:
sourceDir |
(mandatory) directory where the html files are located. Type: File. Default: |
sourceDocuments |
(optional) an override to process several source files, which may be a subset of all
files available in |
checkingResultsDir |
(optional) directory where the checking results written to.
Defaults to |
checkExternalLinks |
(optional, planned) if set to "true", external references are checked too.
Defaults to |
apply plugin: 'org.aim42.HtmlSanityCheck-gradle-plugin'
htmlSanityCheck {
sourceDir = new File( "$buildDir/docs" )
// files to check - in Set-notation
sourceDocuments = [ "one-file.html", "another-file.html", "index.html"]
// where to put results of sanityChecks...
checkingResultsDir = new File( "$buildDir/report/htmlchecks" )
checkExternalLinks = false
}
Finds all '<a href="XYZ">' where XYZ is not defined.
<a href="#missing>internal anchor</a>
...
<h2 id="missinG">Bookmark-Header</h2>
In this example, the bookmark is misspelled.
Images, referenced in '<img src="XYZ"…' tags, refer to external files. The existence of these files is checked by the plugin.
If any is defined more than once, any anchor linking to it will be confused :-)
Image-tags should contain an alt-attribute that the browser displays when the original image file cannot be found or cannot be rendered. Having alt-attributes is good and defensive style.
In addition to checking HTML, this project serves as an example for arc42.
Please see our software architecture documentation.
This tiny piece rests on incredible groundwork:
-
Jsoup HTML parser and analysis toolkit - robust and easy-to-use.
-
IntelliJ IDEA - my (Gernot) best (programming) friend.
-
Of course, Groovy, Gradle, JUnit and Spockframework.
-
The plugin heavily relies on code provided by the Gradle project.
-
Inspiration on code organization, implementation and testing of the plugin came from the Asciidoctor-Gradle-Plugin by [@AAlmiray].
-
Code for string similarity calculation by Ralph Rice.
-
Initial implementation, maintenance and documentation by Gernot Starke.
Several sources provided help during development:
-
The code4reference tutorial an Gradle custom plugins, part 1 and part 2.
-
Of course, the JSoup API documentation
Please report issues or suggestions.
Want to improve the plugin: Fork our repository and send a pull request.