Extract key-value metadata from HTML comments
In the world of printed books, front matter is the stuff at the beginning of the book like the title page, foreword, preface, table of contents, etc. In the world of computer programming, frontmatter is metadata at the top of a file. The term was (probably) popularized by the Jekyll static site generator.
Unlike YAML frontmatter though, HTML frontmatter lives inside plain old HTML comments, so it will be quietly ignored by tools/browsers that don't know about it.
Download node at nodejs.org and install it, if you haven't already.
npm install html-frontmatter --save
Given an HTML or Markdown file that looks like this:
<!--
title: GitHub Integration
keywords: github, git, npm, enterprise
published: 2014-10-02
description: npmE works with GitHub!
-->
<h1>Hello World</h1>
And code like this:
var fm = require('html-frontmatter')
var frontmatter = fm(fs.readFileSync('github.md', 'utf-8'))
Here's what you'll get:
{
title: "GitHub Integration",
keywords: "github, git, npm, enterprise",
published: "Wed Oct 01 2014 17:00:00 GMT-0700 (PDT)",
description: "npmE works with GitHub!"
}
If you have a long string (like a description) and want it to span multiple lines, simply indent each subsequent line with 2 or more spaces:
<!--
description: This is a long string that
wraps and goes on forever
and wraps some more
-->
Your values can contain colons. No worries.
<!--
title: How I roll: or, the life of a wheel
-->
Your values can include shallow arrays
<!--
title: This post has tags
tags: [100, this is a string, true]
-->
Is equivalent to:
<!--
title: This post has tags
tags: [
100,
this is a string,
true
]
# note: the closing bracket of an array must be indented by 2 spaces or more
-->
And will return:
{
title: "This post has tags",
tags: [100, 'this is a string', true]
}
- Boolean "true" and "false" strings are converted to Boolean.
- Numeric strings are converted to Number.
- Strings in YMD-ish format are converted to Date objects.
html-frontmatter exposes the regular expression it uses to detect presence
of frontmatter as a property named pattern
. You can use it to
conditionally parse frontmatter:
var fm = require('html-frontmatter')
var content = "A string that doesn't have frontmatter in it"
if (content.match(fm.pattern)) {
// nope
}
npm install
npm test
# ✓ extracts metadata from colon-delimited comments at the top of an HTML string
# ✓ returns null if frontmatter is not found
# ✓ handles values that contain colons
# ✓ handles line-wrapped values
# ✓ cleans up excess whitespace
# ✓ ignores comments that are not at the top of the file
# ✓ allows newlines before comments
# ✓ ignores comment lines starting with hashes (#)
# ✓ allows single-line comments
# ✓ does not include additional comments
# ✓ coerces boolean strings into Booleans
# ✓ coerces numeric strings into Numbers
# ✓ coerces YMD-ish date strings into Dates
# ✓ exposes its regex pattern as `pattern`
# ✓ handles missing right-hand-value
# ✓ handles shallow arrays