diff --git a/README.md b/README.md index b2147d04e..d3691cb04 100644 --- a/README.md +++ b/README.md @@ -45,6 +45,10 @@ Table of Contents [![npm](https://img.shields.io/npm/dw/@adobe/helix-pipeline.svg)](https://www.npmjs.com/package/@adobe/helix-pipeline) [![Greenkeeper badge](https://badges.greenkeeper.io/adobe/helix-pipeline.svg)](https://greenkeeper.io/) [![Known Vulnerabilities](https://snyk.io/test/github/adobe/hypermedia-pipeline/badge.svg?targetFile=package.json)](https://snyk.io/test/github/adobe/hypermedia-pipeline?targetFile=package.json) +## Helix Markdown + +The Helix Pipeline supports some [Markdown extensions](docs/markdown.md). + ## Anatomy of a Pipeline A pipeline consists of the following main parts: diff --git a/docs/markdown.md b/docs/markdown.md new file mode 100644 index 000000000..5d4f8f621 --- /dev/null +++ b/docs/markdown.md @@ -0,0 +1,299 @@ +# Markdown Features in Project Helix + +Project Helix uses [GitHub Flavored Markdown](https://github.github.com/gfm/) (GFM) with following extensions: + +## Sections + +Use the section marker `---` followed and preceeded by a blank line to create a section break. This is a change from GFM, where `---` denotes a [thematic break](https://github.github.com/gfm/#thematic-breaks). + +```markdown + +This is one section. + +--- + +This is another section. + +--- + +And this is a third section. + +``` + +You can still create thematic breaks in Markdown by using any of the alternative syntaxes: + +> A line consisting of 0-3 spaces of indentation, followed by a sequence of three or more matching `_` or `*` characters, each followed optionally by any number of spaces or tabs, forms a thematic break. + +## Section Metadata (or Midmatter) + +GitHub allows (although not part of the GFM spec) adding metadata to a document by adding a YAML block, enclosed by pairs of `---` to the beginning of the document. This is known as Markdown frontmatter. + +```markdown +--- +key:value +--- + +This is my document. +``` + +Helix extends this notion by allowing to create YAML blocks at the beginning of a section. + +```markdown + +This is one section. + +--- +key:value +--- + +This is another section. It has some meta data. + +--- + +And this is a third section. + +``` + +Because it looks like Markdown "frontmatter", but is allowed in the middle of the document, we call it "midmatter". + +## External Embeds + +Embedding external content in a Markdown document is supported in Helix Markdown. We support a number of embedding syntaxes that were originally introduced for other software and allow a certain degree of interoperability: + +**Note: Helix Pipeline will only process embeds if the URL matches a known whitelist. This is for reasons of security and to guard against accidential embeds.** + + +### IA Writer-Style Embeds + +The [IA Writer Markdown editing app](https://ia.net/writer) for desktop and mobile operating systems [introduced a system called content blocks](https://ia.net/writer/support/general/content-blocks) and this embed style is inspired by the system. + +An embed is: + +- a whitelisted URL +- in a separate paragraph +- that's all. + +```markdown + +https://www.youtube.com/watch?v=KOxbO0EI4MA + +``` + +IA Writer-Style Embeds are the simplest and recommended way of creating embeds. + +### Gatsby-Style Embeds + +This embed style is also supported by a number of Gatsby-plugins like [gatsby-remark-embed-video](https://github.com/borgfriend/gatsby-remark-embed-video). The implementation in Helix shares no code with these plugins. + +An embed is: + +- an inline code block +- in a separate paragraph +- containing a keyword, a colon `:`, and a whitelisted URL + +```markdown + +`video: https://www.youtube.com/embed/2Xc9gXyf2G4` + +``` + +In the example above, the keyword is `video`, but any keyword like `embed`, `external`, `link`, `media` is allowed. + +### Image-Style Embeds + +The only notion Markdown has of external content embedded in the document are images. This embed syntax extends that idea by allowing embeds using the image syntax. + +An embed is: + +- an image `![]()` +- with a whitelisted URL +- in a separate paragraph + +```markdown + +![](https://www.youtube.com/watch?v=KOxbO0EI4MA) + +``` + +### Link + Image-Style Embeds + +The downside of the three embed approaches above is that they do not work on GitHub and don't have a preview. This is solved by Link + Image-Style embeds, albeit at the cost of a more convoluted syntax. + +An embed is: + +- a link to a whitelisted URL +- in a separate paragraph +- with a preview image as the only child + +```markdown + +[![Audi R8](http://img.youtube.com/vi/KOxbO0EI4MA/0.jpg)](https://www.youtube.com/watch?v=KOxbO0EI4MA "Audi R8") + +``` + +## Internal Embeds + +Helix also supports internal embeds, similar to [IA Writer's content blocks](https://ia.net/writer/support/general/content-blocks), with following changes: + +- any of the external embed syntaxes are supported, not just IA-Writer-Style embeds +- whitelisted URLs must be relative and end with `.md` or `.html` + +## Data Embeds and Markdown Templates + +Helix also supports the embedding of tabular or list data. This is useful for embedding data from external sources such as: + +- Google Sheets (not yet supported) +- Microsoft Excel Online (not yet supported) +- Google Calendar (not yet supported) +- RSS Feeds (not yet supported) + +Instead of just dumping the data as a Markdown table, Helix will fetch the data, and find placeholders in the current Markdown document (or section, if the document has sections) and fill the placeholders with the data from the data source. + +If the data source has more than one entry, e.g. multiple rows in a spreadsheet, multiple events in a calendar, or multiple posts in an RSS feed, then the content of the document (or section) will be replicated for each entry. + +Helix maintains a separate whitelist of URLs that indicate embeddable data sources, so any of the embed syntaxes describe above can be used. In the following examples, we will use the IA Writer-style syntax. + +### Example + +Consider a fictional list of used cars, maintained in a spreadsheet: + +| Make | Model | Year | Image | Mileage (from) | Mileage (to) | +| ------- | ------ | ---- | ----------- | -------------- | ------------ | +| Nissan | Sunny | 1992 | nissan.jpg | 100000 | 150000 | +| Renault | Scenic | 2000 | renault.jpg | 75000 | 100000 | +| Honda | FR-V | 2005 | honda.png | 50000 | 150000 | + +This list would be represented as a JSON document like this: + +```json +[ + { + "make": "Nissan", + "model": "Sunny", + "year": 1992, + "image": "nissan.jpg", + "mileage": { + "from": 100000, + "to": 150000 + } + }, + { + "make": "Renault", + "model": "Scenic", + "year": 2000, + "image": "renault.jpg", + "mileage": { + "from": 75000, + "to": 100000 + } + }, + { + "make": "Honda", + "model": "FR-V", + "year": 2005, + "image": "honda.png", + "mileage": { + "from": 50000, + "to": 150000 + } + } +] +``` + +The following examples use the above data. + +#### Placeholders + +To create a data embed, add an embeddable URL to the document and use placeholders like `{{make}}` or `{{model}}` in the document. + +```markdown + + +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md) + +``` + +This will create following HTML (boilerplate omitted): + +```html +
    +
  1. My car:Nissan Sunny
  2. +
  3. My car:Renault Scenic
  4. +
  5. My car:Honda FR-V
  6. +
+``` + +Each of the placeholders have been replaced with entries from the table above. + +#### Dot Notation + +If you want to address nested properties (this will only be useful for some data sources), use a dot notation like `{{mileage.from}}`: + +```markdown + +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +# My {{make}} {{model}} + +![{{make}} {{model}}]({{image}}) + +Built in {{year}}. Driven from {{mileage.from}} km to {{mileage.to}} km. +``` + +to generate following HTML: + +```html +

My Nissan Sunny

+

Nissan Sunny

+

Built in 1992. Driven from 100000 km to 150000 km.

+ +

My Renault Scenic

+

Renault Scenic

+

Built in 2000. Driven from 75000 km to 100000 km.

+ +

My Honda FR-V

+

Honda FR-V

+

Built in 2005. Driven from 50000 km to 150000 km.

+``` + +#### Sections + +If you have content that you don't want to see repeated for each data element, break it out in separate sections like this: + +```markdown + +## My Cars + +--- + +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +- [![{{make}} {{model}}]({{image}})](cars-{{year}}.md) + +``` + +The first section, containing the "My Cars" headline won't be repeated in the generated HTML: + +```html +
+

My Cars

+
+
+ +
+``` + +# Notes for Developers + +The new node types are documented in the [MDAST schema documentation](mdast.md). The types mentioned in this document are: + +- `section` +- `embed` +- `dataEmbed` + diff --git a/package-lock.json b/package-lock.json index d7b92eb8c..5c65c19ca 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,6 +1,6 @@ { "name": "@adobe/helix-pipeline", - "version": "6.7.4", + "version": "6.7.5", "lockfileVersion": 1, "requires": true, "dependencies": { @@ -1505,10 +1505,9 @@ } }, "acorn": { - "version": "7.1.0", - "resolved": "https://registry.npmjs.org/acorn/-/acorn-7.1.0.tgz", - "integrity": "sha512-kL5CuoXA/dgxlBbVrflsflzQ3PAas7RYZB52NOm/6839iVYJgKMJ3cQJD+t2i5+qFa8h3MDpEOJiS64E8JLnSQ==", - "dev": true + "version": "7.1.1", + "resolved": "https://registry.npmjs.org/acorn/-/acorn-7.1.1.tgz", + "integrity": "sha512-add7dgA5ppRPxCFJoAGfMDi7PIBXq1RtGo7BhbLaxwrXPOmw8gq48Y9ozT01hUKy9byMjlR20EJhu5zlkErEkg==" }, "acorn-globals": { "version": "6.0.0", @@ -1517,13 +1516,6 @@ "requires": { "acorn": "^7.1.1", "acorn-walk": "^7.1.1" - }, - "dependencies": { - "acorn": { - "version": "7.1.1", - "resolved": "https://registry.npmjs.org/acorn/-/acorn-7.1.1.tgz", - "integrity": "sha512-add7dgA5ppRPxCFJoAGfMDi7PIBXq1RtGo7BhbLaxwrXPOmw8gq48Y9ozT01hUKy9byMjlR20EJhu5zlkErEkg==" - } } }, "acorn-jsx": { @@ -2027,8 +2019,7 @@ }, "kind-of": { "version": "6.0.2", - "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.2.tgz", - "integrity": "sha512-s5kLOcnH0XqDO+FvuaLX8DDjZ18CGFk7VygH40QoKPUQhW4e2rvM0rwUq0t8IQDOwYSeLK01U90OjzBTme2QqA==", + "resolved": "", "dev": true } } @@ -2744,6 +2735,12 @@ "through": "^2.3.6" } }, + "minimist": { + "version": "1.2.0", + "resolved": "https://registry.npmjs.org/minimist/-/minimist-1.2.0.tgz", + "integrity": "sha1-o1AIsg9BOD7sH7kU9M1d95omQoQ=", + "dev": true + }, "strip-json-comments": { "version": "3.0.1", "resolved": "https://registry.npmjs.org/strip-json-comments/-/strip-json-comments-3.0.1.tgz", @@ -2766,6 +2763,17 @@ "requires": { "array-ify": "^1.0.0", "dot-prop": "^3.0.0" + }, + "dependencies": { + "dot-prop": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-3.0.0.tgz", + "integrity": "sha1-G3CK8JSknJoOfbyteQq6U52sEXc=", + "dev": true, + "requires": { + "is-obj": "^1.0.0" + } + } } }, "component-emitter": { @@ -3206,9 +3214,9 @@ } }, "kind-of": { - "version": "6.0.2", - "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.2.tgz", - "integrity": "sha512-s5kLOcnH0XqDO+FvuaLX8DDjZ18CGFk7VygH40QoKPUQhW4e2rvM0rwUq0t8IQDOwYSeLK01U90OjzBTme2QqA==", + "version": "6.0.3", + "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.3.tgz", + "integrity": "sha512-dcS1ul+9tmeD95T+x28/ehLgd9mENa3LsvDTtzm3vyBEO7RPptvAD+t44WVXaUjTBRcrpFeFlC8WCruUR456hw==", "dev": true } } @@ -3300,12 +3308,18 @@ "integrity": "sha512-vIOSyOXkMx81ghEalh4MLBtDHMx1bhKlaqHDMqM2yeitJ996SLOk5mGdDpI9ifJAgokred8Rmu219fX4OltqXw==" }, "dot-prop": { - "version": "3.0.0", - "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-3.0.0.tgz", - "integrity": "sha1-G3CK8JSknJoOfbyteQq6U52sEXc=", - "dev": true, + "version": "5.2.0", + "resolved": "https://registry.npmjs.org/dot-prop/-/dot-prop-5.2.0.tgz", + "integrity": "sha512-uEUyaDKoSQ1M4Oq8l45hSE26SnTxL6snNnqvK/VWx5wJhmff5z0FUVJDKDanor/6w3kzE3i7XZOk+7wC0EXr1A==", "requires": { - "is-obj": "^1.0.0" + "is-obj": "^2.0.0" + }, + "dependencies": { + "is-obj": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/is-obj/-/is-obj-2.0.0.tgz", + "integrity": "sha512-drqDG3cbczxxEJRoOXcOjtdp1J/lyp1mNn0xaznRs8+muBhgQcrnbspox5X5fOw0HnMnbfDzvnEMEtqDEJEo8w==" + } } }, "duplexer2": { @@ -4267,8 +4281,7 @@ }, "kind-of": { "version": "6.0.2", - "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.2.tgz", - "integrity": "sha512-s5kLOcnH0XqDO+FvuaLX8DDjZ18CGFk7VygH40QoKPUQhW4e2rvM0rwUq0t8IQDOwYSeLK01U90OjzBTme2QqA==", + "resolved": "", "dev": true } } @@ -4631,9 +4644,9 @@ } }, "kind-of": { - "version": "6.0.2", - "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.2.tgz", - "integrity": "sha512-s5kLOcnH0XqDO+FvuaLX8DDjZ18CGFk7VygH40QoKPUQhW4e2rvM0rwUq0t8IQDOwYSeLK01U90OjzBTme2QqA==", + "version": "6.0.3", + "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.3.tgz", + "integrity": "sha512-dcS1ul+9tmeD95T+x28/ehLgd9mENa3LsvDTtzm3vyBEO7RPptvAD+t44WVXaUjTBRcrpFeFlC8WCruUR456hw==", "dev": true }, "micromatch": { @@ -6031,11 +6044,6 @@ "xml-name-validator": "^3.0.0" }, "dependencies": { - "acorn": { - "version": "7.1.1", - "resolved": "https://registry.npmjs.org/acorn/-/acorn-7.1.1.tgz", - "integrity": "sha512-add7dgA5ppRPxCFJoAGfMDi7PIBXq1RtGo7BhbLaxwrXPOmw8gq48Y9ozT01hUKy9byMjlR20EJhu5zlkErEkg==" - }, "request": { "version": "2.88.2", "resolved": "https://registry.npmjs.org/request/-/request-2.88.2.tgz", @@ -7139,9 +7147,9 @@ } }, "minimist": { - "version": "1.2.0", - "resolved": "https://registry.npmjs.org/minimist/-/minimist-1.2.0.tgz", - "integrity": "sha1-o1AIsg9BOD7sH7kU9M1d95omQoQ=", + "version": "1.2.5", + "resolved": "https://registry.npmjs.org/minimist/-/minimist-1.2.5.tgz", + "integrity": "sha512-FM9nNUYrRBAELZQT3xeZQ7fmMOBg6nWNmJKTcgsJeaLstP/UODVpGsr5OhXhhXg6f+qtJ8uiZ+PUxkDWcgIXLw==", "dev": true }, "minimist-options": { @@ -7453,9 +7461,9 @@ }, "dependencies": { "kind-of": { - "version": "6.0.2", - "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.2.tgz", - "integrity": "sha512-s5kLOcnH0XqDO+FvuaLX8DDjZ18CGFk7VygH40QoKPUQhW4e2rvM0rwUq0t8IQDOwYSeLK01U90OjzBTme2QqA==", + "version": "6.0.3", + "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.3.tgz", + "integrity": "sha512-dcS1ul+9tmeD95T+x28/ehLgd9mENa3LsvDTtzm3vyBEO7RPptvAD+t44WVXaUjTBRcrpFeFlC8WCruUR456hw==", "dev": true } } @@ -10643,8 +10651,7 @@ "dependencies": { "minimist": { "version": "1.2.0", - "resolved": false, - "integrity": "sha1-o1AIsg9BOD7sH7kU9M1d95omQoQ=", + "resolved": "", "dev": true } } @@ -13687,8 +13694,7 @@ }, "kind-of": { "version": "6.0.2", - "resolved": "https://registry.npmjs.org/kind-of/-/kind-of-6.0.2.tgz", - "integrity": "sha512-s5kLOcnH0XqDO+FvuaLX8DDjZ18CGFk7VygH40QoKPUQhW4e2rvM0rwUq0t8IQDOwYSeLK01U90OjzBTme2QqA==", + "resolved": "", "dev": true } } @@ -14564,11 +14570,25 @@ "resolved": "https://registry.npmjs.org/unist-util-position/-/unist-util-position-3.1.0.tgz", "integrity": "sha512-w+PkwCbYSFw8vpgWD0v7zRCl1FpY3fjDSQ3/N/wNd9Ffa4gPi8+4keqt99N3XW6F99t/mUzp2xAhNmfKWp95QA==" }, + "unist-util-remove": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/unist-util-remove/-/unist-util-remove-2.0.0.tgz", + "integrity": "sha512-HwwWyNHKkeg/eXRnE11IpzY8JT55JNM1YCwwU9YNCnfzk6s8GhPXrVBBZWiwLeATJbI7euvoGSzcy9M29UeW3g==", + "requires": { + "unist-util-is": "^4.0.0" + }, + "dependencies": { + "unist-util-is": { + "version": "4.0.2", + "resolved": "https://registry.npmjs.org/unist-util-is/-/unist-util-is-4.0.2.tgz", + "integrity": "sha512-Ofx8uf6haexJwI1gxWMGg6I/dLnF2yE+KibhD3/diOqY2TinLcqHXCV6OI5gFVn3xQqDH+u0M625pfKwIwgBKQ==" + } + } + }, "unist-util-remove-position": { "version": "2.0.1", "resolved": "https://registry.npmjs.org/unist-util-remove-position/-/unist-util-remove-position-2.0.1.tgz", "integrity": "sha512-fDZsLYIe2uT+oGFnuZmy73K6ZxOPG/Qcm+w7jbEjaFcJgbQ6cqjs/eSPzXhsmGpAsWPkqZM9pYjww5QTn3LHMA==", - "dev": true, "requires": { "unist-util-visit": "^2.0.0" } diff --git a/package.json b/package.json index 44a4161cc..c2c36540a 100644 --- a/package.json +++ b/package.json @@ -45,8 +45,7 @@ "semantic-release": "17.0.4", "sinon": "9.0.1", "unist-builder": "2.0.3", - "unist-util-inspect": "5.0.1", - "unist-util-remove-position": "2.0.1" + "unist-util-inspect": "5.0.1" }, "dependencies": { "@adobe/helix-fetch": "1.4.1", @@ -57,6 +56,7 @@ "callsites": "^3.1.0", "clone": "^2.1.2", "dompurify": "2.0.8", + "dot-prop": "^5.2.0", "ferrum": "^1.2.0", "fs-extra": "^8.1.0", "github-slugger": "^1.2.1", @@ -76,6 +76,8 @@ "strip-markdown": "^3.1.0", "unified": "^8.3.2", "unist-util-map": "^2.0.0", + "unist-util-remove": "^2.0.0", + "unist-util-remove-position": "2.0.1", "unist-util-select": "^3.0.0", "unist-util-visit": "^2.0.0", "uri-js": "^4.2.2", diff --git a/src/defaults/html.pipe.js b/src/defaults/html.pipe.js index 8855dbbad..69611d72d 100644 --- a/src/defaults/html.pipe.js +++ b/src/defaults/html.pipe.js @@ -41,6 +41,8 @@ const addHeaders = require('../html/add-headers'); const timing = require('../utils/timing'); const sanitize = require('../html/sanitize'); const removeHlxProps = require('../html/removeHlxProps'); +const dataEmbeds = require('../html/fetch-data'); +const dataSections = require('../html/data-sections'); /* eslint newline-per-chained-call: off */ @@ -67,6 +69,7 @@ const htmlpipe = (cont, context, action) => { .use(parse).expose('parse') .use(parseFrontmatter) .use(embeds) + .use(dataEmbeds) .use(smartypants) .use(iconize) .use(sections) @@ -74,6 +77,7 @@ const htmlpipe = (cont, context, action) => { .use(unwrapSoleImages) .use(selectstrain) .use(selecttest) + .use(dataSections) .use(html).expose('html') .use(sanitize).when(paranoid) .use(cont) diff --git a/src/html/data-sections.js b/src/html/data-sections.js new file mode 100644 index 000000000..3be8cdee2 --- /dev/null +++ b/src/html/data-sections.js @@ -0,0 +1,204 @@ +/* + * Copyright 2020 Adobe. All rights reserved. + * This file is licensed to you under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. You may obtain a copy + * of the License at http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under + * the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR REPRESENTATIONS + * OF ANY KIND, either express or implied. See the License for the specific language + * governing permissions and limitations under the License. + */ +const { selectAll } = require('unist-util-select'); +const remove = require('unist-util-remove'); +const visit = require('unist-util-visit'); +const { + deepclone, trySlidingWindow, map, list, reject, contains, is, pipe, setdefault, +} = require('ferrum'); +const removePosition = require('unist-util-remove-position'); +const dotprop = require('dot-prop'); +const { merge } = require('../utils/cache-helper'); + +const pattern = /{{([^{}]+)}}/g; +/** + * Copied from 'unist-util-map' and promisified. + * @param tree + * @param iteratee + * @returns {Promise} + */ +async function pmap(tree, iteratee) { + async function preorder(node) { + async function bound(child) { + return preorder(child); + } + const { children } = node; + const newNode = { ...await iteratee(node) }; + + if (children) { + newNode.children = await Promise.all(children.map(bound)); + } + return newNode; + } + return preorder(tree); +} + +/** + * Finds all MDAST nodes that have a placeholder value and calls + * a user-provided callback function. + * @param {MDAST} section an MDAST node + * @param {*} handlefn a callback function to handle the placeholder + */ +function findPlaceholders(section, handlefn) { + visit(section, (node) => { + if (node.value && pattern.test(node.value)) { + handlefn(node, 'value'); + } + if (node.alt && pattern.test(node.alt)) { + handlefn(node, 'alt'); + } + if (node.url && pattern.test(node.url)) { + handlefn(node, 'url'); + } + if (node.title && pattern.test(node.title)) { + handlefn(node, 'title'); + } + }); +} + +/** + * Determines if an MDAST node contains placeholders like `{{foo}}` + * @param {MDAST} section + */ +function hasPlaceholders(section) { + try { + findPlaceholders(section, () => { + throw new Error('Placeholder detected'); + }); + return false; + } catch { + return true; + } +} + +/** + * @param {MDAST} section + */ +function fillPlaceholders(section) { + if (!section.meta || (!section.meta.embedData && !Array.isArray(section.meta.embedData))) { + return; + } + const data = section.meta.embedData; + // required to make deepclone below work + removePosition(section); + + const children = data.reduce((p, value) => { + const workingcopy = deepclone(section); + + findPlaceholders(workingcopy, (node, prop) => { + if (typeof node[prop] === 'string') { + node[prop] = node[prop].replace(pattern, (_, expr) => dotprop.get(value, expr)); + } + }); + return [...p, ...workingcopy.children]; + }, []); + + section.children = children; + delete section.meta.embedData; +} + +function normalizeLists(section) { + function cleanupLists([first, second]) { + // two consecutive, identical lists + if (first.type === 'list' && second.type === 'list' + && first.ordered === second.ordered + && first.start === second.start + && first.spread === second.spread) { + // move the children of the first to the second list + second.children = [...first.children, ...second.children]; + first.children = []; + // mark the first list to be emptied + return first; + } + return null; + } + + // get the sequence of lists to be removed + const emptiedLists = pipe( + trySlidingWindow(section.children, 2), + map(cleanupLists), + reject(is(null)), + list, + ); + + // perform the cleanup + section.children = pipe( + section.children, // take all existing children + reject((child) => contains(is(child))(emptiedLists)), // reject if they are in the list above + list, // make a nice array because the rest of the world doesn't like iterators yet + ); +} + +async function fillDataSections(context, { downloader, logger }) { + const { content: { mdast } } = context; + async function extractData(section) { + return pmap(section, async (node) => { + if (node.type === 'dataEmbed') { + const task = downloader.getTaskById(`dataEmbed:${node.url}`); + const downloadeddata = await task; + if (downloadeddata.status !== 200) { + logger.warn(`Bad status code (${downloadeddata.status}) for data embed ${node.url}`); + return node; + } + try { + const json = JSON.parse(downloadeddata.body); + if (!Array.isArray(json)) { + logger.warn(`Expected array for data embed ${node.url}, got ${typeof json}`); + return node; + } + section.meta.embedData = json; + + // remember that we are using this source so that we can compute the + // surrogate key later + setdefault(context.content, 'sources', []); + context.content.sources.push(node.url); + + // pass the cache control header through + const res = setdefault(context, 'response', {}); + const headers = setdefault(res, 'headers', {}); + + headers['Cache-Control'] = merge( + headers['Cache-Control'], + downloadeddata.headers.get('cache-control'), + ); + } catch (e) { + logger.warn(`Unable to parse JSON for data embed ${node.url}: ${e.message}`); + return node; + } + } + return node; + }); + } + + async function applyDataSections(section) { + await extractData(section); + remove(section, 'dataEmbed'); + fillPlaceholders(section); + normalizeLists(section); + } + + const dataSections = selectAll('section', mdast); + + // extract data from all sections + await Promise.all(dataSections + .filter(hasPlaceholders) + .map(applyDataSections)); + + if (dataSections.length === 0 && hasPlaceholders(mdast)) { + // extract data from the root node (in case there are no sections) + await applyDataSections(mdast); + } + + remove(mdast, 'dataEmbed'); +} + +module.exports = fillDataSections; diff --git a/src/html/fetch-data.js b/src/html/fetch-data.js new file mode 100644 index 000000000..b693c0e0c --- /dev/null +++ b/src/html/fetch-data.js @@ -0,0 +1,33 @@ +/* + * Copyright 2020 Adobe. All rights reserved. + * This file is licensed to you under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. You may obtain a copy + * of the License at http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under + * the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR REPRESENTATIONS + * OF ANY KIND, either express or implied. See the License for the specific language + * governing permissions and limitations under the License. + */ +const { selectAll } = require('unist-util-select'); +const { + pipe, map, uniq, list, +} = require('ferrum'); + +function fetch({ content: { mdast } }, { downloader, logger, secrets: { DATA_EMBED_SERVICE } }) { + const fetches = pipe( + selectAll('dataEmbed', mdast), + map((node) => node.url), + uniq, + map((url) => { + logger.info(`fetching ${DATA_EMBED_SERVICE}/${url}`); + return downloader.fetch({ + uri: `${DATA_EMBED_SERVICE}/${url}`, + id: `dataEmbed:${url}`, + }); + }), + ); + list(fetches); +} + +module.exports = fetch; diff --git a/src/html/find-embeds.js b/src/html/find-embeds.js index 2625001af..1107ea867 100644 --- a/src/html/find-embeds.js +++ b/src/html/find-embeds.js @@ -91,8 +91,14 @@ function internalImgEmbed({ type, children }, base, contentext, resourceext) { return false; } -function embed(uri, node, whitelist = '', logger) { - if ((uri.scheme === 'http' || uri.scheme === 'https') && mm.some(uri.host, whitelist.split(','))) { +function embed(uri, node, whitelist = '', datawhitelist = '', logger) { + if ((uri.scheme === 'http' || uri.scheme === 'https') && mm.some(uri.host, datawhitelist.split(','))) { + const children = [{ ...node }]; + node.type = 'dataEmbed'; + node.children = children; + node.url = URI.serialize(uri); + delete node.value; + } else if ((uri.scheme === 'http' || uri.scheme === 'https') && mm.some(uri.host, whitelist.split(','))) { const children = [{ ...node }]; node.type = 'embed'; node.children = children; @@ -116,16 +122,23 @@ function internalembed(uri, node, extension) { } function find({ content: { mdast }, request: { extension, url } }, - { logger, secrets: { EMBED_WHITELIST, EMBED_SELECTOR }, request: { params: { path } } }) { + { + logger, secrets: { + EMBED_WHITELIST, + EMBED_SELECTOR, + DATA_EMBED_WHITELIST, + }, + request: { params: { path } }, + }) { const resourceext = `.${extension}`; const contentext = p.extname(path); map(mdast, (node, _, parent) => { if (node.type === 'inlineCode' && gatsbyEmbed(node.value)) { - embed(gatsbyEmbed(node.value), node, EMBED_WHITELIST, logger); + embed(gatsbyEmbed(node.value), node, EMBED_WHITELIST, DATA_EMBED_WHITELIST, logger); } else if (node.type === 'paragraph' && iaEmbed(node, parent)) { - embed(iaEmbed(node, parent), node, EMBED_WHITELIST, logger); + embed(iaEmbed(node, parent), node, EMBED_WHITELIST, DATA_EMBED_WHITELIST, logger); } else if (node.type === 'paragraph' && imgEmbed(node)) { - embed(imgEmbed(node), node, EMBED_WHITELIST, logger); + embed(imgEmbed(node), node, EMBED_WHITELIST, DATA_EMBED_WHITELIST, logger); } else if (node.type === 'inlineCode' && internalGatsbyEmbed(node.value, url, contentext, resourceext)) { internalembed(internalGatsbyEmbed(node.value, url, contentext, resourceext), node, `.${EMBED_SELECTOR}.${extension}`); diff --git a/src/schemas/mdast.schema.json b/src/schemas/mdast.schema.json index 731825fea..2f0912f0e 100644 --- a/src/schemas/mdast.schema.json +++ b/src/schemas/mdast.schema.json @@ -48,6 +48,7 @@ "footnote", "footnoteReference", "embed", + "dataEmbed", "section", "icon" ], @@ -79,6 +80,7 @@ "footnote": "A footnote", "footnoteReference": "A reference to a footnote", "embed": "Content embedded from another page, identified by the `url` attribute.", + "dataEmbed": "Data embedded from another data source (API), identified by the `url` attribute.", "section": "A section within the document. Sections serve as a high-level structure of a single markdown document and can have their own section-specific front matter metadata.", "icon": "An SVG icon, identified by the syntax `:foo:`" }, diff --git a/src/schemas/secrets.schema.json b/src/schemas/secrets.schema.json index db04b082f..d87498baa 100644 --- a/src/schemas/secrets.schema.json +++ b/src/schemas/secrets.schema.json @@ -37,14 +37,24 @@ }, "EMBED_WHITELIST": { "type": "string", - "description": "Comma-separated list of allowed hostnames for embeds. Supports `*.example.com` as a subdomain wildcard. Use `*` to allow all embeds (potentially insecure)", - "default": "www.youtube.com, spark.adobe.com, unsplash.com/photos, soundcloud.com" + "description": "Comma-separated list of allowed hostnames for embeds. Supports `*.example.com` as a subdomain wildcard. Use `*` to allow all embeds (potentially insecure and conflicting with `DATA_EMBED_WHITELIST`)", + "default": "www.youtube.com, spark.adobe.com, unsplash.com, soundcloud.com" + }, + "DATA_EMBED_WHITELIST": { + "type": "string", + "description": "Comma-separated list of allowed hostnames for data embeds. Supports `*.example.com` as a subdomain wildcard. Use `*` to allow all embeds (potentially insecure and conflicting with `EMBED_WHITELIST`)", + "default": "docs.google.com" }, "EMBED_SERVICE": { "type": "string", "description": "URL of an Embed Service that takes the appended URL and returns an embeddable HTML representation.", "default": "https://adobeioruntime.net/api/v1/web/helix/helix-services/embed@v1" }, + "DATA_EMBED_SERVICE": { + "type": "string", + "description": "URL of a DataEmbed Service that takes the appended URL and returns an iterable JSON representation.", + "default": "https://adobeioruntime.net/api/v1/web/helix/helix-services/data-embed@v1" + }, "EMBED_SELECTOR": { "type": "string", "description": "Selector to be used when resolving internal embeds.", diff --git a/src/utils/Downloader.js b/src/utils/Downloader.js index 2086ee949..495a41223 100644 --- a/src/utils/Downloader.js +++ b/src/utils/Downloader.js @@ -82,7 +82,7 @@ class Downloader { * Schedules a task that fetches a web resource. * @param {object} opts options. * @param {object} opts.uri URI to download - * @param {object} opts.options Fetch options passed to the underling helix-fetch. + * @param {object} opts.options Fetch options passed to the underlying helix-fetch. * @param {string} opts.id Some id to later identify the task. * @param {number} opts.timeout Override global timeout * @param {boolean} opts.errorOn404 Treat 404 as error. diff --git a/src/utils/cache-helper.js b/src/utils/cache-helper.js new file mode 100644 index 000000000..60d4068ae --- /dev/null +++ b/src/utils/cache-helper.js @@ -0,0 +1,52 @@ +/* + * Copyright 2020 Adobe. All rights reserved. + * This file is licensed to you under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. You may obtain a copy + * of the License at http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under + * the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR REPRESENTATIONS + * OF ANY KIND, either express or implied. See the License for the specific language + * governing permissions and limitations under the License. + */ +function directives(expression = '') { + const retval = expression.split(',') + .map((s) => s.trim()) + .filter((s) => !!s) + .map((s) => s.split('=')) + .map(([ + directive, + value]) => [directive, Number.isNaN(Number.parseInt(value, 10)) + ? true + : Number.parseInt(value, 10)]) + .reduce((obj, [directive, value]) => { + obj[directive] = value; + return obj; + }, {}); + return retval; +} + +function format(dirs = {}) { + return Object.entries(dirs) + .map(([directive, value]) => ((value === true || value === 1) ? directive : `${directive}=${value}`)) + .join(', '); +} + +function merge(in1 = '', in2 = '') { + const dirs1 = typeof in1 === 'string' ? directives(in1) : in1; + const dirs2 = typeof in2 === 'string' ? directives(in2) : in2; + + const keys = [...Object.keys(dirs1), ...Object.keys(dirs2)]; + + const mergeval = keys.reduce((merged, key) => { + merged[key] = Math.min( + dirs1[key] || Number.MAX_SAFE_INTEGER, + dirs2[key] || Number.MAX_SAFE_INTEGER, + ); + return merged; + }, {}); + + return typeof in1 === 'string' ? format(mergeval) : mergeval; +} + +module.exports = { directives, format, merge }; diff --git a/test/fixtures/example-embeds.html b/test/fixtures/example-embeds.html new file mode 100644 index 000000000..67f702bd5 --- /dev/null +++ b/test/fixtures/example-embeds.html @@ -0,0 +1,245 @@ +

Hypermedia Pipeline

+

Is foo bar?

+
+

+ This project provides helper functions and default implementations for + creating Hypermedia Processing Pipelines. +

+

+ It uses reducers and continuations to create a simple processing pipeline + that can pre-and post-process HTML, JSON, and other hypermedia. +

+

Status

+

+ codecovCircleCIGitHub licenseGitHub issuesnpmGreenkeeper badge +

+

Anatomy of a Pipeline

+

A pipeline consists of following main parts:

+ +

+ Each step of the pipeline is processing a single payload object, that will + slowly accumulate the return values of the functions above + through Object.assign. +

+

See below for the anatomy of a payload.

+

+ Typically, there is one pipeline for each content type supported and + pipeline are identified by file name, e.g. +

+ +

Building a Pipeline

+

+ A pipeline builder can be created by creating a CommonJS module that exports + a function pipe which accepts following arguments and returns a + Pipeline function. +

+ +

+ This project’s main entry provides a helper function for pipeline + construction and a few helper functions, so that a basic pipeline can be + constructed like this: +

+
// the pipeline itself const pipeline = require("@adobe/hypermedia-pipeline"); // helper functions and log const { adaptOWRequest, adaptOWResponse, log } = require('@adobe/hypermedia-pipeline/src/defaults/default.js'); module.exports.pipe = function(cont, params, secrets, logger = log) { logger.debug("Constructing Custom Pipeline"); return pipeline() .pre(adaptOWRequest) // optional: turns OpenWhisk-style arguments into a proper payload .once(cont) // required: execute the continuation function .post(adaptOWResponse) // optional: turns the Payload into an OpenWhisk-style response } 
+

+ In a typical pipeline, you will add additional processing steps as + .pre(require('some-module')) or as + .post(require('some-module')). +

+

The Main Function

+

+ The main function is typically a pure function that converts the + request, context, and + content properties of the payload into a + response object. +

+

+ In most scenarios, the main function is compiled from a template in a + templating language like HTL, JST, or JSX. +

+

+ Typically, there is one template (and thus one main function) for each + content variation of the file type. Content variations are identified by a + selector (the piece of the file name before the file extension, e.g. in + example.navigation.html the selector would be + navigation). If no selector is provided, the template is the + default template for the pipeline. +

+

Examples of possible template names include:

+ +

(Optional) The Wrapper Function

+

+ Sometimes it is neccessary to pre-process the payload in a template-specific + fashion. This wrapper function (often called “Pre-JS” for brevity sake) + allows the full transformation of the pipeline’s payload. +

+

+ Compared to the pipeline-specific pre-processing functions which handle the + request, content, and response, the focus of the wrapper function is + implementing business logic needed for the main template function. This + allows for a clean separation between: +

+
    +
  1. + presentation (in the main function, often expressed in declarative + templates) +
  2. +
  3. + business logic (in the wrapper function, often expressed in imperative + code) +
  4. +
  5. + content-type specific implementation (in the pipeline, expressed in + functional code) +
  6. +
+

A simple implementation of a wrapper function would look like this:

+
// All wrapper functions must export `pre` // The functions takes following arguments: // - `cont` (the continuation function, i.e. the main template function) // - `payload` (the payload of the pipeline) module.exports.pre = (cont, payload) => { const {request, content, context, response} = payload; // modifying the payload content before invoking the main function content.hello = 'World'; const modifiedpayload = {request, content, context, response}; // invoking the main function with the new payload. Capturing the response // payload for further modification const responsepayload = cont(modifiedpayload); // Adding a value to the payload response const modifiedresponse = modifiedpayload.response; modifiedresponse.hello = 'World'; return Object.assign(modifiedpayload, modifiedresponse); } 
+

Pre-Processing Functions

+

Pre-Processing functions are meant to:

+ +

Post-Processing Functions

+

Post-Processing functions are meant to:

+ +

Anatomy of the Payload

+

Following main properties exist:

+ +

The request object

+ +

The content object

+ +

The response object

+ +

The context object

+

TBD: used for stuff that is neither content, request, or response

+

The error object

+

+ This object is only set when there has been an error during pipeline + processing. Any step in the pipeline may set the error object. + Subsequent steps should simply skip any processing if they encounter an + error object. +

+

+ Alternatively, steps can attempt to handle the error object, + for instance by generating a formatted error message and leaving it in + response.body. +

+

The only known property in error is

+ +
diff --git a/test/fixtures/example-embeds.md b/test/fixtures/example-embeds.md new file mode 100644 index 000000000..5455fde3c --- /dev/null +++ b/test/fixtures/example-embeds.md @@ -0,0 +1,182 @@ +# Hypermedia Pipeline + +--- + +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +Is foo {{foo}}? + +--- + +This project provides helper functions and default implementations for creating Hypermedia Processing Pipelines. + +It uses reducers and continuations to create a simple processing pipeline that can pre-and post-process HTML, JSON, and other hypermedia. + +# Status + +[![codecov](https://img.shields.io/codecov/c/github/adobe/hypermedia-pipeline.svg)](https://codecov.io/gh/adobe/hypermedia-pipeline) +[![CircleCI](https://img.shields.io/circleci/project/github/adobe/hypermedia-pipeline.svg)](https://circleci.com/gh/adobe/parcel-plugin-htl) +[![GitHub license](https://img.shields.io/github/license/adobe/hypermedia-pipeline.svg)](https://github.com/adobe/hypermedia-pipeline/blob/master/LICENSE.txt) +[![GitHub issues](https://img.shields.io/github/issues/adobe/hypermedia-pipeline.svg)](https://github.com/adobe/hypermedia-pipeline/issues) +[![npm](https://img.shields.io/npm/dw/@adobe/hypermedia-pipeline.svg)](https://www.npmjs.com/package/@adobe/hypermedia-pipeline) [![Greenkeeper badge](https://badges.greenkeeper.io/adobe/hypermedia-pipeline.svg)](https://greenkeeper.io/) + +## Anatomy of a Pipeline + +A pipeline consists of following main parts: + +- pre-processing functions +- the main response generating function +- an optional wrapper function +- post-processing functions + +Each step of the pipeline is processing a single payload object, that will slowly accumulate the `return` values of the functions above through `Object.assign`. + +See below for the anatomy of a payload. + +Typically, there is one pipeline for each content type supported and pipeline are identified by file name, e.g. + +- `html.pipe.js` – creates HTML documents with the `text/html` content-type +- `json.pipe.js` – creates JSON documents with the `application/json` content-type + +### Building a Pipeline + +A pipeline builder can be created by creating a CommonJS module that exports a function `pipe` which accepts following arguments and returns a Pipeline function. + +- `cont`: the main function that will be executed as a continuation of the pipeline +- `params`: a map of parameters that are interpreted at runtime +- `secrets`: a map of protected configuration parameters like API keys that should be handled with care. By convention, all keys in `secret` are in ALL_CAPS_SNAKE_CASE. +- `logger`: a [Winston](https://www.github.com/winstonjs/winston) logger + +This project's main entry provides a helper function for pipeline construction and a few helper functions, so that a basic pipeline can be constructed like this: + +```javascript +// the pipeline itself +const pipeline = require("@adobe/hypermedia-pipeline"); +// helper functions and log +const { adaptOWRequest, adaptOWResponse, log } = require('@adobe/hypermedia-pipeline/src/defaults/default.js'); + +module.exports.pipe = function(cont, params, secrets, logger = log) { + logger.debug("Constructing Custom Pipeline"); + + return pipeline() + .pre(adaptOWRequest) // optional: turns OpenWhisk-style arguments into a proper payload + .once(cont) // required: execute the continuation function + .post(adaptOWResponse) // optional: turns the Payload into an OpenWhisk-style response +} +``` + +In a typical pipeline, you will add additional processing steps as `.pre(require('some-module'))` or as `.post(require('some-module'))`. + +### The Main Function + +The main function is typically a pure function that converts the `request`, `context`, and `content` properties of the payload into a `response` object. + +In most scenarios, the main function is compiled from a template in a templating language like HTL, JST, or JSX. + +Typically, there is one template (and thus one main function) for each content variation of the file type. Content variations are identified by a selector (the piece of the file name before the file extension, e.g. in `example.navigation.html` the selector would be `navigation`). If no selector is provided, the template is the default template for the pipeline. + +Examples of possible template names include: + +- `html.jsx` (compiled to `html.js`) – default for the HTML pipeline +- `html.navigation.jst` (compiled to `html.navigation.js`) – renders the navigation +- `dropdown.json.js` (not compiled) – creates pure JSON output +- `dropdown.html.htl` (compiled to `dropdown.html.js`) – renders the dropdown component + + +### (Optional) The Wrapper Function + +Sometimes it is neccessary to pre-process the payload in a template-specific fashion. This wrapper function (often called "Pre-JS" for brevity sake) allows the full transformation of the pipeline's payload. + +Compared to the pipeline-specific pre-processing functions which handle the request, content, and response, the focus of the wrapper function is implementing business logic needed for the main template function. This allows for a clean separation between: + +1. presentation (in the main function, often expressed in declarative templates) +2. business logic (in the wrapper function, often expressed in imperative code) +3. content-type specific implementation (in the pipeline, expressed in functional code) + +A simple implementation of a wrapper function would look like this: + +```javascript +// All wrapper functions must export `pre` +// The functions takes following arguments: +// - `cont` (the continuation function, i.e. the main template function) +// - `payload` (the payload of the pipeline) +module.exports.pre = (cont, payload) => { + const {request, content, context, response} = payload; + + // modifying the payload content before invoking the main function + content.hello = 'World'; + const modifiedpayload = {request, content, context, response}; + + // invoking the main function with the new payload. Capturing the response + // payload for further modification + + const responsepayload = cont(modifiedpayload); + + // Adding a value to the payload response + const modifiedresponse = modifiedpayload.response; + modifiedresponse.hello = 'World'; + + return Object.assign(modifiedpayload, modifiedresponse); +} +``` + +### Pre-Processing Functions + +Pre-Processing functions are meant to: + +- parse and process request parameters +- fetch and parse the requested content +- transform the requested content + +### Post-Processing Functions + +Post-Processing functions are meant to: + +- process and transform the response + +## Anatomy of the Payload + +Following main properties exist: + +- `request` +- `content` +- `response` +- `context` +- `error` + +### The `request` object + +- `params`: a map of request parameters +- `headers`: a map of HTTP headers + +### The `content` object + +- `body`: the unparsed content body as a `string` +- `mdast`: the parsed [Markdown AST](https://github.com/syntax-tree/mdast) +- `meta`: a map metadata properties, including + - `title`: title of the document + - `intro`: a plain-text introduction or description + - `type`: the content type of the document +- `htast`: the HTML AST +- `html`: a string of the content rendered as HTML +- `children`: an array of top-level elements of the HTML-rendered content + +### The `response` object + +- `body`: the unparsed response body as a `string` +- `headers`: a map of HTTP response headers +- `status`: the HTTP status code + +### The `context` object + +TBD: used for stuff that is neither content, request, or response + +### The `error` object + +This object is only set when there has been an error during pipeline processing. Any step in the pipeline may set the `error` object. Subsequent steps should simply skip any processing if they encounter an `error` object. + +Alternatively, steps can attempt to handle the `error` object, for instance by generating a formatted error message and leaving it in `response.body`. + +The only known property in `error` is + +- `message`: the error message diff --git a/test/fixtures/example.html b/test/fixtures/example.html new file mode 100644 index 000000000..94833f359 --- /dev/null +++ b/test/fixtures/example.html @@ -0,0 +1,239 @@ +

Hypermedia Pipeline

+

+ This project provides helper functions and default implementations for + creating Hypermedia Processing Pipelines. +

+

+ It uses reducers and continuations to create a simple processing pipeline that + can pre-and post-process HTML, JSON, and other hypermedia. +

+

Status

+

+ codecovCircleCIGitHub licenseGitHub issuesnpmGreenkeeper badge +

+

Anatomy of a Pipeline

+

A pipeline consists of following main parts:

+ +

+ Each step of the pipeline is processing a single payload object, that will + slowly accumulate the return values of the functions above + through Object.assign. +

+

See below for the anatomy of a payload.

+

+ Typically, there is one pipeline for each content type supported and pipeline + are identified by file name, e.g. +

+ +

Building a Pipeline

+

+ A pipeline builder can be created by creating a CommonJS module that exports a + function pipe which accepts following arguments and returns a + Pipeline function. +

+ +

+ This project’s main entry provides a helper function for pipeline construction + and a few helper functions, so that a basic pipeline can be constructed like + this: +

+
// the pipeline itself const pipeline = require("@adobe/hypermedia-pipeline"); // helper functions and log const { adaptOWRequest, adaptOWResponse, log } = require('@adobe/hypermedia-pipeline/src/defaults/default.js'); module.exports.pipe = function(cont, params, secrets, logger = log) { logger.debug("Constructing Custom Pipeline"); return pipeline() .pre(adaptOWRequest) // optional: turns OpenWhisk-style arguments into a proper payload .once(cont) // required: execute the continuation function .post(adaptOWResponse) // optional: turns the Payload into an OpenWhisk-style response } 
+

+ In a typical pipeline, you will add additional processing steps as + .pre(require('some-module')) or as + .post(require('some-module')). +

+

The Main Function

+

+ The main function is typically a pure function that converts the + request, context, and + content properties of the payload into a + response object. +

+

+ In most scenarios, the main function is compiled from a template in a + templating language like HTL, JST, or JSX. +

+

+ Typically, there is one template (and thus one main function) for each content + variation of the file type. Content variations are identified by a selector + (the piece of the file name before the file extension, e.g. in + example.navigation.html the selector would be + navigation). If no selector is provided, the template is the + default template for the pipeline. +

+

Examples of possible template names include:

+ +

(Optional) The Wrapper Function

+

+ Sometimes it is neccessary to pre-process the payload in a template-specific + fashion. This wrapper function (often called “Pre-JS” for brevity sake) allows + the full transformation of the pipeline’s payload. +

+

+ Compared to the pipeline-specific pre-processing functions which handle the + request, content, and response, the focus of the wrapper function is + implementing business logic needed for the main template function. This allows + for a clean separation between: +

+
    +
  1. + presentation (in the main function, often expressed in declarative + templates) +
  2. +
  3. + business logic (in the wrapper function, often expressed in imperative code) +
  4. +
  5. + content-type specific implementation (in the pipeline, expressed in + functional code) +
  6. +
+

A simple implementation of a wrapper function would look like this:

+
// All wrapper functions must export `pre` // The functions takes following arguments: // - `cont` (the continuation function, i.e. the main template function) // - `payload` (the payload of the pipeline) module.exports.pre = (cont, payload) => { const {request, content, context, response} = payload; // modifying the payload content before invoking the main function content.hello = 'World'; const modifiedpayload = {request, content, context, response}; // invoking the main function with the new payload. Capturing the response // payload for further modification const responsepayload = cont(modifiedpayload); // Adding a value to the payload response const modifiedresponse = modifiedpayload.response; modifiedresponse.hello = 'World'; return Object.assign(modifiedpayload, modifiedresponse); } 
+

Pre-Processing Functions

+

Pre-Processing functions are meant to:

+ +

Post-Processing Functions

+

Post-Processing functions are meant to:

+ +

Anatomy of the Payload

+

Following main properties exist:

+ +

The request object

+ +

The content object

+ +

The response object

+ +

The context object

+

TBD: used for stuff that is neither content, request, or response

+

The error object

+

+ This object is only set when there has been an error during pipeline + processing. Any step in the pipeline may set the error object. + Subsequent steps should simply skip any processing if they encounter an + error object. +

+

+ Alternatively, steps can attempt to handle the error object, for + instance by generating a formatted error message and leaving it in + response.body. +

+

The only known property in error is

+ diff --git a/test/testCacheHelper.js b/test/testCacheHelper.js new file mode 100644 index 000000000..ae355a87b --- /dev/null +++ b/test/testCacheHelper.js @@ -0,0 +1,84 @@ +/* + * Copyright 2020 Adobe. All rights reserved. + * This file is licensed to you under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. You may obtain a copy + * of the License at http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under + * the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR REPRESENTATIONS + * OF ANY KIND, either express or implied. See the License for the specific language + * governing permissions and limitations under the License. + */ +/* eslint-env mocha */ +const assert = require('assert'); +const { directives, format, merge } = require('../src/utils/cache-helper'); + +describe('Cache Helper Tests (surrogate)', () => { + it('directive parses directives', () => { + assert.deepStrictEqual(directives('max-age=300'), { + 'max-age': 300, + }); + + assert.deepStrictEqual(directives('s-maxage=300, max-age=300'), { + 's-maxage': 300, + 'max-age': 300, + }); + + assert.deepStrictEqual(directives('s-maxage=300, max-age=300, public'), { + 's-maxage': 300, + 'max-age': 300, + public: true, + }); + + assert.deepStrictEqual(directives(''), {}); + + assert.deepStrictEqual(directives(), {}); + }); + + it('format formats directives', () => { + assert.equal(format({}), ''); + assert.equal(format(undefined), ''); + + assert.equal(format({ + 'max-age': 300, + public: true, + }), 'max-age=300, public'); + }); + + it('merge merges two directives', () => { + assert.deepEqual(merge({}, {}), {}); + + assert.deepEqual(merge({ + 'max-age': 300, + }, { + 's-maxage': 300, + }), { + 's-maxage': 300, + 'max-age': 300, + }); + + assert.deepEqual(merge({ + 'max-age': 300, + 's-maxage': 600, + }, { + 's-maxage': 300, + 'max-age': 600, + }), { + 's-maxage': 300, + 'max-age': 300, + }); + + assert.deepEqual(merge({ + public: true, + }, { + private: true, + }), { + public: true, + private: true, + }); + + assert.equal(merge('max-age=300, public', 'max-age=600'), 'max-age=300, public'); + + assert.equal(merge(), ''); + }); +}); diff --git a/test/testDataEmbeds.js b/test/testDataEmbeds.js new file mode 100644 index 000000000..9c959f789 --- /dev/null +++ b/test/testDataEmbeds.js @@ -0,0 +1,386 @@ +/* + * Copyright 2018 Adobe. All rights reserved. + * This file is licensed to you under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. You may obtain a copy + * of the License at http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under + * the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR REPRESENTATIONS + * OF ANY KIND, either express or implied. See the License for the specific language + * governing permissions and limitations under the License. + */ +/* eslint-env mocha */ +const assert = require('assert'); +const path = require('path'); +const fs = require('fs-extra'); +const { dom: { assertEquivalentNode } } = require('@adobe/helix-shared'); +const { logging } = require('@adobe/helix-testutils'); +const nock = require('nock'); +const { JSDOM } = require('jsdom'); +const { pipe } = require('../src/defaults/html.pipe.js'); +const coerce = require('../src/utils/coerce-secrets'); +const Downloader = require('../src/utils/Downloader.js'); + +const params = { + path: '/hello.md', + __ow_method: 'get', + owner: 'trieloff', + __ow_headers: { + 'X-Forwarded-Port': '443', + 'X-CDN-Request-Id': '2a208a89-e071-44cf-aee9-220880da4c1e', + 'Fastly-Client': '1', + 'X-Forwarded-Host': 'runtime.adobe.io', + 'Upgrade-Insecure-Requests': '1', + Host: 'controller-a', + Connection: 'close', + 'Fastly-SSL': '1', + 'X-Request-Id': 'RUss5tPdgOfw74a68aNc24FeTipGpVfW', + 'X-Branch': 'master', + 'Accept-Language': 'en-US, en;q=0.9, de;q=0.8', + 'X-Forwarded-Proto': 'https', + 'Fastly-Orig-Accept-Encoding': 'gzip', + 'X-Varnish': '267021320', + DNT: '1', + 'X-Forwarded-For': + '192.147.117.11, 157.52.92.27, 23.235.46.33, 10.64.221.107', + 'X-Host': 'www.primordialsoup.life', + Accept: + 'text/html, application/xhtml+xml, application/xml;q=0.9, image/webp, image/apng, */*;q=0.8', + 'X-Real-IP': '10.64.221.107', + 'X-Forwarded-Server': 'cache-lcy19249-LCY, cache-iad2127-IAD', + 'Fastly-Client-IP': '192.147.117.11', + 'Perf-Br-Req-In': '1529585370.116', + 'X-Timer': 'S1529585370.068237,VS0,VS0', + 'Fastly-FF': + 'dc/x3e9z8KMmlHLQr8BEvVMmTcpl3y2YY5y6gjSJa3g=!LCY!cache-lcy19249-LCY, dc/x3e9z8KMmlHLQr8BEvVMmTcpl3y2YY5y6gjSJa3g=!LCY!cache-lcy19227-LCY, dc/x3e9z8KMmlHLQr8BEvVMmTcpl3y2YY5y6gjSJa3g=!IAD!cache-iad2127-IAD, dc/x3e9z8KMmlHLQr8BEvVMmTcpl3y2YY5y6gjSJa3g=!IAD!cache-iad2133-IAD', + 'Accept-Encoding': 'gzip', + 'User-Agent': + 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36', + }, + repo: 'soupdemo', + ref: 'master', + selector: 'md', +}; + +const secrets = { + REPO_RAW_ROOT: 'https://raw.githubusercontent.com/', + EMBED_WHITELIST: '*.youtube.com', + DATA_EMBED_WHITELIST: 'docs.google.com', +}; + +const logger = logging.createTestLogger({ + // tune this for debugging + level: 'debug', +}); + + +const crequest = { + extension: 'html', + url: '/test/test.html', +}; + +const doc1 = fs.readFileSync(path.resolve(__dirname, 'fixtures/example.md')).toString(); +const data1 = 'none'; +const html1 = fs.readFileSync(path.resolve(__dirname, 'fixtures/example.html')).toString(); + +const doc2 = fs.readFileSync(path.resolve(__dirname, 'fixtures/example-embeds.md')).toString(); +const data2 = [{ foo: 'bar' }]; +const html2 = fs.readFileSync(path.resolve(__dirname, 'fixtures/example-embeds.html')).toString(); + +describe('Integration Test with Data Embeds', () => { + afterEach(() => { + nock.restore(); + }); + + beforeEach(() => { + nock.restore(); + nock.activate(); + nock.cleanAll(); + }); + + async function testEmbeds(data, markdown, html, status = 200) { + nock('https://raw.githubusercontent.com') + .get('/adobe/test-repo/master/fstab.yaml') + .reply(() => [404]); + + nock('https://adobeioruntime.net') + .defaultReplyHeaders({ + 'Cache-Control': 'max-age=3600', + }) + .get(/.*/) + .reply(() => [status, data]); + + const action = coerce({ + request: { params }, + secrets, + logger, + }); + + const context = { + request: crequest, + content: { + body: markdown, + }, + }; + + action.downloader = new Downloader(context, action, { forceHttp1: true }); + action.logger = logger; + + const result = await pipe( + (mycontext) => { + if (!mycontext.response) { + mycontext.response = {}; + } + mycontext.response.status = 200; + mycontext.response.body = mycontext.content.document.body.innerHTML; + }, + context, + action, + ); + assert.equal(result.response.status, 200, result.error); + assert.equal(result.response.headers['Content-Type'], 'text/html'); + assertEquivalentNode( + result.response.document.body, + new JSDOM(html).window.document.body, + ); + + return result; + } + + it('html.pipe handles non-JSON responses gracefully', async () => testEmbeds( + 'This is not a JSON document!', + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:{{make}} {{model}}
  2. +
`, + 200, + )); + + it('html.pipe handles non-Array responses gracefully', async () => testEmbeds( + { thisis: 'not a JSON array' }, + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:{{make}} {{model}}
  2. +
`, + 200, + )); + + it('data embeds generate a surrogate key', async () => { + const res1 = await testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', + }, + ], + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:Nissan Sunny
  2. +
  3. My car:Renault Scenic
  4. +
  5. My car:Honda FR-V
  6. +
`, + ); + + const res2 = await testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', + }, + ], + ` +https://docs.google.com/spreadsheets/d/e/someotheruri/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:Nissan Sunny
  2. +
  3. My car:Renault Scenic
  4. +
  5. My car:Honda FR-V
  6. +
`, + ); + + assert.equal(res1.response.headers['Surrogate-Key'], 'PbTcuh0tIarmUOZM'); + assert.equal(res2.response.headers['Surrogate-Key'], 'IkqgcxcG5+q8/cOT'); + + assert.equal(res1.response.headers['Cache-Control'], 'max-age=3600'); + }); + + it('html.pipe processes data embeds in main document', async () => testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', + }, + ], + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:Nissan Sunny
  2. +
  3. My car:Renault Scenic
  4. +
  5. My car:Honda FR-V
  6. +
`, + )); + + it('html.pipe processes data embeds with dot notation', async () => testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', mileage: { from: 100000, to: 150000 }, + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', mileage: { from: 75000, to: 100000 }, + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', mileage: { from: 50000, to: 150000 }, + }, + ], + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +# My {{make}} {{model}} + +![{{make}} {{model}}]({{image}}) + +Built in {{year}}. Driven from {{mileage.from}} km to {{mileage.to}} km. +`, + `

My Nissan Sunny

+

Nissan Sunny

+

Built in 1992. Driven from 100000 km to 150000 km.

+ +

My Renault Scenic

+

Renault Scenic

+

Built in 2000. Driven from 75000 km to 100000 km.

+ +

My Honda FR-V

+

Honda FR-V

+

Built in 2005. Driven from 50000 km to 150000 km.

`, + )); + + it('html.pipe processes data embeds in sections', async () => testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', + }, + ], + ` +## My Cars + +--- + +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +- [![{{make}} {{model}}]({{image}})](cars-{{year}}.md) +`, + `
+

My Cars

+
+
+ +
`, + )); + + it('html.pipe handles error responses gracefully', async () => testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', + }, + ], + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:{{make}} {{model}}
  2. +
`, + 404, + )).timeout(10000); + + it('embed processing works with big files, even when there are few embeds', async () => { + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + await testEmbeds(data2, doc2, html2); + }).timeout(20000); + + it('embed processing works with big files, even when there are no embeds', async () => { + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + await testEmbeds(data1, doc1, html1); + }).timeout(20000); + + it('html.pipe handles error responses gracefully', async () => testEmbeds( + [ + { + make: 'Nissan', model: 'Sunny', year: 1992, image: 'nissan.jpg', + }, + { + make: 'Renault', model: 'Scenic', year: 2000, image: 'renault.jpg', + }, + { + make: 'Honda', model: 'FR-V', year: 2005, image: 'honda.png', + }, + ], + ` +https://docs.google.com/spreadsheets/d/e/2PACX-1vQ78BeYUV4gFee4bSxjN8u86aV853LGYZlwv1jAUMZFnPn5TnIZteDJwjGr2GNu--zgnpTY1E_KHXcF/pubhtml + +1. My car: [![{{make}} {{model}}]({{image}})](cars-{{year}}.md)`, + `
    +
  1. My car:{{make}} {{model}}
  2. +
`, + 404, + )).timeout(10000); +});