Skip to content

Commit

Permalink
Add linkTitle syntas. Change the scope of code span escaping and its …
Browse files Browse the repository at this point in the history
…metadata behavior. Cleanup. Improve docs.
  • Loading branch information
martincizek committed Dec 5, 2019
1 parent edece12 commit e00c32c
Show file tree
Hide file tree
Showing 23 changed files with 339 additions and 90 deletions.
5 changes: 0 additions & 5 deletions .babelrc

This file was deleted.

1 change: 1 addition & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
dist
tools/output
test/mytest*
test/browser/*
62 changes: 53 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,16 @@ The current full options are:
breakWww: false,
breaker: '<!-- -->',
allowedTransformations: [ 'entities', 'commonmark' ],
allowAddHttpScheme: false
allowAddHttpScheme: false,
inImage: false,
},
table: true, // default false
emphasisNonDelimiters: { // default true
maxIntrawordUnderscoreRun: false
maxIntrawordUnderscoreRun: false,
},
linkTitle: { // default true
delimiters: [ '"', '\'', '()' ],
alwaysEscapeDelimiters: [],
},
}
```
Expand All @@ -93,10 +98,12 @@ The predefined syntaxes are available as members of `GfmEscape.Syntax`:
the `isEncodable(input)` and `wouldBeUnaltered(input)` methods on the
`Syntax.cmAutolink` object.
- `codeSpan`: text rendered `` `here` ``.
- `linkTitle`: text rendered `[text](destination "here")` or
`[text](destination 'here')` or `[text](destination (here))`.
`input`: the string to escape. Please note that correct escaping is currently
only guaranteed when the input is trimmed and normalized in terms of whitespace.
The library does not perform whitespace normalizing on its own, as it is often
The library does not perfos qrm whitespace normalizing on its own, as it is often
ensured by the source's origin, e.g. `textContent` of a normalized HTML DOM.
Manual normalizing can be done with `input.trim().replace(/[ \t\n\r]+/g, ' ')`.
If it is intended to keep the source somewhat organized in lines, the minimum
Expand All @@ -110,19 +117,41 @@ no defaults, i.e. they are falsy by default. The following contexts are availabl
```js
{
inLink: true, // indicates suppressing nested links
inImage: true, // similar to inLink for ![this image text](img.png)
inTable: true, // indicates extra escaping of table contents
}
```
When escaping, `metadata` is extra input-output parameter that collects
metadata about the actual escaping. Currently `metadata` are used for
`codeSpan` syntax, where two output parameters `delimiter` and `space` are passed:
`codeSpan` syntax and `linkTitle` syntax.
```js
const escaper = new GfmEscape({ table: true }, GfmEscape.Syntax.codeSpan);
const x = {};
const x = {}; // not necessary as the surrounding delimiter is always '`'
const context = { inTable: true };
const output = escaper.escape('`array|string`', context, x);
console.log(`${x.delimiter}${x.space}${output}${x.space}${x.delimiter}`);
// `` `array\|string` ``
const escaped = escaper.escape('`array|string`', context, x);
console.log(`\`${escaped}\``); // `` `array\|string` ``
console.log(`${x.extraBacktickString.length} backtickts and ${x.extraSpace.length} spaces added.`);
// 1 backticks and 1 spaces added.
const linkTitleEscaper = new GfmEscape({}, GfmEscape.Syntax.linkTitle);
const x = {}; // needed as we let GfmEscape decide the surrounding delimiter
let escaped = escaper.escape('cool "link \'title\'"', context, x);
console.log(`${x.startDelimiter}${escaped}${x.endDelimiter}`);
// (cool "link 'title'")
escaped = escaper.escape('average link title', context, x);
console.log(`${x.startDelimiter}${escaped}${x.endDelimiter}`);
// "average link title"
const rigidLinkTitleEscaper = new GfmEscape({
linkTitle: {
delimiters: '"',
}
}, GfmEscape.Syntax.linkTitle);
// metadata not necessary, as the surronding delimiter will be always '"'
escaped = escaper.escape('cool "link \'title\'"');
console.log(`"${escaped}"`);
// "cool \"link 'title'\""
```
#### Escaping options: `strikethrough`
Expand Down Expand Up @@ -167,6 +196,10 @@ Suboptions:
- `allowAddHttpScheme`: add `http://` scheme when a transformation needs it to
work. E.g. `*www.orchi.tech,*` would become `\*<http://www.orchi.tech>,\*`
with the `commonmark` transformation.
- `inImage`: suggest if extended autolink treatment should be applied within
image text. Although the CommonMark spec says links are interpreted and just
the stripped plain text part renders to the `alt` attribute, cmark-gfm actually
does not do it for extended autolinks, so the default is false.
_How to choose the options_:
1. Consider rendering details of the target Markdown flavor. Backtranslation
Expand Down Expand Up @@ -195,7 +228,7 @@ not to escape them. E.g. in `My account is joe_average.`, the underscore stays
unescaped as `joe_average`, not ~~`joe\_average`~~.
Suboptions:
* `maxIntrawordUnderscoreRun`: if defined, it sets the maximum length of intraword
- `maxIntrawordUnderscoreRun`: if defined, it sets the maximum length of intraword
underscores to be kept as is. E.g. for `1` and input `joe_average or joe__average`,
the output would be `joe_average or joe\_\_average`. This is helpful for some renderers
like Redcarpet. Defaults to `undefined`.
Expand All @@ -206,6 +239,16 @@ Defaults to `false`, i.e. table pipes are not escaped. If enabled, rendering of
delimiter rows is suppressed by escaping its pipes and all pipes are escaped when in
table context.
#### Escaping options: `linkTitle`
Suboptions:
- `delimiters`: array of allowed delimiter to be chosen from or a single delimiter.
Delimiters are `"`, `'` and `()`. When more delimiters are allowed, GfmEscape picks
the least interferring one. The picked delimiter is returned in metadata, as shown
in the example above.
- `alwaysEscapeDelimiters`: array of delimiters that are always escaped.
## GFM escaping details
Terminology:
Expand Down Expand Up @@ -255,6 +298,7 @@ implementation of GFM Spec, we have found a few interesting details...
- `cmark_gfm-005`: Backslash escape in link destination, e.g.
`[foo](http://orchi.tech/foo\&lowbar;bar)` does not prevent entity reference
from interpreting in rendered HTML. We use entity encoding instead, i.e. `&amp;`.
The same applies to link titles.
## TODO
Expand Down
30 changes: 15 additions & 15 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"url": "git+https://github.com/orchitech/gfm-escape.git"
},
"dependencies": {
"union-replacer": "^1.1.0"
"union-replacer": "^1.2.0"
},
"devDependencies": {
"@babel/core": "^7.7.2",
Expand Down
17 changes: 15 additions & 2 deletions rollup.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,26 @@ export default [
commonjs(),
resolve(),
babel({
exclude: ['node_modules/**', 'tools/output/**', 'src/utils/**'],
exclude: ['node_modules/**', 'tools/output/**'],
presets: [
[
'@babel/env', {
modules: false,
targets: {
browsers: '> 1%, IE 11, not op_mini all, not dead',
node: 8,
esmodules: false,
},
useBuiltIns: false,
},
],
],
}),
],
output: [
{ file: pkg.main, format: 'cjs' },
{ file: pkg.module, format: 'es' },
{ file: pkg.browser, name: 'gfmescape', format: 'umd' },
{ file: pkg.browser, name: 'GfmEscape', format: 'umd' },
],
},
];
11 changes: 8 additions & 3 deletions src/GfmEscape.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@
* that aims to be truly complete and survive back-translation.
* @author Martin Cizek, Orchitech Solutions
* @see https://github.com/orchitech/gfm-escape#readme
* @license
* @license MIT
* @module
*/

import UnionReplacer from 'union-replacer';
import Syntax from './Syntax';
import defaultSetup from './defaultSetup';
import applyProcessors from './utils/applyProcessors';

class GfmEscape {
/**
Expand All @@ -24,6 +25,8 @@ class GfmEscape {
this.syntax = syntax;
this.opts = opts ? { ...opts } : {};
this.replacer = new UnionReplacer('gm');
this.preprocessors = [];
this.postprocessors = [];
setup(syntax).forEach(([replace, enabled]) => {
if (enabled) {
replace.call(this);
Expand All @@ -33,13 +36,15 @@ class GfmEscape {
this.cache = {};
}

escape(str, gfmContext = {}, metadata = {}) {
escape(input, gfmContext = {}, metadata = {}) {
const escapeCtx = {
escape: this,
gfmContext,
metadata,
};
return this.replacer.replace(str, escapeCtx);
let str = applyProcessors.call(escapeCtx, input, this.preprocessors);
str = this.replacer.replace(str, escapeCtx);
return applyProcessors.call(escapeCtx, str, this.postprocessors);
}
}

Expand Down
4 changes: 3 additions & 1 deletion src/Syntax.js
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
import BaseSyntax from './syntax/BaseSyntax';
import TextSyntax from './syntax/TextSyntax';
import LinkDestinationSyntax from './syntax/LinkDestinationSyntax';
import LinkTitleSyntax from './syntax/LinkTitleSyntax';
import CmAutolinkSyntax from './syntax/CmAutolinkSyntax';
import CodeSpanSyntax from './syntax/CodeSpanSyntax';

const Syntax = BaseSyntax;

/**
* Enumeration of GFM syntaxes used within {@link gfmSetupDefault}.
* GFM syntaxes used within {@link gfmSetupDefault}.
*/
Syntax.text = new TextSyntax();
Syntax.linkDestination = new LinkDestinationSyntax();
Syntax.linkTitle = new LinkTitleSyntax();
Syntax.cmAutolink = new CmAutolinkSyntax();
Syntax.codeSpan = new CodeSpanSyntax();

Expand Down
7 changes: 5 additions & 2 deletions src/defaultSetup.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,15 @@ import entityBackslashReplace from './replaces/entityBackslashReplace';
import entityEntityReplace from './replaces/entityEntityReplace';
import extAutolinkReplace from './replaces/extAutolinkReplace';
import inlineReplace from './replaces/inlineReplace';
import linkDestinationSpecialsReplace from './replaces/linkDestinationSpecialsReplace';
import linkDestinationReplace from './replaces/linkDestinationReplace';
import linkReplace from './replaces/linkReplace';
import linkTitleReplace from './replaces/linkTitleReplace';
import strikethroughReplace from './replaces/strikethroughReplace';
import tableDelimiterRowReplace from './replaces/tableDelimiterRowReplace';
import tablePipeReplace from './replaces/tablePipeReplace';
import CodeSpanSyntax from './syntax/CodeSpanSyntax';
import LinkDestinationSyntax from './syntax/LinkDestinationSyntax';
import LinkTitleSyntax from './syntax/LinkTitleSyntax';

const gfmSetupDefault = (s) => [
[codeSpanReplace, s.name === CodeSpanSyntax.name],
Expand All @@ -20,7 +22,8 @@ const gfmSetupDefault = (s) => [
[tableDelimiterRowReplace, s.blocksInterpreted],
[blockReplace, s.blocksInterpreted],
[tablePipeReplace, true],
[linkDestinationSpecialsReplace, s.name === LinkDestinationSyntax.name],
[linkDestinationReplace, s.name === LinkDestinationSyntax.name],
[linkTitleReplace, s.name === LinkTitleSyntax.name],
[linkReplace, s.isLink],
[entityEntityReplace, s.isLink],
[entityBackslashReplace, s.inlinesInterpreted],
Expand Down
40 changes: 26 additions & 14 deletions src/replaces/codeSpanReplace.js
Original file line number Diff line number Diff line change
@@ -1,21 +1,33 @@
const BACKTICK_RUN_RE = /(^`*)|(`+(?!.))|`+/;
const longestBacktickString = (str) => {
const m = str.match(/`+/g);
return m
? m.reduce((longest, current) => (
current.length > longest.length ? current : longest
), '')
: '';
};

function processBacktickRun({ match: [m, start, end] }) {
if (start !== undefined || m.length >= this.metadata.delimiter.length) {
this.metadata.delimiter = `${m}\``;
}
if (start !== undefined) {
this.metadata.space = start.length > 0 ? ' ' : '';
}
if (end && !this.metadata.space) {
this.metadata.space = ' ';
}
return m;
const SHOULD_ADD_SPACE_RE = /^`|^[ \r\n].*?[^ \r\n].*[ \r\n]$|`$/;

function scanDelimiters(input) {
const x = this.metadata;
x.extraBacktickString = longestBacktickString(input);
x.extraSpace = SHOULD_ADD_SPACE_RE.test(input) ? ' ' : '';
return input;
}

function addDelimiterExtras(output) {
const x = this.metadata;
const before = x.extraBacktickString + x.extraSpace;
const after = x.extraSpace + x.extraBacktickString;
return `${before}${output}${after}`;
}

/**
* Escape parentheses <, > and whitespace either as entites or in URL encoding.
* Adjust leading and trailing code span part according to contets and
* set metadata.
*/
export default function codeSpanReplace() {
this.replacer.addReplacement(BACKTICK_RUN_RE, processBacktickRun, true);
this.preprocessors.push(scanDelimiters);
this.postprocessors.unshift(addDelimiterExtras);
}
1 change: 0 additions & 1 deletion src/replaces/extAutolink/ExtWebAutolinkTransformers.js
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ class ExtWebAutolinkTransformers {
/**
* Escape characters in extended web autolink match according to the callers settings,
* so that it is interpreted correctly in GFM.
* XXX needed / broken?
* @param {String} str Link match portion to be escaped.
* @private
*/
Expand Down
Loading

0 comments on commit e00c32c

Please sign in to comment.