Skip to content

Commit

Permalink
feat: New option 'linking.limitByAlternatives (#137)
Browse files Browse the repository at this point in the history
See README.md for details.

* test: New baseline.
  • Loading branch information
about-code authored Dec 25, 2020
1 parent 37c7b1e commit 98cb9d0
Show file tree
Hide file tree
Showing 39 changed files with 308 additions and 88 deletions.
74 changes: 36 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,8 @@
- [Multiple Glossaries](#multiple-glossaries)
- [Sorting your glossaries](#sorting-your-glossaries)
- [Cross Linking](#cross-linking)
- [Term-Based Cross-Linking](#term-based-cross-linking)
- [Explicit Cross-Linking](#explicit-cross-linking)
- [Too many links](#too-many-links)
- [Term-Based Auto-Linking](#term-based-auto-linking)
- [Identifier-based Cross-Linking](#identifier-based-cross-linking)
- [Generating Files](#generating-files)
- [Index](#index)
- [Lists](#lists)
Expand Down Expand Up @@ -379,57 +378,47 @@ The i18n-object is passed *as is* to the collator function. Thus you can use add

> **Note:** With `file` being a glob `termHint` and `sort` are being ignored. So you may still declare a dedicated glossary item with a `file` *path* if you need these options.
**Too many links**
**Too many links?**

What may happen with term-based linking and *globs* is, that once a lot of headings become terms, there might be *too many links* generated.
If this is an issue for you explore options like `linking.mentions` or `linking.headingDepths` and the other `linking.*` [options] to control linkify behavior.
What may happen with term-based linking and *globs* is, that once a lot of headings become terms, there might be *too many links* generated. If this is an issue for you explore [`linking.*`][opt-linking] [options] like `linking.mentions`, `linking.limitByAlternatives` or `linking.headingDepths` to tweak linkify behavior.

### ID-based Cross-Links
### Identifier-based Cross-Linking

If the same section heading exists more than once - for example you have multiple pages being structured by a certain template - then you might want to link to one heading in particular.
<!--
While [aliases] sometimes might be the better option, they also require you to use that particular phrase whenevery you refer to that
-->
Another feature we've added is support for [pandoc's heading ids][pandoc-heading-ids]. These allow you to assign anchor ids which do not depend on the heading phrase. That makes them more stable to use for references than auto-generated IDs (slugs).
If the same section heading exists more than once then you might want to link to one heading in particular.
While you should consider using an [alias] to make use of term-based auto-linking, there might be situations where you whish to have manually declared links.

**Since v5.0.0** we've added support for manual cross-linking through [pandoc's concept of heading ids][pandoc-heading-ids]. These allow you to assign identifiers which are more stable for referencing than auto-generated IDs derived from the heading phrase (slugs).

> **Note:** Pandoc's identifier syntax is not standardized in [CommonMark].
[Sample]: document `./pages/page1.md` declares a heading

*/pages/page1.md*
~~~md
## User Story {#p1-story}
~~~

with heading-id `#p1-story`. **Given that `#p1-story` is *unique* accross all documents** you can have a link
```md
## User Story {#s-241}
```

with heading-id `#s-241`. **Given that `#s-241` is *unique* accross all documents** you can use it as a link reference

~~~md
[any phrase](#p1-story)
~~~
```md
[any phrase](#s-241)
```

in any file being processed. [glossarify-md] will resolve the id into a relative path to `page1.md`:
in any file being processed and [glossarify-md] will resolve the relative path:

*/README.md*
~~~
[any phrase](./pages/page1.md#p1-story)
~~~

*/pages/page2.md*
~~~
[any phrase](./page1.md#p1-story)
~~~
```
[any phrase](./pages/page1.md#s-241)
```

<!--
If you are a technical writer and some pages follow a page template with the same standardized headings being used repeatedly, then this is similar to a term having dozens of definitions. By default, when [glossarify-md] finds the term in text it turns it into a link to one definition but adds superscript links to all of its definitions. You may not want this if you don't really care about terminology but term-based cross-linking, only. Therefore you can set a limit for superscript links or if that limit is negative, tell glossarify-md to not generate a link at all if there are more than -1 * limit definitions.
-->
*/pages/page2.md*

```
[any phrase](./page1.md#s-241)
```

<!--
- **`linking.mentions`**
allows to control whether `"all"` mentions of a heading phrase should be turned into a link or only the `"first-in-paragraph"`, for example. Just keep in mind that a *paragraph* is to be understood as a *Markdown paragraph* as specified in [CommonMark].
- **`linking.headingDepths`**
allows you to control which kind of headings to link up. A configuration `[1,2,3]` tells [glossarify-md] to only linkify headings of kind `# One`, `## Two` or `### Three` but not `#### Four`, `##### Five`, `###### Six`.
- **`glossaries.file`**
if you have used a glob pattern you may be able to define a file name pattern to reduce the number of files participating in cross-linking
-->
## Generating Files

### Index
Expand Down Expand Up @@ -793,6 +782,8 @@ need to add those.

#### `linking.paths`

[opt-linking]: #linkingpaths

- **Range:** `"relative" | "absolute"`

Whether to create absolute or relative link-urls to the glossary.
Expand Down Expand Up @@ -830,6 +821,13 @@ Default is `[2,3,4,5,6]`.

In case you have modified [`indexing.headingDepths`](#indexingheadingdepths), be aware that this option only makes sense if it is a *full subset* of the items in [`indexing.headingDepths`](#indexingheadingdepths).

#### `linking.limitByAlternatives`

- **Range:** `number[]` in -95 - +95
- **Since:** v5.0.0

If there are multiple definitions of a term or heading then this option can be used to limit the number of links to alternative definitions. When using a positive value, then the system creates links to alternative definitions *but no more than...*. If the number is negative then the numerical amount indicates to *not create a term-link at all once there are more than...* definitions of a term. This option may be helpful in certain cases where terms appear to have many alternative definitions but just because they are headings of pages that follow a certain page template and thus are repeatedly "defined".

#### `outDir`

- **Range:** `string`
Expand Down
13 changes: 10 additions & 3 deletions conf/v5/schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,8 @@
"baseUrl": "",
"paths": "relative",
"mentions": "all",
"headingDepths": [2,3,4,5,6]
"headingDepths": [2,3,4,5,6],
"limitByAlternatives": 10
}
},
"outDir": {
Expand Down Expand Up @@ -173,15 +174,15 @@
"properties": {
"groupByHeadingDepth": {
"description": "Level of detail by which to group occurrences of terms or syntactic elements in generated files (Range [min, max]: [0, 6]). For example, use 0 to not group at all; 1 to group things at the level of document titles, etc. Configures the indexer. The option affects any files generated from the internal AST node index.",
"type": "number",
"type": "integer",
"minimum": 0,
"maximum": 6
},
"headingDepths": {
"description": "An array with items in a range of 1-6 denoting the depths of headings that should be indexed. Excluding some headings from indexing is mostly a performance optimization, only. You can just remove the option from your config or stick with defaults. Change defaults only if you are sure that you do not want to have cross-document links onto headings at a particular depth, no matter whether the link was created automatically or written manually.\nThe relation to 'linking.headingDepths' is that *this* is about \"knowing the link targets\" whereas the other is about \"creating links\" ...based on knowledge about link targets. Yet, indexing of headings is further required for existing (cross-)links like `[foo](#heading-id)` and resolving the path to where a heading with such id was declared, so for example `[foo](../document.md#heading-id)`.",
"type": "array",
"items": {
"type": "number",
"type": "integer",
"minimum": 1,
"maximum": 6
}
Expand Down Expand Up @@ -251,6 +252,12 @@
"minimum": 1,
"maximum": 6
}
},
"limitByAlternatives": {
"description": "If there are multiple definitions of a term or heading then this option can be used to limit the number of links to alternative definitions. When using a positive value, then the system creates links to alternative definitions *but no more than...*. If the number is negative then the numerical amount indicates to *not create a term-link at all once there are more than...* definitions of a term. This option may be helpful in certain cases where terms appear to have many alternative definitions but just because they are headings of pages that follow a certain page template and thus are repeatedly defined.",
"type": "integer",
"minimum": "-95",
"maximum": "+95"
}
}
},
Expand Down
51 changes: 23 additions & 28 deletions doc/templates/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -353,56 +353,42 @@ The i18n-object is passed *as is* to the collator function. Thus you can use add
**Too many links?**

What may happen with term-based linking and *globs* is, that once a lot of headings become terms, there might be *too many links* generated.
If this is an issue for you explore options like `linking.mentions` or `linking.headingDepths` and the other `linking.*` [options] to control linkify behavior.
What may happen with term-based linking and *globs* is, that once a lot of headings become terms, there might be *too many links* generated. If this is an issue for you explore [`linking.*`][opt-linking] [options] like `linking.mentions`, `linking.limitByAlternatives` or `linking.headingDepths` to tweak linkify behavior.

### ID-based Cross-Links
### Identifier-based Cross-Linking

If the same section heading exists more than once - for example you have multiple pages being structured by a certain template - then you might want to link to one heading in particular.
<!--
While [aliases] sometimes might be the better option, they also require you to use that particular phrase whenevery you refer to that
-->
Another feature we've added is support for [pandoc's heading ids][pandoc-heading-ids]. These allow you to assign anchor ids which do not depend on the heading phrase. That makes them more stable to use for references than auto-generated IDs (slugs).
If the same section heading exists more than once then you might want to link to one heading in particular.
While you should consider using an [alias] to make use of term-based auto-linking, there might be situations where you whish to have manually declared links.

**Since v5.0.0** we've added support for manual cross-linking through [pandoc's concept of heading ids][pandoc-heading-ids]. These allow you to assign identifiers which are more stable for referencing than auto-generated IDs derived from the heading phrase (slugs).

> **Note:** Pandoc's identifier syntax is not standardized in [CommonMark].
[Sample]: document `./pages/page1.md` declares a heading

*/pages/page1.md*
~~~md
## User Story {#p1-story}
## User Story {#s-241}
~~~

with heading-id `#p1-story`. **Given that `#p1-story` is *unique* accross all documents** you can have a link
with heading-id `#s-241`. **Given that `#s-241` is *unique* accross all documents** you can use it as a link reference

~~~md
[any phrase](#p1-story)
[any phrase](#s-241)
~~~

in any file being processed. [glossarify-md] will resolve the id into a relative path to `page1.md`:
in any file being processed and [glossarify-md] will resolve the relative path:

*/README.md*
~~~
[any phrase](./pages/page1.md#p1-story)
[any phrase](./pages/page1.md#s-241)
~~~

*/pages/page2.md*
~~~
[any phrase](./page1.md#p1-story)
[any phrase](./page1.md#s-241)
~~~

<!--
If you are a technical writer and some pages follow a page template with the same standardized headings being used repeatedly, then this is similar to a term having dozens of definitions. By default, when [glossarify-md] finds the term in text it turns it into a link to one definition but adds superscript links to all of its definitions. You may not want this if you don't really care about terminology but term-based cross-linking, only. Therefore you can set a limit for superscript links or if that limit is negative, tell glossarify-md to not generate a link at all if there are more than -1 * limit definitions.
-->


<!--
- **`linking.mentions`**
allows to control whether `"all"` mentions of a heading phrase should be turned into a link or only the `"first-in-paragraph"`, for example. Just keep in mind that a *paragraph* is to be understood as a *Markdown paragraph* as specified in [CommonMark].
- **`linking.headingDepths`**
allows you to control which kind of headings to link up. A configuration `[1,2,3]` tells [glossarify-md] to only linkify headings of kind `# One`, `## Two` or `### Three` but not `#### Four`, `##### Five`, `###### Six`.
- **`glossaries.file`**
if you have used a glob pattern you may be able to define a file name pattern to reduce the number of files participating in cross-linking
-->

## Generating Files

### Index
Expand Down Expand Up @@ -761,6 +747,8 @@ need to add those.

#### `linking.paths`

[opt-linking]: #linkingpaths

- **Range:** `"relative" | "absolute"`

Whether to create absolute or relative link-urls to the glossary.
Expand Down Expand Up @@ -798,6 +786,13 @@ Default is `[2,3,4,5,6]`.

In case you have modified [`indexing.headingDepths`](#indexingheadingdepths), be aware that this option only makes sense if it is a *full subset* of the items in [`indexing.headingDepths`](#indexingheadingdepths).

#### `linking.limitByAlternatives`

- **Range:** `number[]` in -95 - +95
- **Since:** v5.0.0

If there are multiple definitions of a term or heading then this option can be used to limit the number of links to alternative definitions. When using a positive value, then the system creates links to alternative definitions *but no more than...*. If the number is negative then the numerical amount indicates to *not create a term-link at all once there are more than...* definitions of a term. This option may be helpful in certain cases where terms appear to have many alternative definitions but just because they are headings of pages that follow a certain page template and thus are repeatedly "defined".

#### `outDir`

- **Range:** `string`
Expand Down
20 changes: 16 additions & 4 deletions lib/linker.js
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,10 @@ function getLinkifyVisitor(context, vFile, indexEntriesMap) {
*/
function linkifyAst(paragraphNode, headingNode, indexEntrySet, context, vFile) {
const {linking} = context.conf;
const hasMultipleDefs = indexEntrySet.length > 1;
const countDefinitions = indexEntrySet.length;
const maxAlternativeDefs = linking.limitByAlternatives;
const hasAlternativeDefs = countDefinitions > 1;
const hasMoreAlternativeDefs = countDefinitions > maxAlternativeDefs;
const termNodes = indexEntrySet.map(e => e.node);
const termNode = termNodes[0];
const maxReplacements = (linking.mentions === "first-in-paragraph") ? 1 : Infinity;
Expand All @@ -134,16 +137,25 @@ function linkifyAst(paragraphNode, headingNode, indexEntrySet, context, vFile) {
return;
}

if (hasMultipleDefs) {
linkNodes = termNodes.map(getSuperscriptLinkMapper(context, vFile));
if (maxAlternativeDefs < 0 && countDefinitions > Math.abs(maxAlternativeDefs)) {
return;
}
if (hasAlternativeDefs) {
linkNodes = termNodes
.filter((e, idx) => idx < maxAlternativeDefs)
.map(getSuperscriptLinkMapper(context, vFile));
if (hasMoreAlternativeDefs) {
linkNodes.push(html("<sup>...</sup>"));
}
}

// Search for term and insert a term occurrence node
// Omit termHint if there are multiple definitions and superscript links
const newParagraphNode = findReplace(paragraphNode, termNode.regex, (linkText) => {
if (countReplacements >= maxReplacements) {
return text(linkText);
}
if (!hasMultipleDefs && termNode.hint) {
if (!hasAlternativeDefs && termNode.hint) {
if (/\$\{term\}/.test(termNode.hint)) {
linkText = termNode.hint.replace("${term}", linkText);
} else {
Expand Down
11 changes: 10 additions & 1 deletion lib/model/context.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,19 @@ class Context {
this.conf = conf;

conf.baseDir = toForwardSlash(conf.baseDir || "");
conf.outDir = toForwardSlash(conf.outDir || "");
conf.outDir = toForwardSlash(conf.outDir || "");
conf.glossaries = conf.glossaries.map(conf => new Glossary(conf));

// Excluding certain headingDepths in (cross-)linking
conf.indexing.headingDepths = arrayToMap(conf.indexing.headingDepths);
conf.linking.headingDepths = arrayToMap(conf.linking.headingDepths);

// limit link creation for alternative definitions
const altLinks = conf.linking.limitByAlternatives;
if (Math.abs(altLinks) > 95) {
conf.linking.limitByAlternatives = Math.sign(altLinks) * 95;
}

if (conf.generateFiles.listOfFigures) {
conf.generateFiles.listOfFigures = Object.assign({ class: "figure", title: "Figures"}, conf.generateFiles.listOfFigures);
conf.generateFiles.listOf.push(conf.generateFiles.listOfFigures);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 3
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 4
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 5
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 6
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Document

GIVEN option `linking.limitByAlternatives: -5` WITH a negative limit
AND a term *Ambiguous* with more definitions than the mathematical amount of the limit
THEN the term MUST NOT be linkified at all.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"$schema": "../../../../../conf/v5/schema.json",
"baseDir": ".",
"outDir": "../../../../output-actual/config-linking/limitByAlternatives/negative",
"glossaries": [
{"file": "./*.md" }
],
"linking": {
"limitByAlternatives": -5
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 3
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 4
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Testing `linking.limitByAlternatives`

## Ambiguous

Definition 5
Loading

0 comments on commit 98cb9d0

Please sign in to comment.