Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reintroduce anchor detection as configurable step #25

Merged
merged 1 commit into from
Mar 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,18 @@ $ npm install --save markdown-link-extractor
```
## API

### markdownLinkExtractor(markdown)
### markdownLinkExtractor(markdown, checkAnchors = false)

Parameters:

* `markdown` text in markdown format.
* `anchors` if anchors should also be extracted.

Returns:

* an array containing the URLs from the links found.
* an object with the following properties:
* `.anchors`: an array of anchor tag strings (e.g. `[ "#foo", "#bar" ]`) - only filled if `checkAnchors` set `true`.
* `.links`: an array containing the URLs from the links found.

## Examples

Expand All @@ -26,10 +29,26 @@ const markdownLinkExtractor = require('markdown-link-extractor');

const markdown = readFileSync('README.md', {encoding: 'utf8'});

const links = markdownLinkExtractor(markdown);
const { links } = markdownLinkExtractor(markdown);
links.forEach(link => console.log(link));
```

## Upgrading to v5.0.0

- anchor link extraction reintroduced - be careful if you upgrade from version <`3.x` as the `extended` parameter got removed but now there is the `checkAnchors` parameter in place.

Code that looked like this:

```
const links = markdownLinkExtractor(str);
```

Should change to this:

```
const { links } = markdownLinkExtractor(str);
```

## Upgrading to v4.0.0

- anchor link extraction no longer supported
Expand Down
24 changes: 22 additions & 2 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,33 @@
const { marked } = require('marked');
const htmlLinkExtractor = require('html-link-extractor');

module.exports = function markdownLinkExtractor(markdown, extended = false) {
module.exports = function markdownLinkExtractor(markdown, checkAnchors = false) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could pass in an Object. This way, we don't confuse people who have been used to extended - and we're future-proof to provide more configuration options.

const anchors = [];
if(checkAnchors) {
const renderer = {
heading(text, level, raw, slugger) {
if (this.options.headerIds) {
var id = this.options.headerPrefix + slugger.slug(raw);

anchors.push(`#${id}`);

return "<h" + level + " id=\"" + id + "\">" + text + "</h" + level + ">\n";
} // ignore IDs


return "<h" + level + ">" + text + "</h" + level + ">\n";
}
};

marked.use({ renderer });
}

marked.setOptions({
mangle: false, // don't escape autolinked email address with HTML character references.
});


const html = marked(markdown);
const links = htmlLinkExtractor(html);
return links;
return { links, anchors };
};
36 changes: 17 additions & 19 deletions test/markdown-link-extractor.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,88 +6,86 @@ var markdownLinkExtractor = require('../');
describe('markdown-link-extractor', function () {

it('should return an empty array when no links are present', function () {
var links = markdownLinkExtractor('No links here');
var { links } = markdownLinkExtractor('No links here');
expect(links).to.be.an('array');
expect(links).to.have.length(0);
});

it('should extract links with emojis', function () {
var links = markdownLinkExtractor('**[📣 Foo!](https://www.example.com)**');
var { links } = markdownLinkExtractor('**[📣 Foo!](https://www.example.com)**');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('https://www.example.com');
});

it('should extract a link in a [tag](http://example.com)', function () {
var links = markdownLinkExtractor('[example](http://www.example.com)');
var { links } = markdownLinkExtractor('[example](http://www.example.com)');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('http://www.example.com');
});

it('should extract a hash link in [foobar](#foobar)', function () {
var links = markdownLinkExtractor('[foobar](#foobar)');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('#foobar');
});

it('should extract a link from inline html <a href="http://foo.bar.test">foo</a>', function () {
var links = markdownLinkExtractor('<a href="http://foo.bar.test">foo</a>');
var { links } = markdownLinkExtractor('<a href="http://foo.bar.test">foo</a>');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('http://foo.bar.test');
});

it('should extract mailto: link from <[email protected]>', function () {
var links = markdownLinkExtractor('<[email protected]>)');
var { links } = markdownLinkExtractor('<[email protected]>)');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('mailto:[email protected]');
});

it('should extract a link in a with escaped braces [tag](http://example.com\(1\))', function () {
var links = markdownLinkExtractor('[XMLHttpRequest](http://msdn.microsoft.com/library/ie/ms535874\\(v=vs.85\\).aspx)');
var { links } = markdownLinkExtractor('[XMLHttpRequest](http://msdn.microsoft.com/library/ie/ms535874\\(v=vs.85\\).aspx)');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('http://msdn.microsoft.com/library/ie/ms535874(v=vs.85).aspx');
});

it('should extract an image link in a ![tag](http://example.com/image.jpg)', function () {
var links = markdownLinkExtractor('![example](http://www.example.com/image.jpg)');
var { links } = markdownLinkExtractor('![example](http://www.example.com/image.jpg)');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('http://www.example.com/image.jpg');
});

it('should extract an image link in a ![tag](foo/image.jpg)', function () {
var links = markdownLinkExtractor('![example](foo/image.jpg)');
var { links } = markdownLinkExtractor('![example](foo/image.jpg)');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('foo/image.jpg');
});

it('should extract two image links', function () {
var links = markdownLinkExtractor('![img](http://www.example.test/hello.jpg) ![img](hello.jpg)');
var { links } = markdownLinkExtractor('![img](http://www.example.test/hello.jpg) ![img](hello.jpg)');
expect(links).to.be.an('array');
expect(links).to.have.length(2);
expect(links[0]).to.be('http://www.example.test/hello.jpg');
expect(links[1]).to.be('hello.jpg');
});

it('should extract a bare link http://example.com', function () {
var links = markdownLinkExtractor('This is a link: http://www.example.com');
var { links } = markdownLinkExtractor('This is a link: http://www.example.com');
expect(links).to.be.an('array');
expect(links).to.have.length(1);
expect(links[0]).to.be('http://www.example.com');
});

it('should extract multiple links', function () {
var links = markdownLinkExtractor('This is an [example](http://www.example.com). Hope it [works](http://www.example.com/works)');
var { links } = markdownLinkExtractor('This is an [example](http://www.example.com). Hope it [works](http://www.example.com/works)');
expect(links).to.be.an('array');
expect(links).to.have.length(2);
expect(links[0]).to.be('http://www.example.com');
expect(links[1]).to.be('http://www.example.com/works');
});

});
it('should collect anchor tags', function () {
var { anchors } = markdownLinkExtractor('# foo\n# foo', true);
expect(anchors).to.eql(['#foo','#foo-1']);
});

});