glossary

glossary is a JavaScript module that extracts keywords from text (aka "term extraction" or "auto tagging"). It takes a string of text and returns an array of terms that are relevant to the content:

var glossary = require("glossary");

var keywords = glossary.extract("Her cake shop is the best in the business");

console.log(keywords)  // ["cake", "shop", "cake shop", "business"]

glossary is standalone and uses part-of-speech analysis to extract the relevant terms.

install

For node with npm:

npm install glossary

API

blacklisting

Use blacklist to remove unwanted terms from any extraction:

var glossary = require("glossary")({
   blacklist: ["library", "script", "api", "function"]
});

var keywords = glossary.extract("JavaScript color conversion library");

console.log(keywords); // ["color", "conversion"]

minimum frequency

Use minFreq to limit the terms to only those that occur with a certain frequency:

var glossary = require("glossary")({ minFreq: 2 });

var keywords = glossary.extract("Kasey's pears are the best pears in Canada");

console.log(keywords); // ["pears"]

sub-terms

Use collapse to remove terms that are sub-terms of other terms:

var glossary = require("glossary")({ collapse: true });

var keywords = glossary.extract("The Middle East crisis is getting worse");

console.log(keywords); // ["Middle East crisis"]

verbose output

Use verbose to also get the count of each term:

var glossary = require("glossary")({ verbose: true });

var keywords = glossary.extract("The pears from the farm are good");

console.log(keywords); // [ { word: 'pears', count: 1 }, { word: 'farm', count: 1 } ]

propers

glossary Uses jspos for POS tagging. It's inspired by the python module topia.termextract.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
test		test
LICENSE		LICENSE
README.md		README.md
glossary.js		glossary.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

glossary

install

API

blacklisting

minimum frequency

sub-terms

verbose output

propers

About

Releases

Packages

Languages

License

harthur/glossary

Folders and files

Latest commit

History

Repository files navigation

glossary

install

API

blacklisting

minimum frequency

sub-terms

verbose output

propers

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages