Nestor

Machine-augmented annotation for technical text

You can do it; your machine can help.

Purpose

NLP in technical domains requires context sensitivity. Whether for medical notes, engineering work-orders, or social/behavioral coding, experts often use specialized vocabulary with over-loaded meanings and jargon. This is incredibly difficult for off-the-shelf NLP systems to parse through.

The common solution is to contextualize NLP models. For instance, medical NLP has been greatly advanced with the advent of labeled, bio-specific datasets, which have domain-relevant named-entity tags and vocabulary sets. Unfortunately for analysts of these types of data, creating resources like this is incredibly time consuming. This is where nestor comes in.

Quick Links

Nestor and all of it's associated gui's/projects are in the public domain (see the License). For more information and to provide feedback, please open an issue, submit a pull-request, or email us at [email protected].

How does it work?

See the Getting Started page.

This application was originally designed to help manufacturers "tag" their maintenance work-order data according to the methods being researched by the Knowledge Extraction and Applications project at NIST. The goal is to help build context-rich labels in data sets that previously were too unstructured or filled with jargon to analyze. The current build is in very early alpha, so please be patient in using this application. If you have any questions, please do not hesitate to contact us (see Who are we?. )

Why?

There is often a large amount of maintenance data already available for use in Smart Manufacturing systems, but in a currently-unusable form: service tickets and maintenance work orders (MWOs). Nestor is a toolkit for using Natural Language Processing (NLP) with efficient user-interaction to perform structured data extraction with minimal annotation time-cost. For further reading, [see @sexton2017hybrid,sharp2017toward]

Features

Documentation is contained in the /docs subdirectory.

Rank keywords found in your data by importance, saving you time
Suggest term unification by similarity (e.g. spelling), for quick review
Basic entity relationship builder, to assist assembling problem code and taxonomy definitions
Strucutred data output as named-entity tags, whether in readable (comma-sep) or computation-friendly (sparse-mat) form.

Planned:

Customizable entity types and rules
export to NER training formats
command-line app and REST API

Who are we?

This toolkit is a part of the Knowledge Extraction and Application for Smart Manufacturing (KEA) project, within the Systems Integration Division at NIST.

Projects that use Nestor

Various Nestor GUIs
nestor exploratory data analysis (dashboard, viz, etc.)

Points of Contact

Email the dev team at [email protected]
Rachael Sexton @rtbs-dev Nestor Technical Lead
Michael Brundage Principal Investigator

Why "KEA"?

The KEA project seeks to better frame data collection and transformation systems within smart manufacturing as collaborations between human experts and the machines they partner with, to more efficiently utilize the digital and human resources available to manufacturers. Kea (nestor notabilis) on the other hand, are the world's only alpine parrots, finding their home on the southern Island of NZ. Known for their intelligence and ability to solve puzzles through the use of tools, they will often work together to reach their goals, which is especially important in their harsh, mountainous habitat.

Development/Contribution Guidelines

More to come, but primary requirement is the use of Poetry. Plugins are installed as development dependencies through poetry (e.g. taskipy and poetry-dynamic-versioning), though if not using conda environments, poetry-dynamic-versioning may require being installed to the global python installation.

Notebooks should be kept nicely git-friendly with Jupytext

Name		Name	Last commit message	Last commit date
Latest commit History 840 Commits
.github/workflows		.github/workflows
docs		docs
nestor		nestor
tests		tests
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
CODEMETA.yaml		CODEMETA.yaml
LICENSE.md		LICENSE.md
README.md		README.md
mkdocs.yml		mkdocs.yml
nist-pages-deploy.sh		nist-pages-deploy.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nestor

Purpose

Quick Links

How does it work?

Why?

Features

Who are we?

Projects that use Nestor

Points of Contact

Why "KEA"?

Development/Contribution Guidelines

About

Releases 3

Packages

Contributors 6

Languages

License

usnistgov/nestor

Folders and files

Latest commit

History

Repository files navigation

Nestor

Purpose

Quick Links

How does it work?

Why?

Features

Who are we?

Projects that use Nestor

Points of Contact

Why "KEA"?

Development/Contribution Guidelines

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 6

Languages

Packages