Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building a Character Recognition Pipeline with a frontend interface #33

Open
1 of 5 tasks
fyang3 opened this issue Jul 2, 2021 · 1 comment
Open
1 of 5 tasks
Assignees
Labels
enhancement New feature or request

Comments

@fyang3
Copy link

fyang3 commented Jul 2, 2021

The character class in the gender_analysis toolkit provides the functionality to automatically generate a character list with each character’s name, nicknames, and pronouns based on a particular document input and intake user feedback for a manually disambiguated list. The pipeline utilizes a human-AI collaboration approach that includes NLTK’s Named Entity Recognition (NER) and Neuralcoref’s Coreference Resolution model as well as a manual disambiguation interface. For the gender analysis web interface, we’d like to build a frontend that achieves the core functionality of the pipeline:

MVP:

  • A user selects a document through leveraging our document model
  • The backend pipeline automatically output a list of character names with their associated nicknames and pronoun probabilities based on THIS_NOTEBOOK
  • A frontend disambiguation interface that enables the user to validate and correct the pipeline outputs through a dropdown list design (or similar)

Nice-to-have:

  • Output a resolved text with the results from the character identification-disambiguation pipeline
  • Take the resolved text for further analysis similar to proximity analysis and frequency analysis
@fyang3 fyang3 added the enhancement New feature or request label Jul 2, 2021
@kenalba
Copy link

kenalba commented Jul 9, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants