Skip to content

Bi-LSTM - CRF Named Entity Recognition model for Korean (Keras)

Notifications You must be signed in to change notification settings

YongTaek/KoreaNER

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KoreaNER / 한국어 개체명


Evaluation & Comparison:

Corpus: National Institute of Korean Language (ROK) - NER Corpus / 국립국어원 - 개체명 인식용 말뭉치 (Link)

Category KoNER/코너 (2016) Annie (2016) KoreaNER
Precision Recall F-Score Precision Recall F-Score Precision Recall F-Score
DT 0.894 0.880 0.887 0.6373 0.7785 0.7009 0.94 0.94 0.94
LC 0.793 0.853 0.822 0.5822 0.8782 0.7002 0.71 0.76 0.73
OG 0.824 0.772 0.797 0.7624 0.7087 0.7346 0.73 0.63 0.68
PS 0.915 0.885 0.899 0.8834 0.6127 0.7236 0.80 0.75 0.78
TI 0.872 0.810 0.840 0.5441 0.8810 0.6727 0.98 0.89 0.93

source


Future improvements:

  • Add Gazeteer
  • Add specific features for PS/LOC
  • Web API

References:

Character-Aware Neural Language Models

Boosting Named Entity Recognition with Neural Character Embeddings

Attending To Characters In Neural Sequence Labeling Models

Neural Architectures for Named Entity Recognition

Bidirectional LSTM-CRF Models for Sequence Tagging

Character-level Convolutional Networks for Text Classification

A Syllable-based Technique for Word Embeddings of Korean Words

Open source projects (Github):

CharCNN Pytorch

word2vec-keras-in-gensim

anago

annie

DeepSequenceClassification

autumn_ner

kchar

deep-named-entity-recognition

Sequence Tagging with Tensorflow

About

Bi-LSTM - CRF Named Entity Recognition model for Korean (Keras)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%