Skip to content

BioELMo is a biomedical version of embeddings from language model (ELMo), pre-trained on PubMed abstracts.

Notifications You must be signed in to change notification settings

Andy-jqa/bioelmo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 

Repository files navigation

bioelmo

BioELMo is a biomedical version of embeddings from language model (ELMo), pre-trained on PubMed abstracts. Pre-training uses 10M recent PubMed abstracts (2.46B tokens in total), and BioELMo achieves an averaged forward and backward perplexity of 31.37 on a held-out test set. BioELMo encodes biomedical entity-type and relational information pretty well, as shown in our paper.

Download Weights

You can use BioELMo as a fixed-feature extractor for downstream tasks using these weights:

Download Tensorflow Checkpoints

You can further fine-tune BioELMo on other corpora using the Tensorflow checkpoint. See this for details.

Usage

Please visit https://github.com/allenai/bilm-tf. Basically, you use BioELMo the same way you use ELMo.

Probing Experiments

Please visit https://github.com/Andy-jqa/probing_biomed_embeddings (currently under construction) for codes of probing experiments described in our paper.

Citation

Please cite the following paper if you use BioELMo:

@inproceedings{jin2019probing,
  title={Probing Biomedical Embeddings from Language Models},
  author={Jin, Qiao and Dhingra, Bhuwan and Cohen, William and Lu, Xinghua},
  booktitle={Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP},
  pages={82--89},
  year={2019}
}

About

BioELMo is a biomedical version of embeddings from language model (ELMo), pre-trained on PubMed abstracts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published