K-RET: Knowledgeable Biomedical Relation Extraction System

K-RET is a flexible biomedical RE system, allowing for the use of any pre-trained BERT-based system (e.g., SciBERT and BioBERT) to inject knowledge in the form of knowledge graphs from a single source or multiple sources simultaneously. This knowledge can be applied to various contextualizing tokens or just to the tokens of the candidate relation for single and multi-token entities.

Our academic paper which describes K-RET in detail can be found here.

The uer folder corresponds to an updated version of the toolkit developed by Zhao et al. (2019) available here.

Downloading Pre-Trained Models

You should use both a baseline model and one of our pre-trained models to make predictions on new data. If you wish to train a new model on your data, you only need a baseline model, which can be either model referenced in our academic paper.

Baseline Models

After downloading a baseline model, for instance SciBERT, the model needs to be converted using the uer toolkit. For this, you can run the following example, making the necessary adaptations given different baseline models or paths.

cd K-RET/uer/
python3 convert_bert_from_huggingface_to_uer.py --input_model_path ../models/pre_trained_model_scibert/scibert_scivocab_uncased/pytorch_model.bin --output_model_path ../models/pre_trained_model_scibert/output_model.bin

Our Models

Available versions of the best performing pre-trained models are as follows:

The training details are described in our academic paper.

Getting Started

Our project includes code adaption of the K-BERT model available here. Use the K-RET Image available at Docker Hub to set up the rest of the experimental environment.

Usage Example

 CUDA_VISIBLE_DEVICES='1,2,3' python3 -u run_classification.py \
    --pretrained_model_path ./models/pre_trained_model_scibert/output_model.bin \
    --config_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/config.json \
    --vocab_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/vocab.txt \
    --train_path ./datasets/ddi_corpus/train.tsv \
    --dev_path ./datasets/ddi_corpus/dev.tsv \
    --test_path ./datasets/ddi_corpus/test.tsv \
    --class_weights True \
    --weights "[0.234, 3.377, 4.234, 6.535, 24.613]" \
    --epochs_num 30 \
    --batch_size 32 \
    --kg_name "['ChEBI']" \
    --output_model_path ./outputs/scibert_ddi.bin | tee ./outputs/scibert_ddi.log &

For more options check run.sh and, for additional configuration settings (e.g., max_number_entities and contextual_knowledge), check brain/config.py.

Predict New Data Example

CUDA_VISIBLE_DEVICES='0' python3 -u run_classification.py \
    --pretrained_model_path ./models/pre_trained_model_scibert/output_model.bin \
    --config_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/config.json \
    --vocab_path ./models/pre_trained_model_scibert/scibert_scivocab_uncased/vocab.txt \
    --train_path ./datasets/ddi_corpus/train.tsv \
    --dev_path ./datasets/ddi_corpus/dev.tsv \
    --class_weights True \
    --weights "[0.234, 3.377, 4.234, 6.535, 24.613]" \
    --test_path ./datasets/ddi_corpus/test.tsv \
    --epochs_num 30 --batch_size 32 --kg_name "[]" \
    --testing True \
    --to_test_model ./outputs/scibert_ddi.bin \
    | tee ./outputs/ddi_results.log &

Process Results Example

python3 src/process_results.py ./outputs/ddi_results.log ./datasets/ddi_corpus/test.tsv ddi_results.tsv

Reference

Diana Sousa and Francisco M. Couto. 2022. K-RET: Knowledgeable Biomedical Relation Extraction System. Bioinformatics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-RET: Knowledgeable Biomedical Relation Extraction System

Downloading Pre-Trained Models

Baseline Models

Our Models

Getting Started

Usage Example

Predict New Data Example

Process Results Example

Reference

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
auxiliar		auxiliar
brain		brain
datasets		datasets
models		models
outputs		outputs
uer		uer
LICENSE		LICENSE
README.md		README.md
predict.sh		predict.sh
run.sh		run.sh
run_classification.py		run_classification.py

License

lasigeBioTM/K-RET

Folders and files

Latest commit

History

Repository files navigation

K-RET: Knowledgeable Biomedical Relation Extraction System

Downloading Pre-Trained Models

Baseline Models

Our Models

Getting Started

Usage Example

Predict New Data Example

Process Results Example

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages