This is a repository for reproducing the paper ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis [Paper]
We use 5 public datasets in our model, they can be downloaded from:
- MIMIC-IV-ECG
- Champan-Shaoxing-Ningbo (CSD)
- Shandong Provincial Hospital (SPH)
- PTB-XL
- CPSC2018 (The training dataset can be downloaded here if not accessible)
We provided the preprocessing code of these datasets, including extracting waveform data, converting to WFDB format, etc. Take the SPH dataset as an example:
python ./data/preprocess/preprocess_sph.py --data-dir /path/to/sph
You can also copy the .csv files in data/
to your datasets folders. But you still need to convert the data format in SPH and CPSC2018 to WFDB format.
The translated version of PTB-XL dataset is got from Fairseq-signals.
The ECG-Instruct datasets of ECG-Chat are provided in llava/playground/data/
. ecg_instruct_45k.json
is the combination of diagnosis.json
and conversation.json
. We also shared our prompts to build this two datasets in llava/playground/data/prompts/
.
Due to the large size, the files new_record_list.csv
in MIMIC-IV-ECG and pretraining dataset pretrain_mimic.json
in our project can be downloaded here.
To train and evaluate the ECG CoCa model, please use the scripts in open_clip/
.
To pretrain and fine-tune the ECG-Chat model, please use the scripts in llava/
.
The codes for report generation evaluation and RAG are coming soon.
The ECG data augmentation methods implementation comes from torch_ecg. We also used the CKEPE prompt proposed in MERL to evaluate the zero-shot classification ability of our model.
If you think that our work is useful to your research, please cite using this BibTeX:
@misc{zhao2024ecgchatlargeecglanguagemodel,
title={ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis},
author={Yubao Zhao and Tian Zhang and Xu Wang and Puyu Han and Tong Chen and Linlin Huang and Youzhu Jin and Jiaju Kang},
year={2024},
eprint={2408.08849},
archivePrefix={arXiv},
primaryClass={eess.SP},
url={https://arxiv.org/abs/2408.08849},
}
If you have questions about this repo, please submit an issue or contact [email protected].