Skip to content

YubaoZhao/ECG-Chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis

This is a repository for reproducing the paper ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis [Paper]

Usage

Prepare Datasets

We use 5 public datasets in our model, they can be downloaded from:

We provided the preprocessing code of these datasets, including extracting waveform data, converting to WFDB format, etc. Take the SPH dataset as an example:

python ./data/preprocess/preprocess_sph.py --data-dir /path/to/sph

You can also copy the .csv files in data/ to your datasets folders. But you still need to convert the data format in SPH and CPSC2018 to WFDB format.

The translated version of PTB-XL dataset is got from Fairseq-signals.

The ECG-Instruct datasets of ECG-Chat are provided in llava/playground/data/. ecg_instruct_45k.json is the combination of diagnosis.json and conversation.json. We also shared our prompts to build this two datasets in llava/playground/data/prompts/.

Due to the large size, the files new_record_list.csv in MIMIC-IV-ECG and pretraining dataset pretrain_mimic.json in our project can be downloaded here.

Train the Models

To train and evaluate the ECG CoCa model, please use the scripts in open_clip/.

To pretrain and fine-tune the ECG-Chat model, please use the scripts in llava/.

The codes for report generation evaluation and RAG are coming soon.

The ECG data augmentation methods implementation comes from torch_ecg. We also used the CKEPE prompt proposed in MERL to evaluate the zero-shot classification ability of our model.

Citation

If you think that our work is useful to your research, please cite using this BibTeX:

@misc{zhao2024ecgchatlargeecglanguagemodel,
      title={ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis}, 
      author={Yubao Zhao and Tian Zhang and Xu Wang and Puyu Han and Tong Chen and Linlin Huang and Youzhu Jin and Jiaju Kang},
      year={2024},
      eprint={2408.08849},
      archivePrefix={arXiv},
      primaryClass={eess.SP},
      url={https://arxiv.org/abs/2408.08849}, 
}

If you have questions about this repo, please submit an issue or contact [email protected].

Acknowledgement

  • OpenCLIP: the codebase we used to build our ECG CoCa model.
  • LLaVA: we used the code and architecture of LLaVA to build our ECG-Chat model.

About

A Large ECG-Language Model for Cardiac Disease Diagnosis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published