Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the Text-based MLC #1

Open
Eldo-rado opened this issue May 1, 2023 · 2 comments
Open

Some questions about the Text-based MLC #1

Eldo-rado opened this issue May 1, 2023 · 2 comments

Comments

@Eldo-rado
Copy link

Hi 👋, thanks for your great job! And I have some questions about the Text-based MLC to confirm.

  1. When using MIMIC-CXR for pre-training, is the label of multi-label classification extracted by CheXpert labeler?
  2. In the code(class pretrain_dataset), I found that all the data of MIMIC-CXR was used in the pre-training of multi-label classification. Will this not cause information leakage in the subsequent downstream tasks?
    Thank you in advance. I am looking forward to hearing from you!
@chenzcv7
Copy link
Owner

chenzcv7 commented May 4, 2023

Hi, thanks for your interest in our work and sorry for the late reply!

  1. the label of MIMIC-CXR is accompanied in the official dataset file (can be downloaded from https://physionet.org/content/mimic-cxr/2.0.0/).
  2. the text-based MLC will not cause the information leakage, since 1) we only use the training split of MIMIC-CXR for pretraining, and 2) the pretraining text-based MLC task enforces the alignment of images and paired textual formed labels, where the labels play the similar role with the report. The downstream diagnosis classification task, however, performs multi-label classification based on one-hot formed label.

@Eldo-rado
Copy link
Author

Thanks for your reply!
I would also like to confirm that: Is text-based MLC trained with the whole network in Fig. 8? (The multi-label classification performance based on the updated feature $f_g^{kv}$ can also be effective enhanced.) But why is it called a pretraining task?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants