Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image-text pre-trianed #7

Open
Tzx11 opened this issue Mar 11, 2024 · 1 comment
Open

image-text pre-trianed #7

Tzx11 opened this issue Mar 11, 2024 · 1 comment

Comments

@Tzx11
Copy link

Tzx11 commented Mar 11, 2024

nice work!I have two question.
When I finish reading this paper,i think the prior consists of a image encoder and a text encoder,so the image-text pre-trianed weights just contain image encoder and a text encoder weights?
and How do I load image-text pre-trianed weights into models for other medical downstream tasks.

@QtacierP
Copy link
Owner

In this repository, I have made available two sets of pre-trained weights to facilitate further research and application development. The first set consists of pure vision-encoder weights, grounded in the ResNet50 architecture. This is a standard architecture of ResNet50 as found in the TorchVision Library, with the exception of the last fully connected (FC) layer. These weights can be freely accessed and downloaded from this link. Leveraging these weights allows for straightforward fine-tuning on various downstream tasks.

The second set of weights is designed for text-image joint representation, and is hosted on Google Drive. Typically, such weights find their application in zero-shot tasks. To support this, I have included scripts and code within the repository for zero-shot classification, enabling users to implement this advanced functionality with ease.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants