Skip to content

Latest commit

 

History

History
14 lines (9 loc) · 771 Bytes

README.md

File metadata and controls

14 lines (9 loc) · 771 Bytes

CLIP_VisualPrompting

prerequisites: Installation following CLIP repo.

Usage

We have the example of the pan_1.png and pan_2.png, and match them to texts ["an image of the handle of a pan", "an image of the cooking area of a pan"]. After running the script, we have a probability of [[0.6423, 0.3527], [0.3517, 0.6433]] as the final scores.

Acknowledgement

We borrow the optimal transport function from SuperGlue