Inconsistency between the paper and the code #6

zmgy107 · 2024-11-19T13:22:22Z

Hi! I found that the Modality Interaction Task section of the paper says to use visual modality as a query and text modality as key and value, but in the code you provided, in lines 497-499 of the “model_init.py” file, the first input of CrossAttention is 'text_tokens', and the first input of CrossAttention is as query . Is there any error in the provided code?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency between the paper and the code #6

Inconsistency between the paper and the code #6

zmgy107 commented Nov 19, 2024

Inconsistency between the paper and the code #6

Inconsistency between the paper and the code #6

Comments

zmgy107 commented Nov 19, 2024