Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grounding Calculation #23

Open
krbuettner opened this issue Sep 14, 2022 · 1 comment
Open

Grounding Calculation #23

krbuettner opened this issue Sep 14, 2022 · 1 comment

Comments

@krbuettner
Copy link

❓ Questions and Help

Hi, I am looking at the grounding head and am wondering if you could provide clarification on some items in grounding_head.py... what is the difference between local_similarity and local_distance and why is local_distance used to calculate global_dist_r2w rather than local_similarity? Also, does the sign of the grounding score matter? Are high scores large positive #s and low scores small negative #s?

Thank you for your time and help.

@alirezazareian
Copy link
Owner

Similarity and distance are opposite concepts here. The reason we have both is to support various metrics. E.g. cosine measures similarity (high means similar) while Euclidean is a distance (high means not similar). For generality, we define both similarity and distance for each metric. Similarity is used to compute attention weights, since higher similarity means more attention. Then attention is used to get a weighted sum of the local distance matrix and determine global distances between images and captions in the batch. The global distance is used to compute the loss, since loss is higher if the distance of the image to its corresponding caption is high.

Distance is always positive except for dot product, since dot product is unbounded in both ways. For dot product, a very negative number for distance means a very low distance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants