how's the performance on refcoco? #1

yxchng · 2023-04-07T03:16:36Z

No description provided.

helblazer811 · 2023-04-07T03:18:55Z

It is taking a while to run, I'll probably check the results sometime during the weekend.

The initial results of this approach are fairly poor. I think the reason for this is that many of the RefCOCO text prompts involve spatial relations like "the man to the left of the ...". CLIP does not have the ability to contextualize local regions within an image.

PengtaoJiang · 2023-04-10T02:23:21Z

Hello, I also utilize the clip model to classify the masks from SAM. However, I find the performance is poor. Increasing the image size of the clip model may improve the recognition accuracy of each mask.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how's the performance on refcoco? #1

how's the performance on refcoco? #1

yxchng commented Apr 7, 2023

helblazer811 commented Apr 7, 2023

PengtaoJiang commented Apr 10, 2023

how's the performance on refcoco? #1

how's the performance on refcoco? #1

Comments

yxchng commented Apr 7, 2023

helblazer811 commented Apr 7, 2023

PengtaoJiang commented Apr 10, 2023