Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing Result on ImageNet-A Dataset #15

Open
zhangletian2 opened this issue Jul 26, 2024 · 11 comments
Open

Reproducing Result on ImageNet-A Dataset #15

zhangletian2 opened this issue Jul 26, 2024 · 11 comments

Comments

@zhangletian2
Copy link

zhangletian2 commented Jul 26, 2024

Thanks for the wonderful work!
However, when I use test_tpt.sh and ViT-B/16 backbone to reproduce the experiment result on ImageNet-A, I got strangely low accuracy:

Use GPU: 0 for training
Initializing the contect with given words: [a_photo_of_a]
Initial context: "a photo of a"
Number of context words (tokens): 4
=> Model created: visual backbone ViT-B/16
=> Using native Torch AMP. Training in mixed precision.
evaluating: A
number of test samples: 7500
Test: [ 199/7500]       Time  0.140 ( 0.152)    Acc@1   0.00 (  0.50)   Acc@5   0.00 (  4.00)
Test: [ 399/7500]       Time  0.140 ( 0.146)    Acc@1   0.00 (  0.50)   Acc@5   0.00 (  3.25)
Test: [ 599/7500]       Time  0.138 ( 0.144)    Acc@1   0.00 (  0.67)   Acc@5   0.00 (  3.67)
Test: [ 799/7500]       Time  0.140 ( 0.143)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  3.75)
Test: [ 999/7500]       Time  0.140 ( 0.142)    Acc@1   0.00 (  0.50)   Acc@5   0.00 (  3.20)
Test: [1199/7500]       Time  0.140 ( 0.142)    Acc@1   0.00 (  0.67)   Acc@5   0.00 (  3.17)
Test: [1399/7500]       Time  0.142 ( 0.142)    Acc@1   0.00 (  0.57)   Acc@5   0.00 (  3.21)
Test: [1599/7500]       Time  0.138 ( 0.142)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  3.31)
Test: [1799/7500]       Time  0.144 ( 0.141)    Acc@1   0.00 (  0.61)   Acc@5   0.00 (  3.11)
Test: [1999/7500]       Time  0.142 ( 0.141)    Acc@1   0.00 (  0.60)   Acc@5   0.00 (  3.05)
Test: [2199/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.59)   Acc@5   0.00 (  3.00)
Test: [2399/7500]       Time  0.144 ( 0.141)    Acc@1   0.00 (  0.58)   Acc@5   0.00 (  2.88)
Test: [2599/7500]       Time  0.142 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.85)
Test: [2799/7500]       Time  0.140 ( 0.141)    Acc@1   0.00 (  0.61)   Acc@5   0.00 (  2.82)
Test: [2999/7500]       Time  0.142 ( 0.141)    Acc@1   0.00 (  0.67)   Acc@5   0.00 (  2.90)
Test: [3199/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.78)
Test: [3399/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.79)
Test: [3599/7500]       Time  0.144 ( 0.141)    Acc@1   0.00 (  0.58)   Acc@5   0.00 (  2.67)
Test: [3799/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.58)   Acc@5   0.00 (  2.66)
Test: [3999/7500]       Time  0.148 ( 0.141)    Acc@1   0.00 (  0.60)   Acc@5   0.00 (  2.65)
Test: [4199/7500]       Time  0.140 ( 0.141)    Acc@1   0.00 (  0.64)   Acc@5   0.00 (  2.67)
Test: [4399/7500]       Time  0.138 ( 0.141)    Acc@1   0.00 (  0.66)   Acc@5   0.00 (  2.61)
Test: [4599/7500]       Time  0.138 ( 0.141)    Acc@1   0.00 (  0.65)   Acc@5   0.00 (  2.59)
Test: [4799/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.65)   Acc@5   0.00 (  2.58)
Test: [4999/7500]       Time  0.144 ( 0.141)    Acc@1   0.00 (  0.64)   Acc@5   0.00 (  2.58)
Test: [5199/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.54)
Test: [5399/7500]       Time  0.140 ( 0.141)    Acc@1   0.00 (  0.59)   Acc@5   0.00 (  2.48)
Test: [5599/7500]       Time  0.140 ( 0.141)    Acc@1   0.00 (  0.59)   Acc@5   0.00 (  2.52)
Test: [5799/7500]       Time  0.143 ( 0.141)    Acc@1   0.00 (  0.59)   Acc@5   0.00 (  2.45)
Test: [5999/7500]       Time  0.144 ( 0.141)    Acc@1   0.00 (  0.57)   Acc@5   0.00 (  2.40)
Test: [6199/7500]       Time  0.138 ( 0.141)    Acc@1   0.00 (  0.56)   Acc@5   0.00 (  2.37)
Test: [6399/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.48)
Test: [6599/7500]       Time  0.147 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.47)
Test: [6799/7500]       Time  0.143 ( 0.141)    Acc@1   0.00 (  0.63)   Acc@5   0.00 (  2.43)
Test: [6999/7500]       Time  0.144 ( 0.141)    Acc@1   0.00 (  0.61)   Acc@5   0.00 (  2.41)
Test: [7199/7500]       Time  0.139 ( 0.141)    Acc@1   0.00 (  0.60)   Acc@5   0.00 (  2.38)
Test: [7399/7500]       Time  0.140 ( 0.141)    Acc@1   0.00 (  0.62)   Acc@5   0.00 (  2.43)
Acc@1 0.613 Acc@5 2.427
=> Acc. on testset [A]: @1 0.6133333444595337/ @5 2.426666736602783
======== Result Summary ========
params: nstep   lr      bs                                                      
params: 1       0.005   64

I'm confused of the results, and I would be highly appreciated it if you can provide some insight!

@zhaihaotian
Copy link

image
I used RN50 backbone and also got the strange result , but when evaluating the Cross-dataset , the result is the same as the paper.

@zhaihaotian
Copy link

@zhangletian2 do u solve the issue?

@azshue
Copy link
Owner

azshue commented Aug 13, 2024

Hi all, thank you all for trying out the code.

Could you provide more details about the command you ran? Also, @zhaihaotian which "cross-dataset" was it?
Sorry I don't have access to my previous experiment environment anymore, but I'll do my best to help find the bug.

@zhaihaotian
Copy link

@azshue Thank you for reply. I evaluated the caltech101 and UCF101, they all got the same accuracy as the paper, but when evalutaing the Imagenet-adversial and Imagenet-rendition, i will get the strange result as I showed before. I didn't try the other dataset.

@zhaihaotian
Copy link

That's my tpt.sh , I use the default setting.

#!/bin/bash

data_root='/workspace/CaFo/data'
testsets=$1
arch=RN50
# arch=ViT-B/16
bs=64
ctx_init=a_photo_of_a

python ./tpt_classification.py ${data_root} --test_sets ${testsets} \
-a ${arch} -b ${bs} --gpu 0 \
--tpt --ctx_init ${ctx_init}

@azshue
Copy link
Owner

azshue commented Aug 14, 2024

If this issue only happens when evaluating ImageNet-A and ImageNet-R, there might be something wrong with the label masking. These two datasets only have 200 of the 1000 ImageNet classes, so we need to "reset" the classnames for these benchmarks.

Could you double-check if everything is running as expected under this if statement?

@zhaihaotian
Copy link

I think it is running as expected , i print the classname to check whether it is right , and it only has 200 classes in ImageNet-A and ImageNet-R.

imagenet_a_classname = [
    'stingray', 'goldfinch', 'junco', 'American robin', 'jay', 'bald eagle', 'vulture', 'newt', 
    'American bullfrog', 'box turtle', 'green iguana', 'agama', 'chameleon', 'American alligator', 
    'garter snake', 'harvestman', 'scorpion', 'tarantula', 'centipede', 'sulphur-crested cockatoo', 
    'lorikeet', 'hummingbird', 'toucan', 'duck', 'goose', 'koala', 'jellyfish', 'sea anemone', 
    'flatworm', 'snail', 'crayfish', 'hermit crab', 'flamingo', 'great egret', 'oystercatcher', 
    'pelican', 'sea lion', 'Chihuahua', 'Golden Retriever', 'Rottweiler', 'German Shepherd Dog', 
    'pug', 'red fox', 'Persian cat', 'lynx', 'lion', 'American black bear', 'mongoose', 'ladybug', 
    'rhinoceros beetle', 'weevil', 'fly', 'bee', 'ant', 'grasshopper', 'stick insect', 'cockroach', 
    'praying mantis', 'leafhopper', 'dragonfly', 'monarch butterfly', 'small white butterfly', 
    'gossamer-winged butterfly', 'starfish', 'cottontail rabbit', 'porcupine', 'fox squirrel', 
    'marmot', 'bison', 'skunk', 'armadillo', 'baboon', 'white-headed capuchin', 'African bush elephant', 
    'pufferfish', 'academic gown', 'accordion', 'acoustic guitar', 'airliner', 'ambulance', 'apron', 
    'balance beam', 'balloon', 'banjo', 'barn', 'wheelbarrow', 'basketball', 'lighthouse', 'beaker', 
    'bikini', 'hunting bow', 'bow tie', 'breastplate', 'broom', 'candle', 'canoe', 'castle', 'cello', 
    'chain', 'storage chest', 'Christmas stocking', 'cowboy boot', 'cradle', 'rotary dial telephone', 
    'digital clock', 'doormat', 'drumstick', 'dumbbell', 'envelope', 'feather boa', 'flagpole', 
    'forklift', 'fountain', 'garbage truck', 'goblet', 'go-kart', 'golf cart', 'grand piano', 
    'hair dryer', 'clothes iron', 'carved pumpkin', 'jeep', 'kimono', 'lighter', 'limousine', 
    'manhole cover', 'maraca', 'marimba', 'mask', 'mitten', 'mosque', 'metal nail', 'obelisk', 
    'ocarina', 'pipe organ', 'parachute', 'parking meter', 'piggy bank', 'pool table', 'hockey puck', 
    'quill', 'racket', 'fishing casting reel', 'revolver', 'rocking chair', 'rugby ball', 'salt shaker', 
    'sandal', 'saxophone', 'school bus', 'schooner', 'sewing machine', 'shovel', 'sleeping bag', 
    'snowmobile', 'snowplow', 'soap dispenser', 'spatula', 'spider web', 'steam locomotive', 
    'stethoscope', 'couch', 'submarine', 'sundial', 'suspension bridge', 'syringe', 'tank', 
    'teddy bear', 'toaster', 'torch', 'tricycle', 'umbrella', 'unicycle', 'viaduct', 'volleyball', 
    'washing machine', 'water tower', 'wine bottle', 'shipwreck', 'guacamole', 'pretzel', 
    'cheeseburger', 'hot dog', 'broccoli', 'cucumber', 'bell pepper', 'mushroom', 'lemon', 'banana', 
    'cherimoya (custard apple)', 'pomegranate', 'carbonara', 'bubble', 'cliff', 'volcano', 
    'baseball player', 'rapeseed', "yellow lady's slipper", 'corn', 'acorn'
]
imagenet_r_classname = [
    'goldfish', 'great white shark', 'hammerhead shark', 'stingray', 'hen', 'ostrich', 'goldfinch', 
    'junco', 'bald eagle', 'vulture', 'smooth newt', 'axolotl', 'tree frog', 'green iguana', 
    'chameleon', 'Indian cobra', 'scorpion', 'tarantula', 'centipede', 'peafowl', 'lorikeet', 
    'hummingbird', 'toucan', 'duck', 'goose', 'black swan', 'koala', 'jellyfish', 'snail', 
    'American lobster', 'hermit crab', 'flamingo', 'great egret', 'pelican', 'king penguin', 
    'grey whale', 'killer whale', 'sea lion', 'Chihuahua', 'Shih Tzu', 'Afghan Hound', 'Basset Hound', 
    'Beagle', 'Bloodhound', 'Italian Greyhound', 'Whippet', 'Weimaraner', 'Yorkshire Terrier', 
    'Boston Terrier', 'Scottish Terrier', 'West Highland White Terrier', 'Golden Retriever', 
    'Labrador Retriever', 'Cocker Spaniel', 'collie', 'Border Collie', 'Rottweiler', 'German Shepherd Dog', 
    'Boxer', 'French Bulldog', 'St. Bernard', 'Siberian Husky', 'Dalmatian', 'pug', 'Pomeranian', 
    'Chow Chow', 'Pembroke Welsh Corgi', 'Toy Poodle', 'Standard Poodle', 'grey wolf', 'hyena', 
    'red fox', 'tabby cat', 'leopard', 'snow leopard', 'lion', 'tiger', 'cheetah', 'polar bear', 
    'meerkat', 'ladybug', 'fly', 'bee', 'ant', 'grasshopper', 'cockroach', 'praying mantis', 
    'dragonfly', 'monarch butterfly', 'starfish', 'cottontail rabbit', 'porcupine', 'fox squirrel', 
    'beaver', 'guinea pig', 'zebra', 'pig', 'hippopotamus', 'bison', 'gazelle', 'llama', 'skunk', 
    'badger', 'orangutan', 'gorilla', 'chimpanzee', 'gibbon', 'baboon', 'giant panda', 'eel', 
    'clownfish', 'pufferfish', 'accordion', 'ambulance', 'assault rifle', 'backpack', 'barn', 
    'wheelbarrow', 'basketball', 'bathtub', 'lighthouse', 'beer glass', 'binoculars', 'birdhouse', 
    'bow tie', 'broom', 'bucket', 'cauldron', 'candle', 'cannon', 'canoe', 'carousel', 'castle', 
    'mobile phone', 'cowboy hat', 'electric guitar', 'fire truck', 'flute', 'gas mask or respirator', 
    'grand piano', 'guillotine', 'hammer', 'harmonica', 'harp', 'hatchet', 'jeep', 'joystick', 
    'lab coat', 'lawn mower', 'lipstick', 'mailbox', 'missile', 'mitten', 'parachute', 'pickup truck', 
    'pirate ship', 'revolver', 'rugby ball', 'sandal', 'saxophone', 'school bus', 'schooner', 
    'shield', 'soccer ball', 'space shuttle', 'spider web', 'steam locomotive', 'scarf', 
    'submarine', 'tank', 'tennis ball', 'tractor', 'trombone', 'vase', 'violin', 'military aircraft', 
    'wine bottle', 'ice cream', 'bagel', 'pretzel', 'cheeseburger', 'hot dog', 'cabbage', 'broccoli', 
    'cucumber', 'bell pepper', 'mushroom', 'Granny Smith apple', 'strawberry', 'lemon', 'pineapple', 
    'banana', 'pomegranate', 'pizza', 'burrito', 'espresso', 'volcano', 'baseball player', 
    'scuba diver', 'acorn'
]

@less-and-less-bugs
Copy link

Has anyone else managed to reproduce the results of the imagenet series dataset experiments?

@zhangletian2
Copy link
Author

I managed solve the problem by set the 'ID_to_DIRNAME' correctly in 'data/datautils,py':
i.e., the value in 'ID_to_DIRNAME' should be the last parent folder of the 200 folders
屏幕截图 2024-10-21 205858
ID_to_DIRNAME'

@zhaihaotian
Copy link

Thanks, i will check it in my code

@less-and-less-bugs
Copy link

Thank you for your replies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants