Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset issues! #12

Open
tutu-star opened this issue Jun 16, 2023 · 10 comments
Open

Dataset issues! #12

tutu-star opened this issue Jun 16, 2023 · 10 comments

Comments

@tutu-star
Copy link

Hello, author. I want to know how to generate my own dataset in this format(instances_train2017_base.json, (instances_train2017_base_RN50relabel.json, instances_train2017_base_RN50x4relabel_pre.json) and instances_val2017_basetarget.json).

@dongwhfdyer
Copy link

I really want to know too!

@dongwhfdyer
Copy link

  if self.args.target_class_factor != 1.0 and not self.training:
                        if outputs_class.size(-1) == 66:
                            # COCO
                            target_index = [4, 5, 11, 12, 15, 16, 21, 23, 27, 29, 32, 34, 45, 47, 54, 58, 63]
                        elif outputs_class.size(-1) == 1204:
                            # LVIS
                            target_index = [12, 13, 16, 19, 20, 29, 30, 37, 38, 39, 41, 48, 50, 51, 62, 68, 70, 77, 81, 84, 92, 104, 105, 112, 116, 118, 122, 125, 129, 130, 135, 139, 141, 143, 146, 150, 154, 158, 160, 163, 166, 171, 178, 181, 195, 201, 208, 209, 213, 214, 221, 222, 230, 232, 233, 235, 236, 237, 239, 243, 244, 246, 249, 250, 256, 257, 261, 264, 265, 268, 269, 274, 280, 281, 286, 290, 291, 293, 294, 299, 300, 301, 303, 306, 309, 312, 315, 316, 320, 322, 325, 330, 332, 347, 348, 351, 352, 353, 354, 356, 361, 363, 364, 365, 367, 373, 375, 380, 381, 387, 388, 396, 397, 399, 404, 406, 409, 412, 413, 415, 419, 425, 426, 427, 430, 431, 434, 438, 445, 448, 455, 457, 466, 477, 478, 479, 480, 481, 485, 487, 490, 491, 502, 505, 507, 508, 512, 515, 517, 526, 531, 534, 537, 540, 541, 542, 544, 550, 556, 559, 560, 566, 567, 570, 571, 573, 574, 576, 579, 581, 582, 584, 593, 596, 598, 601, 602, 605, 609, 615, 617, 618, 619, 624, 631, 633, 634, 637, 639, 645, 647, 650, 656, 661, 662, 663, 664, 670, 671, 673, 677, 685, 687, 689, 690, 692, 701, 709, 711, 713, 721, 726, 728, 729, 732, 742, 751, 753, 754, 757, 758, 763, 768, 771, 777, 778, 782, 783, 784, 786, 787, 791, 795, 802, 804, 807, 808, 809, 811, 814, 819, 821, 822, 823, 828, 830, 848, 849, 850, 851, 852, 854, 855, 857, 858, 861, 863, 868, 872, 882, 885, 886, 889, 890, 891, 893, 901, 904, 907, 912, 913, 916, 917, 919, 924, 930, 936, 937, 938, 940, 941, 943, 944, 951, 955, 957, 968, 971, 973, 974, 982, 984, 986, 989, 990, 991, 993, 997, 1002, 1004, 1009, 1011, 1014, 1015, 1027, 1028, 1029, 1030, 1031, 1046, 1047, 1048, 1052, 1053, 1056, 1057, 1074, 1079, 1083, 1115, 1117, 1118, 1123, 1125, 1128, 1134, 1143, 1144, 1145, 1147, 1149, 1156, 1157, 1158, 1164, 1166, 1192]
                        else:
                            assert False, "the dataset may not be supported"

https://github.com/tgxs002/CORA/blob/c334a6d87bb19b23fa8c3374e7ff59213dc87e49/models/fast_detr.py#L240C13-L240C13
Hello, I have generated the json files for my own dataset. But I was stuck when it comes to evaluation.
what's the meaning of the target_index in this context? How to edit the target_index for new dataset?

@tutu-star
Copy link
Author

  if self.args.target_class_factor != 1.0 and not self.training:
                        if outputs_class.size(-1) == 66:
                            # COCO
                            target_index = [4, 5, 11, 12, 15, 16, 21, 23, 27, 29, 32, 34, 45, 47, 54, 58, 63]
                        elif outputs_class.size(-1) == 1204:
                            # LVIS
                            target_index = [12, 13, 16, 19, 20, 29, 30, 37, 38, 39, 41, 48, 50, 51, 62, 68, 70, 77, 81, 84, 92, 104, 105, 112, 116, 118, 122, 125, 129, 130, 135, 139, 141, 143, 146, 150, 154, 158, 160, 163, 166, 171, 178, 181, 195, 201, 208, 209, 213, 214, 221, 222, 230, 232, 233, 235, 236, 237, 239, 243, 244, 246, 249, 250, 256, 257, 261, 264, 265, 268, 269, 274, 280, 281, 286, 290, 291, 293, 294, 299, 300, 301, 303, 306, 309, 312, 315, 316, 320, 322, 325, 330, 332, 347, 348, 351, 352, 353, 354, 356, 361, 363, 364, 365, 367, 373, 375, 380, 381, 387, 388, 396, 397, 399, 404, 406, 409, 412, 413, 415, 419, 425, 426, 427, 430, 431, 434, 438, 445, 448, 455, 457, 466, 477, 478, 479, 480, 481, 485, 487, 490, 491, 502, 505, 507, 508, 512, 515, 517, 526, 531, 534, 537, 540, 541, 542, 544, 550, 556, 559, 560, 566, 567, 570, 571, 573, 574, 576, 579, 581, 582, 584, 593, 596, 598, 601, 602, 605, 609, 615, 617, 618, 619, 624, 631, 633, 634, 637, 639, 645, 647, 650, 656, 661, 662, 663, 664, 670, 671, 673, 677, 685, 687, 689, 690, 692, 701, 709, 711, 713, 721, 726, 728, 729, 732, 742, 751, 753, 754, 757, 758, 763, 768, 771, 777, 778, 782, 783, 784, 786, 787, 791, 795, 802, 804, 807, 808, 809, 811, 814, 819, 821, 822, 823, 828, 830, 848, 849, 850, 851, 852, 854, 855, 857, 858, 861, 863, 868, 872, 882, 885, 886, 889, 890, 891, 893, 901, 904, 907, 912, 913, 916, 917, 919, 924, 930, 936, 937, 938, 940, 941, 943, 944, 951, 955, 957, 968, 971, 973, 974, 982, 984, 986, 989, 990, 991, 993, 997, 1002, 1004, 1009, 1011, 1014, 1015, 1027, 1028, 1029, 1030, 1031, 1046, 1047, 1048, 1052, 1053, 1056, 1057, 1074, 1079, 1083, 1115, 1117, 1118, 1123, 1125, 1128, 1134, 1143, 1144, 1145, 1147, 1149, 1156, 1157, 1158, 1164, 1166, 1192]
                        else:
                            assert False, "the dataset may not be supported"

https://github.com/tgxs002/CORA/blob/c334a6d87bb19b23fa8c3374e7ff59213dc87e49/models/fast_detr.py#L240C13-L240C13 Hello, I have generated the json files for my own dataset. But I was stuck when it comes to evaluation. what's the meaning of the target_index in this context? How to edit the target_index for new dataset?

Thank you. I think the corresponding one is the label index. Can you provide the code for generating JSON files? Thank you!

@dongwhfdyer
Copy link

I can only tell you what kind of json file would work. As you can see in these pictures below, I have only included the essential information in the json file. It's a dict with three field 'images', 'annotations', 'categories'.
About the 'categories' field, there's no need for 'embedding'.
The reason why I cannot provide the code is that I found I have this kind of format annotation file in the dataset I use. These are typical coco-format annotation. Someone has done it for me.
About the target_index, I still don't know how to use it.
Maybe we can have a discussion.
image
image
image
image

@tutu-star
Copy link
Author

I can only tell you what kind of json file would work. As you can see in these pictures below, I have only included the essential information in the json file. It's a dict with three field 'images', 'annotations', 'categories'. About the 'categories' field, there's no need for 'embedding'. The reason why I cannot provide the code is that I found I have this kind of format annotation file in the dataset I use. These are typical coco-format annotation. Someone has done it for me. About the target_index, I still don't know how to use it. Maybe we can have a discussion. image image image image

Thank you very much for your reply and we look forward to your early completion. Thank you!

@RobertTang0
Copy link

Hello, I would like to ask how to use my own COCO type data set for training, which part of the code needs to be modified?

@fyang064
Copy link

fyang064 commented Sep 6, 2023

I can only tell you what kind of json file would work. As you can see in these pictures below, I have only included the essential information in the json file. It's a dict with three field 'images', 'annotations', 'categories'. About the 'categories' field, there's no need for 'embedding'. The reason why I cannot provide the code is that I found I have this kind of format annotation file in the dataset I use. These are typical coco-format annotation. Someone has done it for me. About the target_index, I still don't know how to use it. Maybe we can have a discussion. image image image image

Hi Kuhn, your reply looks good. I'm just wondering if the categories in your json file are generated by the off-the-shelf language model since I didn't see anything like it in the json files provided by the authors.

@zifuwan
Copy link

zifuwan commented Sep 28, 2023

I can only tell you what kind of json file would work. As you can see in these pictures below, I have only included the essential information in the json file. It's a dict with three field 'images', 'annotations', 'categories'. About the 'categories' field, there's no need for 'embedding'. The reason why I cannot provide the code is that I found I have this kind of format annotation file in the dataset I use. These are typical coco-format annotation. Someone has done it for me. About the target_index, I still don't know how to use it. Maybe we can have a discussion. image image image image

Hi, I'm also trying to reproduce the CLIP-Aligned Labeling. Regarding your said, "About the 'categories' field, there's no need for 'embedding'.", do you think we can use an un-trained region classifier to get a refined class score for each class? What I mean is, instead of getting the embeddings as you showed in the picture, shall we multiply the region embedding with text embedding to get a score?

@Luhui-Zhao
Copy link

Hello, I would like to ask how to use my own COCO type data set for training, which part of the code needs to be modified?

Hello, I also want to use this code to train my own data set, and there is also a problem like yours, I would like to communicate with you, may I ask if you have solved this problem? If it is convenient, you can add QQ, my QQ is 755476579, looking forward to your reply, thank you.

@tutu-star
Copy link
Author

tutu-star commented May 20, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants