[RFC] Datasets API #53

hassiahk · 2020-11-24T17:24:09Z

🚀 Feature

Having Datasets API for commonly used formats will come in handy.

Pitch

A non-exhaustive list of formats that are commonly used:

CSV file with image_id and target columns (Binary or Multi-Class Classification). There are two ways that are used most often in this:

image_id        target
100011               1
100015               0
100007               2

Above has been implemented using CSVSingleLabelDataset. Should we add support for below in the same or should we create a separate one? I think we can have both in the same.

image_id        target
100011.png           1
100015.png           0
100007.png           2

CSV file with image_id and target columns (Multi-Label Classification). Similarly, there are two ways that are used most often in this:

image_id        target
100011             0 1
100015             0 2
100007             1 2

image_id        target
100011.png         0 1
100015.png         0 2
100007.png         1 2

Folder structure like below:

folder
|-- test
`-- train
    |-- class_1
    |   |-- 10001.png
    |   `-- 10002.png
    |-- class_2
    |   |-- 10005.png
    |   `-- 10009.png
    `-- class_3
        |-- 10014.png
        `-- 10027.png

Above has been implemented using create_folder_dataset but we don't always need to split the train into train_set and valid_set. Because we may have cases where valid_set is pre-defined like below:

folder
|-- test
|-- train
|   |-- class_1
|   |   |-- 10001.png
|   |   `-- 10002.png
|   |-- class_2
|   |   |-- 10005.png
|   |   `-- 10009.png
|   `-- class_3
|       |-- 10014.png
|       `-- 10027.png
`-- valid
    |-- class_1
    |   |-- 10023.png
    |   `-- 10035.png
    |-- class_2
    |   |-- 1002.png
    |   `-- 10042.png
    `-- class_3
        |-- 10029.png
        `-- 10076.png

CSV file with image_id and bbox column (Object Detection). Similar to classification tasks, there can be two ways that are used most often in this:

image_id        bbox
100011          [834.0, 222.0, 56.0, 36.0]
100011          [226.0, 548.0, 130.0, 58.0]
100007          [377.0, 504.0, 74.0, 160.0]

Honestly, I have never seen the below format but still we can have support for this.

image_id             bbox
100011.jpg           [834.0, 222.0, 56.0, 36.0]
100011.jpg           [226.0, 548.0, 130.0, 58.0]
100007.jpg           [377.0, 504.0, 74.0, 160.0]

I have come across only the above four formats, but do let me know if I missed any. And also let me know your thoughts on the above.

cc @zhiqwang

The text was updated successfully, but these errors were encountered:

zhiqwang · 2020-11-24T18:18:50Z

For object detection task, there are two other frequently used formats: Pascal VOC and MSCOCO, and it is supported in torchvision, I am not sure that we didn't mention this two Datasets is for we just use torchvision's implementation when we met this two datasets?

oke-aditya · 2020-11-24T18:28:00Z

I think we should discuss more over this. Datasets is really tricky especially when it comes to object detection etc.
For the Torchvision models, we expect VOC format.
And for Detr, a normalized YOLO format.
We haven't enforced these as these have come from models themselves.

hassiahk added the enhancement New feature or request label Nov 24, 2020

oke-aditya added the datasets Providing Datasets to users label Nov 24, 2020

oke-aditya assigned hassiahk and oke-aditya Nov 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Datasets API #53

[RFC] Datasets API #53

hassiahk commented Nov 24, 2020 •

edited by oke-aditya

Loading

zhiqwang commented Nov 24, 2020 •

edited

Loading

oke-aditya commented Nov 24, 2020

[RFC] Datasets API #53

[RFC] Datasets API #53

Comments

hassiahk commented Nov 24, 2020 • edited by oke-aditya Loading

🚀 Feature

Pitch

zhiqwang commented Nov 24, 2020 • edited Loading

oke-aditya commented Nov 24, 2020

hassiahk commented Nov 24, 2020 •

edited by oke-aditya

Loading

zhiqwang commented Nov 24, 2020 •

edited

Loading