What is the purpose of train and test set? #1

matemato · 2025-01-03T23:45:50Z

Hi,

I want to measure the diversity of my generated synthetic dataset of generated faces. I am using the FFHQ dataset as the real dataset.
Could you please explain the purpose of the train and test set when running your script?

Thank you so much!

MischaD · 2025-01-07T07:03:33Z

Hi,

if you want to compare two different synthetic datasets, there is no purpose for it. The one with the higher unadjusted IRS score will be the more diverse.

There are two potential reasons to use it:

You want to compare the diversity of real data to your synthetic dataset. Then you should add a test set.
You want the diversity to be interpretable in terms of the percentage of the diversity of the original dataset. Then, we need a real reference dataset to adjust for the lack of diversity that comes from the feature extractor (e.g., SwAV). Refer to Section 3.4 for more information.

Hope that helps.

matemato · 2025-01-09T12:39:59Z

Thank you for your answer!

Could you elaborate further on what should be passed as a) train set, b) test set, and c) synthetic data for the 3 examples you mentioned:

Comparing the diversity of 2 different synthetic datasets
Comparing the diversity of real data to a synthetic dataset
Comparing the diversity to be interpretable in terms of the percentage of the diversity of the original dataset.

i.e. what to pass down as arguments when running your script:

results = run("path/to/train", "path/to/test", "path/to/synth", "out/path", "results.json", config)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the purpose of train and test set? #1

What is the purpose of train and test set? #1

matemato commented Jan 3, 2025

MischaD commented Jan 7, 2025

matemato commented Jan 9, 2025

What is the purpose of train and test set? #1

What is the purpose of train and test set? #1

Comments

matemato commented Jan 3, 2025

MischaD commented Jan 7, 2025

matemato commented Jan 9, 2025