Skip to content

Latest commit

 

History

History
74 lines (58 loc) · 5.22 KB

README_dataset.md

File metadata and controls

74 lines (58 loc) · 5.22 KB

Dataset

Download

Dataset file structure

|--FFHQ-UV  
    |--dataset
        |--ffhq-uv  # FFHQ-UV dataset
            |--uv-maps  # facial texture UV-maps
            |--face-latents  # normalized face images' latent codes
            |--meta-info.json  # meta information: normalized face images' attributes, corresponding indices in FFHQ
        |--ffhq-uv-interpolate  # FFHQ-UV-Interpolate dataset
            |--uv-maps  # facial texture UV-maps
            |--face-latents  # normalized face images' latent codes
            |--meta-info.json  # meta information: normalized face images' attributes, corresponding indices in FFHQ
    |--dataset_project  # FFHQ-UV dataset project details
        |--latents.zip   # inverted faces' latent codes
        |--attributes.zip  # detected attributes (.json file)
        |--attributes_ms_api.zip  # detected attributes (the object returned by MS Face API)
        |--lights.zip  # detected light attribute (SH coefficient)

Normalized face images' latent codes

  • In "/FFHQ-UV/dataset/ffhq-uv/face-latents/".
  • We provide the latent codes of the multi-view normalized face images which are used for extracting texture UV-maps.
  • One can generate face images from download latent codes by using the following script.
sh run_gen_face_from_latent.sh  # Please refer to this script for detailed configuration

Normalized face images' meta information

  • In "/FFHQ-UV/dataset/ffhq-uv/meta-info.json".
  • We provide the attributes (gender, age, beard) of each face, which are detected by Microsoft Face API
  • Along with the attributes, we also provide the corresponding indices in original FFHQ dataset.
  • We prioritize the quality and illumination properties of the normalized faces, rather than the identity fidelity with the faces in the original FFHQ dataset. Therefore, we discourage researchers from looking for correspondences between samples in FFHQ-UV and the original FFHQ dataset. To do this, we shuffle the samples in the FFHQ-UV dataset.
  • In addition, we additionally introduce a very small number of samples (about 1K samples) to alleviate the data biases of FFHQ and StyleGAN (e.g., dark-skinned faces). For these samples, we do not provide indices in the original FFHQ dataset.

FFHQ-UV dataset project details

  • In "/FFHQ-UV/dataset_project".
  • Since Microsoft Face API is not accessible for new users, in order to make it easier for others to reproduce our work, we provide project details of the FFHQ-UV dataset creation pipeline, including inverted faces' latent codes (latents.zip) and detected face attributes of inverted faces (lights.zip, attributes.zip, attributes_ms_api.zip).
  • With these files, one can directly run step3 (DataSet_Step3_Editing) and step4 (DataSet_Step4_UV_Texture) to generate final texture UV-maps.
  • In "attributes_ms_api.zip", we provide the object returned by MS Face API, where one can find other detected attributes (e.g., age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure, noise, qualityForRecognition).
  • Since 380 samples failed in the encoder4editing inversion task, we only provide information of the remaining 69620 samples.

Disclaimer

We used the Microsoft Face API to detect some attributes of faces in the FFHQ dataset inverted to the StyleGAN latent space, and made them publicly available to researchers. We emphasize:

  • This is a research-oriented open source project, and the attributes detected by Microsoft Face API are not used commercially.
  • The faces used for detecting attributes are not real faces, but synthetic faces generated by using encoder4editing technology to map the public dataset FFHQ to the StyleGAN latent space.
  • The data related to face attributes provided by this project is limited to research use, please refer to Microsoft Face API for face data and privacy-related information.

FFHQ-UV-Interpolate dataset

FFHQ-UV-Interpolate is a variant of FFHQ-UV. It is based on latent space interpolation, which is with compromised diversity but higher quality and larger scale (100,000 UV-maps).

We adopt the following main steps to obtain FFHQ-UV-Interpolate from FFHQ-UV:

  • Automatic data filtering considering BS Error, valid texture area ratio, expression detection, etc.
  • Sample classification considering attributes such as gender, age, beard, etc.
  • Latent space interpolation within each sample category.

Some quantitative comparisons between FFHQ-UV and FFHQ-UV-Interpolate (the values of ID std. are divided by the value of FFHQ):

Dataset ID std. $\uparrow$ # UV-maps $\uparrow$ BS Error $\downarrow$
FFHQ-UV 90.06% 54,165 7.293
FFHQ-UV-Interpolate 80.12% 100,000 4.490