GitHub - zwenyu/UniSSDA: Code for Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias [CVPR 2024]

This repository contains code demonstrating the UniSSDA method in our CVPR 2024 paper Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias. See arXiv version for appendix.

Setup

Create a conda environment:

conda env create -f environment.yml

Datasets

Available Datasets

We prepared two public datasets:

Office-Home
DomainNet

python download_data.py

In the data directory, the txt folder contains the text files for the splits for each dataset, under the name of the dataset. txt folder is for covariate shift only, txt_labelshift folder is for covariate + label shift with same sample size as in txt folder, and txt_fullsize folder is for covariate + label shift with full dataset size. Generate splits by navigating to the selected folder and running

python generate_txt.py

Adding New Dataset

Structure of data

To add new dataset (e.g., NewData), it should be placed in a folder named NewData in the datasets directory (path provided in the arguments for main.py, ./data by default). The file structure for the dataset should be:

NewData
│
└───domain1
│   │   image1
│   │   image2
│   │   ...
│   
└───domain2
│   │   image1
│   │   image2
│   │   ...    
│ 
...

Generating data splits

The splits for each domain is defined as 50% train, 20% validation, 30% test. Few-shot training and validation sets are sampled from the corresponding splits.

In the datasets directory, the txt folder contains the text files for the splits for each dataset, under the name of the dataset. txt folder is for covariate shift only, txt_labelshift folder is for covariate + label shift with same sample size as in txt folder, and txt_fullsize folder is for covariate + label shift with full dataset size. Each row in the text file is in the format: relative_path_of_image_to_dataset_folder class_id. (e.g., Clipart/Alarm_Clock/00053.jpg 0).

To generate the text files for NewData, after ensuring it has the file structure as stated above, create a new folder named NewData in the txt folder and run the provided generate_txt.py.

Configurations

Next, you have to add configs for the dataset in configs/hparams.py, configs/data_model_configs.py , dataloader/dataloader.py to define training hyperparameters and cross-domain adaptation scenarios.

Domain Adaptation Algorithms

Existing Algorithms

Supervised baseline
CDAC
PAC
AdaMatch
DST
Proposed method

Adding New Algorithm

To add a new algorithm, place it in algorithms/algorithms.py.

Training procedure

The experiments are organised in a hierarchical way such that:

Several experiments are collected under one directory assigned by --experiment_description.
Each experiment could have different trials, each is specified by --run_description.

Training a model

To train a model:

python main.py  --experiment_description expt_run-txt-Resnet34-office_home-openpartial  \
                --run_description expt-Proposed-kshot-3 \
                --da_setting openpartial \
                --da_method Proposed \
                --dataset office_home \
                --backbone Resnet34 \
                --num_seeds 3 \
                --sampling kshot \
                --num_shots 3 \
                --data_path "./data/txt"
                --data_root "./data"

Sample scripts are in scripts.

Deploying results on WandB Team

We use Wandb for visualizations of model training.

Sign up for a WandB account using github or google account. Add --wandb_entity TEAM_NAME as an argument to main.py where TEAM_NAME is an existing WandB team you are in. Eg. --wandb_entity ssda

Additional WandB arguments can be specified through wandb_dir, wandb_project, wandb_tag for organizing WandB runs, logs and artifacts.

Results

Results for each run are saved in experiments_logs. Obtain consolidated results by

python consolidation/consolidate_run.py

Citation

@INPROCEEDINGS{zhang2024unissda,
  author={Zhang, Wenyu and Liu, Qingmu and Wei Cong, Felix Ong and Ragab, Mohamed and Foo, Chuan-Sheng},
  booktitle={2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  title={Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias}, 
  year={2024},
  volume={},
  number={},
  pages={23912-23921},
  doi={10.1109/CVPR52733.2024.02257}}

Acknowledgement

This repository is adapted from AdaTime: A Benchmarking Suite for Domain Adaptation on Time Series Data.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
algorithms		algorithms
configs		configs
consolidation		consolidation
data		data
dataloader		dataloader
models		models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
download_data.py		download_data.py
main.py		main.py
pretrain_trainer.py		pretrain_trainer.py
trainer.py		trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Setup

Datasets

Available Datasets

Adding New Dataset

Structure of data

Generating data splits

Configurations

Domain Adaptation Algorithms

Existing Algorithms

Adding New Algorithm

Training procedure

Training a model

Deploying results on WandB Team

Results

Citation

Acknowledgement

About

Releases

Packages

Languages

zwenyu/UniSSDA

Folders and files

Latest commit

History

Repository files navigation

Setup

Datasets

Available Datasets

Adding New Dataset

Structure of data

Generating data splits

Configurations

Domain Adaptation Algorithms

Existing Algorithms

Adding New Algorithm

Training procedure

Training a model

Deploying results on WandB Team

Results

Citation

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages