This repository implements the model proposed in the ACCV 2024 paper:
Kin Wai Lau, Yasar Abbas Ur Rehman, Pedro Porto Buarque de Gusmão, Lai-Man Po, Lan Ma, Yuyang Xie, FedRepOpt: Gradient Re-parametrized Optimizers in Federated Learning [arXiv paper]
The implementation code is based on the Re-parameterizing Your Optimizers rather than Architectures, ICLR, 2023. For more information, please refer to the link.
When using this code, kindly reference:
@misc{lau2024fedrepoptgradientreparametrizedoptimizers,
title={FedRepOpt: Gradient Re-parametrized Optimizers in Federated Learning},
author={Kin Wai Lau and Yasar Abbas Ur Rehman and Pedro Porto Buarque de Gusmão and Lai-Man Po and Lan Ma and Yuyang Xie},
year={2024},
eprint={2409.15898},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2409.15898},
}
You can download our hyperparameter searched models on CIFAR100 as follow:
You can download our pretrained models on Tiny ImagenNet as follow: Pretrained on 1 local epoch and 240 rounds with cross-silo NIID setting
-
Fed-RepGhost-Tr 0.5x (arch: ghost-rep) link
-
Fed-RepGhost-Inf 0.5x (arch: ghost-target-norepopt) link
-
Fed-CSLA-Ghost 0.5x (arch: ghost-csla) link
-
FedRepOpt-GhostNet 0.5x (arch: ghost-target) link
-
Fed-RepVGG-B1-Tr (arch: RepVGG-B1-repvgg) link
-
Fed-RepVGG-B1-Inf (arch: RepVGG-B1-target-norepopt) link
-
Fed-CSLA-VGG-B1 (arch: RepVGG-B1-csla) link
-
FedRepOpt-VGG-B1 (arch: RepVGG-B1-target) link
Pretrained on 5 local epoch and 1000 rounds with cross-device NIID setting
-
Fed-RepGhost-Tr 0.5x (arch: ghost-rep) link
-
Fed-RepGhost-Inf 0.5x (arch: ghost-target-norepopt) link
-
Fed-CSLA-Ghost 0.5x (arch: ghost-csla) link
-
FedRepOpt-GhostNet 0.5x (arch: ghost-target) link
-
Fed-RepVGG-B1-Tr (arch: RepVGG-B1-repvgg) link
-
Fed-RepVGG-B1-Inf (arch: RepVGG-B1-target-norepopt) link
-
Fed-CSLA-VGG-B1 (arch: RepVGG-B1-csla) link
-
FedRepOpt-VGG-B1 (arch: RepVGG-B1-target) link
You can download our NIID Tiny ImageNet annotations files as follow:
- Cross silo NIID (α=0.1 in Dirichlet distribution, number of client=10) link
- Cross device NIID (α=0.1 in Dirichlet distribution, number of client=100) link
data_splitter/tiny-imagenet_json_splitter_direchlet.py
script provides a tool for generating IID and NIID annotations for Tiny-ImageNet.
-
Requirements:
- Python 3.8.0
- PyTorch 1.7.1
- Flower 1.3.0
-
Install the required packages:
pip install -r requirements.txt
You can run the following command to conduct a hyper-parameter search on CIFAR100 in a centralized setting.
python -m torch.distributed.launch --nproc_per_node NUM_GPUS --master_port PORT_NUM main_repopt_centralized.py \
--data-path /path/to/cifar100 \
--arch ghost-hs \
--batch-size 128 \
--tag search \
--opts TRAIN.EPOCHS 600 TRAIN.BASE_LR 0.6 TRAIN.WEIGHT_DECAY 1e-5 TRAIN.WARMUP_EPOCHS 10 MODEL.LABEL_SMOOTHING 0.1 DATA.DATASET cf100 TRAIN.CLIP_GRAD 5.0
hs.sh
provides examples of commands for finding hyperparameters for RepOpt-VGG-B1 and GhostNet.
You can run the following command to conduct a federated training on Tiny ImageNet.
python src_fl/main.py \
--data-path /path/to/tiny-imagenet-200 \
--arch ghost-target-tinyImageNet \
--batch-size 32 \
--tag experiment \
--num_clients_per_round 10 \
--pool_size 10 \
--rounds 240 \
--scales-path /path/to/hyper-parameter-search/model \
--opts TRAIN.EPOCHS 1 TRAIN.BASE_LR 0.01 TRAIN.LR_SCHEDULER.NAME step TRAIN.LR_SCHEDULER.DECAY_RATE 0.0 TRAIN.WEIGHT_DECAY 4e-5 TRAIN.WARMUP_EPOCHS 0 MODEL.LABEL_SMOOTHING 0.1 AUG.PRESET raug15 DATA.DATASET tiny_imagenet DATA.IMG_SIZE 64 LOGOUTPUT log TRAIN.OPTIMIZER.MOMENTUM 0.0 \
DATA.ANNOTATIONS_FED annotations_fed_alpha_0.1_clients_10 SEED 0
num_clients_per_round
represents number of clients participating in the training for each round andpool_size
represents number of dataset partitions (= number of total clients). Ifnum_clients_per_round
is set to 10 andpool_size
is 10, all the clients participate in the training.round
represents the total number of FL rounds andTRAIN.EPOCHS
represents the total number of training epochs for each clients.train_repopt_fl.sh
provides training command examples for all the models.- The evaluation results will be stored in
output/arch/server/log_rank0.txt
.