Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add condition param to autotune; Add feature engineering module #426

Merged
merged 51 commits into from
Dec 13, 2022

Conversation

rayrayraykk
Copy link
Collaborator

@rayrayraykk rayrayraykk commented Nov 10, 2022

Main changes

condition param

The style are shown below:

   xgb_base.use  train.optimizer.lr  train.optimizer.num_of_trees  performance
0          True                 NaN                           5.0     0.59876
1         False            0.694859                           NaN     0.599826
2         False            0.876475                           NaN     0.600004

wandb

How to enable:
Install wandb and setup init configs for wandb
Run with wandb.use True:

python federatedscope/hpo.py --cfg federatedscope/autotune/baseline/fedex_vfl.yaml wandb.use True wandb.name_project vfl_demo wandb.name_user vfl_demo

Feature engineering

I implement Feature engineering as a wrapper for workers.

FederatedScope
├── federatedscope
│   ├── core           
│   |   ├── ...
│   |   ├── feature
│   |   |   ├── hFL
│   |   |   |   ├── ...
│   |   |   ├── vFL
│   |   |   |   ├── preprocess
│   |   |   |   |   ├── log_transform.py
│   |   |   |   |   ├── quantile_binning.py
│   |   |   |   |   ├── ...
│   |   |   |   ├── selection
│   |   |   |   |   ├── correlation_filter.py
│   |   |   |   |   ├── iv_filter.py
│   |   |   |   |   ├── ...

How to use:

python federatedscope/main.py --cfg federatedscope/vertical_fl/xgb_base/baseline/xgb_base_on_adult.yaml feat_engr.type vfl_instance_norm

@rayrayraykk rayrayraykk added the enhancement New feature or request label Nov 10, 2022
@@ -57,7 +57,7 @@ def run(self):
server_class=get_server_cls(self._trial_cfg),
client_class=get_client_cls(self._trial_cfg),
config=self._trial_cfg.clone(),
client_config=self._client_cfgs)
client_configs=self._client_cfgs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid such kinds of mistakes, we need to improve the coverage of our unit test cases

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in GitHub actions.

joneswong
joneswong previously approved these changes Nov 10, 2022
Copy link
Collaborator

@joneswong joneswong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job! conditional search space is prevalent, and we really need it.

@joneswong joneswong added the FedHPO FedHPO related label Nov 10, 2022
@rayrayraykk rayrayraykk changed the title Add condition param to autotune [WIP]Add condition param to autotune; Add feature engineering module Nov 21, 2022
@rayrayraykk rayrayraykk changed the title [WIP]Add condition param to autotune; Add feature engineering module Add condition param to autotune; Add feature engineering module Nov 22, 2022
@rayrayraykk rayrayraykk requested a review from qbc2016 December 5, 2022 08:10
@rayrayraykk rayrayraykk changed the title [WIP] Add condition param to autotune; Add feature engineering module Add condition param to autotune; Add feature engineering module Dec 5, 2022
Copy link
Collaborator

@xieyxclack xieyxclack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature engineering module looks good to me!

@@ -64,6 +64,23 @@ def extend_data_cfg(cfg):
cfg.data.quadratic.min_curv = 0.02
cfg.data.quadratic.max_curv = 12.5

# feature engineer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ''feature engineering''?

@@ -67,10 +67,10 @@ def extend_fl_setting_cfg(cfg):
# ---------------------------------------------------------------------- #
# Vertical FL related options (for demo)
# ---------------------------------------------------------------------- #
cfg.vertical_dims = [5, 10] # Avoid to be removed when `use` is False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we still need vertical_dims if vertical.use=False?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xgb.dim and vertical.dims are merged into one arg vertical_dims.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what is this? I guess it is the input dim. If it is the case, why do we need it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what is this? I guess it is the input dim. If it is the case, why do we need it?

@qbc2016 Could you explain it briefly for me?

self._init_data_related_var()
self.trigger_train_func(**self.kwargs_for_trigger_train_func)

# Bind method to instance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good! @qbc2016 Maybe we can improve some modules in FL-Tree like this

@@ -0,0 +1,14 @@
def run_scheduler(scheduler, cfg, client_cfgs=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please provide docstring

@@ -16,21 +16,26 @@ def run_smac(cfg, scheduler, client_cfgs=None):

def optimization_function_wrapper(config):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please supplement docstring for this method. It seems that understanding what its input is largely help figuring out what this method does.



def log2wandb(trial, config, results, trial_cfg):
import wandb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I have no idea about the overhead of repeatedly importing a module, I am wondering is it appropriate to do this? Have you benchmarked such a behavior? Here is a related thread: https://stackoverflow.com/questions/128478/should-import-statements-always-be-at-the-top-of-a-module


def wrap_instance_norm_server(worker):
"""
This function is to perform instance norm vfl tabular data for server.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by "perform instance norm vfl tabular data"?

@@ -160,7 +164,7 @@ def _setup_server(self, resource_info=None, client_resource_info=None):
wrap_nbafl_server
wrap_nbafl_server(server)
logger.info('Server has been set up ... ')
return server
return self.feat_engr_wrapper_server(server)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ok to inject those feature engineering procedures into an FL course in this way, but we shall change the instantiation of workers to a better (more general and unified) way. Actually, it is quite confusing to wrap a worker by feature engineering-related wrapper. Feature engineering is just a tiny step in an FL course, which doesn't change a worker significantly. One usual way to instantiate a worker from a collection of such pluggable behaviors is to use factory pattern I guess.

joneswong
joneswong previously approved these changes Dec 13, 2022
Copy link
Collaborator

@joneswong joneswong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved.

Copy link
Collaborator

@joneswong joneswong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joneswong joneswong merged commit a32a8f9 into alibaba:master Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request FedHPO FedHPO related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing docs for configurations
3 participants