Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Notebook] Update notebook tutorials #322

Merged
merged 4 commits into from
Aug 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

FederatedScope is a comprehensive federated learning platform that provides convenient usage and flexible customization for various federated learning tasks in both academia and industry. Based on an event-driven architecture, FederatedScope integrates rich collections of functionalities to satisfy the burgeoning demands from federated learning, and aims to build up an easy-to-use platform for promoting learning safely and effectively.

A detailed tutorial is provided on our [website](https://federatedscope.io/).
A detailed tutorial is provided on our [website](https://federatedscope.io/), and you can try FederatedScope in [FederatedScope Playground](https://try.federatedscope.io/) or [Google Colab](https://colab.research.google.com/github/alibaba/FederatedScope).

## News
- [07-30-2022] We release FederatedScope v0.2.0!
Expand Down
227 changes: 227 additions & 0 deletions materials/notebook/(CN)CIKMCUP2022.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Step-by-step Guidance for CIKM 2022 AnalytiCup Competition"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1. 安装FederatedScope"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-07-26T09:23:22.036523Z",
"start_time": "2022-07-26T09:23:21.642646Z"
}
},
"outputs": [],
"source": [
"# 拷贝FederatedScope并切换到比赛所使用的分支cikm22competition\n",
"\n",
"!cp -r fs_latest FederatedScope\n",
"\n",
"# 如果不在playground中请输入命令`cd FederatedScope`\n",
"import os\n",
"os.chdir('FederatedScope')\n",
"\n",
"!git checkout cikm22competition"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 2. 搭建运行环境"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-07-26T09:23:28.796456Z",
"start_time": "2022-07-26T09:23:22.038843Z"
}
},
"outputs": [],
"source": [
"# 安装 FS\n",
"!pip install -e . --user"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 3. 下载比赛数据集"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-07-26T09:23:29.189922Z",
"start_time": "2022-07-26T09:23:28.798365Z"
}
},
"outputs": [],
"source": [
"# 比赛数据集已经提前下载到`data`目录下,通过如下的命令进行解压:\n",
"!mkdir -p data\n",
"!cp -r ../data/CIKM22Competition ./data/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"现在比赛数据集已经被放置在`FederatedScope/data/CIKM22Competition`目录下,遵循`CIKM22Competition/${client_id}`的方式进行组织,其中`${client_id}`是client的序号并从1开始计数。每个client的目录下包含训练(train.pt),测试(test.pt)和验证(val.pt)三个部分的数据。可以通过`torch.load`的方式来查看数据:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-07-26T09:23:29.823058Z",
"start_time": "2022-07-26T09:23:29.192273Z"
}
},
"outputs": [],
"source": [
"import torch\n",
"# 加载序号为1的client的训练数据\n",
"train_data_client1 = torch.load('./data/CIKM22Competition/1/train.pt')\n",
"# 查看训练数据中的第一个训练样例\n",
"print(train_data_client1[0])\n",
"# 查看训练样例的label\n",
"print(train_data_client1[0].y)\n",
"# 查看训练样例的序号,即${sample_id}\n",
"print(train_data_client1[0].data_index)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 4. 在比赛数据上运行baseline算法"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"FederatedScope为比赛数据内置了两个baseline算法:\"isolated training\"和\"FedAvg\"。假设你已经成功的搭建好FederatedScope的运行环境,并下载了比赛数据,那么可以通过以下的命令运行两个baseline算法"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-07-26T09:23:29.824268Z",
"start_time": "2022-07-26T09:23:29.824254Z"
}
},
"outputs": [],
"source": [
"# 作为示例,我们在这里仅运行3轮,用户可以按照自己的需求更改运行轮数\n",
"# 按照如下命令运行isolated training\n",
"!python federatedscope/main.py --cfg federatedscope/gfl/baseline/isolated_gin_minibatch_on_cikmcup.yaml --client_cfg federatedscope/gfl/baseline/isolated_gin_minibatch_on_cikmcup_per_client.yaml federate.total_round_num 3\n",
"\n",
"# 按照如下命令运行FedAvg\n",
"!python federatedscope/main.py --cfg federatedscope/gfl/baseline/fedavg_gin_minibatch_on_cikmcup.yaml --client_cfg federatedscope/gfl/baseline/fedavg_gin_minibatch_on_cikmcup_per_client.yaml federate.total_round_num 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"其中参数--cfg xxxx.yaml指定了全局的参数,--client_cfg xxx.yaml为每一个client单独指定了参数"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 5. 保存并提交预测结果"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 提交格式\n",
"如CIKM 2022 AnalytiCup Competition介绍中所述,参赛者需要将所有client的预测结果保存到一个csv文件中进行提交。在这个csv文件中,每一行代表一个测试样例的预测结果,并且由`${client_id}`和`${sample_id}`标识所预测的测试样例。其中,`${client_id}`从1开始计数,`${sample_id}`需要与比赛数据中的测试样例的序号保持一致(可以通过测试样例的`data_index`属性获得测试样例的序号)。\n",
"\n",
"需要注意的是,分类任务和多维度回归任务分别遵循以下不同的格式:\n",
"* 对分类任务,每一行应当遵循如下的格式(`${category_id}`从0开始计数):\n",
"\n",
"```\n",
"${client_id},${sample_id},${category_id}\n",
"```\n",
"\n",
"* 对于N维的回归任务,每一行应当遵循如下的格式:\n",
"```\n",
"${client_id},${sample_id},${prediction_1st_dimension},…,${prediction_N-th_dimension}\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 通过FederatedScope保存预测结果\n",
"FederatedScope中的\"cikm22competition\"分支目前支持在训练结束时保存预测结果(相关代码见`federatedscope/gfl/trainer/graphtrainer.py`和`federatedscope/core/trainers/torch_trainer.py`)。\n",
"预测结果将会被保存在一个名为prediction.csv的文件中,该文件位于`outdir`参数所指定的目录中。每次运行时,如果`outdir`所指定的目录已经存在,则会使用一个时间戳后缀用以区分。\n",
"在训练结束时,FederatedScope的训练记录也将标识出预测结果所在的目录,如下所示\n",
"![Finish saving prediction results](https://img.alicdn.com/imgextra/i1/O1CN01L68gve1nGu3BNp5vL_!!6000000005063-2-tps-4766-404.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 提交预测结果\n",
"最后,参赛选手可以将预测结果下载到本地,并上传到[天池](https://tianchi.aliyun.com/competition/entrance/532008/introduction)。"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.2 64-bit",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.2"
},
"vscode": {
"interpreter": {
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

Loading