Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

[High-Level-API] Update MNIST to use optimizer_func #535

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions 02.recognize_digits/README.cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ PaddlePaddle在API中提供了自动加载[MNIST](http://yann.lecun.com/exdb/mni
1. `train_program`:指定如何从 `inference_program` 和`标签值`中获取 `loss` 的函数。
这是指定损失计算的地方。

1. `optimizer`: 配置如何最小化损失。PaddlePaddle 支持最主要的优化方法。
1. `optimizer_func`: 配置如何最小化损失。PaddlePaddle 支持最主要的优化方法。

1. `Trainer`:PaddlePaddle Trainer 管理由 `train_program` 和 `optimizer` 指定的训练过程。
通过 `event_handler` 回调函数,用户可以监控培训的进展。
Expand Down Expand Up @@ -238,6 +238,15 @@ def train_program():
# 该模型运行在单个CPU上
```

#### Optimizer Function 配置

在下面的 `Adam optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。

```python
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
```

### 数据集 Feeders 配置

下一步,我们开始训练过程。`paddle.dataset.movielens.train()`和`paddle.dataset.movielens.test()`分别做训练和测试数据集。这两个函数各自返回一个reader——PaddlePaddle中的reader是一个Python函数,每次调用的时候返回一个Python yield generator。
Expand All @@ -259,16 +268,14 @@ test_reader = paddle.batch(
### Trainer 配置

现在,我们需要配置 `Trainer`。`Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer`。
在下面的 `Adam optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。

```python
# 该模型运行在单个CPU上
use_cuda = False # set to True if training with GPU
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
optimizer = fluid.optimizer.Adam(learning_rate=0.001)

trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer=optimizer)
train_func=train_program, place=place, optimizer_func=optimizer_program)
```

#### Event Handler 配置
Expand Down
15 changes: 11 additions & 4 deletions 02.recognize_digits/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ Here are the quick overview on the major fluid API complements.
This is where you specify the network flow.
1. `train_program`: A function that specify how to get avg_cost from `inference_program` and labels.
This is where you specify the loss calculations.
1. `optimizer`: Configure how to minimize the loss. Paddle supports most major optimization methods.
1. `optimizer_func`: Configure how to minimize the loss. Paddle supports most major optimization methods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we elaborate a bit? I like the description of the train_program. Can we write something like "A function that specifies the configuration of the the optimizer. The optimizer is responsible for minimizing the loss and driving the training. Paddle supports many different optimizers."

1. `Trainer`: Fluid trainer manages the training process specified by the `train_program` and `optimizer`. Users can monitor the training
progress through the `event_handler` callback function.
1. `Inferencer`: Fluid inferencer loads the `inference_program` and the parameters trained by the Trainer.
Expand Down Expand Up @@ -245,6 +245,15 @@ def train_program():
return [avg_cost, acc]
```

#### Optimizer Function Configuration

In the following `Adam` optimizer, `learning_rate` means the speed at which the network training converges.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rephrase: learning_rate specifies the learning rate in the optimization procedure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, sounds good.


```python
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
```

### Data Feeders Configuration

Then we specify the training data `paddle.dataset.mnist.train()` and testing data `paddle.dataset.mnist.test()`. These two methods are *reader creators*. Once called, a reader creator returns a *reader*. A reader is a Python method, which, once called, returns a Python generator, which yields instances of data.
Expand All @@ -266,15 +275,13 @@ test_reader = paddle.batch(
### Trainer Configuration

Now, we need to setup the trainer. The trainer need to take in `train_program`, `place`, and `optimizer`.
In the following `Adam` optimizer, `learning_rate` means the speed at which the network training converges.

```python
use_cuda = False # set to True if training with GPU
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
optimizer = fluid.optimizer.Adam(learning_rate=0.001)

trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer=optimizer)
train_func=train_program, place=place, optimizer_func=optimizer_program)
```

#### Event Handler
Expand Down
15 changes: 11 additions & 4 deletions 02.recognize_digits/index.cn.html
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@
1. `train_program`:指定如何从 `inference_program` 和`标签值`中获取 `loss` 的函数。
这是指定损失计算的地方。

1. `optimizer`: 配置如何最小化损失。PaddlePaddle 支持最主要的优化方法。
1. `optimizer_func`: 配置如何最小化损失。PaddlePaddle 支持最主要的优化方法。

1. `Trainer`:PaddlePaddle Trainer 管理由 `train_program` 和 `optimizer` 指定的训练过程。
通过 `event_handler` 回调函数,用户可以监控培训的进展。
Expand Down Expand Up @@ -280,6 +280,15 @@
# 该模型运行在单个CPU上
```

#### Optimizer Function 配置

在下面的 `Adam optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。

```python
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
```

### 数据集 Feeders 配置

下一步,我们开始训练过程。`paddle.dataset.movielens.train()`和`paddle.dataset.movielens.test()`分别做训练和测试数据集。这两个函数各自返回一个reader——PaddlePaddle中的reader是一个Python函数,每次调用的时候返回一个Python yield generator。
Expand All @@ -301,16 +310,14 @@
### Trainer 配置

现在,我们需要配置 `Trainer`。`Trainer` 需要接受训练程序 `train_program`, `place` 和优化器 `optimizer`。
在下面的 `Adam optimizer`,`learning_rate` 是训练的速度,与网络的训练收敛速度有关系。

```python
# 该模型运行在单个CPU上
use_cuda = False # set to True if training with GPU
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
optimizer = fluid.optimizer.Adam(learning_rate=0.001)

trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer=optimizer)
train_func=train_program, place=place, optimizer_func=optimizer_program)
```

#### Event Handler 配置
Expand Down
15 changes: 11 additions & 4 deletions 02.recognize_digits/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@
This is where you specify the network flow.
1. `train_program`: A function that specify how to get avg_cost from `inference_program` and labels.
This is where you specify the loss calculations.
1. `optimizer`: Configure how to minimize the loss. Paddle supports most major optimization methods.
1. `optimizer_func`: Configure how to minimize the loss. Paddle supports most major optimization methods.
1. `Trainer`: Fluid trainer manages the training process specified by the `train_program` and `optimizer`. Users can monitor the training
progress through the `event_handler` callback function.
1. `Inferencer`: Fluid inferencer loads the `inference_program` and the parameters trained by the Trainer.
Expand Down Expand Up @@ -287,6 +287,15 @@
return [avg_cost, acc]
```

#### Optimizer Function Configuration

In the following `Adam` optimizer, `learning_rate` means the speed at which the network training converges.

```python
def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)
```

### Data Feeders Configuration

Then we specify the training data `paddle.dataset.mnist.train()` and testing data `paddle.dataset.mnist.test()`. These two methods are *reader creators*. Once called, a reader creator returns a *reader*. A reader is a Python method, which, once called, returns a Python generator, which yields instances of data.
Expand All @@ -308,15 +317,13 @@
### Trainer Configuration

Now, we need to setup the trainer. The trainer need to take in `train_program`, `place`, and `optimizer`.
In the following `Adam` optimizer, `learning_rate` means the speed at which the network training converges.

```python
use_cuda = False # set to True if training with GPU
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
optimizer = fluid.optimizer.Adam(learning_rate=0.001)

trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer=optimizer)
train_func=train_program, place=place, optimizer_func=optimizer_program)
```

#### Event Handler
Expand Down
7 changes: 5 additions & 2 deletions 02.recognize_digits/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ def train_program():
return [avg_cost, acc]


def optimizer_program():
return fluid.optimizer.Adam(learning_rate=0.001)


def main():
train_reader = paddle.batch(
paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=500),
Expand All @@ -71,10 +75,9 @@ def main():

use_cuda = os.getenv('WITH_GPU', '0') != '0'
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
optimizer = fluid.optimizer.Adam(learning_rate=0.001)

trainer = fluid.Trainer(
train_func=train_program, place=place, optimizer=optimizer)
train_func=train_program, place=place, optimizer_func=optimizer_program)

# Save the parameter into a directory. The Inferencer can load the parameters from it to do infer
params_dirname = "recognize_digits_network.inference.model"
Expand Down