Add OCR CTC model #596

wanghaoshuang · 2018-01-24T09:32:29Z

fix #591
A test result on random dummy data as below:

-----------  Configuration Arguments -----------
batch_size: 16
device: -1
l2: 0.0005
learning_rate: 0.001
max_clip: 10.0
min_clip: -10.0
momentum: 0.9
pass_num: 16
------------------------------------------------
Pass[0], batch[0]; loss: 2614.78; edit distance: 185.0.
End pass[0]; train data edit_distance: 11.5625.
End pass[0]; test data edit_distance: 5.25.
Pass[1], batch[0]; loss: 1669.92; edit distance: 752.0.
End pass[1]; train data edit_distance: 47.0.
End pass[1]; test data edit_distance: 5.0625.

1. Split data reader and train script. 2. Wrapper some function

qingqing01

Add README.md like https://github.com/PaddlePaddle/models/tree/develop/fluid/image_classification
~~2. Need to add test part in next PR.~~

qingqing01 · 2018-01-24T09:57:32Z

fluid/ocr_ctc/train.py

+add_arg('device',         int,   -1,     "Device id.'-1' means running on CPU"
+                                         "while '0' means GPU-0.")
+# yapf: disable
+def _to_lodtensor(data, place):


The python/paddle/v2/fluid/executor.py can process the sequence data. This can be removed.

The code to process sequence was commented.
https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/executor.py#L108

qingqing01 · 2018-01-24T09:59:14Z

fluid/ocr_ctc/train.py

+    res.set_lod([lod])
+    return res
+
+def _get_feeder_data(data, place):


It seems, only add prefix _ for the non-exposing function.

I regard _get_feeder_data as an internal function of the training module. So I add prefix _ according to google python code style.

qingqing01 · 2018-01-24T10:01:41Z

fluid/ocr_ctc/train.py

+    label_tensor = _to_lodtensor(map(lambda x: x[1], data), place)
+    return {"pixel": pixel_tensor, "label": label_tensor}
+
+def _ocr_conv(input, num, with_bn, param_attrs):


_ocr_conv -> conv_group ?

It is an internal function. So I add prefix _.

qingqing01 · 2018-01-24T10:01:56Z

fluid/ocr_ctc/train.py

+    return conv4
+
+
+def _ocr_ctc_net(images, num_classes, param_attrs):


_ocr_ctc_net -> ctc_net ?

It is an internal function. So I add prefix _.

qingqing01 · 2018-01-24T10:03:56Z

fluid/ocr_ctc/train.py

+        label=label,
+        size=num_classes + 1,
+        blank=num_classes,
+        norm_by_times=True)


norm_by_times=True -> norm_by_times=False?

norm_by_times means whether to divide gradients by sequence length.
With mean op, gradients were divided by batch_size.
If we want to avoid the effect of mean op, it's more reasonable to remove mean_grad op but not make norm_by_times=False.

Actually, in current code, the target to be minimized by optimizer is cost but not avg_cost.

# define cost and optimizer 113 cost = fluid.layers.warpctc( 114 input=fc_out, 115 label=label, 116 size=num_classes + 1, 117 blank=num_classes, 118 norm_by_times=True) 119 avg_cost = fluid.layers.mean(x=cost) 120 optimizer = fluid.optimizer.Momentum( 121 learning_rate=args.learning_rate, momentum=args.momentum) 122 opts = optimizer.minimize(cost)

qingqing01 · 2018-01-24T10:07:10Z

Pass[0], batch[0]; loss: 2614.78; edit distance: 185.0.
End pass[0]; train data edit_distance: 11.5625.
End pass[0]; test data edit_distance: 5.25.
Pass[1], batch[0]; loss: 1669.92; edit distance: 752.0.
End pass[1]; train data edit_distance: 47.0.
End pass[1]; test data edit_distance: 5.0625.

后续log再整理清晰一些。 edit_distance 换成 Word error ？

qingqing01 · 2018-01-24T10:08:07Z

fluid/ocr_ctc/train.py

+def main():
+    args = parser.parse_args()
+    print_arguments(args)
+    train(l2=args.l2,


train() 的参数直接是 args更简单一些吧。

qingqing01 · 2018-01-24T10:08:59Z

fluid/ocr_ctc/train.py

+        norm_by_times=True)
+    avg_cost = fluid.layers.mean(x=cost)
+    optimizer = fluid.optimizer.Momentum(
+        learning_rate=learning_rate / batch_size, momentum=momentum)


learning_rate / batch_size -> learning_rate

超参数里不用考虑batch_size

qingqing01 · 2018-01-24T10:09:12Z

fluid/ocr_ctc/train.py

+    num_classes = data_reader.num_classes()
+    # define network
+    param_attrs = fluid.ParamAttr(
+        regularizer=fluid.regularizer.L2Decay(l2 * batch_size),


l2 * batch_size -> l2

qingqing01 · 2018-01-24T10:16:36Z

fluid/ocr_ctc/train.py

+def _ocr_ctc_net(images, num_classes, param_attrs):
+    conv_features = _ocr_conv(images, 8, True, param_attrs)
+    sliced_feature = fluid.layers.im2sequence(
+        input=conv_features, stride=[1, 1], filter_size=[1, 3])


sliced_feature输出layout是NCHW

filter_size=[1, 3] -> filter_size=[1, sliced_feature.shape[2]] 更通用一些，输入图片的height变了之后，这里也不用改。

Thx. Fixed.

1. Remove 'ocr_ctc' directory to 'ocr'. 2. Init README.md 3. Fix learning rate and l2 4. Refine training log format 5. Reduce arguments of train function 6. Set filter_size of im2sequence dynamicly 7. Add fc op before GRU op

qingqing01

fluid/ocr -> fluid/ocr_recognition
If verify the forward network, please add an inference.py

qingqing01 · 2018-02-02T08:08:03Z

fluid/ocr/ctc_train.py

+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.


In other models, there is no copyright, so remove it?

qingqing01 · 2018-02-02T08:09:59Z

fluid/ocr/ctc_train.py

+    conv4 = _conv_block(conv3, 128, (num / 4), with_bn)
+    return conv4
+
+def _ocr_ctc_net(images, num_classes, param_attrs, rnn_hidden_size=200):


关于命名，觉得可以和其他配置模型保持一致，我看其他配置里没加 _ 前缀。

被其他文件import, ocr_conv这样的配置也是可以用的吧。

_ocr_conv, _ocr_ctc_net这样的命名都不好

这组conv不是ocr特有

ocr_ctc_net里并没有ctc

qingqing01 · 2018-02-02T08:24:16Z

fluid/ocr/ctc_train.py

+        size=num_classes + 1,
+        blank=num_classes,
+        norm_by_times=True)
+    avg_cost = fluid.layers.mean(x=cost)


上面已经把模型的定义隔离了，这里的def train()里又包含了一部分网络，隔离不干净！

可以在另一个文件里定义网络：比如叫 crnn_ctc_model.py

这样后续attention模型也可以继续加个文件，复用train.py。

… ocr_ctc

1. Move all network defining to 'crnn_ctc_model.py' 2. Add initilizer for some layers 3. Rename 'fluid/ocr' to 'fluid/ocr_recognition' 4. Remove copyright 5. Rename some functions

2. Add inference script 3. Add load model script 4. Add some functions into ctc_reader

wanghaoshuang added 6 commits January 22, 2018 10:56

Init OCR_CTC

fbbf6c0

Fix some issues

a7d6b1a

Fix issues

4e37ccc

Restruct code.

bff7fbe

1. Split data reader and train script. 2. Wrapper some function

Add arguments parser.

a87e056

Add function comments.

c43a107

wanghaoshuang requested review from qingqing01, lcy-seso and pkuyym January 24, 2018 09:33

qingqing01 reviewed Jan 24, 2018

View reviewed changes

wanghaoshuang added 2 commits January 24, 2018 21:39

Refine code according comments:

192ef9c

1. Remove 'ocr_ctc' directory to 'ocr'. 2. Init README.md 3. Fix learning rate and l2 4. Refine training log format 5. Reduce arguments of train function 6. Set filter_size of im2sequence dynamicly 7. Add fc op before GRU op

Minimized "avg_cost" instead of "cost".

fb8ae40

qingqing01 reviewed Feb 2, 2018

View reviewed changes

wanghaoshuang added 5 commits February 7, 2018 22:28

Merge branch 'develop' of https://github.com/PaddlePaddle/models into…

744908f

… ocr_ctc

Rfine ctc model

3deecd9

1. Move all network defining to 'crnn_ctc_model.py' 2. Add initilizer for some layers 3. Rename 'fluid/ocr' to 'fluid/ocr_recognition' 4. Remove copyright 5. Rename some functions

Fix batch normal error

f1fe166

1. Add eval script

b5c1766

2. Add inference script 3. Add load model script 4. Add some functions into ctc_reader

Use reduce_sum op instead of mean_op

c9ce72a

qingqing01 approved these changes Mar 7, 2018

View reviewed changes

wanghaoshuang merged commit 75d242f into PaddlePaddle:develop Mar 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OCR CTC model #596

Add OCR CTC model #596

wanghaoshuang commented Jan 24, 2018 •

edited

Loading

qingqing01 left a comment •

edited

Loading

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018 •

edited

Loading

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018 •

edited

Loading

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018

wanghaoshuang Jan 24, 2018

qingqing01 commented Jan 24, 2018

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018

qingqing01 Jan 24, 2018

wanghaoshuang Jan 24, 2018

qingqing01 Jan 24, 2018 •

edited

Loading

wanghaoshuang Jan 24, 2018

qingqing01 left a comment

qingqing01 Feb 2, 2018

qingqing01 Feb 2, 2018 •

edited

Loading

qingqing01 Feb 2, 2018

		return conv4


		def _ocr_ctc_net(images, num_classes, param_attrs):

Add OCR CTC model #596

Add OCR CTC model #596

Conversation

wanghaoshuang commented Jan 24, 2018 • edited Loading

qingqing01 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wanghaoshuang Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wanghaoshuang Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 commented Jan 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 Feb 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wanghaoshuang commented Jan 24, 2018 •

edited

Loading

qingqing01 left a comment •

edited

Loading

wanghaoshuang Jan 24, 2018 •

edited

Loading

wanghaoshuang Jan 24, 2018 •

edited

Loading

qingqing01 Jan 24, 2018 •

edited

Loading

qingqing01 Feb 2, 2018 •

edited

Loading