Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc to the usage of warp-ctc. #2376

Merged
merged 2 commits into from
Jun 5, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 25 additions & 10 deletions python/paddle/trainer_config_helpers/layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2915,11 +2915,11 @@ def memory(name,
to specify the layer needs to be remembered as the following:

.. code-block:: python

mem = memory(size=256)
state = fc_layer(input=mem, size=256)
mem.set_input(mem)


:param name: the name of the layer which this memory remembers.
If name is None, user should call set_input() to specify the
name of the layer which this memory remembers.
Expand Down Expand Up @@ -3403,7 +3403,7 @@ def step(input):
else, for training or testing, one of the input type must
be LayerOutput.

: type is_generating: bool
:type is_generating: bool

:return: LayerOutput object.
:rtype: LayerOutput
Expand Down Expand Up @@ -3810,7 +3810,7 @@ def mse_cost(input, label, weight=None, name=None, coeff=1.0, layer_attr=None):

.. math::

\frac{1}{N}\sum_{i=1}^N(t_i-y_i)^2
\\frac{1}{N}\sum_{i=1}^N(t_i-y_i)^2

:param name: layer name.
:type name: basestring
Expand Down Expand Up @@ -4765,21 +4765,36 @@ def warp_ctc_layer(input,
layer_attr=None):
"""
A layer intergrating the open-source `warp-ctc
<https://github.com/baidu-research/warp-ctc>` library, which is used in
<https://github.com/baidu-research/warp-ctc>`_ library, which is used in
`Deep Speech 2: End-toEnd Speech Recognition in English and Mandarin
<https://arxiv.org/pdf/1512.02595v1.pdf>`, to compute Connectionist Temporal
Classification (CTC) loss.
<https://arxiv.org/pdf/1512.02595v1.pdf>`_, to compute Connectionist Temporal
Classification (CTC) loss. Besides, another `warp-ctc
<https://github.com/gangliao/warp-ctc>`_ repository, which is forked from
the official one, is maintained to enable more compiling options. During the
building process, PaddlePaddle will clone the source codes, build and
install it to :code:`third_party/install/warpctc` directory.

To use warp_ctc layer, you need to specify the path of :code:`libwarpctc.so`,
using following methods:

1. Set it in :code:`paddle.init` (python api) or :code:`paddle_init` (c api),
such as :code:`paddle.init(use_gpu=True,
warpctc_dir=your_paddle_source_dir/third_party/install/warpctc/lib)`.

2. Set environment variable LD_LIBRARY_PATH on Linux or DYLD_LIBRARY_PATH
on Mac OS. For instance, :code:`export
LD_LIBRARY_PATH=your_paddle_source_dir/third_party/install/warpctc/lib:$LD_LIBRARY_PATH`.

More details of CTC can be found by referring to `Connectionist Temporal
Classification: Labelling Unsegmented Sequence Data with Recurrent
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/
icml2006_GravesFGS06.pdf>`_
icml2006_GravesFGS06.pdf>`_.

Note:
- Let num_classes represent the category number. Considering the 'blank'
label needed by CTC, you need to use (num_classes + 1) as the input
size. Thus, the size of both warp_ctc_layer and 'input' layer should
be set to num_classes + 1.
label needed by CTC, you need to use (num_classes + 1) as the input size.
Thus, the size of both warp_ctc layer and 'input' layer should be set to
num_classes + 1.
- You can set 'blank' to any value ranged in [0, num_classes], which
should be consistent as that used in your labels.
- As a native 'softmax' activation is interated to the warp-ctc library,
Expand Down