Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug in dotmul_operator's api and anotation #99

Merged
merged 7 commits into from
Sep 22, 2016
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 9 additions & 14 deletions doc/algorithm/rnn/rnn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,12 +142,15 @@ We also project the encoder vector to :code:`decoder_size` dimensional space, ge
The decoder uses :code:`recurrent_group` to define the recurrent neural network. The step and output functions are defined in :code:`gru_decoder_with_attention`:

.. code-block:: python

group_inputs=[StaticInput(input=encoded_vector,is_seq=True),
StaticInput(input=encoded_proj,is_seq=True)]
trg_embedding = embedding_layer(
input=data_layer(name='target_language_word',
size=target_dict_dim),
size=word_vector_dim,
param_attr=ParamAttr(name='_target_language_embedding'))
group_inputs.append(trg_embedding)

# For decoder equipped with attention mechanism, in training,
# target embedding (the groudtruth) is the data input,
# while encoded source sequence is accessed to as an unbounded memory.
Expand All @@ -156,13 +159,7 @@ The decoder uses :code:`recurrent_group` to define the recurrent neural network.
# All sequence inputs should have the same length.
decoder = recurrent_group(name=decoder_group_name,
step=gru_decoder_with_attention,
input=[
StaticInput(input=encoded_vector,
is_seq=True),
StaticInput(input=encoded_proj,
is_seq=True),
trg_embedding
])
input=group_inputs)


The implementation of the step function is listed as below. First, it defines the **memory** of the decoder network. Then it defines attention, gated recurrent unit step function, and the output function:
Expand Down Expand Up @@ -217,10 +214,8 @@ The code is listed below:

.. code-block:: python

gen_inputs = [StaticInput(input=encoded_vector,
is_seq=True),
StaticInput(input=encoded_proj,
is_seq=True), ]
group_inputs=[StaticInput(input=encoded_vector,is_seq=True),
StaticInput(input=encoded_proj,is_seq=True)]
# In generation, decoder predicts a next target word based on
# the encoded source sequence and the last generated target word.
# The encoded source sequence (encoder's output) must be specified by
Expand All @@ -231,10 +226,10 @@ The code is listed below:
size=target_dict_dim,
embedding_name='_target_language_embedding',
embedding_size=word_vector_dim)
gen_inputs.append(trg_embedding)
group_inputs.append(trg_embedding)
beam_gen = beam_search(name=decoder_group_name,
step=gru_decoder_with_attention,
input=gen_inputs,
input=group_inputs,
id_input=data_layer(name="sent_id",
size=1),
dict_file=trg_dict_path,
Expand Down
6 changes: 6 additions & 0 deletions doc/ui/api/trainer_config_helpers/layers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,12 @@ dotmul_projection
:members: dotmul_projection
:noindex:

dotmul_operator
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: dotmul_operator
:noindex:

full_matrix_projection
----------------------
.. automodule:: paddle.trainer_config_helpers.layers
Expand Down
10 changes: 4 additions & 6 deletions python/paddle/trainer/config_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -2457,13 +2457,11 @@ def __init__(
input_layer = self.get_input_layer(input_index)
operator_conf.input_sizes.append(input_layer.size)
operator_input_index.append(input_index)
if self.config.size == 0:
size = operator.calc_output_size(operator_conf.input_sizes)
if size != 0:
size = operator.calc_output_size(operator_conf.input_sizes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right. It can change size to 0 and make line 2485 fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current version, as line 2461 is the same as line 2465 (size = operator.calc_output_size(operator_conf.input_sizes)), thus no matter line 2460 (if self.config.size == 0) is true, line 2461 will be executed. So I move this line up front.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That means the previous one is buggy. In the previous version line 2465 should be changed to
sz = operator.calc_output_size()
if sz != 0:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have already fix it

if size != 0:
if self.config.size == 0:
self.set_layer_size(size)
else:
size = operator.calc_output_size(operator_conf.input_sizes)
if size != 0:
else:
config_assert(size == self.config.size,
"different inputs have different size: %s vs. %s" %
(size, self.config.size))
Expand Down
28 changes: 17 additions & 11 deletions python/paddle/trainer_config_helpers/layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ def identity_projection(input, offset=None):


@wrap_param_attr_default()
def dotmul_projection(input, param_attr=None, scale=1):
def dotmul_projection(input, param_attr=None):
"""
DotMulProjection with a layer as input.
It performs element-wise multiplication with weight.
Expand All @@ -407,30 +407,36 @@ def dotmul_projection(input, param_attr=None, scale=1):
:type input: LayerOutput
:param param_attr: Parameter config, None if use default.
:type param_attr: ParameterAttribute
:param scale: config scalar, default value is one.
:type scale: float
:return: A DotMulProjection Object.
:rtype: DotMulProjection
"""
proj = DotMulProjection(input_layer_name=input.name,
size=input.size,
**param_attr.attr)
proj.origin = input
size=input.size,
**param_attr.attr)
proj.origin = input
proj.origin.projection = 'dot_mul'
return proj

def dotmul_operator(x, y, scale=1):
"""
DotMulOperator takes two inputs and performs element-wise multiplication:

.. math::
out.row[i] += scale * (in1.row[i] .* in2.row[i])
out.row[i] += scale * (x.row[i] .* y.row[i])

where :math:`.*` means element-wise multiplication, and
scale is a config scalar, its default value is one.

The example usage is:

.. code-block:: python
op = dotmul_operator(x, y,
scale=1)
:param input: Input layer
:type input: LayerOutput

op = dotmul_operator(x=layer1, y=layer2, scale=0.5)

:param x: Input layer1
:type x: LayerOutput
:param y: Input layer2
:type y: LayerOutput
:param scale: config scalar, default value is one.
:type scale: float
:return: A DotMulOperator Object.
Expand Down