-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add conv_seq_to_seq #431
Add conv_seq_to_seq #431
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments first, thank you for the work
conv_seq_to_seq/infer.py
Outdated
:type drop_rate: float | ||
:param max_len: The maximum length of the sentence to be generated. | ||
:type max_len: int | ||
:param beam_size: The width of beam search. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The width of beam expansion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
conv_seq_to_seq/model.py
Outdated
:type input: LayerOutput | ||
:param size: The dimension of the block's output. | ||
:type size: int | ||
:param context_len: The context width of the convolution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
width --> length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
conv_seq_to_seq/model.py
Outdated
:type token_emb: LayerOutput | ||
:param pos_emb: The embedding vector of the input token's position. | ||
:type pos_emb: LayerOutput | ||
:param conv_blocks: The scale list of the convolution blocks. And each element of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And each element of the --> Each element of the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
act=paddle.activation.Linear(), | ||
param_attr=paddle.attr.Param(learning_rate=1.0 / | ||
(2.0 * num_attention)), | ||
bias_attr=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer to change res
into residual
, or shortcut
. Am I right?
I think for quite a while and then realize this is a residual connection. I think res
is not a quite common abbreviation. A meaningful name will make the topology easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
conv_seq_to_seq/model.py
Outdated
act=paddle.activation.Linear(), | ||
bias_attr=True) | ||
|
||
# TODO scaled by length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could you please make this TODO comments more meaningful. I cannot understand what is the problem currently and what is going to do next?
- It is suggested that the TODO commentfollow google's style: https://google.github.io/styleguide/pyguide.html?showone=TODO_Comments#TODO_Comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, I just forgot to delete it. It can't be realized with Paddle in current version.
trg_next_word = trg_next_word + [trg_dict['<e>']] * padding_num | ||
yield src_word, src_word_pos, trg_word, trg_word_pos, trg_next_word | ||
|
||
return reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please fix the non-printable character at the end of the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
convolution block. | ||
:type enc_conv_blocks: list of tuple | ||
:param dec_conv_blocks: The scale list of the decoder's convolution blocks. And each element of | ||
the list contains output dimension and context length of the corresponding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
conv_seq_to_seq/infer.py
Outdated
"--beam_size", | ||
default=1, | ||
type=int, | ||
help="Beam search width. (default: %(default)s)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The width of beam expansion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
conv_seq_to_seq/infer.py
Outdated
"--model_path", | ||
type=str, | ||
required=True, | ||
help="Model path. (default: %(default)s)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path of trained model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
|
||
if __name__ == '__main__': | ||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please fix the non-printable character at the end of the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done @lcy-seso
conv_seq_to_seq/README.md
Outdated
|
||
# Notes | ||
|
||
Currently, the beam search will forward the whole network when predicting every word, which is a waste of time. And we will fix it later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beam search will forward the encoder multiple times when predicting each target word, which requires extra computations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done @lcy-seso
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for this work and please keep on optimizing this.
@lcy-seso