-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplifying dynamic rnn #7509
Comments
For while_op, we should make it behave like if_else_op, meaning that the condition is a vector with size=batch_size. Each dimension of cond is responsible for one instance in the batch. |
We should have a DynamicRNN with same usage as StaticRNN. |
@emailweixu The usage is shown in the below unit test. Paddle/python/paddle/v2/fluid/tests/test_dyn_rnn.py Lines 94 to 101 in 9deb175
The complex unit test is just a low-level test for the syntax sugar |
I see. But I think we still need a simpler while_op which can handle each sample in a batch independently. |
@emailweixu do you mean the syntax like below, each candition can have an independent branch to handle? cond = less_then(...)
ie = pd.if_else(cond)
with ie.true_block():
d = pd.layer.add(x, y)
ie.output(d, pd.layer.softmax(d))
with ie.false_block():
d = pd.layer.fc(z)
ie.output(d, d+1)
o1, o2 = ie(cond) like: cond = less_then(sequence, condition)
rnn = pd.DynamicRNN(cond)
with rnn.true_block():
in_ = rnn.step_input(sent_emb)
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh')
rnn.update_memory(mem, out_)
rnn.output(out_)
with rnn.false_block():
in_ = rnn.step_input(sent_emb)
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=200, act='relu')
rnn.update_memory(mem, out_)
rnn.output(out_)
o = rnn() or rnn = pd.DynamicRNN()
with rnn.block():
in_ = rnn.step_input(sent_emb)
cond = less_then(in_, condition)
ie = pd.ifelse(cond)
with case.true_block()
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh')
rnn.update_memory(mem, out_)
rnn.output(out_)
with case.false_block()
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=200, act='relu')
rnn.update_memory(mem, out_)
rnn.output(out_)
o = rnn() |
I have a design of scalar switch case op, maybe the interface for this can be like: rnn = pd.DynamicRNN()
with rnn.block():
in_ = rnn.step_input(sent_emb)
cond1 = logic_op(in_, condition1)
cond2 = logic_op(in_, condition2)
switch = SwitchOp()
with switch.case(cond1)
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh')
rnn.update_memory(mem, out_)
rnn.output(out_)
with switch.case(cond2)
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=200, act='relu')
rnn.update_memory(mem, out_)
rnn.output(out_)
o = rnn() |
What I am thinking something like the following: cond = calc_initial_condition()
loop = fluid.layers.While(cond=cond)
with loop.block():
mem = memory(boot=x)
out = layers.fc(input=mem, size=100)
p = layers.fc(input=out, size=1, act='sigmoid')
layers.less_than(p, 0.5, cond=cond)
loop.update_memory(mem, out)
loop.output(out)
o = loop() |
Sorry I am a little confused about the above code, I write my questions in the comment below. cond = calc_initial_condition()
loop = fluid.layers.While(cond=cond)
with loop.block():
mem = memory(boot=x)
out = layers.fc(input=mem, size=100)
p = layers.fc(input=out, size=1, act='sigmoid')
# 1. p is not used in the following code, is y ==> p ?
# 2. what is the less_than used for?
layers.less_than(y, 0.5, cond=cond)
loop.update_memory(mem, out)
loop.output(out)
o = loop() |
Sorry, y should be p. less_than is for calculating termination condition |
The design should be such that the user does not need to worry about batch at all. The code in test_dyn_rnn.py is too complex. User should not need to write code such as:
Paddle/python/paddle/v2/fluid/tests/test_dyn_rnn.py
Lines 27 to 30 in df9c13a
Paddle/python/paddle/v2/fluid/tests/test_dyn_rnn.py
Line 55 in df9c13a
Paddle/python/paddle/v2/fluid/tests/test_dyn_rnn.py
Line 60 in df9c13a
Paddle/python/paddle/v2/fluid/tests/test_dyn_rnn.py
Line 62 in df9c13a
Why can't we make it as easy as v2 recurrent_group or fluid StaticRNN?
The text was updated successfully, but these errors were encountered: