Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying dynamic rnn #7509

Closed
emailweixu opened this issue Jan 14, 2018 · 9 comments
Closed

Simplifying dynamic rnn #7509

emailweixu opened this issue Jan 14, 2018 · 9 comments
Assignees

Comments

@emailweixu
Copy link
Collaborator

emailweixu commented Jan 14, 2018

The design should be such that the user does not need to worry about batch at all. The code in test_dyn_rnn.py is too complex. User should not need to write code such as:

rank_table = fluid.layers.lod_rank_table(x=sent_emb)
sent_emb_array = fluid.layers.lod_tensor_to_array(
x=sent_emb, table=rank_table)

mem = fluid.layers.shrink_memory(x=mem, i=i, table=rank_table)

fluid.layers.increment(x=i, in_place=True)

fluid.layers.less_than(x=i, y=seq_len, cond=cond)

Why can't we make it as easy as v2 recurrent_group or fluid StaticRNN?

@emailweixu emailweixu changed the title Making dynamic rnn more transparent to batch Simplifying dynamic rnn Jan 14, 2018
@emailweixu
Copy link
Collaborator Author

For while_op, we should make it behave like if_else_op, meaning that the condition is a vector with size=batch_size. Each dimension of cond is responsible for one instance in the batch.

@emailweixu
Copy link
Collaborator Author

We should have a DynamicRNN with same usage as StaticRNN.

@reyoung
Copy link
Collaborator

reyoung commented Jan 15, 2018

@emailweixu
We have the Python syntax sugar, DynamicRNN, to wrap them all. It just like recurrent_group before and uses the same API from https://github.com/PaddlePaddle/talks/blob/develop/paddle-gtc-china.pdf

The usage is shown in the below unit test.

rnn = fluid.layers.DynamicRNN()
with rnn.block():
in_ = rnn.step_input(sent_emb)
mem = rnn.memory(shape=[100], dtype='float32')
out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh')
rnn.update_memory(mem, out_)
rnn.output(out_)

The complex unit test is just a low-level test for the syntax sugar DynamicRNN. It because we develop the low-level APIs first, and make the low-level APIs correctly. And we develop the syntax sugar and write the complete test by using this syntax sugar.

@emailweixu
Copy link
Collaborator Author

I see. But I think we still need a simpler while_op which can handle each sample in a batch independently.

@jacquesqiao
Copy link
Member

jacquesqiao commented Jan 31, 2018

@emailweixu do you mean the syntax like below, each candition can have an independent branch to handle?

cond = less_then(...)
ie = pd.if_else(cond)
with ie.true_block():
    d = pd.layer.add(x, y)
    ie.output(d, pd.layer.softmax(d))
with ie.false_block():
    d = pd.layer.fc(z)
    ie.output(d, d+1)
o1, o2 = ie(cond)

like:

cond = less_then(sequence, condition)
rnn = pd.DynamicRNN(cond)
with rnn.true_block():
     in_ = rnn.step_input(sent_emb) 
     mem = rnn.memory(shape=[100], dtype='float32') 
     out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh') 
     rnn.update_memory(mem, out_) 
     rnn.output(out_) 
with rnn.false_block():
     in_ = rnn.step_input(sent_emb) 
     mem = rnn.memory(shape=[100], dtype='float32') 
     out_ = fluid.layers.fc(input=[in_, mem], size=200, act='relu') 
     rnn.update_memory(mem, out_) 
     rnn.output(out_) 
o = rnn()

or

rnn = pd.DynamicRNN()
with rnn.block():
     in_ = rnn.step_input(sent_emb)
     cond = less_then(in_, condition) 
     ie = pd.ifelse(cond)
     with case.true_block()
	     mem = rnn.memory(shape=[100], dtype='float32') 
	     out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh') 
	     rnn.update_memory(mem, out_) 
	     rnn.output(out_)
     with case.false_block()
	     mem = rnn.memory(shape=[100], dtype='float32') 
	     out_ = fluid.layers.fc(input=[in_, mem], size=200, act='relu') 
	     rnn.update_memory(mem, out_) 
	     rnn.output(out_) 
o = rnn()

@jacquesqiao
Copy link
Member

I have a design of scalar switch case op, maybe the interface for this can be like:
#8031

rnn = pd.DynamicRNN()
with rnn.block():
     in_ = rnn.step_input(sent_emb)
     cond1 = logic_op(in_, condition1)
     cond2 = logic_op(in_, condition2)
     switch = SwitchOp()
     with switch.case(cond1)
	     mem = rnn.memory(shape=[100], dtype='float32') 
	     out_ = fluid.layers.fc(input=[in_, mem], size=100, act='tanh') 
	     rnn.update_memory(mem, out_) 
	     rnn.output(out_)
     with switch.case(cond2)
	     mem = rnn.memory(shape=[100], dtype='float32') 
	     out_ = fluid.layers.fc(input=[in_, mem], size=200, act='relu') 
	     rnn.update_memory(mem, out_) 
	     rnn.output(out_) 
o = rnn()

@emailweixu
Copy link
Collaborator Author

emailweixu commented Feb 1, 2018

What I am thinking something like the following:

cond = calc_initial_condition()
loop = fluid.layers.While(cond=cond)
with loop.block():
      mem = memory(boot=x)
      out = layers.fc(input=mem, size=100)
      p = layers.fc(input=out, size=1, act='sigmoid')
      layers.less_than(p, 0.5, cond=cond)
      loop.update_memory(mem, out)
      loop.output(out)
o = loop()

@jacquesqiao
Copy link
Member

Sorry I am a little confused about the above code, I write my questions in the comment below.

cond = calc_initial_condition()
loop = fluid.layers.While(cond=cond)
with loop.block():
      mem = memory(boot=x)
      out = layers.fc(input=mem, size=100)
      p = layers.fc(input=out, size=1, act='sigmoid')
      # 1. p is not used in the following code, is y ==> p ?
      # 2. what is the less_than used for?
      layers.less_than(y, 0.5, cond=cond)
      loop.update_memory(mem, out)
      loop.output(out)
o = loop()

@emailweixu
Copy link
Collaborator Author

Sorry, y should be p. less_than is for calculating termination condition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants