Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Reshape of input array when the shape is available only at runtime is not possible #10789

Closed
anirudhacharya opened this issue May 3, 2018 · 12 comments

Comments

@anirudhacharya
Copy link
Member

Currently the reshape operator in MXNet needs the shape attribute beforehand, which will be used as a parameter of the reshape operator.

And the Reshape_like operator takes two inputs and reshapes the first input based on the inferred shape of the second input.

But if I have two inputs where the second input is a shape tuple, and my input needs to be reshaped based on this shape tuple. It is not currently supported.

It would be good if mxnet supported this operation. Because if the shape attribute of the reshape operator is generated at runtime, then we will need such an operator.

Here is a mock instance of what we would need, where even after we implement the Shape and ConstantFill operator, it would not be possible to build this graph.
screen shot 2018-05-02 at 7 41 24 pm

Here is a more real world example of a model that performs OCR, where the Shape operator is used to generate the shape of an array at runtime. If the output of this Shape operator was to be fed into a reshape operator then it would not be possible.

screen shot 2018-05-02 at 7 10 03 pm

@anirudhacharya
Copy link
Member Author

Proposed Solution:

A potential solution to this is to modify the existing reshape_like operator to accept another boolean parameter like infer_shape. And based on this parameter we can either infer the shape of second input( as it is done now) or take the second input as the shape array itself.

Caffe2 does something similar in their resizeLike operator.

@haojin2 @nswamy @piiswrong @zheng-da @eric-haibin-lin

@roywei
Copy link
Member

roywei commented May 4, 2018

Can we make this a feature request and add the label? Thanks!

@larroy
Copy link
Contributor

larroy commented May 8, 2018

what did you use to draw the nice graphs?

@anirudhacharya
Copy link
Member Author

@larroy Netron - https://github.com/lutzroeder/Netron

@zheng-da
Copy link
Contributor

can you show a small example that reshape like that is absolutely required and reshape_like can't work?
my concern is that the shape operator doesn't have backward function and the code with the shape operator will fail to compute gradients.

@anirudhacharya
Copy link
Member Author

@zheng-da The example is there in the issue description. If we want the reshape to happen based on the shape generated by the shape operator at runtime, then the existing reshape_like operator will not work.

Also seeing that caffe2 and other frameworks support reshape with the shape available only at runtime, if users are trying to move to MXNet for inference after building and training their models in Caffe2 or pytorch then it would not be possible.

In fact, the OCR model I have mentioned in the issue description was built with Pytorch and exported into ONNX. And the customer wanted to import this model into MXNet and currently we do not support it. Here is the ONNX definition of Reshape and this is the reshape operator in caffe2

@zheng-da
Copy link
Contributor

my understanding is that you want to do something like this:

var = mx.sym.var()
var2 = mx.sym.var()
s = var.shape()
var2 = mx.sym.reshape(var2, s)

my question is why something below is insufficient:

var = mx.sym.var()
var2 = mx.sym.var()
var2 = mx.sym.reshape_like(var2, var)

My main concern is that after you add the shape operator, how the first code work in the backward.
the shape operator doesn't have backward. In other words, how are we going to train the model?
Also , how do we do shape inference for the reshape operator?

I agree we should provide the shape for symbols. but adding this feature is very complex. It requires a fundamental change in the backward of MXNet. We need to add a special shape symbol so that we can still do shape inference and train a model.

@anirudhacharya
Copy link
Member Author

This is a mock scenario, where we will need the shape operator.

var1 = mx.sym.var()
var2 = mx.sym.var()
s = mx.sym.shape(var1)
unsq = mx.sym.unsqueeze(s)
conc = mx.sym.concat(unsq, var3) # where var3 is another such sym.var
var2 = mx.sym.reshape(var2, conc)
constFill = mx.sym.ConstantFill(conc) # we currently do not have constant fill operator.

The example you provided, more or less has the shape information upfront. But what if the shape is something that is computed and available only at runtime.

Yes I understand that modifying the reshape operator will require fundamental changes. In fact I was discussing the exact same thing with @haojin2 yesterday, as to how we will do shape inference in the reshape operator.

Do you have any suggestions?

@zheng-da
Copy link
Contributor

adding this feature requires at least creating a new shape symbol, change nnvm to handle shape symbols differently from normal symbols, rewrite the shape inference pass to propagate shape information. We need a good design for this, which should be compatible with the current shape inference scheme (I don't have any workable design). Even if we have a design, we need someone who is familiar with all of the components above to spend at least a month (likely much more) implementing it and making it work correctly.

@szha
Copy link
Member

szha commented May 13, 2018

Shape operator itself doesn't have to be special. What needs special handling is using symbol in arguments which are normally kept as node attributes. We will also need zeroth-order tensor support that @tqchen has been requesting to avoid a mess.

@chinakook
Copy link
Contributor

To get the shape of symbol is very important in some condition. It can be done as following:
x=mx.sym.var(‘data’, shape=(1,3,224,224))
y=resnet50(x)
_, yshape, _ = y.infer_shape_partial()

However, It’s difficult to get shape of a tensor or to define dims of weights according to shape of a tensor in HybridBlock in gluon.

@Roshrini
Copy link
Member

This can be solved after adding support for Dynamic shape in MXNet which is tracked here: #12732

Related issues for using ONNX models in MXNet: #13774, #13395,

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants