Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully convolutional neural network in Keras #2087

Closed
ypxie opened this issue Mar 26, 2016 · 12 comments
Closed

Fully convolutional neural network in Keras #2087

ypxie opened this issue Mar 26, 2016 · 12 comments

Comments

@ypxie
Copy link

ypxie commented Mar 26, 2016

The problem is that the argument of Resize2D:model.get_node('conv_1').output_shape is not a constant, so it can not be dumped into a json string.

I may use an integer to trace back to the specific layer, but it will be easily messed up when the number of layers is large.

Any suggestions to solve this?

def buildmodel(img_rows, img_cols,img_channels, outputshape, weight_decay = 1e-7):
    model = Graph()
    model.add_input(name = 'input', input_shape = (img_channels, None, None), dtype = 'float')    
    model.add_node(Convolution2D_shape(4,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_1', input = 'input')

    model.add_node(MaxPooling2D(pool_size = (2,2)), name = 'max_1', input = 'conv_1')

    model.add_node(Convolution2D_shape(6,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_2', input = 'max_1')
    model.add_node(MaxPooling2D(pool_size = (2,2)), 
                   name = 'max_2', input = 'conv_2')

    model.add_node(UpSampling2D((2,2)), name = 'upsamp_1', input = 'max_2')
    model.add_node(Resize2D(model.get_node('conv_2').output_shape), name = 'resize_1', input = 'upsamp_1')
    model.add_node(Convolution2D_shape(6,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_3', input = 'resize_1')

    model.add_node(UpSampling2D((2,2)), name = 'upsamp_2', input = 'conv_3')    
    model.add_node(Resize2D(model.get_node('conv_1').output_shape), name = 'resize_2', input = 'upsamp_2')
    model.add_node(Convolution2D_shape(4,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_4', input = 'resize_2')

    model.add_node(Convolution2D_shape(1,3,3, border_mode = 'same', activation='sigmoid'), 
                          name = 'conv_5', input = 'conv_4')
    model.add_output(name = 'output_mask', input = 'conv_5')
    opt = RMSprop(lr=0.001, rho=0.9, epsilon=1e-6) 
    model.compile(loss={'output_mask': 'mse' }, optimizer=opt)
    return model
@ypxie
Copy link
Author

ypxie commented Mar 26, 2016

@NasenSpray
Copy link

What are Resize2D and Convolution2D_shape doing? Side note: fully convolutional networks don't have max-pooling.

@ypxie
Copy link
Author

ypxie commented Mar 26, 2016

resize2d crop or pad the input to a certain size, the size is not pre defined value, it is defined in the running time cause fully convolution network can work with any size. Convolution_shape is a modified version of convolutional layer which does not requires fixed input size. I think fully convolutional neural network does have max pooling layer.

@NasenSpray
Copy link

Okay, you have 2x max-pool and 2x upsample, so the sizes already match even w/o resize2d as long as you only make sure that the inputs are a multiple of 4 on both sizes.

I think fully convolutional neural network does have max pooling layer.

It's called fully convolutional for a reason... :D

@EderSantana
Copy link
Contributor

fully convolution network can work with any size.

yes they can work with any size in the input. But if one of your layers are doing some size specific work, they might crash right?

This is a good question though. Maxpooling is not totally incompatible with that approach. On the other hand, try to limit yourself to use border_mode="same" and revertible downsampling (I mean, something that might get you back to the original size with a simple Upsampling2D. With border_mode="same" you might always know where you are and not need Resize2D. That worked for me before...

@ypxie
Copy link
Author

ypxie commented Mar 26, 2016

Yes it will crash for size specific layer, but the core of Fcn is avoiding such operation.

I think resize2d is necessary because the size of testing images are totally out of control.

I also wondering why Json string is necessary. Reconstructing model from Json string is not faster than recompile it at all.

What do you think?

@NasenSpray
Copy link

That makes me remember... I could contribute my implementation for fractionally-strided/transposed convolutions (aka "deconvolution").

I think resize2d is necessary because the size of testing images are totally out of control.

But they are in your control.

I also wondering why Json string is necessary. Reconstructing model from Json string is not faster than recompile it at all.

It's only necessary if you want to use the model without having to have the code that builds it. Otherwise it's sufficient to load/save just the weights.

@ypxie
Copy link
Author

ypxie commented Mar 26, 2016

@NasenSpray pooling and upsampling don't necessary give you the original size. Think 5 * 5 input, after applying max-pooling, it becomes 2* 2, upsampling it back gives you 4*4.

@NasenSpray
Copy link

@shampool: I know, that's why I said

Okay, you have 2x max-pool and 2x upsample, so the sizes already match even w/o resize2d as long as you only make sure that the inputs are a multiple of 4 on both sizes.

You can crop/pad the images to that size before you feed them to the network.

@ypxie
Copy link
Author

ypxie commented Mar 26, 2016

That's not a good idea, because it needs you carefully design the size of inputs to make sure every successive max-pooling layers are reversible, which is not flexible and tending to make mistake.

@perone
Copy link

perone commented Mar 26, 2016

As far as I know, thw FCNs have pooling, if you look the Caffe model you'll see it: https://gist.github.com/longjon/1bf3aa1e0b8e788d7e1d#file-readme-md
From the technical point of view, I don't think it is wrong to call it Fully Convolutional because the pooling can be seen as a Convolutional with another function.

@lukovkin
Copy link
Contributor

@NasenSpray Regarding your implementation of deconvolution - it will be nice to see.
We are still fighting with the implementation of UFCNN with Keras, and stuck a bit - in the original paper (http://arxiv.org/abs/1508.00317v1) it seems that author upsamples output of each consecutive convolution layer with a factor of 2, and for the next convolutional layers it is required to get output which is half the size of input.
And he doesn't use max pooling for it.
From the point of view of layers' shapes it could be done with filter size being product of original sequence length, but I doubt it's a correct way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants