Fully convolutional neural network in Keras #2087

ypxie · 2016-03-26T04:57:57Z

The problem is that the argument of Resize2D:model.get_node('conv_1').output_shape is not a constant, so it can not be dumped into a json string.

I may use an integer to trace back to the specific layer, but it will be easily messed up when the number of layers is large.

Any suggestions to solve this?

def buildmodel(img_rows, img_cols,img_channels, outputshape, weight_decay = 1e-7):
    model = Graph()
    model.add_input(name = 'input', input_shape = (img_channels, None, None), dtype = 'float')    
    model.add_node(Convolution2D_shape(4,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_1', input = 'input')

    model.add_node(MaxPooling2D(pool_size = (2,2)), name = 'max_1', input = 'conv_1')

    model.add_node(Convolution2D_shape(6,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_2', input = 'max_1')
    model.add_node(MaxPooling2D(pool_size = (2,2)), 
                   name = 'max_2', input = 'conv_2')

    model.add_node(UpSampling2D((2,2)), name = 'upsamp_1', input = 'max_2')
    model.add_node(Resize2D(model.get_node('conv_2').output_shape), name = 'resize_1', input = 'upsamp_1')
    model.add_node(Convolution2D_shape(6,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_3', input = 'resize_1')

    model.add_node(UpSampling2D((2,2)), name = 'upsamp_2', input = 'conv_3')    
    model.add_node(Resize2D(model.get_node('conv_1').output_shape), name = 'resize_2', input = 'upsamp_2')
    model.add_node(Convolution2D_shape(4,3,3, border_mode = 'same', activation='relu'), 
                          name = 'conv_4', input = 'resize_2')

    model.add_node(Convolution2D_shape(1,3,3, border_mode = 'same', activation='sigmoid'), 
                          name = 'conv_5', input = 'conv_4')
    model.add_output(name = 'output_mask', input = 'conv_5')
    opt = RMSprop(lr=0.001, rho=0.9, epsilon=1e-6) 
    model.compile(loss={'output_mask': 'mse' }, optimizer=opt)
    return model

The text was updated successfully, but these errors were encountered:

ypxie · 2016-03-26T04:58:55Z

@fchollet @EderSantana @farizrahman4u

NasenSpray · 2016-03-26T11:52:14Z

What are Resize2D and Convolution2D_shape doing? Side note: fully convolutional networks don't have max-pooling.

ypxie · 2016-03-26T12:34:15Z

resize2d crop or pad the input to a certain size, the size is not pre defined value, it is defined in the running time cause fully convolution network can work with any size. Convolution_shape is a modified version of convolutional layer which does not requires fixed input size. I think fully convolutional neural network does have max pooling layer.

NasenSpray · 2016-03-26T12:48:38Z

Okay, you have 2x max-pool and 2x upsample, so the sizes already match even w/o resize2d as long as you only make sure that the inputs are a multiple of 4 on both sizes.

I think fully convolutional neural network does have max pooling layer.

It's called fully convolutional for a reason... :D

EderSantana · 2016-03-26T12:58:55Z

fully convolution network can work with any size.

yes they can work with any size in the input. But if one of your layers are doing some size specific work, they might crash right?

This is a good question though. Maxpooling is not totally incompatible with that approach. On the other hand, try to limit yourself to use border_mode="same" and revertible downsampling (I mean, something that might get you back to the original size with a simple Upsampling2D. With border_mode="same" you might always know where you are and not need Resize2D. That worked for me before...

ypxie · 2016-03-26T13:23:22Z

Yes it will crash for size specific layer, but the core of Fcn is avoiding such operation.

I think resize2d is necessary because the size of testing images are totally out of control.

I also wondering why Json string is necessary. Reconstructing model from Json string is not faster than recompile it at all.

What do you think?

NasenSpray · 2016-03-26T13:27:49Z

That makes me remember... I could contribute my implementation for fractionally-strided/transposed convolutions (aka "deconvolution").

I think resize2d is necessary because the size of testing images are totally out of control.

But they are in your control.

I also wondering why Json string is necessary. Reconstructing model from Json string is not faster than recompile it at all.

It's only necessary if you want to use the model without having to have the code that builds it. Otherwise it's sufficient to load/save just the weights.

ypxie · 2016-03-26T14:15:18Z

@NasenSpray pooling and upsampling don't necessary give you the original size. Think 5 * 5 input, after applying max-pooling, it becomes 2* 2, upsampling it back gives you 4*4.

NasenSpray · 2016-03-26T14:19:48Z

@shampool: I know, that's why I said

Okay, you have 2x max-pool and 2x upsample, so the sizes already match even w/o resize2d as long as you only make sure that the inputs are a multiple of 4 on both sizes.

You can crop/pad the images to that size before you feed them to the network.

ypxie · 2016-03-26T14:22:00Z

That's not a good idea, because it needs you carefully design the size of inputs to make sure every successive max-pooling layers are reversible, which is not flexible and tending to make mistake.

perone · 2016-03-26T17:04:39Z

As far as I know, thw FCNs have pooling, if you look the Caffe model you'll see it: https://gist.github.com/longjon/1bf3aa1e0b8e788d7e1d#file-readme-md
From the technical point of view, I don't think it is wrong to call it Fully Convolutional because the pooling can be seen as a Convolutional with another function.

lukovkin · 2016-03-30T13:57:18Z

@NasenSpray Regarding your implementation of deconvolution - it will be nice to see.
We are still fighting with the implementation of UFCNN with Keras, and stuck a bit - in the original paper (http://arxiv.org/abs/1508.00317v1) it seems that author upsamples output of each consecutive convolution layer with a factor of 2, and for the next convolutional layers it is required to get output which is half the size of input.
And he doesn't use max pooling for it.
From the point of view of layers' shapes it could be done with filter size being product of original sequence length, but I doubt it's a correct way.

dolaameng mentioned this issue Aug 23, 2016

Specifying a FCN based on VGG16 #3540

Closed

stale bot added the stale label May 23, 2017

stale bot closed this as completed Jun 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fully convolutional neural network in Keras #2087

Fully convolutional neural network in Keras #2087

ypxie commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

EderSantana commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

ypxie commented Mar 26, 2016

perone commented Mar 26, 2016

lukovkin commented Mar 30, 2016

Fully convolutional neural network in Keras #2087

Fully convolutional neural network in Keras #2087

Comments

ypxie commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

EderSantana commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

ypxie commented Mar 26, 2016

NasenSpray commented Mar 26, 2016

ypxie commented Mar 26, 2016

perone commented Mar 26, 2016

lukovkin commented Mar 30, 2016