2D Transpose Convolutions #54

tejank10 · 2018-07-07T08:18:57Z

Added 2D transpose convolutions and tests

MikeInnes · 2018-07-11T18:25:46Z

This is great, but I think we can simplify a bit -- we don't actually need the conv_transpose alias. How about if the ConvTranspose layer just calls the gradient function directly, and we also define the derivative of that function.

Aside from not needing the NNlib patch, this has the big bonus that nested AD will then also work through convolutions.

tejank10 · 2018-07-12T17:38:16Z

That sounds cool. I was working on writing the gradient hook up for it. I felt that we should refactor the conv interface.
Currently, ∇conv_data and conv interface do not accept same number of arguements. ∇conv_data takes the gradients (dy) as one of the inputs. So, ∇conv_data ends up taking 3 arguements. I feel that this is more suited for ∇conv_data to act as gradient of conv rather than a forward pass of transposed convolution.

When this setup of ∇conv_data is used for transposed conv, it throws an error in the backward pass.

ERROR: Gradient is not a tuple of length 3

because the back_ function requires the length of gradient tuple to be the same as the number of arguements to the ∇conv_data.

EDIT: Ahh nvm, fixed it :)

tejank10 · 2018-07-17T19:04:35Z

This should fix the failing gradtest

tejank10 · 2018-09-19T14:55:47Z

This has been now resolved for v1.0.

MikeInnes · 2018-10-08T13:56:48Z

It'd be good if we could take the opportunity to make the interface a bit more consistent. If I understand correctly, ∇conv_data and ∇conv_filter are currently taking an unnecessary extra argument, so it'd be better to just remove those everywhere, rather than adding the argument to conv. We will probably have to add deprecations for the current forms though.

tejank10 · 2018-10-10T16:14:36Z

I realized that for ∇conv_data and ∇conv_filter extra arguments are required. Using cdims or ctdims cannot give us exact dimensions every time because of integer division.
For eg: During convolution if size(input) = (10,10,1,1) and size(kernel)=(3,3,1,1) and stride=2 then size(output)=(4,4,1,1).
If output is passed through ∇conv_data hoping it to use ctdims to infer the dimensions of ∇input using dimensions of output and kernel, it would produce size(∇input)=(9,9,1,1)

MikeInnes · 2018-10-24T11:14:48Z

src/conv.jl

@@ -36,8 +53,14 @@ function crosscor(x::A, w::A; pad = 0, stride = 1, dilation = 1) where A<:Abstra
        x, w, pad = pad_, stride = stride_, dilation = dilation)
 end

-∇conv_data(dy::A, x::A, w::A; pad = 0, stride = 1, dilation = 1, flipkernel = 0) where A<:AbstractArray =
-  ∇conv_data!(zero(x), dy, x, w; pad = pad, stride = stride, dilation = dilation, flipkernel=flipkernel)
+function ∇conv_data(dy::A, w::A, x_dims=nothing; pad = 0, stride = 1, dilation = 1, flipkernel = 0) where A<:AbstractArray


Couple small things

Can you make this a keyword argument size

Can you add a simple three-arg wrapper with a deprecation warning

You should compare x_dims === nothing so that type inference can remove the check

When size method is being used in the function, it gives MethodError if this argument is named size. How about using dims instead?

dims is a bit weird because it usually refers to the dimensions you act on. How about just calling Base.size inside the function?

MikeInnes

Ok great, this is all but there now.

Once this is merged we're going to have to upper-bound CuArrays and Flux when we tag NNlib again. We'll have to update both of those for the new API as well (I know you have a PR for flux already)

MikeInnes · 2018-10-25T07:31:16Z

src/conv.jl

+  if size === nothing
+    size = cdims(Base.size(x), dilation_dims(w, dilation), pad_, stride_)
+  end
+  conv!(similar(x, size), x, w, pad = pad_, stride = stride_, dilation = dilation)


Does the size argument actually make sense for conv? I don't know if there's a similar ambiguity in the sizes as compared with the transpose.

conv acts as the gradient function of input for conv transpose during the backward pass. Hence, just like conv_data there exists an ambiguity here as well.

Ok that's fine, just checking

MikeInnes · 2018-10-25T07:35:31Z

src/conv.jl

-∇conv_filter(dy::A, x::A, w::A; pad = 0, stride = 1, dilation = 1, flipkernel=0) where A<:AbstractArray =
-  ∇conv_filter!(zero(w), dy, x, w; pad = pad, stride = stride, dilation = dilation, flipkernel=flipkernel)
+∇conv_filter(dy::A, x::A, size::Tuple; pad = 0, stride = 1, dilation = 1, flipkernel=0) where A<:AbstractArray = 
+  ∇conv_filter!(zeros(eltype(dy),size), dy, x; pad = pad, stride = stride, dilation = dilation, flipkernel=flipkernel)


This should use similar

To clarify, you mean to use zero(similar(dy, size)), right?

Just similar should be fine, if you're about to overwrite it anyway.

Currently we have similar in use in this branch, which is failing at nan test for ∇conv_filter. However, if zero is used instead of similar then the tests pass.

MikeInnes · 2018-10-25T07:36:18Z

src/conv.jl


-∇conv_filter(dy::A, x::A, w::A; pad = 0, stride = 1, dilation = 1, flipkernel=0) where A<:AbstractArray =
-  ∇conv_filter!(zero(w), dy, x, w; pad = pad, stride = stride, dilation = dilation, flipkernel=flipkernel)
+∇conv_filter(dy::A, x::A, size::Tuple; pad = 0, stride = 1, dilation = 1, flipkernel=0) where A<:AbstractArray = 


Are you intentionally doing data and filter in different ways? Why not use a size kwarg for both?

Yes, it is intentional. Because for conv_data it serves a dual purpose. When size=nothing it performs a conv_transpose, else it is a conv_grad.
conv_filter has only one purpose, that is to find the gradient of the filter. Hence, we require the value of size to be passed.

But they are not actually different operations, right? conv transpose is just the gradient with a particular inferred size. (Lmk if my understanding is off here.)

So we could in principle just write a size-inference for conv_filter; but if you don't want to do that for now it'd be fine to just do size as a kwarg without a value (which will error if it's not provided).

tejank10 · 2018-11-28T20:37:17Z

nan test for grad_conv_filter3d is failing in this update, will fix it soon

vchuravy · 2018-12-02T21:44:33Z

bump! this would be great to have

staticfloat · 2018-12-14T04:09:26Z

I didn't see any big missing pieces, but I am worried that we might not have sufficient test coverage. I am working on getting codecov or something hooked up so that we can be sure that what we're merging covers as many corner cases as possible.

staticfloat · 2018-12-14T05:41:18Z

@tejank10 can you rebase this on top of the latest master? I have merged Codecov support so we can make sure that our test cases are properly stressing each codepath of the convolutions now.

codecov-io · 2018-12-14T07:42:58Z

Codecov Report

Merging #54 into master will increase coverage by 1.69%.
The diff coverage is 83.33%.

@@            Coverage Diff             @@
##           master      #54      +/-   ##
==========================================
+ Coverage   70.46%   72.15%   +1.69%     
==========================================
  Files           9        9              
  Lines         579      607      +28     
==========================================
+ Hits          408      438      +30     
+ Misses        171      169       -2

Impacted Files	Coverage Δ
src/impl/conv.jl	`89.52% <100%> (+2.11%)`	⬆️
src/conv.jl	`66.66% <71.42%> (+7.34%)`	⬆️
src/activation.jl	`84.21% <0%> (-9.13%)`	⬇️
src/NNlib.jl	`100% <0%> (+100%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 085adb7...0d27d79. Read the comment docs.

pbarker · 2019-01-24T21:21:45Z

bump 🙂 is any help needed to get this out the door?

staticfloat · 2019-01-25T11:03:08Z

@tejank10 Great, thanks. Can you synthesize a few more tests to exercise the codepaths that are missing (as evidenced by the code coverage).

I'm particularly interested that we hit the first couple of branches in ctdims() and wdims() and that we hit the size == nothing branch in crosscor(), ∇conv_data(), and ∇conv_filter(). If tests for those branches can be added, I think this is ready to merge.

staticfloat · 2019-02-01T01:09:24Z

Awesome. I'm calling this good, and will be testing it out with an autoencoder experiment in the near future!

* print/convert batchedadjtrans over cuarray * Update test/batchedadjtrans.jl Co-authored-by: Brian Chen <[email protected]> Co-authored-by: Brian Chen <[email protected]>

tejank10 mentioned this pull request Jul 9, 2018

2D Conv transpose support FluxML/Flux.jl#311

Merged

tejank10 force-pushed the conv_transpose branch from b579b91 to 97b095b Compare September 8, 2018 19:47

tejank10 changed the base branch from julia-0.6 to master September 19, 2018 05:56

MikeInnes mentioned this pull request Oct 4, 2018

deconvolution operation request JuliaGPU/CuArrays.jl#58

Closed

MikeInnes reviewed Oct 24, 2018

View reviewed changes

MikeInnes reviewed Oct 25, 2018

View reviewed changes

vchuravy mentioned this pull request Dec 8, 2018

Forward-port NNLIB changes JuliaGPU/CuArrays.jl#222

Closed

tejank10 mentioned this pull request Dec 9, 2018

Changes in Conv API JuliaGPU/CuArrays.jl#223

Merged

tejank10 added 12 commits December 14, 2018 01:00

1.0 conv transpose fix

be0d05a

ordered functions

1ba5078

conv_data api change

3e34f15

added dep warning, renamed arg

2f67fca

rm x from conv2d_grad_x api

38fecd7

conv_filter api changes

0ccb032

wdim and conv_filter size kwarg

38ca7d7

replaced zeros by similar

f1c30b7

1.0 conv transpose fix

90804ee

ordered functions

8cbba53

conv_data api change

c27b6e0

added dep warning, renamed arg

81cfc36

tejank10 added 7 commits December 14, 2018 01:51

rm x from conv2d_grad_x api

63bfc48

conv_filter api changes

abf657b

wdim and conv_filter size kwarg

8ff0216

crosscor aligned with new API

30dd2e4

minor depr fix

9033a7e

use zero in grad_conv_filter

904b733

wdims error fix

c4ac366

tejank10 force-pushed the conv_transpose branch from 56fbbb0 to c4ac366 Compare December 14, 2018 07:31

MikeInnes assigned staticfloat Jan 25, 2019

test for crosscor, increased coverage

0d27d79

staticfloat merged commit 8546f3c into FluxML:master Feb 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2D Transpose Convolutions #54

2D Transpose Convolutions #54

tejank10 commented Jul 7, 2018

MikeInnes commented Jul 11, 2018

tejank10 commented Jul 12, 2018 •

edited

Loading

tejank10 commented Jul 17, 2018

tejank10 commented Sep 19, 2018

MikeInnes commented Oct 8, 2018

tejank10 commented Oct 10, 2018 •

edited

Loading

MikeInnes Oct 24, 2018

tejank10 Oct 24, 2018 •

edited

Loading

MikeInnes Oct 24, 2018

MikeInnes left a comment

MikeInnes Oct 25, 2018

tejank10 Oct 25, 2018

MikeInnes Oct 25, 2018

MikeInnes Oct 25, 2018

tejank10 Oct 25, 2018

MikeInnes Oct 25, 2018

tejank10 Nov 30, 2018

MikeInnes Oct 25, 2018

tejank10 Oct 25, 2018 •

edited

Loading

MikeInnes Oct 25, 2018

tejank10 commented Nov 28, 2018

vchuravy commented Dec 2, 2018

staticfloat commented Dec 14, 2018

staticfloat commented Dec 14, 2018

codecov-io commented Dec 14, 2018 •

edited

Loading

pbarker commented Jan 24, 2019

staticfloat commented Jan 25, 2019

staticfloat commented Feb 1, 2019

2D Transpose Convolutions #54

2D Transpose Convolutions #54

Conversation

tejank10 commented Jul 7, 2018

MikeInnes commented Jul 11, 2018

tejank10 commented Jul 12, 2018 • edited Loading

tejank10 commented Jul 17, 2018

tejank10 commented Sep 19, 2018

MikeInnes commented Oct 8, 2018

tejank10 commented Oct 10, 2018 • edited Loading

Choose a reason for hiding this comment

tejank10 Oct 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MikeInnes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tejank10 Oct 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tejank10 commented Nov 28, 2018

vchuravy commented Dec 2, 2018

staticfloat commented Dec 14, 2018

staticfloat commented Dec 14, 2018

codecov-io commented Dec 14, 2018 • edited Loading

Codecov Report

pbarker commented Jan 24, 2019

staticfloat commented Jan 25, 2019

staticfloat commented Feb 1, 2019

tejank10 commented Jul 12, 2018 •

edited

Loading

tejank10 commented Oct 10, 2018 •

edited

Loading

tejank10 Oct 24, 2018 •

edited

Loading

tejank10 Oct 25, 2018 •

edited

Loading

codecov-io commented Dec 14, 2018 •

edited

Loading