-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pure-Julia convolutions #9
Conversation
need to bump REQUIRE? |
Good catch. FWIW there's a bunch of stuff up for grabs here, e.g. implementing gradients and pooling. I'll happily take extremely slow / naive implementations to get things going. |
Any example usage for this API? |
A couple of notes to keep in mind:
|
|
@iblis17 the API is pretty simple, we just define a weight like w = randn(2, 2, 3, 5) which is a im = rand(100, 100, 3)
conv(im, w) A nice property of this is that you can drop trailing dimensions of the image (particularly the batch dimension, which is implicitly 1 in this case). It's also completely generic across number of dimensions, which seems a lot nicer than having several |
Not quite, this way you wouldn't be able to distinguish between a single 2D image and a batch of 1D inputs or a single 3D image and a batch of 2D ones. Maybe we can properly dispatch using the second argument, but it still doesn't sound like a clear API for me.
I believe we need to preserve the way people use convolutions in other languages/libraries which is to use keyword. Also, it's again unclear how to map this to cuDNN case. Anyway, I'm more worried about strides. I don't think you can implement them using any kind of view, but even if you can, this would break array memory contiguity and, I suppose, invalidate of slow down some algorithms. All this stuff requires quite a lot of investigation, I should say.
Once again, strides are harder to handle than padding. Also keep in mind pooling which requires strides ~99% of times and gradients for them should be thought out separately. |
w = randn(2, 2, 3, 5) How come the weight is like that instead of |
Does this accept static array stencils? |
|
But if the channels are independent (the stencil doesn't apply between channels, just on each channel), is this the same operation as if it were reshaped to |
well, I think it depends on your data. There is a slightly different. Consider that in case of In case of "channels are independent", please checkout depth-wise convolution. |
Superseded by an absolutely mind-bogglingly less efficient amount of code (in terms of SLOC) in #94. |
Add group support for convolutions
Add group support for convolutions
Accepts an input
W+ C N
and a kernelW+ Cin Cout
(similar to Knet, although we don't flip the channel dimension of the kernel).This is fairly naive – it's very fast for basic convolutions but struggles a bit more when you have multiple channels.