Upsampling Layer for Flux #95

avik-pal · 2019-02-24T09:02:59Z

Stuff to do before it is ready to merge

Wait for Major overhaul of NNlib #94 to get merged.
Add tests
Add the corresponding layer in Flux

This only supports 2d arrays.

Corresponding CuArrays PR: JuliaGPU/CuArrays.jl#239

Performance Gains over `Base.repeat`

using NNlib, BenchmarkTools
x = rand(100, 100, 3, 1);

@benchmark upsample($x, (3, 3))
@benchmark repeat($x, inner=(3, 3, 1, 1))

Results

upsample

BenchmarkTools.Trial: 
  memory estimate:  2.06 MiB
  allocs estimate:  3
  --------------
  minimum time:     2.312 ms (0.00% GC)
  median time:      2.350 ms (0.00% GC)
  mean time:        2.504 ms (2.29% GC)
  maximum time:     52.855 ms (95.61% GC)
  --------------
  samples:          1990
  evals/sample:     1

Base.repeat

BenchmarkTools.Trial: 
  memory estimate:  9.38 MiB
  allocs estimate:  210018
  --------------
  minimum time:     13.784 ms (0.00% GC)
  median time:      15.807 ms (0.00% GC)
  mean time:        16.536 ms (6.66% GC)
  maximum time:     67.226 ms (75.78% GC)
  --------------
  samples:          302
  evals/sample:     1

Co-Authored-By: staticfloat <[email protected]>

staticfloat · 2019-02-25T17:27:39Z

I think it would be good to have this mimic the behavior of the rest of the methods in NNlib post-#94. What I mean by that is:

upsample!() should operate on 5d arrays (3 spatial dimensions), you can use wrappers like in the rest of the package to automatically reshape 3d and 4d arrays to instead be 5d easily.
upsample!() should take in an UpsamplingDims object instead of having kwargs passed to it; this may help with performance here as well, as the compiler will then be able to specialize on things like stride and dilation. Just copy PoolDims.jl and make an UpsamplingDims.jl or something, and fill out the method definitions. I think it likely that after doing this, the number of allocations needed within upsample!() should drop dramatically.
In the rest of the package, we refer to width as the innermost dimension, we should stay consistent. (E.g. we assume WHCN dimensional ordering, whereas the methods here seem to assume HWCN).

avik-pal · 2019-02-26T07:44:02Z

@staticfloat I did manage to address the issues you mentioned. But I ended up messing up the performance badly. The previous benchmarks now show:

BenchmarkTools.Trial: 
  memory estimate:  306.93 MiB
  allocs estimate:  5940030
  --------------
  minimum time:     1.137 s (2.52% GC)
  median time:      1.159 s (2.47% GC)
  mean time:        1.174 s (4.51% GC)
  maximum time:     1.214 s (7.45% GC)
  --------------
  samples:          5
  evals/sample:     1

staticfloat · 2019-02-26T07:55:44Z

Oh wow, that's pretty disappointing. I'm currently brushing up #94, looking into performance problems there as well, so once I am satisfied with that and we merge it in, I'll swing around to this. I bet it's something simple. Thanks for the update, the structure of this code looks much better. :)

pshashk · 2019-03-30T16:31:19Z

I'm using the following approach to achieve upsampling. Basically, we iterate over all possible index grids in the new array where we can assign the original array. The backward pass is symmetrical, so all the index grids can be reused.

function upsample(x, ratio)
  y = similar(x, (size(x) .* ratio)...)
  for i in Iterators.product(Base.OneTo.(ratio)...)
     loc = map((i,r,s)->range(i, stop = s, step = r), i, ratio, size(y))
     @inbounds y[loc...] = x
  end
  y
end

x = rand(100, 100, 3, 1);
@benchmark repeat($x, inner=(3, 3, 1, 1))

Base.repeat

BenchmarkTools.Trial:
  memory estimate:  9.38 MiB
  allocs estimate:  210018
  --------------
  minimum time:     7.749 ms (0.00% GC)
  median time:      9.059 ms (0.00% GC)
  mean time:        9.301 ms (5.33% GC)
  maximum time:     41.234 ms (75.08% GC)
  --------------
  samples:          538
  evals/sample:     1

upsample

@benchmark upsample($x, (3, 3, 1, 1))

BenchmarkTools.Trial: 
  memory estimate:  2.06 MiB
  allocs estimate:  12
  --------------
  minimum time:     328.915 μs (0.00% GC)
  median time:      390.158 μs (0.00% GC)
  mean time:        469.281 μs (6.07% GC)
  maximum time:     29.835 ms (97.17% GC)
  --------------
  samples:          10000
  evals/sample:     1

sdewaele · 2020-02-07T14:12:39Z

+1

andevellicus · 2020-05-28T03:53:30Z

+1

staticfloat and others added 7 commits February 21, 2019 03:39

Major overhaul

13636ad

Update src/conv.jl

ac7c450

Co-Authored-By: staticfloat <[email protected]>

update comment

3924fa1

Co-Authored-By: staticfloat <[email protected]>

Update src/impl/pooling_direct.jl

c99ab34

Co-Authored-By: staticfloat <[email protected]>

update comment

3c0910b

Co-Authored-By: staticfloat <[email protected]>

Update src/conv.jl

2c9e588

Co-Authored-By: staticfloat <[email protected]>

Add Flux Upsampling Code for CPU

074010a

MikeInnes requested a review from staticfloat February 25, 2019 09:23

Modify the upsampling functions to the new API

288244e

dbadrian mentioned this pull request Sep 2, 2019

Upsampling GPU Kernel for Flux JuliaGPU/CuArrays.jl#293

Draft

3 tasks

maxfreu mentioned this pull request Dec 28, 2020

Add bilinear upsample layer FluxML/Flux.jl#1180

Closed

CarloLucibello closed this Dec 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upsampling Layer for Flux #95

Upsampling Layer for Flux #95

avik-pal commented Feb 24, 2019 •

edited

Loading

staticfloat commented Feb 25, 2019

avik-pal commented Feb 26, 2019

staticfloat commented Feb 26, 2019

pshashk commented Mar 30, 2019 •

edited

Loading

sdewaele commented Feb 7, 2020

andevellicus commented May 28, 2020

Upsampling Layer for Flux #95

Upsampling Layer for Flux #95

Conversation

avik-pal commented Feb 24, 2019 • edited Loading

Stuff to do before it is ready to merge

Performance Gains over Base.repeat

Results

staticfloat commented Feb 25, 2019

avik-pal commented Feb 26, 2019

staticfloat commented Feb 26, 2019

pshashk commented Mar 30, 2019 • edited Loading

sdewaele commented Feb 7, 2020

andevellicus commented May 28, 2020

avik-pal commented Feb 24, 2019 •

edited

Loading

Performance Gains over `Base.repeat`

pshashk commented Mar 30, 2019 •

edited

Loading