Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiletime for tensor very slow #278

Open
michel2323 opened this issue Oct 17, 2017 · 5 comments
Open

Compiletime for tensor very slow #278

michel2323 opened this issue Oct 17, 2017 · 5 comments

Comments

@michel2323
Copy link

michel2323 commented Oct 17, 2017

We do prototyping of a time integration using a handwritten residual function (50 lines) and Jacobian (100 lines):

    while norm_res > eps
        iteration = iteration + 1
        for iter in eachindex(J)
            J[iter]=0.0
        end
        jac_beuler(x, xold, h, e_fd, p_m, J, Ymat)
        x = x - inv(J)*F
        residual_beuler(x, xold, h, e_fd, p_m, F, Ymat)
        norm_res = norm(F)
    end

Computing the Jacobian, Hessian and tensor of one timestep with

    jac_cfg = ForwardDiff.JacobianConfig(integrate_wrapper, x, ForwardDiff.Chunk{1}())
    jac = x -> ForwardDiff.jacobian(integrate_wrapper, x, jac_cfg)
    
    hes_jac = x -> ForwardDiff.jacobian(integrate_wrapper, x)
    hes_cfg = ForwardDiff.JacobianConfig(hes_jac, x, ForwardDiff.Chunk{1}())
    hes = x -> ForwardDiff.jacobian(hes_jac, x, hes_cfg)

    ten_hes = x -> ForwardDiff.jacobian(hes_jac, x)
    ten_cfg = ForwardDiff.JacobianConfig(ten_hes, x, ForwardDiff.Chunk{1}())
    ten = x -> ForwardDiff.jacobian(ten_hes, x, ten_cfg)

results in the following runtime:

  1st Jacobian: 2.629844 seconds (1.13 M allocations: 49.980 MiB, 1.57% gc time)
  2nd Jacobian 0.005359 seconds (14.67 k allocations: 751.859 KiB)
  1st Hessian 26.822772 seconds (20.38 M allocations: 625.502 MiB, 2.31% gc time)
  2nd Hessian 0.151826 seconds (58.35 k allocations: 11.634 MiB, 22.91% gc time)
  1st tensor 6536.218095 seconds (12.29 G allocations: 319.796 GiB, 2.90% gc time)

I already profiled and typed the code as much as possible I could. This is run with Julia -O0. With these JIT compilation times I would be happy to sacrifice a bit of runtime for a faster JIT.

Is there any way to reduce the time for the tensor?

@misun6312
Copy link

I have the same issue and I'm wondering if there is any solution for this.

@jrevels
Copy link
Member

jrevels commented Feb 6, 2018

ref #266

Playing around with --compile and/or @nospecialize might help...interpretation in Julia is pretty slow right now, but it still might be faster than the insane compilation times you're hitting here.

It might be worth it to try removing all of ForwardDiff's @inlines annotations and see what it does to performance. Those annotations were necessary a long time ago in order to guarantee performance, but the compiler's inlining heuristic has since gotten more advanced and might do better nowadays.

@ChrisRackauckas
Copy link
Member

Is this still an issue? There's no MWE given to test it. #266 does much better on v1.0 though.

@michel2323
Copy link
Author

michel2323 commented Nov 14, 2018

using ForwardDiff


function speelpenning(x)
  res = [1.0]
  for i in x
    res = res*i
  end 
  return res 
end

dim = parse(Int,ARGS[1])
println("Speelpenning with dim = ", dim)
@time fjac = x0 -> ForwardDiff.jacobian(speelpenning, x0) 
@time fhes = x0 -> ForwardDiff.jacobian(fhes_jac, x0) 
@time ften = x0 -> ForwardDiff.jacobian(ften_hes, x0) 
x1 = ones(dim)
@time fjac(x1)
@time fhes(x1)
@time ften(x1)

No it's not resolved. With dim=10 this code takes forever. That's weird for speelpenning which is a classic example in AD.

@ChrisRackauckas
Copy link
Member

Here's a corrected version of the code as an MWE:

using ForwardDiff

function speelpenning(x)
  res = [one(eltype(x))]
  for i in x
    res .= res.*i
  end
  return res
end

dim = 10
println("Speelpenning with dim = ", dim)
fjac = x0 -> ForwardDiff.jacobian(speelpenning, x0)
fhes_jac = x0 -> ForwardDiff.jacobian(speelpenning, x0)
fhes = x0 -> ForwardDiff.jacobian(fhes_jac, x0)
ften_hes = x0 -> ForwardDiff.jacobian(fhes_jac, x0)
ften = x0 -> ForwardDiff.jacobian(ften_hes, x0)
x1 = ones(dim)
println("Jac with compile")
@time fjac(x1)
println("Jac without compile")
@time fjac(x1)
println("Hes with compile")
@time fhes(x1)
println("Hes without compile")
@time fhes(x1)
println("Ten with compile")
@time ften(x1)
println("Ten without compile")
@time ften(x1)

Which outputs:

Speelpenning with dim = 10
Jac with compile
  0.676379 seconds (1.86 M allocations: 96.334 MiB, 3.62% gc time)
Jac without compile
  0.000071 seconds (8 allocations: 2.344 KiB)
Hes with compile
  1.022722 seconds (1.72 M allocations: 81.470 MiB, 1.85% gc time)
Hes without compile
  0.000103 seconds (11 allocations: 23.250 KiB)
Ten with compile
 22.808301 seconds (9.07 M allocations: 347.316 MiB, 0.78% gc time)
Ten without compile
  0.000317 seconds (15 allocations: 255.906 KiB)

With -O0:

Speelpenning with dim = 10
Jac with compile
  0.553653 seconds (1.86 M allocations: 96.355 MiB, 5.27% gc time)
Jac without compile
  0.000026 seconds (38 allocations: 4.063 KiB)
Hes with compile
  0.444710 seconds (1.72 M allocations: 81.481 MiB, 5.49% gc time)
Hes without compile
  0.000078 seconds (41 allocations: 33.719 KiB)
Ten with compile
  1.802644 seconds (9.07 M allocations: 347.459 MiB, 8.56% gc time)
Ten without compile
  0.000361 seconds (45 allocations: 361.531 KiB)

@KristofferC KristofferC changed the title Tensor very slow Compiletime for tensor very slow Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants