Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested Taylor1s vs TaylorN #123

Open
blas-ko opened this issue Sep 7, 2017 · 6 comments
Open

Nested Taylor1s vs TaylorN #123

blas-ko opened this issue Sep 7, 2017 · 6 comments

Comments

@blas-ko
Copy link
Contributor

blas-ko commented Sep 7, 2017

Based on offline discussions with @lbenet and @PerezHz (Jet Transport performance is amazing compared with TaylorN) I've been sketching some ideas for having a nested Taylor1 object working as a N-variable Taylor expansion (some of this have been worked around in #12).
This would address some of the TaylorN issues ( #24 , #11 , etc ) and would make an N-variable expansion much similar (in construction) to a 1-variable expansion. The only main difference would be about complexity and maybe the way of storing/printing coefficients.

One of the first results I found is that, if a Taylor1 operation has a complexity of, say, M, then the same nested Taylor1 object equivalent to an N-variable expansion would be of complexity M^N (N times the operations of a Taylor1 object), which is nasty. On the other hand, this would allocate really little memory, as there wouldn't be any need to call for special dictionaries or so.

I've not thought yet of a way to reduce this complexity, but I don't know about the actual complexity for the current TaylorN either.

Discussion about this would be appreciated.

cc: @dpsanders

@blas-ko
Copy link
Contributor Author

blas-ko commented Sep 11, 2017

There is a natural way to make the nested Taylors work just by extending the promotion between two nested Taylor1s. Nevertheless, allocation and time just explodes as order goes higher...

using TaylorSeries

#nested taylors
t = Taylor1(5);
x = Taylor1([zero(t),one(t)],5);
y = Taylor1([zero(x),one(x)],5);
z = Taylor1([zero(y),one(y)],5);
w = Taylor1([zero(z),one(z)],5);
q = Taylor1([zero(w),one(w)],5);

#multiplication
@time t*t
@time x*x
@time y*y
@time z*z
@time w*w
@time q*q;
  0.000008 seconds (6 allocations: 320 bytes)
  0.000010 seconds (159 allocations: 10.266 KiB)
  0.000138 seconds (6.99 k allocations: 437.672 KiB)
  0.001761 seconds (99.14 k allocations: 5.841 MiB)
  0.024221 seconds (936.35 k allocations: 54.412 MiB, 27.03% gc time)
  0.258099 seconds (7.50 M allocations: 433.848 MiB, 48.65% gc time)

#sum
@time t+t
@time x+x
@time y+y
@time z+z
@time w+w
@time q+q;
  0.000003 seconds (6 allocations: 320 bytes)
  0.000004 seconds (24 allocations: 1.438 KiB)
  0.000007 seconds (132 allocations: 7.250 KiB)
  0.000012 seconds (330 allocations: 17.750 KiB)
  0.000018 seconds (618 allocations: 32.938 KiB)
  0.000027 seconds (996 allocations: 52.813 KiB)

For the normal TaylorN it doesn't get as bad...

T,X,Y,Z,W,Q = set_variables("x",numvars=6);

#sum
@time T+T
@time X+X
@time Y+Y
@time Z+Z
@time W+W
@time Q+Q;
  0.000013 seconds (77 allocations: 26.625 KiB)
  0.000010 seconds (77 allocations: 26.625 KiB)
  0.000011 seconds (77 allocations: 26.625 KiB)
  0.000009 seconds (77 allocations: 26.625 KiB)
  0.000012 seconds (77 allocations: 26.625 KiB)
  0.000012 seconds (77 allocations: 26.625 KiB)

#multiplication
@time T*T
@time X*X
@time Y*Y
@time Z*Z
@time W*W
@time Q*Q;
  0.000016 seconds (41 allocations: 10.141 KiB)
  0.000014 seconds (41 allocations: 10.141 KiB)
  0.000010 seconds (41 allocations: 10.141 KiB)
  0.000009 seconds (41 allocations: 10.141 KiB)
  0.000010 seconds (41 allocations: 10.141 KiB)
  0.000015 seconds (41 allocations: 10.141 KiB)

Any ideas on how to deal with this? Should it be worth it in any sense to implement this?

@lbenet
Copy link
Member

lbenet commented Sep 11, 2017

@PerezHz (offline) argues that for some cases, and I think this covers few variables, nested Taylor1 behave better with respect to integration. The following, at least for three variables illustrates something different.

julia> using TaylorSeries

julia> # The following reports the second run only

julia> @time begin
       t = Taylor1(5)
       x = Taylor1(eltype(t), 5)
       y = Taylor1(eltype(x), 5)
       t*x+y
       end;
  0.000048 seconds (23 allocations: 1.266 KiB)

julia> # The following reports the second run only

julia> @time begin
       T, X, Y = set_variables("t x y", order=5)
       T*X+Y
       end
  0.000109 seconds (356 allocations: 26.016 KiB)

Maybe your observation is correct, but beyond some number of variables beyond three.

@PerezHz
Copy link
Contributor

PerezHz commented Sep 13, 2017

A simple case which illustrates the difference in performance between Taylor1 and TaylorN for jet transport applications is the following:

using Suppressor #suppress warnings
using BenchmarkTools, TaylorIntegration, TaylorSeries
f(t, x) = x^2
const t0=0.0
const tmax=0.3
const order = 28
const abstol = 1.0E-20
const x0 = 3.0 #"nominal" initial condition
const varorder = 5

p = Taylor1(varorder) # initialize Taylor1 for jet transport
q = set_variables("ξ", numvars=1, order=varorder) # initialize TaylorN for jet transport

const x0T1 = x0 + p #Taylor1 jet transport initial condition
const x0TN = x0 + q[1] #jet transport initial condition

Then, for Taylor1, we get

julia> @suppress_err @btime taylorinteg(f, x0T1, t0, tmax, order, abstol, maxsteps=1);
487.347 μs (15825 allocations: 1.01 MiB)

Whereas for TaylorN we get

julia> @suppress_err @btime taylorinteg(f, x0TN, t0, tmax, order, abstol, maxsteps=1);
9.438 ms (303488 allocations: 17.33 MiB)

The difference is even bigger when going to higher orders in the variations. If we change varorder from 5 to 10, then we get the following:

julia> @suppress_err @btime taylorinteg(f, x0T1, t0, tmax, order, abstol, maxsteps=1);
617.371 μs (15825 allocations: 1.26 MiB)

julia> @suppress_err @btime taylorinteg(f, x0TN, t0, tmax, order, abstol, maxsteps=1);
18.796 ms (508328 allocations: 29.07 MiB)

If we try this same idea for some more complicated problems (i.e., with more dependent variables), then (what I found is that) TaylorN is practically impossible to use, whereas Taylor1 makes things feasible. Again, these results I found apply only for jet transport with one variation.

@lbenet
Copy link
Member

lbenet commented Sep 13, 2017

As a side remark, while I agree that recursive Taylor1 may be better in the examples above, it may also be true that for more dependent variables eventually TaylorN becomes better.

@blas-ko
Copy link
Contributor Author

blas-ko commented Sep 14, 2017

In your example @lbenet , notice you're defining x = Taylor1(eltype(t), 5), where eltype(t) = Float64, so x is just a Taylor1{Float64} and not a Taylor1{Taylor1{Float64}}. (Notice that taking x = Taylor1{typeof(t),5} won't work either as the innermost Taylor would be of order 1.)
Changing things a little, this becomes

using TaylorSeries

@time begin
    t = Taylor1(5);
    x = Taylor1([zero(t),one(t)],5);
    y = Taylor1([zero(x),one(x)],5);
    t*x+y
end;
  0.000062 seconds (738 allocations: 41.094 KiB)

@time begin
    T, X, Y = set_variables("t x y", order=5)
    T*X+Y
end
  0.000071 seconds (356 allocations: 26.016 KiB)

Notice how nested Taylor1s now have more (double) allocations than TaylorN.

Nevertheless, @PerezHz benchmarks are promising and I think it's worth it to try it out for 2 or 3 nested variables and see what outcome do we get.
Again, complexity of the nesting depends heavily on the order of the polynomials as it goes (I believe) as M^N where M is the order and N the number of variables, so I think benchmarks have to be done controlling both: number of variables and order of the polynomials.

@lbenet
Copy link
Member

lbenet commented Sep 22, 2017

@blas-ko Thanks for pointing out my mistake. Yet, the following is what I had in mind:

julia> using TaylorSeries, BenchmarkTools

julia> @btime begin
           t = Taylor1(5);
           x = Taylor1(typeof(t), 5);
           y = Taylor1(typeof(x), 5);
           t*x+y
       end;
  5.847 μs (259 allocations: 13.89 KiB)

julia> @btime begin
           T, X, Y = set_variables("t x y", order=5)
           T*X+Y
       end;
  18.994 μs (349 allocations: 25.77 KiB)

The difference is not too big, but seems a bit more efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants