-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting type stability with EinCode #97
Comments
If you comment out the last line of I think the work here is ultimately done by julia> function f4(M, V; dim)
IA = (-1,0)
IB = ntuple(d -> d==dim ? 0 : d, ndims(V))
# IC = (-1, filter(!=(dim), ntuple(+, ndims(V)))...)
IC = ntuple(d -> d==dim ? -1 : d, ndims(V))
TensorOperations.tensorcontract(M, IA, V, IB, IC)
end
f4 (generic function with 1 method)
julia> f4(M, V; dim=2) ≈ f(M, V, dim=2)
true
julia> @code_warntype f4(M, V; dim=2)
...
Body::Array{Float64,3} |
I'm surprised to find out that even when the dimensions are known, it still returns unstable results:
It returns Any. Why would this be? |
Thank for the issue. Type stability is completely out of consideration in OMEinsum. High order tensors appears in many applications:
|
Will this instability decrease the speed of the function when be processed? function _FL2(T, L)
L_reshaped = reshape(L, (size(T,1), size(T,1), :))
result = ein"(abx,aim),bin->mnx"(L_reshaped, conj(T), T) :: Array{eltype(T),3}
return reshape(result, size(L))
end
julia> size(L),size(T)
((256, 256), (256, 2, 256)) Performance: julia> @benchmark ein"(abx,aim),bin->mnx"(L_reshaped, conj(T), T)
BenchmarkTools.Trial: 1446 samples with 1 evaluation.
Range (min … max): 1.403 ms … 204.071 ms ┊ GC (min … max): 0.00% … 6.36%
Time (median): 2.242 ms ┊ GC (median): 0.00%
Time (mean ± σ): 3.443 ms ± 6.709 ms ┊ GC (mean ± σ): 27.07% ± 25.08%
▆█▆▆▆▄▄▁▁
█████████▇▆█▇██▇██▇▇█▇▇▇▇██▆▇▆▆▆▅▇▆▁▆▄▅▅▅▁▅▅▄▁▄▄▁▁▁▄▁▄▅▄▁▁▄ █
1.4 ms Histogram: log(frequency) by time 17.1 ms <
Memory estimate: 3.51 MiB, allocs estimate: 317.
julia> @benchmark _FL2(T, L)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 16.662 s (0.00% GC) to evaluate,
with a memory estimate of 513.52 KiB, over 30 allocations. Meanwhile, if I use TensorOperations: function _FL2(T, L)
L_reshaped = reshape(L, (size(T,1), size(T,1), :))
@tensor result[m,n,x] := L_reshaped[a,b,x] * T[a,i,m] * T[b,i,n]
return reshape(result, size(L))
end the performance get correct: @benchmark _FL2(T, L)
BenchmarkTools.Trial: 1796 samples with 1 evaluation.
Range (min … max): 1.086 ms … 205.694 ms ┊ GC (min … max): 0.00% … 31.31%
Time (median): 1.769 ms ┊ GC (median): 0.00%
Time (mean ± σ): 2.772 ms ± 5.724 ms ┊ GC (mean ± σ): 28.95% ± 23.49%
▇█▇▇▅▂▂▁ ▁
█████████▆▆▅▇▅▇▆▆▇▆▇▇▇▇▇▇▆▇▇▆▆▆▅▅▆█▅▆▅▇▅▆▄▅▅▁▁▅▄▄▅▄▁▁▄▄▁▄▅▄ █
1.09 ms Histogram: log(frequency) by time 16.2 ms <
Memory estimate: 2.50 MiB, allocs estimate: 11. |
I do not think the type instability matters here. Maybe check your Julia version? @qiyang-ustc julia> @benchmark _FL2(T, L)
BenchmarkTools.Trial: 842 samples with 1 evaluation.
Range (min … max): 4.439 ms … 71.841 ms ┊ GC (min … max): 0.00% … 2.75%
Time (median): 5.041 ms ┊ GC (median): 4.05%
Time (mean ± σ): 5.925 ms ± 4.291 ms ┊ GC (mean ± σ): 7.74% ± 6.34%
▄█▇▅▂▂▂
██████████▆█▁▇▆▆▄▅▄▅▅▅▄▅▅▄▄▅▄▁▅▁▁▅▄▁▁▄▁▁▄▁▁▁▁▁▄▁▁▁▄▁▁▄▄▁▁▄ ▇
4.44 ms Histogram: log(frequency) by time 20.8 ms <
Memory estimate: 9.01 MiB, allocs estimate: 323. |
Interesting, my current version (macOS).
Let me try to figure out what happened here |
I think it is more interesting, and Indeed I make some mistakes, I thought for such simple problem the order specification is not so important. julia> function _FL2(T, L)
L_reshaped = reshape(L, (size(T,1), size(T,1), :))
result = ein"abx,(aim,bin)->mnx"(L_reshaped, conj(T), T) :: Array{eltype(T),3}
return reshape(result, size(L))
end^C
julia> @benchmark _FL2(T, C * C)
BenchmarkTools.Trial: 1547 samples with 1 evaluation.
Range (min … max): 1.546 ms … 131.868 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.051 ms ┊ GC (median): 0.00%
Time (mean ± σ): 3.225 ms ± 4.121 ms ┊ GC (mean ± σ): 23.83% ± 24.31%
▂▆█▃▃▃▂▂▂▂▁▁▂▂▂▂▂▁▂
██████████████████████▇▆█▆▆▅▅▇▄▆▄▄▆▆▅▆▄▅▄▅▅▅▅▄▄▄▄▁▁▁▁▄▄▄▄▁▄ █
1.55 ms Histogram: log(frequency) by time 12.3 ms <
Memory estimate: 4.01 MiB, allocs estimate: 323.
julia> function _FL2(T, L)
L_reshaped = reshape(L, (size(T,1), size(T,1), :))
result = ein"(abx,aim),bin->mnx"(L_reshaped, conj(T), T) :: Array{eltype(T),3}
return reshape(result, size(L))
end
_FL2 (generic function with 1 method)
julia> @benchmark _FL2(T, C * C)
BenchmarkTools.Trial: 1636 samples with 1 evaluation.
Range (min … max): 1.562 ms … 19.747 ms ┊ GC (min … max): 0.00% … 19.90%
Time (median): 2.054 ms ┊ GC (median): 0.00%
Time (mean ± σ): 3.048 ms ± 1.968 ms ┊ GC (mean ± σ): 27.26% ± 25.50%
▅█▅▂▂▂▂▂▂▂▁ ▁▁▁▁▁ ▁▁
████████████▇███████████▇█▇▇▇▇▆▇▇▆▄▆▄▇▆▅▄▅▄▄▄▄▄▄▄▁▄▅▄▅▅▄▁▄ █
1.56 ms Histogram: log(frequency) by time 10.8 ms <
Memory estimate: 4.01 MiB, allocs estimate: 323.
julia> function _FL2(T, L)
L_reshaped = reshape(L, (size(T,1), size(T,1), :))
result = ein"abx,aim,bin->mnx"(L_reshaped, conj(T), T) :: Array{eltype(T),3}
return reshape(result, size(L))
end
_FL2 (generic function with 1 method)
julia> @benchmark _FL2(T, C * C)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 16.971 s (0.00% GC) to evaluate,
with a memory estimate of 1.00 MiB, over 32 allocations. Thus it seems it related to the order specification of ein_string. I also notice when check the generated code with the faster one call something like: But the slower one call Is this helpful to diagnosis the problem? |
In order to present this more precisely. Please check the following MWE: julia> using BenchmarkTools, OMEinsum
julia> @benchmark ein"abx,aim,bin->mnx"(rand(64,64,1), rand(64,2,64), rand(64,2,64))
BenchmarkTools.Trial: 92 samples with 1 evaluation.
Range (min … max): 48.270 ms … 395.540 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 49.050 ms ┊ GC (median): 0.00%
Time (mean ± σ): 54.857 ms ± 36.580 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█▅▁
███▇▅▅▅▁▁▁▁▅▁▅▁▅▅▁▁▅▁▁▁▁▁▁▅▁▁▁▅▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▁
48.3 ms Histogram: log(frequency) by time 95.1 ms <
Memory estimate: 193.62 KiB, allocs estimate: 36.
julia> @benchmark ein"abx,(aim,bin)->mnx"(rand(64,64,1), rand(64,2,64), rand(64,2,64))
BenchmarkTools.Trial: 12 samples with 1 evaluation.
Range (min … max): 331.549 ms … 577.723 ms ┊ GC (min … max): 2.51% … 33.04%
Time (median): 446.449 ms ┊ GC (median): 10.68%
Time (mean ± σ): 447.993 ms ± 87.542 ms ┊ GC (mean ± σ): 20.32% ± 15.10%
██ █ █ █ █ █ █ █ █ █ █
██▁▁▁▁█▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁█▁▁▁▁▁█▁▁▁▁▁▁▁█▁▁▁█▁█▁▁▁▁▁▁█ ▁
332 ms Histogram: frequency by time 578 ms <
Memory estimate: 384.39 MiB, allocs estimate: 489.
julia> @benchmark ein"(abx,aim),bin->mnx"(rand(64,64,1), rand(64,2,64), rand(64,2,64))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 99.975 μs … 10.920 ms ┊ GC (min … max): 0.00% … 98.15%
Time (median): 122.029 μs ┊ GC (median): 0.00%
Time (mean ± σ): 169.924 μs ± 348.072 μs ┊ GC (mean ± σ): 21.70% ± 12.25%
█▆▄▃▃▂▁ ▁
███████▇▆▅▄▄▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▄▅▆▆▄▄▅▅▅▅▅▄▄▅▄▅▃▅▄▅ █
100 μs Histogram: log(frequency) by time 1.63 ms <
Memory estimate: 397.56 KiB, allocs estimate: 325. |
I think the results here make sense: julia> using BenchmarkTools, OMEinsum
julia> ein_1 = ein"abx,aim,bin->mnx"
abx, aim, bin -> mnx
julia> ein_2 = ein"abx,(aim,bin)->mnx"
abx, mnab -> mnx
├─ abx
└─ aim, bin -> mnab
├─ aim
└─ bin
julia> ein_3 = ein"(abx,aim),bin->mnx"
xmbi, bin -> mnx
├─ abx, aim -> xmbi
│ ├─ abx
│ └─ aim
└─ bin
julia> A, B, C = (rand(64,64,1), rand(64,2,64), rand(64,2,64));
julia> @benchmark $(ein_1)($A, $B, $C)
BenchmarkTools.Trial: 174 samples with 1 evaluation.
Range (min … max): 28.648 ms … 30.992 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 28.708 ms ┊ GC (median): 0.00%
Time (mean ± σ): 28.735 ms ± 189.006 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▄▆▅▅▃█▂▃ ▁
▄██████████▇█▆▇▅▁▄▄▅▃▃▁▃▁▁▁▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▃
28.6 ms Histogram: frequency by time 29.1 ms <
Memory estimate: 33.39 KiB, allocs estimate: 27.
julia> @benchmark $(ein_2)($A, $B, $C)
BenchmarkTools.Trial: 34 samples with 1 evaluation.
Range (min … max): 112.747 ms … 202.833 ms ┊ GC (min … max): 1.40% … 44.79%
Time (median): 122.213 ms ┊ GC (median): 8.70%
Time (mean ± σ): 148.203 ms ± 40.752 ms ┊ GC (mean ± σ): 24.70% ± 19.76%
▆█ ▃▁
██▄▄▁▁▁▁▄▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▁▁▁▁▄██ ▁
113 ms Histogram: frequency by time 203 ms <
Memory estimate: 384.24 MiB, allocs estimate: 480.
julia> @benchmark $(ein_3)($A, $B, $C)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 96.167 μs … 6.366 ms ┊ GC (min … max): 0.00% … 97.64%
Time (median): 105.083 μs ┊ GC (median): 0.00%
Time (mean ± σ): 114.678 μs ± 108.491 μs ┊ GC (mean ± σ): 5.63% ± 7.94%
██▇▅▂▁ ▂
█████████▇█▇▇▇▅▄▅▄▁▃▃▄▄▄▄▄▁▄▄▁▃▃▁▃▃▁▁▃▁▁▃▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▃▁▄ █
96.2 μs Histogram: log(frequency) by time 493 μs <
Memory estimate: 237.33 KiB, allocs estimate: 316.
julia> size_dict = Dict(['a' => 64, 'b' => 64, 'x' => 1, 'i' => 2, 'm' => 64, 'n' =>64]);
julia> contraction_complexity(ein_1, size_dict)
Time complexity: 2^25.0
Space complexity: 2^12.0
Read-write complexity: 2^14.584962500721156
julia> contraction_complexity(ein_2, size_dict)
Time complexity: 2^25.584962500721158
Space complexity: 2^24.0
Read-write complexity: 2^25.00105627463478
julia> contraction_complexity(ein_3, size_dict)
Time complexity: 2^20.0
Space complexity: 2^13.0
Read-write complexity: 2^15.321928094887362 It is clear that the second contraction order has larger space and time complexity. |
Here the reason is not the type instability, the difference in speed is because of the contraction order. If you want to get the contraction order automatically, please try julia> @benchmark $(optein"abx,aim,bin->mnx")($(A), $(B), $(C))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 115.000 μs … 6.201 ms ┊ GC (min … max): 0.00% … 97.14%
Time (median): 127.875 μs ┊ GC (median): 0.00%
Time (mean ± σ): 141.044 μs ± 135.872 μs ┊ GC (mean ± σ): 7.40% ± 9.37%
▆█▇▅▃▂▁▁ ▂
███████████▇▇▆▅▅▄▄▄▃▃▃▁▁▃▃▃▄▃▁▃▄▁▁▁▁▃▃▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▃▄▆█ █
115 μs Histogram: log(frequency) by time 558 μs <
Memory estimate: 334.91 KiB, allocs estimate: 352.
julia> optein"abx,aim,bin->mnx"
bxim, bin -> mnx
├─ abx, aim -> bxim
│ ├─ abx
│ └─ aim
└─ bin |
Thanks! It is very clear! |
I have a function that computes the product of a square matrix along one dimension of an n-dimensional array. Thus, the returned array is of the same size as the passed array. Because the dimension over which to multiply is only known at runtime, I use
EinCode
. However, the result is not type-stable. Is there a good way to give OMEinsum more information so the compiler can figure out the return type? Or maybe more generally, what's the best way to contract over a single index shared by two arrays, where the index is only known at runtime?The text was updated successfully, but these errors were encountered: