fixed a sync problem #15

ArrogantGao · 2023-09-27T12:44:06Z

The main change is by force sync in the C code, the problem mentioned in #10 is fixed, results are shown as

julia> using GenericTensorNetworks, GenericTensorNetworks.Graphs

julia> using CUDA

julia> g = Graphs.random_regular_graph(200, 3)
{200, 300} undirected simple Int64 graph

julia> optimizer = TreeSA(ntrials=3)
TreeSA{Int64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, GreedyMethod{OMEinsumContractionOrders.MinSpaceOut}, Any}(20, 0.01:0.05:14.96, 3, 50, 1.0, 0.2, :greedy, 0, Any[], GreedyMethod{OMEinsumContractionOrders.MinSpaceOut}(OMEinsumContractionOrders.MinSpaceOut(), 1))

julia> gp = IndependentSet(g; optimizer=optimizer)
┌ Warning: target space complexity not found, got: 23.0, with time complexity 30.29749478594516, read-write complexity 25.905654474039626.
└ @ OMEinsumContractionOrders ~/.julia/packages/OMEinsumContractionOrders/WpwIz/src/treesa.jl:229
IndependentSet{OMEinsum.SlicedEinsum{Int64, OMEinsum.DynamicNestedEinsum{Int64}}, NoWeight}(OMEinsum.SlicedEinsum{Int64, OMEinsum.DynamicNestedEinsum{Int64}}(Int64[], 5, 5 -> 
├─ 147, 147∘5 -> 5
│  ├─ 147
│  └─ 147∘5, 5∘147 -> 147∘5
│     ├─ 182∘143∘147, 182∘143∘5 -> 147∘5
│     │  ├─ 147∘182, 143∘147 -> 182∘143∘147
│     │  │  ├─ 147∘182
│     │  │  └─ 143∘147
│     │  └─ 182∘136∘192∘143, 192∘136∘5 -> 182∘143∘5
│     │     ├─ 151∘178∘198∘33∘78∘146∘143∘196∘182, 33∘136∘78∘192∘146∘143∘198∘196∘178∘151 -> 182∘136∘192∘143
│     │     │  ⋮
│     │     │  
│     │     └─ 5∘192, 5∘136 -> 192∘136∘5
│     │        ⋮
│     │        
│     └─ 5∘147
└─ 5
), SimpleGraph{Int64}(300, [[40, 52, 126], [43, 113, 170], [10, 17, 97], [17, 96, 117], [136, 147, 192], [70, 75, 172], [26, 50, 144], [56, 139, 179], [177, 179, 192], [3, 56, 93]  …  [110, 122, 179], [5, 9, 142], [36, 45, 69], [75, 80, 127], [98, 102, 137], [54, 57, 119], [90, 125, 161], [15, 79, 163], [65, 77, 174], [18, 35, 47]]), NoWeight(), Dict{Int64, Int64}())

julia> contraction_complexity(gp)
Time complexity: 2^30.29749478594516
Space complexity: 2^23.0
Read-write complexity: 2^25.905654474039626

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  4.446523 seconds (7.53 M allocations: 502.525 MiB, 3.03% gc time)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
88.0ₜ

julia> using CuTropicalGEMM

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  0.048345 seconds (121.74 k allocations: 6.030 MiB)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
88.0ₜ

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  0.053084 seconds (121.74 k allocations: 6.030 MiB)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
88.0ₜ

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  0.052982 seconds (121.74 k allocations: 6.030 MiB)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
88.0ₜ

The result now is stable and the macro @time works properly.

I also removed the unused files .travis.yml and Artifacts.toml, as mentioned in #12.

GiggleLiu · 2023-09-27T12:57:00Z

Please try this test case, I am afraid it is fully fixed:

using GenericTensorNetworks, GenericTensorNetworks.Graphs
using CUDA
using Random; Random.seed!(6)
g = Graphs.random_regular_graph(200, 3)
item(x::AbstractArray) = Array(x)[]
optimizer = TreeSA(ntrials=1)
gp = IndependentSet(g; optimizer=optimizer)
contraction_complexity(gp)
@time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
using CuTropicalGEMM
@time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)

ArrogantGao · 2023-09-27T13:14:38Z

Please try this test case, I am afraid it is fully fixed:

using GenericTensorNetworks, GenericTensorNetworks.Graphs
using CUDA
using Random; Random.seed!(6)
g = Graphs.random_regular_graph(200, 3)
item(x::AbstractArray) = Array(x)[]
optimizer = TreeSA(ntrials=1)
gp = IndependentSet(g; optimizer=optimizer)
contraction_complexity(gp)
@time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
using CuTropicalGEMM
@time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)

julia> using GenericTensorNetworks, GenericTensorNetworks.Graphs

julia> using CUDA

julia> using Random; Random.seed!(6)
TaskLocalRNG()

julia> g = Graphs.random_regular_graph(200, 3)
{200, 300} undirected simple Int64 graph

julia> item(x::AbstractArray) = Array(x)[]
item (generic function with 1 method)

julia> optimizer = TreeSA(ntrials=1)
TreeSA{Int64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, GreedyMethod{OMEinsumContractionOrders.MinSpaceOut}, Any}(20, 0.01:0.05:14.96, 1, 50, 1.0, 0.2, :greedy, 0, Any[], GreedyMethod{OMEinsumContractionOrders.MinSpaceOut}(OMEinsumContractionOrders.MinSpaceOut(), 1))

julia> gp = IndependentSet(g; optimizer=optimizer)
┌ Warning: target space complexity not found, got: 24.0, with time complexity 31.121747025566243, read-write complexity 26.325278911340753.
└ @ OMEinsumContractionOrders ~/.julia/packages/OMEinsumContractionOrders/WpwIz/src/treesa.jl:229
IndependentSet{OMEinsum.SlicedEinsum{Int64, OMEinsum.DynamicNestedEinsum{Int64}}, NoWeight}(OMEinsum.SlicedEinsum{Int64, OMEinsum.DynamicNestedEinsum{Int64}}(Int64[], 86, 86 -> 
├─ 86∘192, 192∘86 -> 86
│  ├─ 86∘192
│  └─ 158∘49∘192, 86∘158∘49 -> 192∘86
│     ├─ 192∘158, 192∘49 -> 158∘49∘192
│     │  ├─ 158∘192, 158 -> 192∘158
│     │  │  ├─ 158∘192, 192 -> 158∘192
│     │  │  │  ⋮
│     │  │  │  
│     │  │  └─ 158
│     │  └─ 49∘192, 49 -> 192∘49
│     │     ├─ 49∘192
│     │     └─ 49
│     └─ 157∘86∘158∘106, 157∘106∘49 -> 86∘158∘49
│        ├─ 158∘106∘151∘157∘199∘125∘86∘21∘74∘138∘68∘30∘60∘58∘57, 138∘60∘125∘30∘58∘68∘74∘151∘158∘57∘199∘21∘106 -> 157∘86∘158∘106
│        │  ├─ 102∘57∘44∘158∘47, 106∘47∘151∘157∘102∘199∘125∘86∘57∘21∘74∘138∘68∘30∘60∘58∘44 -> 158∘106∘151∘157∘199∘125∘86∘21∘74∘138∘68∘30∘60∘58∘57
│        │  │  ⋮
│        │  │  
│        │  └─ 138∘60∘125∘30∘58∘68∘74∘151∘158∘57∘35∘199, 21∘106∘35 -> 138∘60∘125∘30∘58∘68∘74∘151∘158∘57∘199∘21∘106
│        │     ⋮
│        │     
│        └─ 49∘157, 49∘106 -> 157∘106∘49
│           ├─ 49∘157
│           └─ 49∘106
└─ 86
), SimpleGraph{Int64}(300, [[44, 158, 182], [66, 126, 167], [67, 85, 113], [10, 76, 105], [145, 148, 171], [56, 104, 180], [22, 41, 72], [149, 166, 173], [72, 90, 174], [4, 69, 129]  …  [55, 63, 137], [49, 86, 158], [15, 127, 176], [36, 102, 176], [120, 134, 178], [32, 94, 118], [30, 96, 113], [28, 81, 165], [45, 98, 189], [71, 75, 108]]), NoWeight(), Dict{Int64, Int64}())

julia> contraction_complexity(gp)
Time complexity: 2^31.121747025566243
Space complexity: 2^24.0
Read-write complexity: 2^26.325278911340753

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
 36.767120 seconds (65.42 M allocations: 4.263 GiB, 4.67% gc time, 0.13% compilation time)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
89.0ₜ

julia> using CuTropicalGEMM
[ Info: Precompiling CuTropicalGEMM [c2b282c3-c9c2-431d-80f7-a1a0561ebe55]

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  0.470510 seconds (526.07 k allocations: 33.198 MiB, 6.24% gc time)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
89.0ₜ

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  0.042993 seconds (115.21 k allocations: 5.698 MiB)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
89.0ₜ

julia> @time CUDA.@sync solve(gp, SizeMax(); usecuda=true, T=Float32)
  0.065605 seconds (115.21 k allocations: 5.698 MiB)
0-dimensional CuArray{Tropical{Float32}, 0, CUDA.Mem.DeviceBuffer}:
89.0ₜ

The test passed, and I also double checked the result on another server.

Did you rebuild the binary after pull?

GiggleLiu

I just verified the correctness. Good job! After fixing the Project.toml and CI, we can release v0.1.

GiggleLiu · 2023-09-28T12:47:39Z

Project.toml

@@ -6,6 +6,7 @@ version = "1.0.0-DEV"
 [deps]
 ArtifactUtils = "8b73e784-e7d8-4ea5-973d-377fed4e3bce"
 Artifacts = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
+BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"


Please remove all dependencies that not directly used in src from the Project.toml
If you want a test environment, please add a Project.toml file to the test folder, example: https://github.com/TensorBFS/TensorInference.jl/tree/main/test
TestEnv.jl can help you start a test environment for debugging easily.

For packages like BenchmarkTools, they should not be included in the local environment.

GiggleLiu · 2023-09-28T14:17:48Z

We need to get this PR merged before closing issue #10

Approve means you get the permission to merge this PR directly.

ArrogantGao · 2023-09-28T14:19:23Z

We need to get this PR merged before closing issue #10

Ops, sorry about that, I reopened the the issue #10.

fixed a sync problem

71d6c1e

ArrogantGao added 2 commits September 28, 2023 10:40

fix block overflow

e7d3ebc

fixed block overflow

25bf988

GiggleLiu approved these changes Sep 28, 2023

View reviewed changes

ArrogantGao mentioned this pull request Sep 28, 2023

Unstable result in the GenericTensorNetwork example #10

Closed

cleaned the Project.toml

4fbad81

readme revised

1a762c9

ArrogantGao merged commit fe574dd into main Sep 28, 2023

GiggleLiu mentioned this pull request Sep 28, 2023

Cleanup repo #12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixed a sync problem #15

fixed a sync problem #15

ArrogantGao commented Sep 27, 2023

GiggleLiu commented Sep 27, 2023

ArrogantGao commented Sep 27, 2023

GiggleLiu left a comment

GiggleLiu Sep 28, 2023

GiggleLiu commented Sep 28, 2023 •

edited

Loading

ArrogantGao commented Sep 28, 2023

fixed a sync problem #15

fixed a sync problem #15

Conversation

ArrogantGao commented Sep 27, 2023

GiggleLiu commented Sep 27, 2023

ArrogantGao commented Sep 27, 2023

GiggleLiu left a comment

Choose a reason for hiding this comment

GiggleLiu Sep 28, 2023

Choose a reason for hiding this comment

GiggleLiu commented Sep 28, 2023 • edited Loading

ArrogantGao commented Sep 28, 2023

GiggleLiu commented Sep 28, 2023 •

edited

Loading