-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
example of slow model generation #49
Comments
This is an interesting test case. It's a perfect example illustrating
Convex.jl parses this in just .3 seconds (with the same .11 second (To make it even more readable, I'd replace The reason you'd want to use Convex.jl over JuMP is not for speed, but |
Hah thats pretty neat |
I agree that's a nicer formulation. I'm hoping that there's a way to get the best of both worlds where you would choose to use a vectorized formulation because it's clearer and not because it's needed for speed (which is very un-julian). Anyway even without major architectural changes I'd bet that there a few bottlenecks in the scalar case that could be improved without too much effort. |
The most terrible thing that we're doing in the scalar case is performing a = Variable(10) we construct a sparse matrix which when multiplied by a will select out the a = Variable(10,20) which forms the sparse matrix which when multiplied by vec(a) would produce I'm sure that we could use some insights from JuMP here... On Fri, Jan 16, 2015 at 1:40 PM, Miles Lubin [email protected]
Madeleine Udell |
Most users won't know the "best way to do things", and indexing along with stacking are probably our slowest operations- so we definitely need to work on it. @davidlizeng did some clever things to make indexing faster for LLS and I believe he's looking into adding that in Convex.jl. We'll see how much faster that makes this example. |
Are you using |
Expressions in Convex.jl are maps from variables objects to coefficients, which are SparseMatrixCSC. To index into the expression, we do one of the following
If we are selecting lots of rows all over the sparse matrix, it seems like method 2 is usually better, while selecting few rows, or a chunk of rows in a row usually leads to method 1 being better. @mlubin @IainNZ Are there other ways that we can use to try to select rows out of SparseMatrixCSC objects that you think might be faster? |
SparseMatrixCSC is efficient for selecting columns but not rows. If this is the bottleneck operation then I'd recommend using using CSR format (compressed sparse row). Row slices are linear-time lookup. There's an open PR for this (JuliaLang/julia#7029), but if you just want to experiment, CSR is equivalent to CSC with the dimensions flipped. |
For the original convex post, I now get julia> using Distributions
julia> using LinearAlgebra
julia> using Gurobi
julia> import Convex # namespace conflicts with JuMP.Variable()
julia> function gen_data(n)
pos = rand(MvNormal([1.0,2.0],1.0),n)
neg = rand(MvNormal([-1.0,1.0],1.0),n)
return pos,neg
end
gen_data (generic function with 1 method)
julia> const N = 2
2
julia> const C = 10
10
julia> function svm_convex(pos,neg,solver)
w = Convex.Variable(N)
b = Convex.Variable()
ξpos = Convex.Variable(size(pos,2), Convex.Positive())
ξneg = Convex.Variable(size(neg,2), Convex.Positive())
problem = Convex.minimize(Convex.sumsquares(w) + C*sum(ξpos) + C*sum(ξneg))
for j in 1:size(pos,2)
push!(problem.constraints, dot(w,pos[:,j]) - b >= 1 - ξpos[j])
#problem.constraints += dot(w,pos[:,j]) - b >= 1 - ξpos[j]
end
for j in 1:size(neg,2)
push!(problem.constraints, -1*(dot(w,neg[:,j]) - b) >= 1-ξneg[j])
#problem.constraints += -1*(dot(w,neg[:,j]) - b) >= 1-ξneg[j]
end
Convex.solve!(problem, solver)
return Convex.evaluate(w), Convex.evaluate(b)
end
svm_convex (generic function with 1 method)
julia> pos,neg = gen_data(2000)
([1.9549836788159067 1.7171091461123682 … 0.0857034301415448 2.217107664293599; 0.9142782414570454 3.391383891929138 … 2.150346162128055 3.0846956783486137], [-2.3747735768967484 -0.9632429519263247 … -1.9632219657972412 -1.1641815583798856; 0.5275107809723425 1.4908871156989485 … 0.5244168621220339 1.5478250107179292])
julia> @time w, b = svm_convex(pos,neg, Gurobi.Optimizer)
Academic license - for non-commercial use only - expires 2022-04-15
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (mac64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 8008 rows, 4012 columns and 24014 nonzeros
Model fingerprint: 0x7e0bd539
Model has 2 quadratic constraints
Coefficient statistics:
Matrix range [4e-05, 1e+01]
QMatrix range [1e+00, 1e+00]
Objective range [1e+00, 1e+00]
Bounds range [0e+00, 0e+00]
RHS range [1e+00, 1e+00]
Presolve removed 4002 rows and 1 columns
Presolve time: 0.01s
Presolved: 4006 rows, 4011 columns, 16012 nonzeros
Presolved model has 2 second-order cone constraints
Ordering time: 0.00s
Barrier statistics:
Dense cols : 3
Free vars : 3
AA' NZ : 1.201e+04
Factor NZ : 1.604e+04 (roughly 3 MBytes of memory)
Factor Ops : 6.421e+04 (less than 1 second per iteration)
Threads : 1
Objective Residual
Iter Primal Dual Primal Dual Compl Time
0 2.63936156e+05 1.99810454e+04 2.60e+00 2.01e+04 4.58e+01 0s
1 9.88964802e+04 1.50954405e+04 8.39e-01 1.18e+04 1.92e+01 0s
2 7.05989167e+04 1.31270445e+04 5.37e-01 8.15e+03 1.32e+01 0s
3 1.87214697e+04 9.95205410e+03 1.60e-02 1.85e+03 2.36e+00 0s
4 1.42635612e+04 1.04746351e+04 3.75e-03 6.88e+02 8.44e-01 0s
5 1.35207428e+04 1.08936853e+04 2.28e-03 4.30e+02 5.35e-01 0s
6 1.31176886e+04 1.11765792e+04 1.52e-03 3.04e+02 3.79e-01 0s
7 1.28434221e+04 1.13534443e+04 1.05e-03 2.41e+02 2.89e-01 0s
8 1.26922566e+04 1.14491159e+04 6.67e-04 2.10e+02 2.42e-01 0s
9 1.24852995e+04 1.15827954e+04 1.85e-04 1.67e+02 1.78e-01 0s
10 1.24222926e+04 1.16773917e+04 2.93e-06 1.39e+02 1.48e-01 0s
11 1.23650014e+04 1.18138275e+04 2.06e-06 1.01e+02 1.07e-01 0s
12 1.23479331e+04 1.18425480e+04 1.61e-06 9.31e+01 9.81e-02 0s
13 1.23213208e+04 1.19401568e+04 6.73e-07 6.91e+01 7.31e-02 0s
14 1.23100141e+04 1.19706912e+04 7.35e-07 6.31e+01 6.55e-02 0s
15 1.22990185e+04 1.19938315e+04 6.03e-07 5.74e+01 5.89e-02 0s
16 1.22883280e+04 1.20136702e+04 5.36e-07 5.24e+01 5.31e-02 0s
17 1.22875077e+04 1.20302735e+04 2.69e-07 4.83e+01 4.96e-02 0s
18 1.22798852e+04 1.20369287e+04 2.35e-07 4.63e+01 4.69e-02 0s
19 1.22707587e+04 1.20705517e+04 2.19e-07 3.82e+01 3.84e-02 0s
20 1.22663177e+04 1.21011822e+04 1.31e-07 3.10e+01 3.15e-02 0s
21 1.22610991e+04 1.21091698e+04 2.17e-07 2.91e+01 2.92e-02 0s
22 1.22532029e+04 1.21262368e+04 6.86e-08 2.49e+01 2.45e-02 0s
23 1.22488727e+04 1.21461177e+04 2.03e-07 2.07e+01 1.99e-02 0s
24 1.22462311e+04 1.21653762e+04 1.19e-07 1.67e+01 1.58e-02 0s
25 1.22442693e+04 1.21733717e+04 1.31e-07 1.50e+01 1.40e-02 0s
26 1.22431147e+04 1.21791383e+04 7.82e-08 1.36e+01 1.26e-02 0s
27 1.22420458e+04 1.21867057e+04 9.59e-08 1.19e+01 1.09e-02 0s
28 1.22403342e+04 1.21968617e+04 7.17e-08 9.47e+00 8.63e-03 0s
29 1.22399947e+04 1.22074753e+04 6.60e-08 6.93e+00 6.40e-03 0s
30 1.22393923e+04 1.22135922e+04 4.19e-08 5.65e+00 5.12e-03 0s
31 1.22390998e+04 1.22204954e+04 2.83e-08 4.25e+00 3.75e-03 0s
32 1.22388921e+04 1.22239764e+04 1.39e-07 3.39e+00 3.00e-03 0s
33 1.22383519e+04 1.22266122e+04 6.80e-08 2.71e+00 2.37e-03 0s
34 1.22380912e+04 1.22290447e+04 4.16e-08 2.07e+00 1.82e-03 0s
35 1.22379909e+04 1.22332853e+04 4.39e-08 1.11e+00 9.59e-04 0s
36 1.22377215e+04 1.22364936e+04 8.75e-09 2.98e-01 2.53e-04 0s
37 1.22376906e+04 1.22376527e+04 2.66e-08 8.89e-03 7.71e-06 0s
38 1.22376903e+04 1.22376637e+04 1.59e-09 6.10e-03 5.36e-06 0s
39 1.22376896e+04 1.22376663e+04 3.99e-09 5.46e-03 4.73e-06 0s
40 1.22376892e+04 1.22376881e+04 9.96e-09 4.44e-09 1.44e-07 0s
Barrier solved model in 40 iterations and 0.10 seconds
Optimal objective 1.22376892e+04
User-callback calls 123, time in user-callback 0.00 sec
1.047763 seconds (3.16 M allocations: 677.923 MiB, 22.29% gc time, 6.96% compilation time)
([1.3915131521712125, 0.7011441389283813], 1.075629207371045) So better than 6.5 seconds, but still much slower than the solve time of Gurobi. |
Closing because this has been improved. It could probably still be better, but that's a much larger job. |
I was playing around with implementing a basic linear SVM in both Convex.jl and JuMP and noticed a pretty significant penalty for model generation in Convex.jl:
Both give approximately the same answer. For JuMP on my machine, Gurobi solves in 0.09 seconds and the whole thing is done in 0.10 seconds. For Convex.jl, Gurobi solves in 0.11 seconds but the total time is 6.5 seconds.
I know model generation time hasn't been as much of a priority as with JuMP, and Convex.jl has to do more work in principle since it's performing a lot of automatic transformations. Anyway I wanted to share this as a benchmark that could be used to speed up Convex.jl.
CC @IainNZ @joehuchette
The text was updated successfully, but these errors were encountered: