-
-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance in complementarity models #3729
Comments
I've made two small changes to #3730 and #3731. The biggest issue is terms like This ends up making a bunch of intermediate quadratic expressions: a::VariableRef = RK[r,s]
b::AffExpr = 1 + tk[r,s] # Allocates
c::QuadExpr = a * b # Allocates
d::Float64 = 1 + tk0[r]
e::QuadExpr = c / d # Allocates We do this by default because the usual call is some large summation, where we will keep building in-place. But in nonlinear expressions, we're probably building up some expression graphs that are relatively simple. I wonder if we need a way of opting out of affine and quadratic simplifications. |
There are also possible changes like: This |
That brings it down to julia> include("/tmp/mitch/household_model_mcp.jl");
0.936928 seconds (1.37 M allocations: 138.243 MiB, 6.48% gc time, 72.37% compilation time)
0.229130 seconds (1.24 M allocations: 129.766 MiB, 16.21% gc time) |
@mitchphillipson, yor model contains lines like: Why divide by I might write your for r in R, s in S
if sum(ys0[r, s, g] for g in G) ≈ 0
continue
end
theta_PL_Y_VA = ifelse(
ld0[r,s] != 0,
ld0[r,s] / (ld0[r,s] + kd0[r,s] * (1 + tk0[r])),
0.0,
)
PI_Y_VA = PL[r]^theta_PL_Y_VA * (RK[r,s] * (1 + tk[r,s]) / (1 + tk0[r]))^(1 - theta_PL_Y_VA)
@constraint(
household,
sum(PA[r,g] * id0[r,g,s] for g in G) +
ld0[r,s] * PI_Y_VA +
kd0[r,s] * (1 + tk0[r]) * PI_Y_VA -
sum(PY[r,g] * ys0[r,g,s] for g in G) * (1 - ty[r,s]) ⟂ Y[r,s]
)
end |
using JuMP
import PATHSolver
using JuMP
import PATHSolver
function main()
R = [Symbol("R_$i") for i in 1:51]
S = G = [Symbol("S_$i") for i in 1:71]
ys0 = Containers.DenseAxisArray(rand(length(R), length(S), length(G)), R, S, G)
ys0_n = length(ys0)
ys0.data[rand(1:ys0_n, floor(Int, 0.8 * ys0_n))] .= 0.0
id0 = Containers.DenseAxisArray(rand(length(R), length(S), length(G)), R, S, G)
ld0 = Containers.DenseAxisArray(rand(length(R), length(S)), R, S)
ld0.data[rand(1:length(ld0), 20)] .= 0.0
kd0 = Containers.DenseAxisArray(rand(length(R), length(S)), R, S)
tk0 = Containers.DenseAxisArray(fill(0.32, length(R)), R)
household = Model(PATHSolver.Optimizer)
@variables(household, begin
ty[r in R, s in S], (start = 0.5)
tk[R, S]
Y[R, S] >= 0, (start = 1)
PA[R, G] >= 0, (start = 1)
PY[R, G] >= 0, (start = 1)
RK[R, S] >= 0, (start = 1)
PL[R] >= 0, (start = 1)
end)
@expressions(household, begin
theta_PL_Y_VA[r in R, s in S],
ifelse(
ld0[r,s] != 0,
ld0[r,s] / (ld0[r,s] + kd0[r,s] * (1 + tk0[r])),
0.0,
)
PI_Y_VA[r in R, s in S],
PL[r]^theta_PL_Y_VA[r,s] * (RK[r,s] * (1 + tk[r,s]) / (1 + tk0[r]))^(1 - theta_PL_Y_VA[r,s])
I_PL_Y[r in R, s in S],
ld0[r,s] * PI_Y_VA[r,s] / PL[r]
I_RK_Y[r in R, s in S],
kd0[r,s] * (1 + tk0[r]) * PI_Y_VA[r,s] / (RK[r,s] * (1 + tk[r,s]))
end)
@constraint(
household,
[r in R, s in S; sum(ys0[r, s, g] for g in G) > 0],
sum(PA[r,g] * id0[r,g,s] for g in G) +
I_PL_Y[r,s] * PL[r] +
I_RK_Y[r,s] * RK[r,s] * (1 + tk[r,s]) -
sum(PY[r,g] * ys0[r,g,s] * (1 - ty[r,s]) for g in G) ⟂ Y[r,s]
)
return household
end
function main2()
op_div = NonlinearOperator(/, :/)
R = [Symbol("R_$i") for i in 1:51]
S = G = [Symbol("S_$i") for i in 1:71]
ys0 = Containers.DenseAxisArray(rand(length(R), length(S), length(G)), R, S, G)
ys0_n = length(ys0)
ys0.data[rand(1:ys0_n, floor(Int, 0.8 * ys0_n))] .= 0.0
id0 = Containers.DenseAxisArray(rand(length(R), length(S), length(G)), R, S, G)
ld0 = Containers.DenseAxisArray(rand(length(R), length(S)), R, S)
ld0.data[rand(1:length(ld0), 20)] .= 0.0
kd0 = Containers.DenseAxisArray(rand(length(R), length(S)), R, S)
tk0 = Containers.DenseAxisArray(fill(0.32, length(R)), R)
household = Model(PATHSolver.Optimizer)
@variables(household, begin
ty[r in R, s in S], (start = 0.5)
tk[R, S]
Y[R, S] >= 0, (start = 1)
PA[R, G] >= 0, (start = 1)
PY[R, G] >= 0, (start = 1)
RK[R, S] >= 0, (start = 1)
PL[R] >= 0, (start = 1)
end)
@expressions(household, begin
theta_PL_Y_VA[r in R, s in S],
ifelse(
ld0[r,s] != 0,
ld0[r,s] / (ld0[r,s] + kd0[r,s] * (1 + tk0[r])),
0.0,
)
PI_Y_VA[r in R, s in S],
PL[r]^theta_PL_Y_VA[r,s] * op_div(RK[r,s] * (1 + tk[r,s]), 1 + tk0[r])^(1 - theta_PL_Y_VA[r,s])
I_PL_Y[r in R, s in S],
ld0[r,s] * PI_Y_VA[r,s]
I_RK_Y[r in R, s in S],
kd0[r,s] * (1 + tk0[r]) * PI_Y_VA[r,s]
end)
@constraint(
household,
[r in R, s in S; sum(ys0[r, s, g] for g in G) > 0],
sum(PA[r,g] * id0[r,g,s] for g in G) +
I_PL_Y[r,s] +
I_RK_Y[r,s] -
sum(PY[r,g] * ys0[r,g,s] for g in G)* (1 - ty[r,s]) ⟂ Y[r,s]
)
return household
end
GC.gc(); @time main();
GC.gc(); @time main();
GC.gc(); @time main();
GC.gc(); @time main2();
GC.gc(); @time main2();
GC.gc(); @time main2(); which is: julia> include("household_model_mcp.jl");
1.072836 seconds (5.07 M allocations: 432.596 MiB, 9.38% gc time, 58.08% compilation time)
0.408449 seconds (4.94 M allocations: 424.035 MiB, 16.76% gc time)
0.454981 seconds (4.94 M allocations: 424.209 MiB, 15.82% gc time)
0.777681 seconds (1.12 M allocations: 123.002 MiB, 3.35% gc time, 74.64% compilation time)
0.181090 seconds (992.88 k allocations: 115.049 MiB, 8.62% gc time)
0.205074 seconds (992.92 k allocations: 115.103 MiB, 9.65% gc time) So I think some of this is just writing a "better" model. |
There are a couple of reasons the model is written the way it is,
This is pretty great though and I'll implement these changes on the larger model. A couple question-comments:
I'll work to implement these suggestions in both my MCP code and the backend of MPSGE. In general, writing the explicit MCP code is tedious and error prone, but can lead to faster code due to optimizations. We plan to have people interact with MPSGE due to simplicity of modelling. Thanks for the suggestions! Mitch |
I would factor out some common helper functions that let you write the simplifications once: utility(x, y, a) = x^a * y^(1 - a)
function utility(x1, x2, x3, s1, s2)
if s2 isa Real && isone(s2)
return utility(x1, x2, s1)
end
return (x1 / x2)^s1 * (x2 / x3)^s2 * x3
end Questions:
|
Using #3732 function main3()
Random.seed!(1234)
R = [Symbol("R_$i") for i in 1:51]
S = G = [Symbol("S_$i") for i in 1:71]
ys0 = Containers.DenseAxisArray(rand(length(R), length(S), length(G)), R, S, G)
ys0_n = length(ys0)
ys0.data[rand(1:ys0_n, floor(Int, 0.8 * ys0_n))] .= 0.0
id0 = Containers.DenseAxisArray(rand(length(R), length(S), length(G)), R, S, G)
ld0 = Containers.DenseAxisArray(rand(length(R), length(S)), R, S)
ld0.data[rand(1:length(ld0), 20)] .= 0.0
kd0 = Containers.DenseAxisArray(rand(length(R), length(S)), R, S)
tk0 = Containers.DenseAxisArray(fill(0.32, length(R)), R)
household = Model(PATHSolver.Optimizer)
@variables(household, begin
ty[r in R, s in S], (start = 0.5)
tk[R, S]
Y[R, S] >= 0, (start = 1)
PA[R, G] >= 0, (start = 1)
PY[R, G] >= 0, (start = 1)
RK[R, S] >= 0, (start = 1)
PL[R] >= 0, (start = 1)
end)
pow_utility(x, y, a) = x^a * y^(1 - a)
for r in R, s in S
if sum(ys0[r, s, g] for g in G) == 0
continue
end
theta_PL_Y_VA = ifelse(
ld0[r,s] != 0,
ld0[r,s] / (ld0[r,s] + kd0[r,s] * (1 + tk0[r])),
0.0,
)
PI_Y_VA = pow_utility(
PL[r],
RK[r,s] * @nl(1 + tk[r,s]) / (1 + tk0[r]),
theta_PL_Y_VA,
)
I_PL_Y = ld0[r,s] * PI_Y_VA / PL[r]
I_RK_Y = kd0[r,s] * (1 + tk0[r]) * PI_Y_VA / (RK[r,s] * @nl(1 + tk[r,s]))
@constraint(
household,
sum(PA[r,g] * id0[r,g,s] for g in G) +
I_PL_Y * PL[r] +
I_RK_Y * RK[r,s] * @nl(1 + tk[r,s]) -
sum(PY[r,g] * ys0[r,g,s] for g in G) * @nl(1 - ty[r,s]) ⟂ Y[r,s]
)
end
return household
end yields
|
So it's a bit annoying that small changes to the model can have a big impact on runtime. But I don't really see a way to avoid this. Nonlinear is pretty tightly coupled to the expression graph you generate, so there is (and perhaps should be) a big difference between If profiling throws up missing methods, we should obviously fix that. Otherwise, one option is #3732. This is kinda like the GAMS summation issue again. JuMP asks more of the developer to get performance, but it also offers more ways of writing the same thing. Here's the pprof from the original Here's I think
accesses of a DenseAxisArray in the |
We should try profiling with a call to |
The culprit is https://github.com/jump-dev/MathOptInterface.jl/blob/cab197cc755904d8e7704a67df8280e458c486e8/src/Nonlinear/parse.jl#L265-L274 If there are lots of small AffExpr, then the cost of converting them to a ScalarNonlinearFunction and then parsing using a stack is high. We could write out directly. |
A big improvement here: jump-dev/MathOptInterface.jl#2487 Now the much more reasonable: |
I think the big block of |
Part of the problem is this horrific block:
|
@mitchphillipson if I do: @variable(household, __PI_KS__)
@constraint(household, sum(betaks[r,s]*RK[r,s]^(1+etaK) for r∈R,s∈S)^(1/(1+etaK)) - __PI_KS__ ⟂ __PI_KS__)
@expressions(household, begin
# PI_KS, sum(betaks[r,s]*RK[r,s]^(1+etaK) for r∈R,s∈S)^(1/(1+etaK))
O_KS_RK[r=R,s=S], kd0[r,s]*(RK[r,s]/__PI_KS__)^etaK
I_RKS_KS, sum(kd0[r,s] for r∈R,s∈S)
end) Then your original model runs in 50 seconds, which seems more in line with GAMS. The wider issue is that we don't have a common subexpression elimination phase yet, so |
It's an interesting solution, introducing a dummy variable/constraint. I haven't been able to test your changes, but by "runs" do you mean "builds the constraints" or "optimizes"? |
I wrapped your entire script in function, including set_attribute(household, "cumulative_iteration_limit", 0)
optimize!(household) results in julia> include("household_model_mcp.jl")
Reading options file /var/folders/bg/dzq_hhvx1dxgy6gb5510pxj80000gn/T/jl_mTrTb0
> cumulative_iteration_limit 0
Read of options file complete.
Path 5.0.03 (Fri Jun 26 09:58:07 2020)
Written by Todd Munson, Steven Dirkse, Youngdae Kim, and Michael Ferris
Preprocessed size : 23502
Major Iteration Log
major minor func grad residual step type prox inorm (label)
0 0 1 1 nan I 0.0e+00 1.0e+05 (mkt_PF)
Major Iterations. . . . 0
Minor Iterations. . . . 0
Restarts. . . . . . . . 0
Crash Iterations. . . . 0
Gradient Steps. . . . . 0
Function Evaluations. . 1
Gradient Evaluations. . 1
Basis Time. . . . . . . 0.001942
Total Time. . . . . . . 1.436610
Residual. . . . . . . . 1.000000e+20
Postsolved residual: nan
48.493215 seconds (88.03 M allocations: 12.604 GiB, 45.03% gc time, 16.20% compilation time) I'll email it to you |
Wow. That running in less than a minute is wild, I don't think GAMS does it that fast. I'll be digging into this tomorrow. Truly impressive work. |
GAMS will probably similarly benefit from the I think we need to pay more attention to the computational form of the functions, rather than their mathematical form. A good example, even if it doesn't really make a runtime difference (because it's just @expression(household, betaks[r=R,s=S],
kd0[r,s]/sum(kd0[rr,ss] for rr∈R,ss∈S)
) This sums the denominator You could instead do: betaks_denominator = sum(kd0)
@expression(household, betaks[r=R,s=S], kd0[r,s] / betaks_denominator) or even betaks_denominator = sum(kd0)
betaks = kd0 ./ betaks_denominator |
Yes, that's absolutely true. And now that I know what to look for, it's
very obvious. Part of the issue is on a smaller model, it's irrelevant.
Every new model gets bigger.
Seriously can't show enough appreciation. This is spectacular.
…On Wed, Apr 17, 2024, 4:49 PM Oscar Dowson ***@***.***> wrote:
GAMS will probably similarly benefit from the __PI_KS__ trick. That
function is just truly nasty.
I think we need to pay more attention to the computational form of the
functions, rather than their mathematical form.
A good example, even if it doesn't really make a runtime difference
(because it's just Float64), is:
@expression(household, betaks[r=R,s=S],
kd0[r,s]/sum(kd0[rr,ss] for rr∈R,ss∈S)
)
This sums the denominator sum(kd0[rr,ss] for rr∈R,ss∈S) separately for
every r in R and s in S. That's the same quadratic runtime in
$(|R|\times|S|)$.
You could instead do:
betaks_denominator = ***@***.***(household, betaks[r=R,s=S], kd0[r,s] / betaks_denominator)
or even
betaks_denominator = sum(kd0)
betaks = kd0 ./ betaks_denominator
—
Reply to this email directly, view it on GitHub
<#3729 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEVAPXJ66Z6QXES6BMWU5GTY53U7NAVCNFSM6AAAAABGGMYJ62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRSGQ3TEMJUGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
It might help if I explain my workflow for diagnosing this:
The |
Closing because I don't know if there's much else to do here. @jd-foster added a note to #2348 (comment) @mitchphillipson can comment below if he finds anything else, or we can keep chatting via an email thread that has been running in parallel. |
@mitchphillipson sent me a variation off this example, which runs much slower than I would expect:
ProfileView shows a lot of time being spent checking mutability:
Same with PProf:
The text was updated successfully, but these errors were encountered: