Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparsity detection breaks the norm function on 0.8 #247

Closed
DaniGlez opened this issue Jun 14, 2024 · 20 comments
Closed

Sparsity detection breaks the norm function on 0.8 #247

DaniGlez opened this issue Jun 14, 2024 · 20 comments
Labels
bug Something isn't working

Comments

@DaniGlez
Copy link

DaniGlez commented Jun 14, 2024

Hello, I have a couple of optimization problems that do not work with the recent AD upgrades because of (I believe) some interaction of the sparsity detection system with the norm function. Here's one MWE that works correctly on v0.7.2 but errors out on v0.8.1

using ADNLPModels, LinearAlgebra

F = x -> norm(log.([1 2; 3 4] * x)) - 5
nls = ADNLSModel(F, [1.0, 2.0], 1)

I apologize if this is not the correct repository, I'm not familiar enough with the internals of AD to be 100% sure that the issue is in this library and not in a more specific AD library.

@tmigot
Copy link
Member

tmigot commented Jun 14, 2024

Thanks @DaniGlez for the issue. We changed the sparsity detection technique in 0.8.

@gdalle an idea?

The error is as follows:

julia> nls = ADNLSModel(F, [1.0, 2.0], 1)
ERROR: Function ismissing requires primal value(s).
A dual-number tracer for local sparsity detection can be used via `local_hessian_pattern`.
Stacktrace:
  [1] ismissing(t::SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}})
    @ SparseConnectivityTracer .julia\packages\SparseConnectivityTracer\QlV0S\src\overload_dual.jl:18
  [2] generic_norm2(x::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}})  
    @ LinearAlgebra AppData\Local\Programs\julia-1.10.2\share\julia\stdlib\v1.10\LinearAlgebra\src\generic.jl:464
  [3] norm2(x::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}})
    @ LinearAlgebra AppData\Local\Programs\julia-1.10.2\share\julia\stdlib\v1.10\LinearAlgebra\src\generic.jl:529
  [4] norm(itr::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}, p::Int64)
    @ LinearAlgebra AppData\Local\Programs\julia-1.10.2\share\julia\stdlib\v1.10\LinearAlgebra\src\generic.jl:598
  [5] norm
    @ AppData\Local\Programs\julia-1.10.2\share\julia\stdlib\v1.10\LinearAlgebra\src\generic.jl:596 [inlined]
  [6] (::var"#3#4")(x::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}})
    @ Main .\REPL[8]:1
  [7] F!
    @ .julia\dev\ADNLPModels.jl\src\nls.jl:141 [inlined]

@tmigot tmigot added the bug Something isn't working label Jun 14, 2024
@gdalle
Copy link
Collaborator

gdalle commented Jun 14, 2024

TLDR: yes, it's on us (@adrhill and I), but it's easy to fix in a matter of days.

What is happening?

Your diagnosis is correct: this bug is linked to sparsity detection. While v0.7 of ADNLPModels.jl relied on Symbolics.jl, v0.8 relies on SparseConnectivityTracer.jl instead. It is a much simpler (and faster) library, which leverages operator overloading and offers two modes:

  • global sparsity detection (the default) tries to output a pattern that doesn't depend on the value of the point x at which your Jacobian / Hessian are evaluated
  • local sparsity detection adds further optimizations to find more zeros by exploiting the value of x, which makes the resulting pattern invalid at another point x'

The error tells you that a certain operator called by the LinearAlgebra.norm function, namely ismissing, is not supported in global mode, and that you would need local mode for it to work. The reason is that global mode forgets the values, and that includes whether or not they are missing.

What can you do about it now?

Since v0.8, the sparsity detector is not just modified: it is also configurable. Unfortunately that option is not yet documented, despite my annoying requests to @amontoison (see #242). @tmigot if you want to take a stab at it, that would allow @DaniGlez to switch to either one of these detectors:

  • Symbolics.SymbolicSparsityDetector(): the old world, exactly like in v0.7
  • SparseConnectivityTracer.TracerLocalSparsityDetector(): a local alternative which will support linear algebra, but beware that the pattern is only valid for the x at which you detected it. In most use cases that's not very helpful.

Alternately, you can freeze ADNLPModels.jl to v0.7 for the time being.

What can we do about it later?

Adrian and I are working hard on SparseConnectivityTracer.jl, and one of our next goals is implementing linear algebra overloads, precisely to avoid this kind of behavior (and also to speed things up). You can follow this issue to keep track of our progress:

Note that ADNLPModels.jl only switched its sparsity detection after we successfully handled the entire library of OptimizationProblems.jl, while checking that the correct patterns were detected every time. This check is now part of the SparseConnectivityTracer.jl test suite.

We never encountered the norm function in OptimizationProblems.jl, otherwise we would have caught the present bug sooner. So every time something like this happens (and it will happen again), perhaps we could add a new problem to the library that exemplifies it?

@tmigot
Copy link
Member

tmigot commented Jun 14, 2024

Okay, so to sum up @DaniGlez you have two alternatives:

  • Stay at ADNLPModels 0.7 for now until we stabilize the version 0.8.
  • Use a new "detector" as follows:
nls = ADNLSModel(F, [1.0, 2.0], 1, detector = SparseConnectivityTracer.TracerLocalSparsityDetector())

Could you please verify that the second option gives the good jacobian residual?

@DaniGlez If you have an academic or real-world exemple with a norm function that you are willing to share, would you consider adding it to OptimizationProblems.jl?

@DaniGlez
Copy link
Author

Massive thanks to both of you for the detailed explanations. Indeed, for the moment this is not a critical issue for me as I can just pin ADNLPModels to 0.7, though it would naturally be nice to upgrade. I'll check if the manual local sparsity detector setting allows me to use 0.8 while the overload is not implemented.

As for contributing an example to the problems catalog, I'm definitely happy to do it. It's not immediate since I have to extract a specific and non-trivial example from my Rube-Goldberg homotopy machine that implements many problem variations with a single implementation (the reason I'm using this library :)), but it should be fairly straightforward to formulate an example nevertheless.

@gdalle
Copy link
Collaborator

gdalle commented Jun 14, 2024

Again, be careful with local sparsity patterns, you get what you asked for: something that will depend on the current point

@abelsiqueira
Copy link
Member

I have another example here:

Random.seed!(0)
n = 100
X = randn(n, 2)
y = [X[i,1]^2 + X[i,2]^2 + randn() * 0.1 > 1 ? 1.0 : 0.0 for i = 1:n]
σ(t) = t  0 ? 1 / (1 + exp(-t)) : exp(t) / (1 + exp(t))
(ŷ, y) = y * log(ŷ + 1e-12) + (1 - y) * log(1 -+ 1e-12)
h(β, x) = σ(dot(β, [1.0; x; x .^ 2; x[1] * x[2]]))
L(β) = sum((h(β, X[i,:]), y[i]) for i = 1:n)
nlp = ADNLPModel(L, randn(6), detector = SparseConnectivityTracer.TracerLocalSparsityDetector())
output = minimize(nlp, verbose=0)

The sigmoid function (sigma) is defined that way for numerical stability. Is there a long term solution for this?

@gdalle
Copy link
Collaborator

gdalle commented Jul 3, 2024

I can't run your example, where does the minimize function come from?

At first glance, this is a different problem than the one which triggered the present issue, and it is due to your use of a conditional if/else block. This is something that our operator overloading cannot handle, so the only solution is to use the ifelse function in such cases. If you replace σ with the following code, I think it should work:

σ(t) = ifelse(t  0, 1 / (1 + exp(-t)), exp(t) / (1 + exp(t)))

@abelsiqueira
Copy link
Member

The minimize comes from JSOSuite.jl, but the error comes when creating ADNLPModel (without the detector). The ifelse is the solution though, I forgot about it.

@gdalle
Copy link
Collaborator

gdalle commented Jul 4, 2024

It should probably be documented better, ping @adrhill. The SparseConnectivityTracer package is still experimental so it's very useful when users report such unexpected behaviors!

@abelsiqueira
Copy link
Member

We should probably add some of these examples here and on JSOSuite as well.

@amontoison
Copy link
Member

@DaniGlez The issue seems to be fixed with ADNLPModels.jl v0.8.4.

@gdalle
Copy link
Collaborator

gdalle commented Jul 31, 2024

Indeed, SparseConnectivityTracer v0.6 was released yesterday and includes a bunch of array-specific overload for things like norm

@gdalle
Copy link
Collaborator

gdalle commented Jul 31, 2024

It doesn't fix @abelsiqueira's problem which is more fundamental, so maybe we can either rename the issue or close it?

@tmigot
Copy link
Member

tmigot commented Jul 31, 2024

@gdalle Can you open a new issue for @abelsiqueira 's problem? Thanks!

@gdalle
Copy link
Collaborator

gdalle commented Jul 31, 2024

Here you go: #283

@DaniGlez
Copy link
Author

Thank you folks, much appreciated! Just one final thing, would it be possible to make it work with StaticArrays as well? Seems like it's not overloading the StaticArray-specific norm despite the generic AbstractArray{T} signature. MWE (ADNLPModels 0.8.4 with SparseConnectivityTracer 0.6.0) with stack trace at the bottom of the comment; seems to work without the SVector conversion.

using ADNLPModels, LinearAlgebra, StaticArrays

function F(x)
    y = SVector{2}([1 2; 3 4] * x)
    [norm(log.(y), 2) - 5]
end
nls = ADNLSModel(F, [1.0, 2.0], 1)`

I suppose this should go directly into the SparseConnectivityTracer issues instead of lingering here, but I'm not sure how to make the MWE without ADNLPModels.


Stacktrace:

  [1] macro expansion
    @ ~/.julia/packages/StaticArrays/MSJcA/src/linalg.jl:278 [inlined]
  [2] _norm
    @ ~/.julia/packages/StaticArrays/MSJcA/src/linalg.jl:266 [inlined]
  [3] norm
    @ ~/.julia/packages/StaticArrays/MSJcA/src/linalg.jl:265 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/StaticArrays/MSJcA/src/linalg.jl:327 [inlined]
  [5] _norm
    @ ~/.julia/packages/StaticArrays/MSJcA/src/linalg.jl:311 [inlined]
  [6] norm
    @ ~/.julia/packages/StaticArrays/MSJcA/src/linalg.jl:310 [inlined]
  [7] F(x::Vector{SparseConnectivityTracer.HessianTracer{SparseConnectivityTracer.IndexSetHessianPattern{…}}})
    @ Main ./REPL[13]:3
  [8] (::ADNLPModels.var"#F!#381"{})(output::Vector{…}, x::Vector{…})
    @ ADNLPModels ~/.julia/packages/ADNLPModels/vcIV9/src/nls.jl:141
  [9] (::ADNLPModels.var"#F#15"{})(x::Vector{…}; nequ::Int64)
    @ ADNLPModels ~/.julia/packages/ADNLPModels/vcIV9/src/ad.jl:233
 [10] (::ADNLPModels.var"#F#15"{Int64, ADNLPModels.var"#F#6#16"{}})(x::Vector{SparseConnectivityTracer.HessianTracer{…}})
    @ ADNLPModels ~/.julia/packages/ADNLPModels/vcIV9/src/ad.jl:231
... omitted ...

@gdalle
Copy link
Collaborator

gdalle commented Jul 31, 2024

Unfortunately we can't make it work automatically with specific array types. Our overload is based on the numbers we put in the array to trace sparsity patterns, like

LinearAlgebra.norm(::AbstractArray{<:TracerNumber})

while the overload in StaticArrays will be

LinearAlgebra.norm(::StaticArray{<:Number})

Unfortunately the latter will always have priority over the former, so our overloads are only useful for plain Arrays, or other AbstractArrays which don't redefine norm. Does that make sense?

The only solution would be to extend our overload to every array type which redefines norm and other LinearAlgebra functions. That seems a bit painful, but I guess StaticArrays are sufficiently important.

@gdalle
Copy link
Collaborator

gdalle commented Jul 31, 2024

I have opened an issue in SparseConnectivityTracer: adrhill/SparseConnectivityTracer.jl#144

@DaniGlez
Copy link
Author

Yeah, that makes sense indeed. I was not completely sure of the dispatch precedence rules for this case. Thanks @gdalle, much appreciated.

@adrhill
Copy link

adrhill commented Aug 1, 2024

That seems a bit painful, but I guess StaticArrays are sufficiently important.

I agree, we'll get that fixed for you @DaniGlez!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants