-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy ReLU operation #1176
Comments
CC: @frehseg, @tomerarnon |
I'm not sure I understand the issue you are pointing out here.
I.e. |
The support vector |
Ah, now I see. I lack the intuition around support vectors to see these things easily 😅 I imagine this is to do with the non convexity of the relu? |
Actually, my guess here was also not correct. Although maybe in that case the non-convexity had something to do with it, that isn't the only issue. For example, if you take a Ball2 (that has a negative part, but remains convex after relu-ing) the resulting support vectors are incorrect. e.g. for a ball lying slightly above the x axis, |
Here are some new ideas/observations.
Below is an example.
julia> using LazySets, Plots
julia> using LazySets.Approximations: box_approximation
julia> P = VPolygon([[-1., 1.], [-1.5, 0.5], [1.5, 0.5], [1., -0.5]]);
julia> A1 = LineSegment([0., 0.], [2., 0.]);
julia> A2 = LineSegment([0., 0.], [0., 1.3]);
julia> Q1 = LineSegment([0., 0.], [1.25, 0.]);
julia> Q2 = LineSegment([0., 0.], [0., 1.]);
julia> Q3 = convert(HPolygon, intersection(P, Hyperrectangle(low=[0., 0.], high=[2., 2.])));
julia> Bi = box_approximation(P)
julia> Bo = box_approximation(ConvexHullArray([Q1, Q2, Q3]))
julia> C = ConvexHullArray([Q1, Q2, Q3])
julia> plot([A1, A2], linecolor=:black, markercolor=:black)
julia> plot!(Bi, color=:green)
julia> plot!(Bo, color=:yellow)
julia> plot!(C, color=:cyan)
julia> plot!(P, color=:lightblue)
julia> plot!(Q3, color=:red)
julia> plot!([Q1, Q2], linecolor=:red, markercolor=:red) |
Very interesting, thanks!
…Sent from my iPhone
On 8 May 2019, at 10:23, Christian Schilling ***@***.***> wrote:
Here are some new ideas/observations.
ReLU is conceptually simple for boxes, namely again a box. Let Bi be the input box. We construct the output box Bo:
Check in each dimension whether the corresponding lower bound of Bi is negative (one support-function query per dimension).
For each negative case, bound Bo from below by 0. For each non-negative case, bound Bo from below by the bound of Bi.
For each negative dimension, check whether the corresponding upper bound of Bi is positive.
For each non-positive case, bound Bo from above by 0.
In short: Bo = Hyperrectangle(max.(low(Bi), zeros(n)), min.(high(Bi), zeros(n)))
box_approximation(ReLU(X)) == ReLU(box_approximation(X)).
In the positive orthant the box approximation is very coarse. A better approximation can be obtained by taking the convex hull of all points on the positive orthant, the origin, and every intersection of the box approximation with the axes. I suggest that we cache these intersection points for efficiency. (Note that they may not exist, which is a desirable case because then the approximation is exact.)
Below is an example.
light blue: polygon X (to be ReLU'ed)
green: box approximation of X
red: true ReLU(X) set (union of three sets)
yellow: box approximation of ReLU(X) (= ReLU(box(X)))
light green: best convex approximation of ReLU(X)
julia> using LazySets, LazySets.Approximations, Plots
julia> P = VPolygon([[-1., 1.], [-1.5, 0.5], [1.5, 0.5], [1., -0.5]]);
julia> A1 = LineSegment([0., 0.], [2., 0.]);
julia> A2 = LineSegment([0., 0.], [0., 1.3]);
julia> Q1 = LineSegment([0., 0.], [1.5, 0.]);
julia> Q2 = LineSegment([0., 0.], [0., 1.]);
julia> Q3 = intersection(P, Hyperrectangle(low=[0., 0.], high=[2., 2.]));
julia> Bi = box_approximation(P)
julia> Bo = box_approximation(ConvexHullArray([Q1, Q2, Q3]))
julia> C = ConvexHullArray([Q1, Q2, Q3])
julia> plot([A1, A2], linecolor=:black, markercolor=:black)
julia> plot!(Bi, color=:green)
julia> plot!(Bo, color=:yellow)
julia> plot!(C, color=:cyan)
julia> plot!(P, color=:lightblue)
julia> plot!(Q3, color=:red)
julia> plot!([Q1, Q2], linecolor=:red, markercolor=:red)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Shouldn't the Bo = Hyperrectangle(low = relu.(low(Bi)), high = relu.(high(Bi))) Potential correction aside, this is a great observation! We do something similar to this in a much messier and less clean way in our implementation of the MaxSens algorithm (I might steal this actually to clean up that code 😄). Ultimately though, I'd say it is the cyan set that is the "holy grail" for relu. Or an even tighter approximation that doesn't project the set downwards onto the x-axis at all (since it isn't necessary in that region of the graph). I see why this is a very tough problem though; I'm really glad to see that you are still working on it! Maybe it has been obvious, but I've had to redirect my attention elsewhere recently. I am still hoping to return to this towards the end of the academic quarter, but I clearly haven't been able to devote it enough time... |
True, thanks! And I like the definition of a helper function
Oh, I also wanted to use that. Fixed!
Yes, true, another error in my comment. I fixed it. And I also made the code runnable.
Don't worry, the same goes for me 😊 |
To generalize the observations above:
This explains the good behavior for boxes, since the box is a Cartesian product of intervals.
This leads to an algorithm to compute Another observation:
This gives a simple algorithm to compute the vertex representation of the convex hull of a ReLU for polytopes. |
Here is a (manual) prototype implementation. The imprecision in the plots is due to approximation imprecision. julia> using LazySets, LazySets.Approximations, Plots, Optim
julia> xlim = [-1.5, 1.5]; ylim = [-0.5, 1.0];
# original set
julia> P = VPolygon([[-1., 1.], [-1.5, 0.5], [1.5, 0.5], [1., -0.5]]);
julia> plot(P, xlim=xlim, ylim=ylim) # subdivision into orthants
julia> Q1 = P ∩ HPolygon([HalfSpace([-1., 0.], 0.), HalfSpace([0., -1.], 0.)]);
julia> Q2 = P ∩ HPolygon([HalfSpace([-1., 0.], 0.), HalfSpace([0., 1.], 0.)]);
julia> Q3 = P ∩ HPolygon([HalfSpace([1., 0.], 0.), HalfSpace([0., 1.], 0.)]);
julia> Q4 = P ∩ HPolygon([HalfSpace([1., 0.], 0.), HalfSpace([0., -1.], 0.)]);
julia> plot!(Q1, Nφ=60)
julia> plot!(Q2, Nφ=60)
julia> plot!(Q3, Nφ=60)
julia> plot!(Q4, Nφ=60) # projection
julia> R2 = [1. 0.; 0. 0.] * Q2;
julia> R3 = [0. 0.; 0. 0.] * Q3;
julia> R4 = [0. 0.; 0. 1.] * Q4;
# identify redundant sets (not needed, but simplifies things later)
julia> [R2 ⊆ P, R3 ⊆ P, R4 ⊆ P]
3-element Array{Bool,1}:
true
true
false
julia> plot(P, xlim=xlim, ylim=ylim)
julia> plot!(Q1, Nφ=60, color=:red)
julia> plot!(R4, Inf, linecolor=:red, width=5) # union (not convex)
julia> U = UnionSetArray([Q1, R4]);
# convex hull
julia> UC = ConvexHullArray([Q1, R4]);
# need to overapproximate for plotting lazy convex hull of lazy intersections
julia> Upolar = overapproximate(UC, PolarDirections(20));
julia> plot(P, xlim=xlim, ylim=ylim)
julia> plot!(Upolar) I was actually surprised that the inclusion checks worked out. In general I suspect to get precision problems. The simplest sufficient case is if the set on the l.h.s. is empty (not seen in this example, but generally this should happen), and this can be detected without an inclusion check. Note that this "algorithm" is exponential in the number of dimensions. It will be vital to first detect dimensions where the corresponding 0-hyperplane is not crossed to cut down the number of sets. |
This sounds promising because it is cheap, but I am somehow missing a recursion of
Give me a few minutes. |
Not a counterexample, but a pathological case: What if Here is a real counterexample: julia> ρ([0.1, 1.0], P)
0.9
julia> ρ([0.1, 1.0], C)
1.0 It is not a counterexample to an overestimation, though. If your result is an overestimation (I am not sure), it may still be useful and maybe in practice that bound is good enough. |
Thank you Christian for bringing me back down to earth. Upon closer inspection I only get the following: In general, supp(RL(x),d) >= 0 if all d_i>=0, and <= 0 otherwise. |
Do you mean that all results so far do not hold? The example from above is at least also a counterexample to the last result here:
Consider direction julia> ρ([-1., 1.], C)
1.0
julia> ρ([-1., 0.], P)
1.5
julia> ρ([-1., 1.], P)
2.0
julia> 1.0 == max(1.5, 2.0)
false |
No, only the conjecture is false. The rest should be ok. Your counterexample doesn’t apply, because you need to apply the formula nested, i.e., componentwise (everywhere where there’s an s it means that ReLU is applied to a single variable). ρ([-1, 1], C) = max(ρ([-1, 0], ReLU_1(P)), ρ([-1, 1], ReLU_1(P))) = max(min_[-1,0] ρ([t, 0], P), min_[-1,0] ρ([t, 1], P)). |
#1176 - Box approximation for Rectification
#1176 - Support function for Rectification
Rekindling an old fire here! But I thought this might be the place to comment this thought: I've just taken a dive into the implementation of # bad math probably, and X ∩ O⁺ should be computed
# only once of course, but I hope the idea is clear.
# for d ∈ mixed_dimensions
if ρ(X ∩ O⁺, d) < ρ(X, d)
# d is a relevant direction.
# Should proceed to calculate.
else
# Projection(X, d) is a subset of X ∩ O⁺, and so should be skipped.
end I'm not sure if this poses problems with different set types, or anything like that, but it was a thought that occurred to me, so I went digging through the source to see if it was already in place (and couldn't find it, or just failed to identify it). Most efficiently, I think this step would be incorporated into the |
Given a set
X
, the setRelu(X)
respresents the set{max.(x, 0) | x ∈ X}
.For a convex set
X
,Relu(X)
is not necessarily convex (takeX
as the 2D line segment from(0, -1)
to(2, 1)
.For a start, we only need the support vector. The first idea is to just reset all negative entries in the support vector ofX
to zero. We would need to check if that is actually correct.The text was updated successfully, but these errors were encountered: