Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster and simpler generic_norm2 #43256

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

oscardssmith
Copy link
Member

I'm not 100% sure if this works. Probably needs a PkgEval.

I'm not 100% sure if this works. Probably needs a PkgEval
@oscardssmith oscardssmith added performance Must go faster linear algebra Linear algebra labels Nov 29, 2021
@@ -462,27 +462,12 @@ norm_sqr(x::Union{T,Complex{T},Rational{T}}) where {T<:Integer} = abs2(float(x))

function generic_norm2(x)
maxabs = normInf(x)
(maxabs == 0 || isinf(maxabs)) && return maxabs
(v, s) = iterate(x)::Tuple
(ismissing(maxabs) || maxabs == 0 || isinf(maxabs)) && return maxabs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sad if we need to add a bunch of special-cased missing code to LinearAlgebra.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. We can leave out that test. It's just that the transition to mapreduce facilitated norm handle missing for certain ps, without pulling *missing to the surface.

Copy link
Member

@martinholters martinholters Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would

Suggested change
(ismissing(maxabs) || maxabs == 0 || isinf(maxabs)) && return maxabs
(isinf(maxabs) !== false || maxabs == 0) && return maxabs

be preferable? (Which I think should be equivalent, but please do double-check.)

EDIT by @dkarrasch : one needs to avoid performing maxabs == 0 because that throws a TypeError: non-boolean (Missing) used in boolean context. Otherwise that would work, because of short-circuiting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that you get a notification: I modified @martinholters's suggestion, which I tested for maxabs = missing and it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and that same trick should make it possible to apply the same steps to generic_normp!

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tests pass without the init keyword, this LGTM, up to the controversial ismissing(maxabs), which we could as well remove here and keep that for discussion. I acknowledge the sentiment against explicit missing handling, it's just that normInf(x) succeeds and returns missing whenever there's a missing entry, so it's the logic in (maxabs == 0 || isinf(maxabs)) that fails. The mapreduce step handles everything correctly again.

@oscardssmith oscardssmith added the needs pkgeval Tests for all registered packages should be run with this change label Nov 29, 2021
@oscardssmith
Copy link
Member Author

@nanosoldier runtests(ALL, vs = ":master")

@oscardssmith
Copy link
Member Author

Why isn't nanosoldier running on this?

@maleadt
Copy link
Member

maleadt commented Nov 30, 2021

Not sure, let's try again:

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@oscardssmith
Copy link
Member Author

@maleadt any idea why the report is broken?

@maleadt
Copy link
Member

maleadt commented Dec 1, 2021

@maleadt any idea why the report is broken?

(Sorry about that, but we can’t show files that are this big right now.)

Just click raw. I guess we now exceed the maximal log size.

@KristofferC
Copy link
Member

KristofferC commented Dec 1, 2021

The reason the report is so big is because so many packages failed. But the results look quite strange. Perhaps rerun it?

@oscardssmith
Copy link
Member Author

@nanosoldier runtests(ALL, vs = ":master")

1 similar comment
@maleadt
Copy link
Member

maleadt commented Dec 2, 2021

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@oscardssmith
Copy link
Member Author

I've decided that while I have this PR, I might as well also fix generic_normp, so this needs review again. Also I fixed a bug where we weren't accumulating results in enough precision.

@oscardssmith oscardssmith reopened this Dec 3, 2021
@dkarrasch
Copy link
Member

The problem seems to be that mapreduce with the required anonymous functions fails to predict the return type and falls back to dynamic dispatch, and is hence slow. Mu proposal woudl be change to the following. It doesn't speed up the computations, but simplifies the code a bit and allows to silently handle missing values:

function mygeneric_norm2(x)
    maxabs = normInf(x)
    (isinf(maxabs) !== false || maxabs == 0) && return maxabs
    T = typeof(maxabs)
    sum = zero(promote_type(Float64, T))
    if isfinite(length(x)*maxabs*maxabs) && maxabs*maxabs != 0 # Scaling not necessary
        for v in x
            sum += norm_sqr(v)
        end
        return convert(T, sqrt(sum))
    else
        invmaxabs = inv(maxabs)
        if isfinite(invmaxabs)
            for v in x
                sum += (norm(v) * invmaxabs)^2
            end
        else
            for v in x
                sum += (norm(v) / maxabs)^2
            end
        end
        return convert(T, maxabs*sqrt(sum))
    end
end

It uses ideas that have come up in the discussion of this PR.

@KristofferC
Copy link
Member

Is this PR still "simpler"? It doesn't really feel like that.

@dkarrasch
Copy link
Member

I agree, it seems hard/impossible to simplify things via mapreduce due to the type inference issue, so we could as well leave things as they are.

@oscardssmith
Copy link
Member Author

I think I'll let this sit until captured variables in closures improves.

@oscardssmith
Copy link
Member Author

@mcabbott can you review this? I've updated it based on inspiration from #43459

stdlib/LinearAlgebra/src/generic.jl Outdated Show resolved Hide resolved
T = typeof(float(norm(first(x))))
sT = promote_type(T, Float64)
ans = mapreduce(norm_sqr, +, x)
ans in (0, Inf) || return convert(T, sqrt(ans))
Copy link
Contributor

@mcabbott mcabbott Dec 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to check whether in does ==. But it does, perhaps that's OK.

I'd also rather not use ans as a variable name. And would prefer to use a different name for the second path's output, especially as it has a different type.

Comment on lines +483 to +484
for v in x
ans += sT(norm(v))^p
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth special-casing p==3, p==0.5, for which we can replace ^p with faster functions?

Comment on lines +470 to +471
for v in x
ans += (norm(v)/maxabs)^2
Copy link
Contributor

@mcabbott mcabbott Dec 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't measure any change to pulling out the division, and multiplying by invmaxabs.

But adding @simd for v in x seems to help quite a bit. Is it safe to do so? (Or maybe this method won't get called on the sort of arrays for which it is beneficial anyway.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second. I don't want to deal with the invmaxabs here since we're already in a slow path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, times for me with rand(1000) are

  • for v in x; out += (norm(v)/maxabs)^2 takes 2.352 μs
  • with @simd 1.733 μs,
  • both after normInf(x) which takes 1.404 μs, same as maximum

vs:

  • BLAS.nrm2 takes 1.196 μs
  • mapreduce(norm_sqr, +, x) takes 229.763 ns

So I guess LinearAlgebra.NRM2_CUTOFF should be higher, currently 32.

But also, why is maximum so slow? Can this be done less carefully here since we don't care about -0.0 and NaN?

sT = promote_type(T, Float64)
ans = mapreduce(norm_sqr, +, x)
ans in (0, Inf) || return convert(T, sqrt(ans))
maxabs = sT(normInf(x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code has one more short-circuit here, returning 0/Inf if maxabs is either. Might be worthwhile to have that here? All-zeros might be the most common case after finite norm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would all zero be common? I'd think that would be pretty rare.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant all zero might be more common than truly tiny values. Hopefully both much less common than values about 1.

@mcabbott
Copy link
Contributor

mcabbott commented Nov 3, 2022

What's the status of this? After #40790 it will at least need to be rebased.

@oscardssmith
Copy link
Member Author

the status is that I haven't thought about this for a year because I was running into really annoying issues with closures capturing variables that were ruining the performance.

@mcabbott
Copy link
Contributor

mcabbott commented Nov 3, 2022

Ok! Was reminded by this thread, and what's here seems pretty quick (but has 1 mystery allocation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra needs pkgeval Tests for all registered packages should be run with this change performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants