Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqPeriodicEuclidean? #151

Closed
SteffenPL opened this issue Oct 9, 2019 · 10 comments
Closed

SqPeriodicEuclidean? #151

SteffenPL opened this issue Oct 9, 2019 · 10 comments

Comments

@SteffenPL
Copy link

For my simulation, the squared periodic euclidean distances has to be computed for many pairs.

Would it be possible to add SqPeriodicEuclidean to the collection of metrics or is it too special?
If wanted, I could work on an implementation and hand in a pull request.

@dkarrasch
Copy link
Member

You can use pairwise(SqEuclidean(), A, B; dims=2), and that should be pretty fast. Or what exactly do you mean by "adding to the collection of metrics"?

@SteffenPL
Copy link
Author

Thanks for the fast reply.
I'm using SqEuclidean and pairwise already, and they work great!

However, to make computations on a torus, I need the periodic metric.
I don't know how to get the squared periodic distance out of the result pairwise(SqEuclidean(), A, B; dims=2).

Maybe I don't see something obvious here, but it seems to be important first to do the modulo stuff and then compute the sum of squares...

@dkarrasch
Copy link
Member

Oooh, sorry, I overlooked the "Periodic". That's interesting! I didn't add it at the time, because I wasn't sure if anyone would ever use it. As a quick hack, you could call pairwise(PeriodicEuclidean(periods), A, B); dims=2).^2, or, indeed, make a PR. It would be nice if #150 got merged first, though. But otherwise, only few additions would be needed, because the modulo stuff is already there.

@SteffenPL
Copy link
Author

SteffenPL commented Oct 9, 2019

Cool.

I'll wait for #150. Until then, I might figure out how to implement pairwise efficiently for a PR.

For now, I will use a user-defined metric as below in my project, since performance is not critical yet.

Thanks for your replies!

struct SqPeriodicEuclidean{W <: AbstractArray{<: Real}} <: SemiMetric
    periods::W
end

# SqPeriodicEuclidean
Base.eltype(d::SqPeriodicEuclidean) = eltype(d.periods)
@inline parameters(d::SqPeriodicEuclidean) = d.periods
@inline function eval_op(d::SqPeriodicEuclidean, ai, bi, p)
    s1 = abs(ai - bi)
    s2 = mod(s1, p)
    s3 = min(s2, p - s2)
    abs2(s3)
end
@inline function eval_op(d::SqPeriodicEuclidean, ai, bi)
    periods = d.periods
    p = isempty(periods) ? oneunit(eltype(periods)) : first(periods)
    eval_op(d, ai, bi, p)
end
@inline eval_reduce(::SqPeriodicEuclidean, s1, s2) = s1 + s2
@inline eval_end(::SqPeriodicEuclidean, s) = s

sqpeuclidean(a::AbstractArray, b::AbstractArray, p::AbstractArray{<: Real}) =
    SqPeriodicEuclidean(p)(a, b)
sqpeuclidean(a::Number, b::Number, p::Real) = SqPeriodicEuclidean([p])(a, b)


function Distances._evaluate(d::SqPeriodicEuclidean, a::AbstractArray, b::AbstractArray)
    @boundscheck if length(a) != length(b)
        throw(DimensionMismatch("first array has length $(length(a)) which does not match the length of the second, $(length(b))."))
    end
    p = parameters(d)
    @boundscheck if p !== nothing
        length(a) != length(p) && throw(DimensionMismatch("arrays have length $(length(a)) but parameters have length $(length(p))."))
    end
    if length(a) == 0
        return zero(result_type(d, a, b))
    end
    @inbounds begin
        s = zero(result_type(d, a, b))
        if size(a) == size(b)
            @simd for I in eachindex(a, b, p)
                aI = a[I]
                bI = b[I]
                pI = p[I]
                s = eval_reduce(d, s, eval_op(d, aI, bI, pI))
            end
        else
            for (Ia, Ib, Ip) in zip(eachindex(a), eachindex(b), eachindex(p))
                aI = a[Ia]
                bI = b[Ib]
                pI = p[Ip]
                s = eval_reduce(d, s, eval_op(d, aI, bI, pI))
            end
        end
    end
    return eval_end(d, s)
end


@eval @inline (dist::SqPeriodicEuclidean)(a::AbstractArray, b::AbstractArray) = Distances._evaluate(dist, a, b)

@dkarrasch
Copy link
Member

It might be better to eventually have some functions dispatch on Union{PeriodicEuclidean,SqPeriodicEuclidean} rather than have new copies of existing code, or even have

const PeriodicUnionMetrics =  Union{PeriodicEuclidean,SqPeriodicEuclidean}

treated analogously to WeightedUnionMetrics. This applies to parameters, eval_op(..., ai, bi, p), eval_reduce. There will be no need for Base.eltype, eval_op(..., ai, bi), eval_end and obviously _evaluate once SqPeriodicEuclidean is added to UnionMetrics, any way. Just as a quick comment for a future PR.

@dkarrasch
Copy link
Member

Alright, now that #150 is merged, it is actually really easy to define your desired metric within a few lines:

import Distances: UnionSemiMetric, parameters, eval_op, _evaluate

struct SqPeriodicEuclidean{W<:AbstractArray{<:Real}} <: UnionSemiMetric
    periods::W
end

parameters(d::SqPeriodicEuclidean) = d.periods

@inline Distances.eval_op(d::SqPeriodicEuclidean, ai, bi, p) = eval_op(PeriodicEuclidean(d.periods), ai, bi, p)
@inline (dist::SqPeriodicEuclidean)(a::AbstractArray, b::AbstractArray) = _evaluate(dist, a, b, parameters(dist))

Let's test it:

using Distances, Test, BenchmarkTools
x, y, L = [0.0, 0.0], [0.75, 0.0], [0.5, Inf];
@test evaluate(SqPeriodicEuclidean(L), x, y)  evaluate(PeriodicEuclidean(L), x, y)^2
@btime evaluate($(SqPeriodicEuclidean(L)), $x, $y); # 22.872 ns (0 allocations: 0 bytes)
@btime evaluate($(PeriodicEuclidean(L)), $x, $y); # 30.411 ns (0 allocations: 0 bytes)

The difference really is in the final sqrt. Enjoy checking how many predefined generic fallbacks this uses from inside the package!

@SteffenPL
Copy link
Author

That's an elegant solution. I like the new freedom which #150 gives. Thanks!

I tried the code and it just works.

@zahachtah
Copy link

Has something changed since this because import Distances: UnionSemiMetric, parameters, eval_op, _evaluate gives error that UnionSemiMetric cannot be imported.

@zahachtah
Copy link

yes, its now part of the package :-)

@SteffenPL
Copy link
Author

@zahachtah It seem you have to call all of this before using Distances. Only then it works, otherwise it cannot import UnionSemiMetric

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants