Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

large broadcasting overhead on small subarray #45510

Closed
johnnychen94 opened this issue May 30, 2022 · 1 comment
Closed

large broadcasting overhead on small subarray #45510

johnnychen94 opened this issue May 30, 2022 · 1 comment
Labels
arrays [a, r, r, a, y, s] broadcast Applying a function over a collection performance Must go faster

Comments

@johnnychen94
Copy link
Member

function mapf!(f, out, A, B)
    @inbounds @simd for i in eachindex(out, A, B)
        out[i] = f(A[i], B[i])
    end
    return out
end

function mapf_broadcast!(f, out, A, B)
    @. out = f(A, B)
end

A, B = rand(64, 64), rand(64, 64)
out = similar(A)

@btime mapf!(max, view($out, :, 1), view($A, :, 1), view($B, :, 1)); # 16.011 ns (0 allocations: 0 bytes)

@btime mapf_broadcast!(max, view($out, :, 1), view($A, :, 1), view($B, :, 1));
# 1.6.5: 202.272 ns (9 allocations: 528 bytes)
# 1.7.3: 187.501 ns (9 allocations: 528 bytes)
# 1.8.0-beta3: 204.926 ns (5 allocations: 336 bytes)
# 1.9.0-DEV.647: 78.541 ns (3 allocations: 240 bytes)

For larger columns, the overhead still exists but is much smaller.

A, B = rand(4096, 4), rand(4096, 4)
out = similar(A)

@btime mapf!(max, view($out, :, 1), view($A, :, 1), view($B, :, 1)); # 1.037 μs (0 allocations: 0 bytes)
@btime mapf_broadcast!(max, view($out, :, 1), view($A, :, 1), view($B, :, 1));
# 1.8.0-beta3: 1.279 μs (5 allocations: 336 bytes)
# 1.9.0-DEV.647: 1.158 μs (3 allocations: 240 bytes)
@johnnychen94 johnnychen94 added performance Must go faster arrays [a, r, r, a, y, s] broadcast Applying a function over a collection labels May 30, 2022
@giordano
Copy link
Contributor

Duplicate of #28126?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] broadcast Applying a function over a collection performance Must go faster
Projects
None yet
Development

No branches or pull requests

2 participants