Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow weighted sum/dot etc. #775

Closed
DNF2 opened this issue Mar 22, 2022 · 1 comment · Fixed by #778
Closed

Slow weighted sum/dot etc. #775

DNF2 opened this issue Mar 22, 2022 · 1 comment · Fixed by #778

Comments

@DNF2
Copy link

DNF2 commented Mar 22, 2022

dot(w, x), sum(x, w) etc. seem to use a slow fallback implementation of dot:

1.7.2> using StatsBase

1.7.2> x = rand(1000); w = weights(rand(length(x)));

1.7.2> x = rand(1000); w = weights(rand(1000));

1.7.2> @btime dot($w, $x)
  975.000 ns (0 allocations: 0 bytes)
244.6881117358608

1.7.2> @btime dot($w.values, $x)
  85.151 ns (0 allocations: 0 bytes)
244.68811173586084

The same goes for calling sum(x, w) or wsum(x, w).

This is independent of the number of BLAS threads.

This is a pretty dramatic performance difference. Should StatsBase call BLAS.dot or is this something that should be improved in the fallback dot routine in Base?

@nalimilan
Copy link
Member

Good catch. We need to use the values field instead of passing AbstractWeights objects directly. I'm not sure we should implement dot for AbstractWeights objects, as the number of methods to implement would be very large if we start trying to use the fastest methods everywhere. A more reasonable goal is to ensure good performance only for methods which are specifically designed to take weights.

See #778.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants