-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rand vs. sample #6003
Comments
Looking at the docs, I don't think I see a case where the functions couldn't be combined. |
Among others, there's this: julia> using StatsBase
julia> rand(1:3)
1
julia> rand([1:3])
ERROR: no method rand(Array{Int64,1})
julia> sample(1:3)
3
julia> sample([1:3])
3
|
Ah, I misread your comment as "could" rather than "couldn't". So you're in agreement :-) |
I wonder if the idea of a weighted collection shouldn't be generalized. There's a lot of overlap with multisets and dicts where the values are weights. The features that don't appear in Base's rand function are weights, replacement and ordering. It would be nice if we could arrange things so that there doesn't need to be a separate function just to add those features. We can either backports support for those things into Base or figure out a way to layer support onto the base |
cc: @lindahua |
I am all for migrating the cc: @johnmyleswhite |
I wholeheartedly agree. I would love to see |
+1000000 for sample in Base. |
Well, it's more the other direction we're proposing: using |
I thought the idea was to merge the names |
Yes, that's the idea. It remains to be seen which name will be left at the end. |
Presumably you're going to keep |
I was thinking about that. Samplen is pretty odd. |
Agree to use |
Ok, so the naming issue is decided. It remains to determine what needs to be done to merge them. |
If you may wait a little bit, I can make a PR for this after the deadline in this weekend. I am also fine if someone else can take a lead to do that. |
No hurry on this. It's not going into 0.3. |
I have now get to the point to tackle this issue. I am proposing the API below: The main idea is to reuse # randomly draw one sample from a
rand(a::AbstractArray)
# randomly draw multiple samples from a, with or without replacement
# this subsumes the rand(a::Range, n) methods
rand(a::AbstractArray, n::Int; replace::Bool=true)
rand(a::AbstractArray, dim::Dims; replace::Bool=true) The With this implemented, the |
Sounds good. There's a slight potential for confusion with |
This implements the generalization suggested by @ivarne (JuliaLang#8257 (comment)) or by @lindahua (JuliaLang#6003 (comment)). This change is very simple thanks to commit 48f27bc.
This implements the generalization suggested by @ivarne (JuliaLang#8257 (comment)) or by @lindahua (JuliaLang#6003 (comment)). This change is very simple thanks to commit 48f27bc.
This implements the generalization suggested by @ivarne (JuliaLang#8257 (comment)) or by @lindahua (JuliaLang#6003 (comment)). This change is very simple thanks to commit 48f27bc.
This implements the generalization suggested by @ivarne (JuliaLang#8257 (comment)) or by @lindahua (JuliaLang#6003 (comment)). This change is very simple thanks to commit 48f27bc.
This implements the generalization suggested by @ivarne (JuliaLang#8257 (comment)) or by @lindahua (JuliaLang#6003 (comment)). This change is very simple thanks to commit 48f27bc.
@lindahua Is this worth revisiting now in the 0.4 cycle? |
I may look into this again next week. |
Did not make it into 0.4, removing from milestone. |
Someone needs to own this to make it happen or it's going to slip again. |
It's also a bit unclear what still needs to be done at this point. |
The main difference now is that
The only reason we can't combine them is that we can't add more keyword arguments to existing method signatures without getting a method overwritten warning. |
and the problem @JeffBezanson mentioned above:
(sample throws a |
After some discussion, the decision was made to leave it as is.
|
Base Julia provides the
rand
andrand!
functions, which do random sampling over various domains. The StatsBase package providessample
andsample!
with partially overlapping functionality. There have been various threads (e.g. this one) where people are discussion additional functionality for one or the other. I'm opening this issue to discuss whether these interfaces should be separate or if they should be merged and if they are to remain separate, to clarify the distinction betweenrand
andsample
.The text was updated successfully, but these errors were encountered: