Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selectperm? #10767

Closed
sglyon opened this issue Apr 8, 2015 · 17 comments
Closed

selectperm? #10767

sglyon opened this issue Apr 8, 2015 · 17 comments

Comments

@sglyon
Copy link
Contributor

sglyon commented Apr 8, 2015

I was hoping to find a routine in base that would give me the first k elements of sortperm(v).

This would be much like select(v, 1:k), but instead of returning the actual elements, it would return the index where those elements can be found.

Does such a function exist?

@sglyon
Copy link
Contributor Author

sglyon commented Apr 8, 2015

Digging through sort.jl, I came up with this (note I prefix everything with absolute path because I wrote this function outside of Base):

function selectperm(v::AbstractVector,
                    k::Union(Int,OrdinalRange);
                    lt::Function=isless,
                    by::Function=identity,
                    rev::Bool=false,
                    order::Base.Order.Ordering=Base.Order.Forward)
    select!(collect(1:length(v)), k, Base.Order.Perm(Base.Order.ord(lt,by,rev,order),v))
end

It seems to do the job for me. Is this the best way to get this functionality? If so, I'm happy to package this up inside sort.jl and make the interface more like the current sortperm, sortperm! functions and submit a PR.

@StefanKarpinski
Copy link
Member

That does the trick. It seems unfortunate to allocate length(v) indices only to return k of them, but it's not obvious to me how to do better without duplicating most of the logic of select!.

@sglyon
Copy link
Contributor Author

sglyon commented Apr 8, 2015

It would be great if we could solve that problem. The application I'm currently using this for typically has v length 10,000 or greater and k between 1-30. That code sits inside a loop, so the wasted allocation really adds up.

@carlobaldassi
Copy link
Member

I also needed this functionality in a hot loop, so I had to avoid allocation. I ended up implementing a partial quick sort algorithm and using that.
I put everything in this gist (the partsort implementation and a small example).

@StefanKarpinski
Copy link
Member

I wonder if this would be worth including in the standard library or SortingAlgorithms?

@nalimilan
Copy link
Member

Since the standard library already has sort, sortperm and select, sounds logical to also have selectperm.

@StefanKarpinski
Copy link
Member

It does seem quite logical, I must say.

@sglyon
Copy link
Contributor Author

sglyon commented Apr 9, 2015

I agree with @nalimilan here. This seems like it naturally belongs with the rest of its family.

@StefanKarpinski
Copy link
Member

I'm ok with the API being allocation by default and with the option of pre-allocation when performance is critical. That marries default convenience with the option of being fast without too much hassle.

@sglyon
Copy link
Contributor Author

sglyon commented Apr 9, 2015

I haven't tested yet, but I suppose that the code @carlobaldassi shared will be more performant than what I have here.

Maybe we make selectperm use the code I wrote and selectperm! use what @carlobaldassi has?

@ViralBShah
Copy link
Member

+1 for having this in base.

@carlobaldassi
Copy link
Member

I wrote that code quite some time ago, and when I benchmarked it it was indeed the fastest way I could find to achieve the goal. I only had one use case though, and it was before the new GC. It would be good to benchmark it again.

But at least I remember I thoroughly tested the PartialQuickSort implementation, so there should be no surprises on that side hopefully.

@sglyon
Copy link
Contributor Author

sglyon commented Apr 10, 2015

I'll try to package this up into a PR in the next few days

@sglyon
Copy link
Contributor Author

sglyon commented Apr 13, 2015

See #10800

@sglyon sglyon closed this as completed Apr 13, 2015
@pao
Copy link
Member

pao commented Apr 13, 2015

@spencerlyon2 If you add closes #10767 to the commit message, it will close this issue automatically when the pull is accepted into master.

@sglyon
Copy link
Contributor Author

sglyon commented Apr 13, 2015

Ahh, that's how it works. I added that line and expected it to close this issue immediately.

I guess I should have waited for the PR to be accepted.

@pao
Copy link
Member

pao commented Apr 13, 2015

No problem! This should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants