Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add a fallback method for sort, plus one for strings #16853

Closed
wants to merge 2 commits into from

Conversation

ararslan
Copy link
Member

@ararslan ararslan commented Jun 9, 2016

Sorting a string returns a string, sorting a number is a MethodError, and sorting other iterables collects them first and returns an array. I separated numbers into a separate call because otherwise the MethodError text isn't so useful.

@ararslan
Copy link
Member Author

ararslan commented Jun 9, 2016

@StefanKarpinski You seem to be the supreme overlord of strings, what do you think of being able to sort them?

@StefanKarpinski
Copy link
Member

I'm a little bit iffy on it, although I suppose that's the logical conclusion of treating them as containers of characters. Since with the other change here, you can just write String(sort(s)); but on the other hand, then you might as well get a String back when you do sort(s) instead of an array of characters.

@ararslan
Copy link
Member Author

ararslan commented Jun 9, 2016

Yeah, I had started this PR idea as just a general fallback method but then I realized that when given a string it returned an array of characters. I figured string in -> string out made more sense. If you're too iffy about it, strings could be a MethodError along with numbers.

@StefanKarpinski
Copy link
Member

I dunno. It seems somewhat silly but harmless.

@ararslan
Copy link
Member Author

ararslan commented Jun 9, 2016

Mostly I just wanted to be able to sort things like Base.Take, strings are just a consequence.

## other iterables and fallback ##

sort(s::String; kws...) = String(sort(collect(s); kws...))
sort(n::Number) = throw(MethodError("no method matching sort(::Number)"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MethodError() takes the function and tuple of arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffBezanson Oh neat, I didn't realize that. I'll do it that way. Thanks!

sort(n::Number) = throw(MethodError(sort, n))
sort(itr; kws...) = sort(collect(itr); kws...)
sort(itr; kws...) = sort!(collect(itr); kws...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope collect never aliases. x-ref discussion in #15546

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkelman Probably safer to use not-in-place sort then, yeah?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured collect always made a new object so it would be safe to modify that in place and avoid an extra copy. I didn't realize it was possible for collect to alias.

Copy link
Contributor

@tkelman tkelman Jun 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hopefully isn't possible for collect to alias, but that's a difficult assumption to verify since people can write their own collect methods - if they write one that aliases, who is at fault there, the author of the aliasing method or downstream callers who assumed it wouldn't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have made sort no longer yell (i.e. sort! -> sort).

@rfourquet
Copy link
Member

Comment by Scott Jones from https://groups.google.com/d/topic/julia-dev/ietfsSAIwnI/discussion

I know it isn't very much code, but I was wondering what the use case of this would be?
In all my time doing lots of string processing, I don't think I ever needed this.
Since a String isn't normally treated as an array, and it is simple enough to get a vector of characters from a string which can then be sorted, if you really needed to.

I'd rather see this fall back to a error, like Number.

(What I did, many times, was take the characters from some (generally large) text, and see with a sparse bit map all the characters present, or count the frequencies of the characters,
but sorting is not needed for that. Just sorting the Unicode characters also doesn't make too much sense, you'd really want to find all of the graphemes and sort those, I'd imagine.)

-Scott

@ararslan
Copy link
Member Author

Yeah, I'd be fine with sorting a string being an error.

@tkelman
Copy link
Contributor

tkelman commented Jun 11, 2016

Is sort(collect(itr)) all that onerous to type for other iterables? Having the iterator-in, array-out part be performed explicitly with a collect seems like it may be preferable to doing it implicitly as part of sort? Otherwise do we start doing this for all other collection functions like findfirst?

@nalimilan
Copy link
Member

sort(collect(itr)) isn't that long, but doing something useful by default when calling sort sounds like a good idea. I guess we could also do a = collect(itr); itr !=== a ? sort!(a) ; sort(a) to avoid a copy, which could trap users when done directly. findfirst is a very different case AFAIK since you don't need to collect the result to return the matching index.

I agree that special-casing strings doesn't seem worth it.

@tkelman
Copy link
Contributor

tkelman commented Jun 11, 2016

Ah right the specific case of findfirst (and findnext) is what #15755 is about. I think you could substitute other collection functions in its place in what I said above though.

@ararslan
Copy link
Member Author

@tkelman It's not that it's onerous to type sort(collect(itr)), I just thought it would be nice to be able to sort an iterable without having to worry about whether it has a sort method defined (e.g. for things like Base.Take). But I agree, kind of iffy that sort would be iterator in, array out. That might be more surprising for a user than would not having a sort method.

@ararslan
Copy link
Member Author

Unless there's someone who's really into this idea, I'll go ahead and close this PR

@ararslan ararslan closed this Jun 11, 2016
@ararslan ararslan deleted the sort_fallback branch June 11, 2016 23:43
@StefanKarpinski
Copy link
Member

Doing sort(collect(keys(d))) is pretty annoying and common. This would at least get it down to sort(keys(d)) instead of that. Of course, we could just have a sort method for KeyIterators but that seems a bit ad hoc.

@ararslan
Copy link
Member Author

@StefanKarpinski Yeah, I agree; I do think it would be nice to have a catchall for sorting iterables of various types, like KeyIterator as you said. But would it not be strange to get an array from sort when passed a general iterable?

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jun 13, 2016

What else could you get?

@ararslan
Copy link
Member Author

ararslan commented Jun 13, 2016

Uhhhhhhhhh ¯\_(ツ)_/¯

Good point. So should I reopen this? Does my implementation seem like a reasonable way to do this? (The string thing aside for the moment.)

@ararslan
Copy link
Member Author

ararslan commented Jun 13, 2016

Well, I suppose it depends on what you're sorting. For example, if I wanted to sort a tuple, say (3,2,1), I would expect to get a tuple back rather than an array.

Edit: I guess that could be another thing that's special cased, e.g.

sort(t::Tuple) = (sort(collect(t))...)

@rfourquet
Copy link
Member

I would favor this possibility to sort general iterables, but without special cases: this would return an array even for tuples and strings as input. This is more predicable, and more useful sometimes, when the user wants to mutate the result. In the end, it seems that modifying the sort signature by removing AbstractArray, and using iterator traits (HasLength/HasShape) instead would be enough (and copymutable does the job, instead of calling collect). Slightly related is #22489 which allows to sort! iterables with a "random-access" trait.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants