Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename Unicode package to Strings. #25416

Closed
wants to merge 1 commit into from
Closed

Rename Unicode package to Strings. #25416

wants to merge 1 commit into from

Conversation

malmaud
Copy link
Contributor

@malmaud malmaud commented Jan 5, 2018

Closes #25394

@malmaud
Copy link
Contributor Author

malmaud commented Jan 5, 2018

There was also talk of moving additional functions from Base into Strings, but this initial renaming is the easiest and least contentious, so why don't we start there.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jan 5, 2018

This change reads to me as if without doing using Strings I would be unable to use strings in Julia, which is absurd. The most compelling argument made when we discussed this was that there are character sets which do not map into Unicode and for which the notion of uppercase, lowercase, etc. still make sense. So even if there is a Unicode.uppercase function that only handles Unicode code points, it would make sense to have a more general uppercase function which dispatches to Unicode.uppercase for character types which represent Unicode code points. However, I'm not 100% sold on the general importance of supporting character sets which cannot be fully mapped into Unicode.

@JeffBezanson
Copy link
Member

That doesn't bother me so much. Standard library names are often fairly generic, e.g. using Iterators comes to mind (by the way, should we rename Base.Iterators to Iterators via a thin stdlib wrapper, like we've been doing for other things?). I think being able to move more string functions out of Base is also a pretty compelling reason.

@StefanKarpinski
Copy link
Member

That doesn't bother me so much. Standard library names are often fairly generic, e.g. using Iterators comes to mind

And you don't have access to iterator types or functions until you do using Iterators. Whereas the String type is available without doing using Strings and so are almost all functions that operate on strings.

(by the way, should we rename Base.Iterators to Iterators via a thin stdlib wrapper, like we've been doing for other things?).

Yes.

I think being able to move more string functions out of Base is also a pretty compelling reason.

This PR doesn't do that, and I suspect if people are annoyed by moving lowercase out of Base they're going to be 100x more annoyed by moving all the string-related functions out.

I'm fine with some solution that doesn't require people who are afraid of the word "Unicode" to type using Unicode, but I just don't think this is it.

@malmaud
Copy link
Contributor Author

malmaud commented Jan 7, 2018

I guess we should re-add the triage label, since a decision still hasn't been reached about this?

@malmaud malmaud added the triage This should be discussed on a triage call label Jan 7, 2018
@JeffBezanson
Copy link
Member

And you don't have access to iterator types or functions until you do using Iterators

Kind of, but there are certainly iterators and iteration available without it.

Calling this Strings has even been suggested before: #22241 (comment)

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jan 8, 2018

It would be helpful to address my analysis in the original issue of the apparent objections. If you have other objections that I've missed or counterpoints to any of them, that would move the conversation forward.

@JeffBezanson
Copy link
Member

JeffBezanson commented Jan 8, 2018

On the 3-point analysis, I'm a bit in each camp.

  1. While lowercase, isspace etc. seem pretty specific to be part of Base, there's not much else they could mean. isvalid is a pretty generic name and it's still exported from Base. isspace, isdigit etc. are some of the most useful functions on characters.
  2. I don't mind the name Unicode so much, but it would be nice to be able to move non-unicode-specific string functions out of Base too. But if we're not really going to do that, I see much less need to rename the package.
  3. Supporting non-unicode character sets is indeed pretty fanciful, but the argument is just that putting generic functions that could make sense for other character sets in Unicode is gratuitous.

There are more Unicode-specific things in Base.Unicode that could be exported from or moved to the Unicode package, e.g. category code functions, instead of the more generic isspace etc. Overall the best way to resolve this is probably to move lowercase, uppercase, and character predicates back to Base, so we can keep Unicode for truly Unicode-specific stuff.

@JeffBezanson
Copy link
Member

linking to #25479

@malmaud
Copy link
Contributor Author

malmaud commented Jan 11, 2018

Ah, so the thinking now is to move the generic string functions back into Base, leaving a Unicode module that's really Unicode-specific? Makes sense.

@JeffBezanson
Copy link
Member

Triage is ok with #25479

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Jan 11, 2018
@StefanKarpinski
Copy link
Member

Triage: we're going to go with #25479 instead.

@DilumAluthge DilumAluthge deleted the jmm/string_rename branch March 25, 2021 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants