Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: std::ascii reform #486

Merged
merged 1 commit into from
Dec 16, 2014
Merged

Conversation

SimonSapin
Copy link
Contributor

Move the std::ascii::Ascii type and related traits to a new Cargo package on crates.io, and instead expose its functionality for u8, [u8], char, and str types.

Rendered

@SimonSapin
Copy link
Contributor Author

@SimonSapin
Copy link
Contributor Author

By the way, I created starting point of the rust-ascii repository by running this command on a clone of the Rust repository:

git filter-branch --prune-empty --index-filter '
    git ls-tree -z -r --name-only --full-tree $GIT_COMMIT |
    grep -z -v "ascii.rs$" |
    xargs -0 -r git rm --cached -r -q
' -- master
git rebase --root HEAD

(The second command gets rid of the empty merge commits.)

@SimonSapin
Copy link
Contributor Author

CC @Kimundi (original author of the module)

@Kimundi
Copy link
Member

Kimundi commented Nov 29, 2014

I don't mind in particular if this moves to a cargo package, but I'm a bit weary about cluttering u8 and char with those methods. But then again that would still be preferable to not providing them at all.

@SimonSapin
Copy link
Contributor Author

@Kimundi regarding cluttering u8 and char: the trait doesn’t have to be in the prelude. (This is one of the unresolved questions.) Would you prefer free functions?

@Kimundi
Copy link
Member

Kimundi commented Dec 1, 2014

No, please not free functions :)

I just think that the current situation where those functions are defined on the Ascii type directly seems to be the cleanest one in regard to not having to worry about the behavior of every ascii-only function on non-ascii inputs.

@SimonSapin
Copy link
Contributor Author

The obvious problem with the Ascii type is that you can’t use it for not-entirely-ASCII data, where to_ascii_lowercase and other methods are still useful. (Well, you can break the non-ASCII invariant with unsafe { foo.to_ascii_nocheck() } and the methods will work fine, but that doesn’t seem like a good idea.)

@drewm1980
Copy link

+1 for simplifying Rust's string story! Having one less char/string related type in std would have saved me significant time while learning rust. Could you get rid of char next? Coming from C I assumed it was 8 bits. If it were upper-cased, I might suspect it was some newfangled variable-length character type, but I find a lowercase 32 bit char type very surprising.

@steveklabnik
Copy link
Member

People are often confused about 'char,' it's true. Very common IRC question.

@SimonSapin
Copy link
Contributor Author

What the char type should be or how it should be named is out of the scope of this RFC. If you want to keep discussing it, maybe start a Discourse thread?

@aturon
Copy link
Member

aturon commented Dec 5, 2014

I am wholly in favor of this proposal; the Ascii type is pretty clearly not carrying its weight, but the case conversion/testing functionality is important to provide in std.

@SimonSapin
Copy link
Contributor Author

@aturon, any opinion on the unresolved questions?

@aturon
Copy link
Member

aturon commented Dec 5, 2014

@SimonSapin

What to do with std::ascii::escape_default?

Leave #[unstable] for now.

Rename the AsciiExt and OwnedAsciiExt traits?

Actually, it seems plausible to merge these traits.

Should they be in the prelude? The Ascii type and the related traits currently are.

I would lean toward "no", but we are planning to do a separate prelude stabilization so it doesn't matter for now.

Are associated type stable enough yet?
If not, AsciiExt should temporarily keep its type parameter.

Yes, as of today that's correct.

Which of all the Ascii::is_* methods should AsciiExt include? Those included should have ascii added in their name.

Depends somewhat on the prelude question; if it's not in the prelude, then probably all of the ones that have been marked #[unstable] already. Alternatively, we could keep these as #[unstable] and drive stabilization by demand.

@SimonSapin
Copy link
Contributor Author

Rename the AsciiExt and OwnedAsciiExt traits?

Actually, it seems plausible to merge these traits.

OwnedAsciiExt methods take self by value, and so can not be for Sized?. Merging it into AsciiExt would prevent it from being implemented on [u8] and str, it would have to be on &[u8] and &str instead.

@SimonSapin
Copy link
Contributor Author

@SimonSapin
Copy link
Contributor Author

In #503 (Stabilize std::prelude)

// The two traits will be collapsed into one `CharExt` trait in the `std::char`
// module, however instead of reexporting two traits.
pub use char::{Char, UnicodeChar};

So the naming convention here is that a FooExt trait adds methods for the type/trait Foo. This is consistent with the existing IteratorExt.

Given the above, AsciiExt sounds like it’s implemented only on the Ascii type, which is not accurate. What should be the naming convention for "XYZ-related methods for various types"?

@blaenk
Copy link
Contributor

blaenk commented Dec 10, 2014

Good one @SimonSapin.

@aturon
Copy link
Member

aturon commented Dec 16, 2014

@SimonSapin

OwnedAsciiExt methods take self by value, and so can not be for Sized?. Merging it into AsciiExt would prevent it from being implemented on [u8] and str, it would have to be on &[u8] and &str instead.

Good point; we can keep these separate.

So the naming convention here is that a FooExt trait adds methods for the type/trait Foo. This is consistent with the existing IteratorExt.

Given the above, AsciiExt sounds like it’s implemented only on the Ascii type, which is not accurate. What should be the naming convention for "XYZ-related methods for various types"?

I think AsciiExt is fine -- extension traits cover a few different scenarios, including ones like this.

@aturon
Copy link
Member

aturon commented Dec 16, 2014

Generally speaking, it seems like the stakeholders on this topic are satisfied with the design, as is the core team; this will simplify our string story, and follow the same pattern we're moving toward with unicode (where more advanced functionality lives in the Caroverse). Thanks for proposing this, @SimonSapin!

I'm going to merge this RFC.

@SimonSapin, you want to turn your branch into a PR?

bors added a commit to rust-lang/rust that referenced this pull request Dec 27, 2014
Implements [RFC 486](rust-lang/rfcs#486). Fixes #19908.

* Rename `to_ascii_{lower,upper}` to `to_ascii_{lower,upper}case`, per #14401
* Remove the `Ascii` type and associated traits: `AsciiCast`, `OwnedAsciiCast`, `AsciiStr`, `IntoBytes`, and `IntoString`.
* As a replacement, add `.is_ascii()` to `AsciiExt`, and implement `AsciiExt` for `u8` and `char`.

[breaking-change]
bors added a commit to rust-lang/rust that referenced this pull request Dec 27, 2014
Implements [RFC 486](rust-lang/rfcs#486). Fixes #19908.

* Rename `to_ascii_{lower,upper}` to `to_ascii_{lower,upper}case`, per #14401
* Remove the `Ascii` type and associated traits: `AsciiCast`, `OwnedAsciiCast`, `AsciiStr`, `IntoBytes`, and `IntoString`.
* As a replacement, add `.is_ascii()` to `AsciiExt`, and implement `AsciiExt` for `u8` and `char`.

[breaking-change]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants