Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark str::is_char_boundary and str::split_at* unstably const. #131520

Merged
merged 2 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions library/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,9 @@
#![feature(cfg_target_has_atomic_equal_alignment)]
#![feature(cfg_ub_checks)]
#![feature(const_for)]
#![feature(const_is_char_boundary)]
#![feature(const_precise_live_drops)]
#![feature(const_str_split_at)]
#![feature(decl_macro)]
#![feature(deprecated_suggestion)]
#![feature(doc_cfg)]
Expand Down
48 changes: 36 additions & 12 deletions library/core/src/str/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -185,8 +185,9 @@ impl str {
/// ```
#[must_use]
#[stable(feature = "is_char_boundary", since = "1.9.0")]
#[rustc_const_unstable(feature = "const_is_char_boundary", issue = "131516")]
#[inline]
pub fn is_char_boundary(&self, index: usize) -> bool {
pub const fn is_char_boundary(&self, index: usize) -> bool {
// 0 is always ok.
// Test for 0 explicitly so that it can optimize out the check
// easily and skip reading string data for that case.
Expand All @@ -195,8 +196,8 @@ impl str {
return true;
}

match self.as_bytes().get(index) {
// For `None` we have two options:
if index >= self.len() {
// For `true` we have two options:
//
// - index == self.len()
// Empty strings are valid, so return true
Expand All @@ -205,9 +206,9 @@ impl str {
//
// The check is placed exactly here, because it improves generated
// code on higher opt-levels. See PR #84751 for more details.
None => index == self.len(),

Some(&b) => b.is_utf8_char_boundary(),
index == self.len()
} else {
self.as_bytes()[index].is_utf8_char_boundary()
}
}

Expand Down Expand Up @@ -639,7 +640,8 @@ impl str {
#[inline]
#[must_use]
#[stable(feature = "str_split_at", since = "1.4.0")]
pub fn split_at(&self, mid: usize) -> (&str, &str) {
#[rustc_const_unstable(feature = "const_str_split_at", issue = "131518")]
pub const fn split_at(&self, mid: usize) -> (&str, &str) {
match self.split_at_checked(mid) {
None => slice_error_fail(self, 0, mid),
Some(pair) => pair,
Expand Down Expand Up @@ -679,7 +681,8 @@ impl str {
#[inline]
#[must_use]
#[stable(feature = "str_split_at", since = "1.4.0")]
pub fn split_at_mut(&mut self, mid: usize) -> (&mut str, &mut str) {
#[rustc_const_unstable(feature = "const_str_split_at", issue = "131518")]
pub const fn split_at_mut(&mut self, mid: usize) -> (&mut str, &mut str) {
// is_char_boundary checks that the index is in [0, .len()]
if self.is_char_boundary(mid) {
// SAFETY: just checked that `mid` is on a char boundary.
Expand Down Expand Up @@ -718,11 +721,12 @@ impl str {
#[inline]
#[must_use]
#[stable(feature = "split_at_checked", since = "1.80.0")]
pub fn split_at_checked(&self, mid: usize) -> Option<(&str, &str)> {
#[rustc_const_unstable(feature = "const_str_split_at", issue = "131518")]
pub const fn split_at_checked(&self, mid: usize) -> Option<(&str, &str)> {
// is_char_boundary checks that the index is in [0, .len()]
if self.is_char_boundary(mid) {
// SAFETY: just checked that `mid` is on a char boundary.
Some(unsafe { (self.get_unchecked(0..mid), self.get_unchecked(mid..self.len())) })
Some(unsafe { self.split_at_unchecked(mid) })
} else {
None
}
Expand Down Expand Up @@ -758,7 +762,9 @@ impl str {
#[inline]
#[must_use]
#[stable(feature = "split_at_checked", since = "1.80.0")]
pub fn split_at_mut_checked(&mut self, mid: usize) -> Option<(&mut str, &mut str)> {
#[rustc_const_unstable(feature = "const_str_split_at", issue = "131518")]
#[rustc_allow_const_fn_unstable(const_is_char_boundary)]
Copy link
Member

@RalfJung RalfJung Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add rustc_allow_const_fn_unstable here? That makes no sense on an unstable const fn. (I'm cleaning this up in #132451.)

Also @Noratrieb please make sure to get wg-const-eval approval before adding any rustc_allow_const_fn_unstable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd have a bot helping with this attribute policy... rust-lang/triagebot#1851

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I didn't see this, yeah. I was a bit unsure whether I need to ping const eval here but I deemed that there wasn't anything weird enough going on to be problematic (which I guess is sort of true here, the attribute is just misplaced). will be more careful next time

Copy link
Contributor Author

@zachs18 zachs18 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add rustc_allow_const_fn_unstable here? That makes no sense on an unstable const fn. (I'm cleaning this up in #132451.)

I added it because I got an error otherwise, though that was on an earlier version of the PR.

pub const fn split_at_mut_checked(&mut self, mid: usize) -> Option<(&mut str, &mut str)> {
// is_char_boundary checks that the index is in [0, .len()]
if self.is_char_boundary(mid) {
// SAFETY: just checked that `mid` is on a char boundary.
Expand All @@ -774,7 +780,25 @@ impl str {
///
/// The caller must ensure that `mid` is a valid byte offset from the start
/// of the string and falls on the boundary of a UTF-8 code point.
unsafe fn split_at_mut_unchecked(&mut self, mid: usize) -> (&mut str, &mut str) {
const unsafe fn split_at_unchecked(&self, mid: usize) -> (&str, &str) {
let len = self.len();
let ptr = self.as_ptr();
// SAFETY: caller guarantees `mid` is on a char boundary.
unsafe {
(
from_utf8_unchecked(slice::from_raw_parts(ptr, mid)),
from_utf8_unchecked(slice::from_raw_parts(ptr.add(mid), len - mid)),
)
}
}

/// Divides one string slice into two at an index.
///
/// # Safety
///
/// The caller must ensure that `mid` is a valid byte offset from the start
/// of the string and falls on the boundary of a UTF-8 code point.
const unsafe fn split_at_mut_unchecked(&mut self, mid: usize) -> (&mut str, &mut str) {
let len = self.len();
let ptr = self.as_mut_ptr();
// SAFETY: caller guarantees `mid` is on a char boundary.
Expand Down
Loading