Support for non-valid UTF-8 strings #6

ismaelgv · 2018-07-12T16:52:38Z

Extend code to support non-valid UTF-8 strings in filenames, paths and arguments:

Use OsStr and OsString.
Follow OsStr pattern API extension in Rust repository.
Check issues with current crates: clap, regex, walkdir and ansi_term

The text was updated successfully, but these errors were encountered:

ismaelgv · 2018-08-01T23:53:51Z

Right now it is not possible to convert OsStr(ing) to &[u8] on Windows to be used in regex::bytes::Regex::replace without losing information. For example, ripgrep uses a to_string_lossy conversion to obtain a &[u8] in Windows.

BurntSushi · 2018-09-05T00:07:54Z

Yeah, this is something I've always wondered about. So far, I haven't had anyone complain about cases where information is lost, i.e., when there's an invalid UTF-16 file path on Windows. One presumes that this might be so infrequent that it may not be a blocking problem in practice.

Getting a real fix for this is tricky. One possibility is to use the underlying representation of an OsStr (which is WTF-8), but this is not part of the public API. Another possibility is to re-create WTF-8 decoding outside of std using the Windows version of the OsStrExt trait. But this incurs a second WTF-8 decoding step, however, it's no worse than the lossy UTF-8 decoding that I'm already doing.

ismaelgv added the enhancement New feature or request label Jul 12, 2018

ismaelgv added this to the long-term milestone Jul 12, 2018

ismaelgv modified the milestones: long-term, 0.2 Aug 1, 2018

ismaelgv mentioned this issue Nov 27, 2021

ascii restriction mode #24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for non-valid UTF-8 strings #6

Support for non-valid UTF-8 strings #6

ismaelgv commented Jul 12, 2018

ismaelgv commented Aug 1, 2018

BurntSushi commented Sep 5, 2018

Support for non-valid UTF-8 strings #6

Support for non-valid UTF-8 strings #6

Comments

ismaelgv commented Jul 12, 2018

ismaelgv commented Aug 1, 2018

BurntSushi commented Sep 5, 2018