Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should \R match \u001bE? #4

Open
fstirlitz opened this issue Jun 21, 2022 · 0 comments
Open

Should \R match \u001bE? #4

fstirlitz opened this issue Jun 21, 2022 · 0 comments

Comments

@fstirlitz
Copy link

One of the code points that are supposed to be matched by \R is <NL>, that is U+0085, which is the C1 control code NEXT LINE (NEL). The definition of <NL> is missing from the specification text, but is implied by the contents of the README.

However, C1 control codes have an alternative representation using ASCII code points; U+0085 has an alternative representation as U+001B U+0045, and for example terminal emulators that support the former as a line-ending character tend to also support the latter (e.g. VTE).

$ printf 'qwe\x1bErty\nabc\xc2\x85def\n'
qwe
rty
abc
def

Some, in fact, only support the the latter (e.g. xterm, native Linux console subsystem):

$ printf 'qwe\x1bErty\nabc\xc2\x85def\n'
qwe
rty
abcdef
$ printf 'qwe\x1bErty\nabc\xc2\x85def\n'
qwe
rty
abc◈def

As such U+0085 can be considered equivalent to (or at least no better than) U+001B U+0045, and it is inconsistent to recognise the former, but not the latter. As such, U+001B U+0045 should be included as a recognised line ending sequence.

On the other hand, the inclusion of NEL (in either form) makes the escape not align with ^ and $ in mu mode, despite the claim in the README. So perhaps removing NEL altogether is also an option.

Which is it going to be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant