Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LF and CRLF incompatabilities in string/symbol/heredoc/char literals #13903

Open
BlobCodes opened this issue Oct 24, 2023 · 1 comment
Open

Comments

@BlobCodes
Copy link
Contributor

Related to #5831
Related to #13772

Discussion

Status Quo

Currently, when parsing string/symbol/heredoc literals, the crystal compiler "copies" most data between the string delimiters.
This also includes the line feeds, allowing the user to have multi-line strings:

a = "hello
world"
p a # => "hello\nworld"

But notably, the compiler does not differentiate between CRLF and LF here.
If the editor I used to write this code had CRLF as its default line separator,
the code would have output "hello\r\nworld".

Problem

When developing on windows, many tools choose CRLF as their default line feed, some even automatically convert between CRLF and LF. For example, Windows developers often have the core.autocrlf git config activated globally, which would cause this exact behaviour.

I think it is important that there is no implicit change of behaviour when switching between these two line endings.

While this issue has already been mentioned in #5831, the referenced issue is closed and more concerned with backslash line endings:
#5831 (comment)

Proposal

I think the crystal compiler should always treat all CRLFs as LFs, removing this implicit behavioural change between platforms.

To back up my point, ruby also has this exact behaviour:

⬢[blobcodes@toolbox ~]$ crystal eval 'File.write("./temp.cr", "a=\"hello\r\nworld\";p a")'
⬢[blobcodes@toolbox ~]$ ruby ./temp.cr
"hello\nworld"
⬢[blobcodes@toolbox ~]$ crystal ./temp.cr
"hello\r\nworld"

Appendix A: Char literals

As an esoterical example, the following code does not compile when converted to CRLF:

'
'

Just these two single quotes separated by a line feed could keep windows users from compiling a program:

In temp.cr:1:1

 1 | '
     ^
Error: unterminated char literal, use double quotes for strings

Appendix B: Example program

Here's a more realistic scenario of frustrating user-experience because of this behaviour:

A student has the task of communicating with an FTP server via TCP to understand the protocol, for which he uses crystal.
He then writes the following program:

require "socket"

request = <<-EOF
USER demo
PASS password
HELP
QUIT

EOF

client = TCPSocket.new("test.rebex.net", 21)
client << request
while package = client.gets
  puts package
end
client.close

While it works for him (a windows user), sharing this program with another student on a unix system results in the FTP server just not responding.

@straight-shoota
Copy link
Member

This sounds very reasonable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants