-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add chomp option to gets, lines, each_line #3704
Conversation
if bytesize > 0 && buffer[bytesize - 1] == byte | ||
back(1) | ||
|
||
if byte === '\n' && bytesize > 0 && buffer[bytesize - 1] === '\r' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a hack to me, why not 2 methods with one being argless for \r\n
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least Ruby always removes \r
when chomping \n
or the argless version:
$ irb
irb(main):001:0> "foo\r\n".chomp("\n")
=> "foo"
irb(main):002:0> "foo\r\n".chomp
=> "foo"
So I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asterite I think that's confusing behaviour personally, and I can imagine cases where you would want to chomp \n
and not \r\n
. If you want to chomp both \n
and \r\n
why not just use chomp with no args? Copying ruby here seems ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @RX14. Ruby's behaviour in that first example is a bug IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me it's OK. I'll merge this, please send a PR to fix this if you find a way.
The problem is, when you do gets
, this is equivalent to doing gets('\n')
, so passing no delimiter is equivalent to passing \n
as a delimiter. If \n
is the delimiter, and the line ends with \r\n
, then \r\n
is also removed. There's currently no way to distinguish between "no delimiter is specified" and "\n
was specified as a delimiter". To do this, we'd have to change the general gets(delimiter, limit, chomp)
method to be something like gets(delimiter, limit, chomp, argless)
, or do a separate implementation for the argless case in several IO
types.
I personally don't find a use case where you'd want to remove \n
but keep the \r
, but if you do find such case, send a PR with all the necessary changes.
# Moves the write pointer, and the resulting string bytesize, | ||
# by the given amount | ||
def back(amount : Int) | ||
unless 0 <= amount < @bytesize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't amount == @bytesize
be allowed, to reset the builder to the beginning (@bytesize == 0
)?
942b0f6
to
11682dc
Compare
Great to see this! I'd implemented a version of this myself (as Re: setting |
I'll merge this. Before 1.0 we can tweak the default |
@asterite this PR broke docs rendering :/ I've found it out by doing |
@Sija I'm doing |
@Sija Fixed! (I think ^_^) |
@asterite yup, works for me :) |
This adds a
chomp
optional argument toIO#gets
,IO#each_line
,String#lines
,String#each_line
and generally to every method that reads lines, or by line.This feature was added recently to Ruby 2.4, and honestly I always wanted to be able to do this, both in Ruby and Crystal, so keeping the way Ruby does it is good (I also imagined the same API before Ruby added it).
There's a slight difference with Ruby, though. The default
chomp
value in Ruby is alwaysfalse
. This is to preserve backwards compatibility, I guess. In our case breaking backwards compatibility is fine before 1.0. So, the rules are:gets
,each_line
andlines
assumechomp = true
gets
overloads assumechomp = false
The idea is that argless
gets
is semantically equivalent to reading a line. So in every case where we read a line, we chomp it if it ends with a newline (or"\r\n"
). If we pass an argument togets
, for examplegets('a')
, we usually want to read up to, and including, that delimiter. Same goes withgets(3)
: we want to read a String of 3 bytes, please don't chomp it by default.Note: I wouldn't mind having
chomp = true
in every case. I actually don't know what use cases exist forgets
with a delimiter. If we make it alwaystrue
by default it's maybe more consistent, and one can always passchomp = false
to disable this behaviour, so please share your thoughts and use cases about this.In summary:
Of course one can always do
gets.try &.chomp
. The difference is that this will allocate two strings: one that is read, and one chomped. Sochomp: true
can also improve performance a bit if we don't care about the ending newline (and I'd say in most cases we don't).As a side story, I was curious about what happens in these cases in Ruby 2.4:
The first one returns an empty array. The second and third ones return an array with a single empty string. Well, almost, the second case actually triggers a segmentation fault, which I just reported. I just wanted to share this little story because it's a good change of air, from receiving segmentation faults and bug reports to Crystal, to be able to actually report one in another project (and help it become better, of course) 😸