Parse integer division operator #6470

bcardiff · 2018-07-30T19:19:52Z

In the path to implement #2968 first the operators needs to be allowed.

This PR allows // ~~and &//~~ operators to be defined.
The behavior will need to wait to the next release.
// will raise on overflow ~~and &// won't. Similar to + vs &+.~~

// will continue to mean empty regex when possible:

foo // bar is equivalent to foo.//(bar)
foo // is a syntax error (from now on)
foo(//) is a call passing empty regex

The ASTNode#to_s will use %r() as empty regex always since // will not always work.

There are some refactors on specs and code for allowing operators to be used as methods and macro names introducing a state ivar @wants_def_or_macro_name that works similar to @slash_is_regex configuring the behavior of the lexer depending on the context.

straight-shoota · 2018-07-30T20:23:19Z

// will be allowed to overflow and &// won't.

Isn't it the other way around?

bcardiff · 2018-07-30T21:00:30Z

Yes @straight-shoota I mean to overflow as "raise overflow". &// won't raise. I've just edited for clarity.

RX14 · 2018-07-31T00:07:48Z

&//=

This reminds me of perl. I don't like it.

RX14 · 2018-07-31T00:13:30Z

I really don't like all these new operators being proposed. Wrapped(T) will work fine, and integer division works fine as-is. Why change?

I really don't think we should introduce all these operators, they're just more things for a newcomer to learn to save a few characters typing (.to_f) in a fairly rare situation. I think it's a bad tradeoff.

bcardiff · 2018-07-31T13:49:59Z

I would expect that from all the operators added in this PR // will be at far the most used. The rest are for consistency. Implementing only // would leave things unfinished in this topic.

The division operator / can be improved, this has been discussed already in #2968 and #6223. The main issue is semantic one. / performs different operators depending on the numeric type. The fact that we are used to an overload behaviour of / does not mean it can't be improved.

Adding .to_i/.to_f to force one or another will make the code use more (or less) precision that it should because .to_i/.to_f has fixed output type.

The wrapper api, can be implemented on top of this. But not the other way around. So operators are more powerful and expressive.

For wrapper to work well with overloads you will need Wrapper(T) < T which is something that is not supported. Otherwise the type is used only locally without passing them around.

The discussion left (hence this PR) is not about if the language needs to add this, but if the implementation is good enough.

RX14 · 2018-07-31T14:53:03Z

Wrapping(T) is a discussion for another issue, but i'm yet to see a place where numbers with wrapping arithmetic performed on them are actually passed around significantly outside of a single class. I don't see a reason why ease of use of unchecked arithmetic is a big deal because it's just so rare. Adding a whole set of new operators just for something so rarely used seems to be like a gross premature optimization and a bad idea.

Back to this issue, I'm not sure what your concern is about .to_i/.to_f, because the current method clearly works, because it's the current method. I haven't seen any friction in that area and I've been on IRC helping newbies for a long time. I can't see the motivation.

I'm really against adding these operators because what we have now just isn't broken, so lets not fix it. Change for the sake of change is a bad thing.

RX14 · 2018-07-31T14:59:13Z

I just want data that this is actually a problem in the real world, and not a solution to a problem that we've invented in our heads.

bcardiff · 2018-07-31T16:56:15Z

Back to this issue, I'm not sure what your concern is about .to_i/.to_f, because the current method clearly works, because it's the current method.

Let's use an example, if avg method wants to be coded that will work with any kind of numbers you will end up adding a to_f to force non-integer division, because / currently is both (int and non-int). I think that that already needs to be improved because it can be more clear.

The to_f will make the result of the operation Float64, but if the input numbers are Float32 then why the operation needs to be carried on 64 bits? Only because the lack / to express the floating point division in the most appropriate type given the operands.

Again, the fact that we are used to C like math operators does not make it the best option.

RX14 · 2018-07-31T19:41:37Z

@bcardiff I think issues like that are fairly rare, and val = val.to_f unless val.is_a? Float is a very short snippet for the small amount of methods that operate on generic Numbers.

bcardiff · 2018-07-31T22:54:02Z

That snippet is a workaround and harder to come up when learning the language. Having different operators is cleaner. Something new to learn today, yes. But is a more consistent set of operators.

Try to explain that line when coding a method that manipulate numbers vs explain that / is floating point division and // is floor division always.

Splitting / and // was already agreed and pros/cons were exposed.
Leaving // to raise upon overflow and avoid adding &// seems incomplete.
Excluding //= and &//= when /= is already supported seems incomplete also.

I could understand removing 3 from the PR, if and only if /= is also removed. But it seems arbitrary since +=, &+=, etc are allowed. So they are implemented.

RX14 · 2018-08-01T09:44:08Z

@bcardiff it's not just something new to learn today, it's something new that every newcomer from now until forever will have to learn.

At the end of the day though, I'm more concerned about the overflow stuff adding a bazillion more operators than this PR only adding two.

asterite · 2018-08-01T10:04:27Z

I think having / and // operators is more intuitive than just / (many people find it strange that 1/2 == 0)
The & operators will be used in very specific circumstances, so this doesn't add a lot of complexity. Much less with division.
I think the decision about introducing this // operator is already taken. Now it's time to review the code.

wooster0 · 2018-08-01T10:15:18Z

src/compiler/crystal/syntax/to_s.cr

-      when StringInterpolation
-        visit_interpolation(exp) { |s| Regex.append_source s, @str }
+      if (exp = node.value).is_a?(StringLiteral) && exp.value.empty?
+        # // is not allways an empty regex, sometimes is an operator


Typo: allways -> always

j8r

Generally there is no point of using case statements for only one single condition vs if.

There is strictly no performance difference between the two. However using a case adds one more line to the code because the condition is splitted in two lines, which is IMO less clear than on one line with if

j8r · 2018-08-01T10:57:53Z

spec/compiler/lexer/lexer_spec.cr

  it "lexes #{string.inspect}" do
    lexer = Lexer.new string
+    unless (v = slash_is_regex).nil?


Replace simply by if v = slash_is_regex

This is not the same. slash_is_regex is Bool? so you need to test for .nil?

Ok 👍 - the language evaluate if the expression is true and not nil, and we just want to know if it isn't nil.

That would be abusing that the default value of slash_is_regex is false.

j8r · 2018-08-01T11:24:37Z

src/compiler/crystal/syntax/lexer.cr

@@ -287,8 +288,17 @@ module Crystal
        line = @line_number
        column = @column_number
        char = next_char
-        if !@slash_is_regex && char == '='
+        if (@wants_def_or_macro_name || !@slash_is_regex) && char == '/'
+          case next_char


Can be replaced by a smaller if statement

Using the case is more about following the pattern of the rest of the lexer.
But I don't follow which is the proposed smaller if.

This is fine - this just makes the lexer looks a bit more complex that it is, that's it :)
Here it would be

if next_char == '/' next_char :"//=" else @token.type = :"//" end

I thought you ment the if (@wants_def_or_macro_name ....
There are some if in the lexer, bust mostly case even single when-ed (?) case. I found them easier to read the lexer if all the structures are the same and not case by case based that we will end up having if your suggestions are applied.

At there is no real change the the actual code.

So case are kept. But thanks.

j8r · 2018-08-01T11:24:59Z

src/compiler/crystal/syntax/lexer.cr

+        when '/'
+          case next_char
+          when '/'
+            case next_char


With ifs it would be (start at line 648, replacing the 2 cases):

raise "&/ is not a valid operator. Did you mean &// ?" if next_char != '/' if next_char == '=' next_char :"&//=" else @token.type = :"&//" end

At a note, the previous code was:

case next_char when '/' case next_char when '=' next_char :"&//=" else @token.type = :"&//" end else raise "&/ is not a valid operator. Did you mean &// ?" end

j8r · 2018-08-01T11:25:28Z

src/compiler/crystal/syntax/lexer.cr

@@ -470,6 +489,13 @@ module Crystal
            else
              symbol "&*"
            end
+          when '/'
+            case next_char


And here it would be:

if next_char == '/' next_char_and_symbol "&//" else symbol "&/" end

j8r · 2018-08-01T11:25:37Z

src/compiler/crystal/syntax/lexer.cr

@@ -412,7 +426,12 @@ module Crystal
            symbol "*"
          end
        when '/'
-          next_char_and_symbol "/"
+          case next_char


ditto ditto :-)

It would be here

if next_char == '/' next_char_and_symbol "//" else symbol "/" end

ysbaddaden

Let's merge!

I regularly fumble on this in Ruby or put to_f in too many places because I can't trust / to rrturn a float. This is inconsistent and I'm happy we finally fix this in Crystal, with a minimal change for floor division (just add a /).

@RX14 points were already made/refuted in related issues.

RX14 · 2018-08-01T14:46:42Z

I forgot we'd already merged all the &+ overflow operators. Since we've already - in my opinion - gone overboard with the number of operators in the language you might as well go and merge this PR.

oprypin · 2018-08-01T18:06:28Z

I would prefer &//= to be removed, along with &+= &-= &*=. If someone wants to be this specific, they can write a = a &+ 1 rather than a &+= 1.

And if anyone else is thinking: how can integer division overflow? 🤔
@RX14 had to explain to me that it happens (only?) with Int32::MIN / -1

bcardiff · 2018-08-01T19:33:15Z

I fixed the typo and rebased. !

@oprypin I will still raise the card of completeness regarding the &+= and others. And they will be actually used in Hasher and others. All the op= are syntax sugar and not a core part of the language.

asterite · 2018-08-01T19:37:59Z

By the way: https://stackoverflow.com/questions/30378591/overflow-division-operator-not-recognized

I would recommend not having the &/ operator in Crystal, just like in Swift. It's an extra check for something that can rarely happen.

RX14 · 2018-08-01T19:56:13Z

completeness for completeness's sake seems like a very bad idea to me. It just expands the set of operators.

I'd be much happier if the only operators we were adding were &+, &-, &*, // and //=.

bcardiff · 2018-08-01T20:09:52Z

Since division by zero would still be raised in &// is not that that operator is free of exception.
Ok, I will remove &// and &//=.

Refactor specs to assert operators all together

Allow // to be used as empty regex in some context only. Use %r() for empty regex when in to_s of AST.

the position of :"[]?" was off, new line added to compare them easily

DRY logic of allowed operators names for def and macro

sdogruyol

Great work 💯 Thank you @bcardiff 👍

bcardiff added topic:compiler:parser breaking-change labels Jul 30, 2018

bcardiff added this to the 0.26.0 milestone Jul 30, 2018

wooster0 reviewed Aug 1, 2018

View reviewed changes

j8r reviewed Aug 1, 2018

View reviewed changes

ysbaddaden approved these changes Aug 1, 2018

View reviewed changes

bcardiff force-pushed the feature/parse-int-div-ops branch from def342d to a85334f Compare August 1, 2018 19:29

bcardiff added 4 commits August 2, 2018 01:13

Add lexer of // //= operators

5dbd6ea

Refactor specs to assert operators all together

Add // //= operators parsing

0ebf161

Allow // to be used as empty regex in some context only. Use %r() for empty regex when in to_s of AST.

Reorganize DefOrMacroCheck constants

d39b1a4

the position of :"[]?" was off, new line added to compare them easily

Refactor: simplify parsing of def and macro name

5cb2886

DRY logic of allowed operators names for def and macro

bcardiff force-pushed the feature/parse-int-div-ops branch from a85334f to 5cb2886 Compare August 2, 2018 04:44

bcardiff changed the title ~~Parse integer division operators~~ Parse integer division operator Aug 2, 2018

sdogruyol approved these changes Aug 2, 2018

View reviewed changes

AlexWayfer mentioned this pull request Aug 3, 2018

Make / be float division #2968

Closed

bcardiff merged commit db838c7 into crystal-lang:master Aug 4, 2018

RX14 added the topic:compiler label Aug 7, 2018

Sija mentioned this pull request Sep 21, 2018

0.26 macro fails to accept keyword arguments with empty regex #6761

Closed

bcardiff mentioned this pull request Sep 28, 2018

Add floor divisions operator #6891

Merged

Parse integer division operator #6470

Parse integer division operator #6470

Conversation

bcardiff commented Jul 30, 2018 • edited Loading

straight-shoota commented Jul 30, 2018

bcardiff commented Jul 30, 2018

RX14 commented Jul 31, 2018 • edited Loading

RX14 commented Jul 31, 2018

bcardiff commented Jul 31, 2018

RX14 commented Jul 31, 2018 • edited Loading

RX14 commented Jul 31, 2018 • edited Loading

bcardiff commented Jul 31, 2018

RX14 commented Jul 31, 2018 • edited Loading

bcardiff commented Jul 31, 2018

RX14 commented Aug 1, 2018

asterite commented Aug 1, 2018

Choose a reason for hiding this comment

j8r left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ysbaddaden left a comment

Choose a reason for hiding this comment

RX14 commented Aug 1, 2018 • edited Loading

oprypin commented Aug 1, 2018

bcardiff commented Aug 1, 2018

asterite commented Aug 1, 2018

RX14 commented Aug 1, 2018

bcardiff commented Aug 1, 2018

sdogruyol left a comment

Choose a reason for hiding this comment

bcardiff commented Jul 30, 2018 •

edited

Loading

RX14 commented Jul 31, 2018 •

edited

Loading

RX14 commented Jul 31, 2018 •

edited

Loading

RX14 commented Jul 31, 2018 •

edited

Loading

RX14 commented Jul 31, 2018 •

edited

Loading

RX14 commented Aug 1, 2018 •

edited

Loading