Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse integer division operator #6470

Merged
merged 4 commits into from
Aug 4, 2018

Conversation

bcardiff
Copy link
Member

@bcardiff bcardiff commented Jul 30, 2018

In the path to implement #2968 first the operators needs to be allowed.

This PR allows // and &// operators to be defined.
The behavior will need to wait to the next release.
// will raise on overflow and &// won't. Similar to + vs &+.

// will continue to mean empty regex when possible:

  • foo // bar is equivalent to foo.//(bar)
  • foo // is a syntax error (from now on)
  • foo(//) is a call passing empty regex

The ASTNode#to_s will use %r() as empty regex always since // will not always work.

There are some refactors on specs and code for allowing operators to be used as methods and macro names introducing a state ivar @wants_def_or_macro_name that works similar to @slash_is_regex configuring the behavior of the lexer depending on the context.

@straight-shoota
Copy link
Member

// will be allowed to overflow and &// won't.

Isn't it the other way around?

@bcardiff
Copy link
Member Author

Yes @straight-shoota I mean to overflow as "raise overflow". &// won't raise. I've just edited for clarity.

@RX14
Copy link
Contributor

RX14 commented Jul 31, 2018

&//=

This reminds me of perl. I don't like it.

@RX14
Copy link
Contributor

RX14 commented Jul 31, 2018

I really don't like all these new operators being proposed. Wrapped(T) will work fine, and integer division works fine as-is. Why change?

I really don't think we should introduce all these operators, they're just more things for a newcomer to learn to save a few characters typing (.to_f) in a fairly rare situation. I think it's a bad tradeoff.

@bcardiff
Copy link
Member Author

I would expect that from all the operators added in this PR // will be at far the most used. The rest are for consistency. Implementing only // would leave things unfinished in this topic.

The division operator / can be improved, this has been discussed already in #2968 and #6223. The main issue is semantic one. / performs different operators depending on the numeric type. The fact that we are used to an overload behaviour of / does not mean it can't be improved.

Adding .to_i/.to_f to force one or another will make the code use more (or less) precision that it should because .to_i/.to_f has fixed output type.

The wrapper api, can be implemented on top of this. But not the other way around. So operators are more powerful and expressive.

For wrapper to work well with overloads you will need Wrapper(T) < T which is something that is not supported. Otherwise the type is used only locally without passing them around.

The discussion left (hence this PR) is not about if the language needs to add this, but if the implementation is good enough.

@RX14
Copy link
Contributor

RX14 commented Jul 31, 2018

Wrapping(T) is a discussion for another issue, but i'm yet to see a place where numbers with wrapping arithmetic performed on them are actually passed around significantly outside of a single class. I don't see a reason why ease of use of unchecked arithmetic is a big deal because it's just so rare. Adding a whole set of new operators just for something so rarely used seems to be like a gross premature optimization and a bad idea.

Back to this issue, I'm not sure what your concern is about .to_i/.to_f, because the current method clearly works, because it's the current method. I haven't seen any friction in that area and I've been on IRC helping newbies for a long time. I can't see the motivation.

I'm really against adding these operators because what we have now just isn't broken, so lets not fix it. Change for the sake of change is a bad thing.

@RX14
Copy link
Contributor

RX14 commented Jul 31, 2018

I just want data that this is actually a problem in the real world, and not a solution to a problem that we've invented in our heads.

@bcardiff
Copy link
Member Author

Back to this issue, I'm not sure what your concern is about .to_i/.to_f, because the current method clearly works, because it's the current method.

Let's use an example, if avg method wants to be coded that will work with any kind of numbers you will end up adding a to_f to force non-integer division, because / currently is both (int and non-int). I think that that already needs to be improved because it can be more clear.

The to_f will make the result of the operation Float64, but if the input numbers are Float32 then why the operation needs to be carried on 64 bits? Only because the lack / to express the floating point division in the most appropriate type given the operands.

Again, the fact that we are used to C like math operators does not make it the best option.

@RX14
Copy link
Contributor

RX14 commented Jul 31, 2018

@bcardiff I think issues like that are fairly rare, and val = val.to_f unless val.is_a? Float is a very short snippet for the small amount of methods that operate on generic Numbers.

@bcardiff
Copy link
Member Author

That snippet is a workaround and harder to come up when learning the language. Having different operators is cleaner. Something new to learn today, yes. But is a more consistent set of operators.

Try to explain that line when coding a method that manipulate numbers vs explain that / is floating point division and // is floor division always.

  1. Splitting / and // was already agreed and pros/cons were exposed.
  2. Leaving // to raise upon overflow and avoid adding &// seems incomplete.
  3. Excluding //= and &//= when /= is already supported seems incomplete also.

I could understand removing 3 from the PR, if and only if /= is also removed. But it seems arbitrary since +=, &+=, etc are allowed. So they are implemented.

@RX14
Copy link
Contributor

RX14 commented Aug 1, 2018

@bcardiff it's not just something new to learn today, it's something new that every newcomer from now until forever will have to learn.

At the end of the day though, I'm more concerned about the overflow stuff adding a bazillion more operators than this PR only adding two.

@asterite
Copy link
Member

asterite commented Aug 1, 2018

  • I think having / and // operators is more intuitive than just / (many people find it strange that 1/2 == 0)
  • The & operators will be used in very specific circumstances, so this doesn't add a lot of complexity. Much less with division.
  • I think the decision about introducing this // operator is already taken. Now it's time to review the code.

when StringInterpolation
visit_interpolation(exp) { |s| Regex.append_source s, @str }
if (exp = node.value).is_a?(StringLiteral) && exp.value.empty?
# // is not allways an empty regex, sometimes is an operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: allways -> always

Copy link
Contributor

@j8r j8r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally there is no point of using case statements for only one single condition vs if.

There is strictly no performance difference between the two. However using a case adds one more line to the code because the condition is splitted in two lines, which is IMO less clear than on one line with if

it "lexes #{string.inspect}" do
lexer = Lexer.new string
unless (v = slash_is_regex).nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace simply by if v = slash_is_regex

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the same. slash_is_regex is Bool? so you need to test for .nil?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok 👍 - the language evaluate if the expression is true and not nil, and we just want to know if it isn't nil.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be abusing that the default value of slash_is_regex is false.

@@ -287,8 +288,17 @@ module Crystal
line = @line_number
column = @column_number
char = next_char
if !@slash_is_regex && char == '='
if (@wants_def_or_macro_name || !@slash_is_regex) && char == '/'
case next_char
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced by a smaller if statement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the case is more about following the pattern of the rest of the lexer.
But I don't follow which is the proposed smaller if.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine - this just makes the lexer looks a bit more complex that it is, that's it :)
Here it would be

if next_char == '/'
  next_char :"//="
else
  @token.type = :"//"
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you ment the if (@wants_def_or_macro_name ....
There are some if in the lexer, bust mostly case even single when-ed (?) case. I found them easier to read the lexer if all the structures are the same and not case by case based that we will end up having if your suggestions are applied.

At there is no real change the the actual code.

So case are kept. But thanks.

when '/'
case next_char
when '/'
case next_char
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ifs it would be (start at line 648, replacing the 2 cases):

raise "&/ is not a valid operator. Did you mean &// ?" if next_char != '/'
if next_char == '='
  next_char :"&//="
else
  @token.type = :"&//"
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a note, the previous code was:

case next_char
when '/'
  case next_char
  when '='
    next_char :"&//="
  else
    @token.type = :"&//"
  end
else
  raise "&/ is not a valid operator. Did you mean &// ?"
end

@@ -470,6 +489,13 @@ module Crystal
else
symbol "&*"
end
when '/'
case next_char
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here it would be:

if next_char == '/'
  next_char_and_symbol "&//"
else
  symbol "&/"
end

@@ -412,7 +426,12 @@ module Crystal
symbol "*"
end
when '/'
next_char_and_symbol "/"
case next_char
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto ditto :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be here

if next_char == '/'
  next_char_and_symbol "//"
else
  symbol "/"
end

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge!

I regularly fumble on this in Ruby or put to_f in too many places because I can't trust / to rrturn a float. This is inconsistent and I'm happy we finally fix this in Crystal, with a minimal change for floor division (just add a /).

@RX14 points were already made/refuted in related issues.

@RX14
Copy link
Contributor

RX14 commented Aug 1, 2018

I forgot we'd already merged all the &+ overflow operators. Since we've already - in my opinion - gone overboard with the number of operators in the language you might as well go and merge this PR.

@oprypin
Copy link
Member

oprypin commented Aug 1, 2018

I would prefer &//= to be removed, along with &+= &-= &*=. If someone wants to be this specific, they can write a = a &+ 1 rather than a &+= 1.

And if anyone else is thinking: how can integer division overflow? 🤔
@RX14 had to explain to me that it happens (only?) with Int32::MIN / -1

@bcardiff bcardiff force-pushed the feature/parse-int-div-ops branch from def342d to a85334f Compare August 1, 2018 19:29
@bcardiff
Copy link
Member Author

bcardiff commented Aug 1, 2018

I fixed the typo and rebased. :shipit: !

@oprypin I will still raise the card of completeness regarding the &+= and others. And they will be actually used in Hasher and others. All the op= are syntax sugar and not a core part of the language.

@asterite
Copy link
Member

asterite commented Aug 1, 2018

By the way: https://stackoverflow.com/questions/30378591/overflow-division-operator-not-recognized

I would recommend not having the &/ operator in Crystal, just like in Swift. It's an extra check for something that can rarely happen.

@RX14
Copy link
Contributor

RX14 commented Aug 1, 2018

completeness for completeness's sake seems like a very bad idea to me. It just expands the set of operators.

I'd be much happier if the only operators we were adding were &+, &-, &*, // and //=.

@bcardiff
Copy link
Member Author

bcardiff commented Aug 1, 2018

Since division by zero would still be raised in &// is not that that operator is free of exception.
Ok, I will remove &// and &//=.

Refactor specs to assert operators all together
Allow // to be used as empty regex in some context only.
Use %r() for empty regex when in to_s of AST.
the position of :"[]?" was off, new line added to compare them easily
DRY logic of allowed operators names for def and macro
@bcardiff bcardiff force-pushed the feature/parse-int-div-ops branch from a85334f to 5cb2886 Compare August 2, 2018 04:44
@bcardiff bcardiff changed the title Parse integer division operators Parse integer division operator Aug 2, 2018
Copy link
Member

@sdogruyol sdogruyol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 💯 Thank you @bcardiff 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants