Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scientific notation for integer literals #10154

Open
straight-shoota opened this issue Dec 29, 2020 · 17 comments
Open

Scientific notation for integer literals #10154

straight-shoota opened this issue Dec 29, 2020 · 17 comments
Labels
good first issue This is an issue suited for newcomers to become aquianted with working on the codebase. help wanted This issue is generally accepted and needs someone to pick it up kind:feature topic:lang topic:stdlib:numeric

Comments

@straight-shoota
Copy link
Member

Currently scientific notation is only supported for float literals: 1e0 == 1.0.
For some use cases with large numbers, it would be great if you could use scientific notation for integer literals. That's more concise than a literal with many digits or casting a float literal to int.
The default scientific notation literals should probably continue to be floats. But a type suffix can be used to designate a different type but it only works for float types: 1e0f32, 1e0f64.

For integers it would just be 1e0i32 (etc.). Obviously only positive exponents would be valid for integer types.

Inspiration from https://forum.crystal-lang.org/t/constants-and-compiler/2814/3

@straight-shoota straight-shoota added help wanted This issue is generally accepted and needs someone to pick it up good first issue This is an issue suited for newcomers to become aquianted with working on the codebase. labels Feb 14, 2021
@marcotc
Copy link

marcotc commented Jun 18, 2021

I'm taking a stab at this.

My plan is to implement support for u8, u16, u32, u64, u128, i8, i16, i32, i64, i128.
The limitations are: only non-negative, integer exponent; only integer coefficient.
No decimals separators are allowed anywhere. A negative coefficient is allowed.

@straight-shoota
Copy link
Member Author

You should be careful about 128-bit integers. We currently support those literals only in the 64-bit value range. See #8373

So essentially, for now they should use the same parsing logic as their 64-bit equivalent, just casted to the 128-bit type.

@marcotc
Copy link

marcotc commented Jun 18, 2021

@straight-shoota Thank you so much for the heads up on 128-bit integers. I'll review #8373 and try to implement it as you suggested here.

@asterite
Copy link
Member

I don't think this should be done. When someone reads an e in the number, in every other language it means a float. This is very confusing, and the use cases should be sparse. This is also the only case where a literal without a suffix means a float, but it can also have an integer suffix.

@konovod
Copy link
Contributor

konovod commented Dec 11, 2021

Just my two cents: 123e4i64 isn't very different from 123e4.to_i64 that is already possible. I would even say that latter is less confusing.
On the other side, benefit (except saving 4 characters) is that former is guaranteed to be precise, while latter could involve loss of precision. That would be especially visible on 128 bit integers, as there are no Float128.

@straight-shoota
Copy link
Member Author

@konovod Both expressions are very different regarding precision. Using integer literals with scientific notation is the whole point of this proposal, because converting float literals is imprecise.

@asterite
Copy link
Member

because converting float literals is imprecise

You can always write the full integer, right?

Any other language that has this feature?

@BlobCodes
Copy link
Contributor

BlobCodes commented Dec 11, 2021

Any other language that has this feature?

  • Wren-lang supports them (even with the decimal point) (they only have one number type)
  • J-lang supports it

source: https://rosettacode.org/wiki/Literals/Integer (I have never heard of these languages before)

@asterite
Copy link
Member

Can you link to the part that shows that Jlang supports this?

@BlobCodes
Copy link
Contributor

https://rosettacode.org/wiki/Literals/Integer#J

J also allows integers to be entered using other notations, such as scientific or rational.

   1e2 100r5   
100 20

Internally, J freely converts fixed precision integers to floating point numbers when they overflow, and numbers (including integers) of any type may be combined using any operation where they would individually be valid arguments.

Internally, J represents numeric constants in their simplest type, regardless of how they were specified. In other words 9r1, although it is "specified as a rational" is represented as an extended precision integer. Similarly, 2.0, although it is "specified as a floating point value" is represented as an integer, and 1.0 is represented as a boolean.

@asterite
Copy link
Member

I'm also very curious why are we doing this when I personally never needed this, but I needed byte literals for chars like a thousand times.

@asterite
Copy link
Member

Is J-Lang dynamic? Do you have type declarations there?

@BlobCodes
Copy link
Contributor

There are no type declarations, but it internally still uses ints (and falls back to floats on overflow)
Generally I couldn't find any language that exactly implements this proposal, but you can do very similar things in dynamic languages.

@konovod
Copy link
Contributor

konovod commented Dec 13, 2021

@straight-shoota

Both expressions are very different regarding precision. Using integer literals with scientific notation is the whole point of this proposal, because converting float literals is imprecise.

I thought so, but couldn't actually find example with loss.

p 92e17.to_i64, Int64::MAX
p 18e18.to_u64, UInt64::MAX

Outputs

9200000000000000000
9223372036854775807
18000000000000000000
18446744073709551615

@oprypin
Copy link
Member

oprypin commented Dec 13, 2021

@konovod

9020650833e9.to_i64 == 9020650832999999488
Python code to find these
import random
while True:
  m = random.randint(1, 10000000000)
  for e in range(4, 18):
    expr = f'{m}e{e}'
    x = eval(expr)
    if x < 2**63 and not str(int(x)).endswith('000'):
        print(expr, int(x))

@asterite
Copy link
Member

Would you not write that particular integer by hand?

@oprypin
Copy link
Member

oprypin commented Dec 13, 2021

Yes, you definitely would.

All I'm saying is [anything]e[anything].to_i must never be recommended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue This is an issue suited for newcomers to become aquianted with working on the codebase. help wanted This issue is generally accepted and needs someone to pick it up kind:feature topic:lang topic:stdlib:numeric
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants