-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are leading zeros allowed in the exponent part of a float? #356
Comments
I am the initial reporter of the bug avakar/pytoml#9. I would advocate the correctness of using leading zeroes, as this is common practice in a number of languages. (It allows nicely aligned numbers.) Here are a few examples: Python 2.7/3.4: print(4.86e-6)
# Prints "4.86e-06" Ruby 2.1: puts(4.86e-6)
# Prints 4.86e-06 |
Can someone provide a concrete example where allowing leading zeros is useful? |
Here is my case. @avakar's TOML parser refused leading zeros because it failed to parse some of the TOML files that a Python script of mine was producing automatically. It turned out that such files were the ones where a parameter turned so small that scientific notation was used for it. (As I said above, Python's |
@ziotom78 Oh, I see, I didn't realize that's what your example was pointing out. That's quite curious that Ruby and Python do that. Do you know what the rationale is behind that behavior? |
I'd call that bad design... Reminds me of that old thing, the C strcmp function and the implications it had for sorting... |
I imagine that this way of formatting numbers might allow nicer alignment when printing numbers in columns, though I am not really sure. Python says nothing about this: https://docs.python.org/2/library/string.html#formatspec. (See this question on StackOverflow for some more interesting information: http://stackoverflow.com/questions/9910972/python-number-of-digits-in-exponent.) Interesting enough, there are cases where this behaviour is intentional and documented, see how .NET does floating-point formatting (https://msdn.microsoft.com/en-us/library/dwhawy9k.aspx#EFormatString):
|
It's a mess... https://en.wikipedia.org/wiki/Leading_zero#0_as_a_prefix Decimal vs. octal, etc. I think C#/VB.net etc. use three digits in the exponent, forcing the user to use custom string formats for 'correct' output.. AFAIK, strictly mathematically, using leading zeros is generally discouraged. Programming languages have different conventions, obviously, so I don't know what an easy and elegant answer to the question at hand would be.. |
I agree that C's way of indicating octal numbers by prefixing them with a zero is really confusing. However, in this case we are discussing the opportunity of allowing leading zeroes in exponents. AFAIK, in this case there is no ambiguity at all, as every language I know always interprets exponents as decimal numbers, regardless of the trailing zeros. |
In this case I think internal consistency between "integer parts" of the spec outweighs the benefits of accepted the generated output of certain languages. You could argue that integer values should allow leading zeros, which would then propagate to all integer parts, but I just don't agree with allowing such cruft. TOML is designed for readability as a primary goal, and leading zeros are counter to that. I'll submit a clarification to the float language. |
I would argue that C99 is a pretty strong standard and going against it will produce a lot of headaches for a lot of people. One of the reason is that there is no easy way to manipulate the number of digits in the exponent for output functions like cout / printf ... making toml unsuitable to be output by many languages based on C. "The exponent always contains at least two digits, and only as many more digits as necessary to represent the exponent." A reconsideration would be welcome. |
"outweighs the benefits of accepted the generated output of certain languages." This is misleading, as, in fact we're talking most common languages. Exponents with leading zeroes has always been a standard in the computing world, and restricting them here reduces usability to no purpose. toml is a terrible place to create new notation standards. |
I'm sure this comes up with naive computer-generated TOML configurations, and not as much with human-written configurations. My knee-jerk reaction, valid or not, is that the programs generating TOML ought to write human-readable exponents. But leading zeros in exponents, valid or not, are human-readable, especially the ones that This is a case in which consistency for the sake of the spec isn't worth the cost to users writing configs with the tools they've got on hand. Integer values shouldn't have leading zeros, but exponent values could. The spec could be changed to read "An exponent part is an E (upper or lower case) followed by an integer part (which follows the same rules as integer values but may include up to two leading zeros)." I chose the "up to two" part because The ABNF could use the following in place of the current definition of exp = "e" float-exp-part
float-exp-part = [ minus / plus ] zero-prefixable-int And if we really don't want any more than two leading zeros, we could instead do this: exp = "e" float-exp-part
float-exp-part = [ minus / plus ] float-exp-int
float-exp-int = [ %x30 [ underscore ] [ %x30 [ underscore ] ] ] unsigned-dec-int There's probably a better way to write this, but it works. The underscores are ugly but consistent with And so I also ask @mojombo to reconsider. |
Actually, Microsoft had a function called _set_output_format to print the exponent as two digits which is now obsolete because they also follow C99 starting with Visual Studio 2015. https://msdn.microsoft.com/en-us/library/bb531344(v=vs.140).aspx
|
Two observations:
Anyhow, my point is, there is effectively zero ambiguity over what 1e-05 means; and for that matter, what 1e+05 means. They've been written that way since time immemorial, programming languages typically print them that way by default and read them that way as well, and even most non-graphing calculators display 2-digit zero-padded exponents. I'd be surprised if TOML didn't accept them that way. If I were using a toml file for a scientific program with floating point config values that I was getting from elsewhere (maybe output from another program), it would be a nuisance to have to edit them and remove the zeros and plus signs from the exponents. |
This is indeed the main reason why I stopped using TOML in my scientific codes. |
@uvtc is right, and considering their observations and @ziotom78's corroboration I would plead to re-open this issue and allow leading zeros. Even those who don't find them useful will supposedly find them harmless, and considering that others find them useful – or even essential – that's a clear case in favor of allowing them. |
@mojombo ^ |
Ok, you've all made a compelling argument. I would love to see a PR adding this capability. |
Here goes nothing! I stuck to what I spelled out in my previous comment for any number of leading zeroes, and updated the changelog. (And trimmed an unnecessary trailing whitespace elsewhere, whoops. Blame my trusty editor.) |
Closed by #656. |
An issue have arisen in my parser: avakar/pytoml#9, and I find specs ambiguous about this. In particular, it says
Does the phrase integer part mean it's an integer as specified in the Integer section of the specs and thus disallows leading zeros? In other words, is the following a valid TOML file?
Note the leading zero in the exponent.
The text was updated successfully, but these errors were encountered: