Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected output #1089

Closed
LeventErkok opened this issue Feb 25, 2021 · 23 comments · Fixed by #1108
Closed

Unexpected output #1089

LeventErkok opened this issue Feb 25, 2021 · 23 comments · Fixed by #1108
Assignees
Labels
bug Something not working correctly UX Issues related to the user experience (e.g., improved error messages)
Milestone

Comments

@LeventErkok
Copy link
Contributor

$ cryptol
┏━╸┏━┓╻ ╻┏━┓╺┳╸┏━┓╻
┃  ┣┳┛┗┳┛┣━┛ ┃ ┃ ┃┃
┗━╸╹┗╸ ╹ ╹   ╹ ┗━┛┗━╸
version 2.9.0
https://cryptol.net  :? for help

Loading module Cryptol
Cryptol> :m Float
Loading module Cryptol
Loading module Float
Float> :set fp-base=10
Float> 65504 : Float 5 11
65500

Wondering why 65504 prints as 65500. (It's the maximum representable half-float.) Is this a printing issue, or perhaps something else is going on?

@LeventErkok
Copy link
Contributor Author

I suspect this might have to do directly with libBF, as opposed to Cryptol.

There seems to be an online calculator at http://numcalc.com. I tried:

   > \p 11 5
   > \p
   BigFloat precision=11 bits (~3 digits), exponent size=5 bits
   > 65504.0
   65500.0

which matches Cryptol output, but it is still the wrong output.

I've mailed the author of LibBF to see if he has any feedback; but he hasn't responded yet.

@yav
Copy link
Member

yav commented Feb 26, 2021

Interesting, this seems to be due to fact that we are using the BF_FTOA_FORMAT_FREE_MIN way of showing things, seems to do the right thing with BF_FTOA_FORMAT_FREE. The docs says:

same as BF_FTOA_FORMAT_FREE but uses the minimum number of digits
(takes more computation time). Identical to BF_FTOA_FORMAT_FREE for
binary radices with bf_ftoa() and for bfdec_ftoa()

so I thought this version might be nicer... I wonder if it is doing that because it is considering 65500 to have "fewer" digits than 65504 but when you parse it back you'd get the same thing. If this is the intended behavior, I find it rather confusing, and maybe we should use the FREE format instead (especially since at the moment we don't even accept 65500 as a literal, although that might change).

@LeventErkok
Copy link
Contributor Author

I got a response from the author:

Hi,

Thank you for the report. It may be a bug in the floating point printing code. With "\x" you can see the value in hexa and it seems correct.

Best regards,

Fabrice.

So yeah, it seems the number itself is correctly represented, but decimal conversion is busted. FORMAT_FREE seems to be the way to go till libBF gets a patch itself.

@robdockins robdockins added bug Something not working correctly upstream Tracking bugs in external tools/libraries we depend on labels Mar 3, 2021
@LeventErkok
Copy link
Contributor Author

LeventErkok commented Mar 4, 2021

To be clear, one can argue it is doing the right thing, just not the expected thing. More strongly, perhaps we're expecting a bit too much.

As a half-float:

  • The number 65472 is represented by 0 11110 1111111110
  • The number 65504 is represented by 0 11110 1111111111

Notice that the above two representations are next-to-each other, i.e., there are no other numbers between them that is exactly representable. In this binade, representable numbers are separated by 32, as evidenced by 65504 - 65472 = 32. There are 33 numbers between 65472 and 65504 inclusively. Assuming nearest-ties-to-even, the first 17 will map to the first bit pattern above, and the last 16 will map to the second. Since 65472+16+1 = 65489; then one can reasonably argue that it is free to print anything from 65489 to 65504 as the output, which makes 65500 a valid choice. Though definitely one that'll confuse most everyone interacting with Crytpol/LibBF. (It definitely confused me!)

Given any output from 65489 to 65504 is acceptable, Why does it pick 65500 with FREE_MIN? I think it's because it has more zero's than any other. If you're using scientific notation for instance, you'd prefer 65500, because you can write it as 655E2. That seems to be the most reasonable explanation, though it sounds more aesthetically justified than any other measure. I'd consider 655.04E2 to be just as "minimal," though it does take more "characters" to write down. I'd claim if you're using FREE_MIN, you should also use the exponential notation. If you are not using exponential notation, then FREE_MIN will really be confusing, as we found here.

Perhaps there's really nothing to do here, except to note this in the documentation as floats are strangely beasts and full of these sorts of surprises. Calling this a bug seems a bit too harsh.

@robdockins
Copy link
Contributor

I expect the printing functions in Cryptol to be precise; so to the extent this behavior represents a mismatch between expectations and behavior, I think that's a bug. The FREE_MIN code may indeed be doing exactly what you suggest on purpose; rereading the documentation string in libbf.h suggests this may be the case. If we accept that view, then the bug is in our bindings to libBF using that mode for our printing function.

@yav
Copy link
Member

yav commented Mar 4, 2021

I agree it's a bug, except it is probably our bug rather than upstream, so maybe we should remove that label, and we should just change Cryptol to use FREE instead of FREE_MIN

@LeventErkok
Copy link
Contributor Author

FREE is definitely the way to go. Though I'm curious if you can run into similar scenarios with that setting as well. I wouldn't rule it out without doing some further investigation.

@LeventErkok
Copy link
Contributor Author

We can actually confirm the 17-16 split in the mapping.

Float> [65472 .. 65504] : [33](Float 5 11)
[65470, 65470, 65470, 65470, 65470, 65470, 65470, 65470, 65470,
 65470, 65470, 65470, 65470, 65470, 65470, 65470, 65470, 65500,
 65500, 65500, 65500, 65500, 65500, 65500, 65500, 65500, 65500,
 65500, 65500, 65500, 65500, 65500, 65500]

Funny how the first 17 is represented as 65470, which is also totally confusing. (But is explained by the same argument as before; it has more 0's at the end than 65472.)

I'd like to know what the above prints if you get around to changing the printing to the FREE mode. I'm hoping the first 17 will be 65472 and the last 16 will be 65504 which would match expectations.

@yav
Copy link
Member

yav commented Mar 4, 2021

@LeventErkok I can give it a go a bit later today. Btw, we are using some Haskell bindings to libBF I wrote, so you can play around with this stuff in ghci too (with -fobject-code): http://hackage.haskell.org/package/libBF

@robdockins
Copy link
Contributor

Changing to the FREE mode results in:

Float> [65472 .. 65504] : [33](Float 5 11)
[0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0,
 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0,
 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffc0.0, 0xffe0.0,
 0xffe0.0, 0xffe0.0, 0xffe0.0, 0xffe0.0, 0xffe0.0, 0xffe0.0,
 0xffe0.0, 0xffe0.0, 0xffe0.0, 0xffe0.0, 0xffe0.0, 0xffe0.0,
 0xffe0.0, 0xffe0.0, 0xffe0.0]

I'm not sure this is exactly what we want either. Maybe we can get precise decimal output somehow.

@LeventErkok
Copy link
Contributor Author

That looks correct though. Try running it after :set fp-base=10; I think you'll get precisely the predicted output.

@robdockins
Copy link
Contributor

Oh, right, I forgot it was a separate setting.

Float> :set fp-base=10
Float> [65472 .. 65504] : [33](Float 5 11)
[65472, 65472, 65472, 65472, 65472, 65472, 65472, 65472, 65472,
 65472, 65472, 65472, 65472, 65472, 65472, 65472, 65472, 65504,
 65504, 65504, 65504, 65504, 65504, 65504, 65504, 65504, 65504,
 65504, 65504, 65504, 65504, 65504, 65504]

This seems like what we'd want.

@LeventErkok
Copy link
Contributor Author

Bingo! Looks like FREE is the correct way to go so far as Cryptol is concerned.

BTW, I ran into the following:

Float> [(65472 : Float 5 11) .. 65504]

Parse error at <interactive>:1:2--1:22
  The boundaries of .. sequences should be valid numeric types.
The expression `ELocated (ELocated (EParens (ELocated (ETyped (ELocated (ELocated (ELit (ECNum 65472 DecLit)) (Range {from = Position {line = 1, col = 3}, to = Position {line = 1, col = 8}, source = "<interactive>"})) (Range {from = Position {line = 1, col = 3}, to = Position {line = 1, col = 8}, source = "<interactive>"})) (TLocated (TUser (UnQual (Ident False "Float")) [TLocated (TNum 5) (Range {from = Position {line = 1, col = 17}, to = Position {line = 1, col = 18}, source = "<interactive>"}),TLocated (TNum 11) (Range {from = Position {line = 1, col = 19}, to = Position {line = 1, col = 21}, source = "<interactive>"})]) (Range {from = Position {line = 1, col = 11}, to = Position {line = 1, col = 21}, source = "<interactive>"}))) (Range {from = Position {line = 1, col = 3}, to = Position {line = 1, col = 21}, source = "<interactive>"}))) (Range {from = Position {line = 1, col = 2}, to = Position {line = 1, col = 22}, source = "<interactive>"})) (Range {from = Position {line = 1, col = 2}, to = Position {line = 1, col = 22}, source = "<interactive>"})` is not.

I totally understand the error, but surely that's not what you intended to print to the user!

@robdockins
Copy link
Contributor

I think that's #975

@LeventErkok
Copy link
Contributor Author

Ah, my version of Cryptol is 2.9.0 from hackage, that explains it.

Thinking a bit more about this though, there seems to be an oddity. Why is the following well typed?

Float> :t [65472 .. 65504] : [33](Float 5 11)
([65472 .. 65504] : [33](Float 5 11)) : [33]Float 5 11

What does Cryptol think the type of those literals are? Clearly, they cannot be Float 5 11 as you couldn't do enumeration on them. So, there must be some "lifting" after the enumeration is resolved? But that'd have required mapping some conversion function over the list. That looks odd.

@yav
Copy link
Member

yav commented Mar 4, 2021

Yeah, I think that's #1062. I thought it'd be nice to check that literals are exactly representable, but I forgot that we also assume the literals are downward closed so that we don't infer super polymorphic types for long sequences of numbers.

If you have ideas, let's comment on that ticket. I think, currently we are thinking of maybe just allowing you to write imprcise literals (in base 10), and have them be rounded in some way.

@robdockins robdockins added UX Issues related to the user experience (e.g., improved error messages) and removed upstream Tracking bugs in external tools/libraries we depend on labels Mar 4, 2021
@LeventErkok
Copy link
Contributor Author

Playing a bit more about this, I'm convinced showFree is the way to go for consistent results. There's one gotcha, however. You probably want to trim any suffix 0's after the floating point itself, except the very first one in case all are 0's. (Otherwise it prints a lot of unnecessary 0's with large significant sizes.) I've settled on the following function for SBV, which mimics how Haskell prints Inf/NaN etc, though I believe Cryptol uses fpNaN/fpPosInf etc.

import LibBF (BigFloat)
import qualified LibBF as BF

data FP = FP { fpExponentSize    :: Int
             , fpSignificandSize :: Int
             , fpValue           :: BigFloat
             }

-- | Show a big float in the base given.
-- NB. Do not be tempted to use BF.showFreeMin below; it produces arguably correct
-- but very confusing results. See <https://github.com/GaloisInc/cryptol/issues/1089>
-- for a discussion of the issues.
bfToString :: Int -> Bool -> FP -> String
bfToString b withPrefix (FP _ sb a)
  | BF.bfIsNaN  a = "NaN"
  | BF.bfIsInf  a = if BF.bfIsPos a then "Infinity" else "-Infinity"
  | BF.bfIsZero a = if BF.bfIsPos a then "0.0"      else "-0.0"
  | True          = trimZeros $ BF.bfToString b withP a
  where opts = BF.showRnd BF.NearEven <> BF.showFree (Just (fromIntegral sb))
        withP
          | withPrefix = BF.addPrefix <> opts
          | True       = opts

        trimZeros s
          | '.' `elem` s = reverse $ case dropWhile (== '0') $ reverse s of
                                       res@('.':_) -> '0' : res
                                       res         -> res
          | True         = s

@LeventErkok
Copy link
Contributor Author

LeventErkok commented Mar 10, 2021

I've released a new version of SBV (v8.12), that now supports arbitrary-sized floats. I understand that Cryptol doesn't use SBV for floats currently, and there's probably no real ROI to add support for it; but just in case you want to compare the results to What4 gave you, it might come in handy. It could also be useful in other projects.

In particular, there's a new crackNum parameter displays numbers in great detail. This is particularly useful for floats:

Prelude Data.SBV> satWith z3{crackNum=True} $ \(x :: SFloatingPoint 5 11) -> x .== 65504
Satisfiable. Model:
  s0 = 65504 :: FloatingPoint 5 11
                  1       0
                  5 43210 9876543210
                  S -E5-- ---S10----
   Binary layout: 0 11110 1111111111
      Hex layout: 7BFF
       Precision: Half (5 exponent bits, 10 significand bits.)
            Sign: Positive
        Exponent: 15 (Stored: 30, Bias: 15)
  Classification: FP_NORMAL
    Binary Value: 0b1.1111111111p15
     Octal Value: 0o177740
   Decimal Value: 65504
       Hex Value: 0xffe0

The implementation uses Iavor's LibBF under the hood. Cheers!

@atomb atomb added this to the 2.11.0 milestone Mar 12, 2021
@robdockins
Copy link
Contributor

I don't think this zero-trimming function quite what you want, either. If there is an exponent suffix, I think the function above won't trim the zeros because they appear in the middle of the string. E.g., if I just use the showFree format in Cryptol, I get the following, which I don't think would be handled by the above.

Cryptol> :m Float
Loading module Cryptol
Loading module Float
Float> :set fpBase=10
Float> 1.0e64 : Float64
1.0000000000000000e64

@LeventErkok
Copy link
Contributor Author

I don't think showFree adds e at all. (At least not in my tests. It just prints with all required digits, without any exponentiation.)

Did you see otherwise?

@robdockins
Copy link
Contributor

Via GHCi

Prelude LibBF> let x = bfFromInteger (10 ^ (64::Int))
Prelude LibBF> bfToString 10 (showFree (Just 53)) x
"1.0000000000000000e64"

@LeventErkok
Copy link
Contributor Author

Ah, my bad. Apparently I didn't test it enough. SBV shows the same:

Prelude> :m + Data.SBV

Prelude Data.SBV> :set -XDataKinds
Prelude Data.SBV> 1e64 :: SFloatingPoint 11 53
1.0000000000000000e64 :: SFloatingPoint 11 53

Surprisingly, it doesn't do that with larger precision:

Prelude Data.SBV> 1e64 :: SFloatingPoint 11 200
10000000000000000000000000000000000000000000000000000000000000000 :: SFloatingPoint 11 200

Same in GHCi:

Prelude LibBF> let x = bfFromInteger (10 ^ (64::Int))
Prelude LibBF> bfToString 10 (showFree (Just 200)) x
"10000000000000000000000000000000000000000000000000000000000000000"

I'm not sure what sort of logic it's following here.:

Prelude LibBF> bfToString 10 (showFree (Just 200)) x
"10000000000000000000000000000000000000000000000000000000000000000"
Prelude LibBF> bfToString 10 (showFree (Just 128)) x
"1.000000000000000000000000000000000000000e64"
Prelude LibBF> bfToString 10 (showFree (Just 192)) x
"1.0000000000000000000000000000000000000000000000000000000000e64"
Prelude LibBF> bfToString 10 (showFree (Just 256)) x
"10000000000000000000000000000000000000000000000000000000000000000.00000000000000"
Prelude LibBF> bfToString 10 (showFree (Just 384)) x
"10000000000000000000000000000000000000000000000000000000000000000.0000000000000000000000000000000000000000000000000000"
Prelude LibBF> bfToString 10 (showFree (Just 512)) x
"10000000000000000000000000000000000000000000000000000000000000000.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"

I guess it gives up on the e notation after some precision?

In either case you're right; need to do a bit more string-manipulation there to get it cleaner.

@robdockins robdockins self-assigned this Mar 15, 2021
@robdockins
Copy link
Contributor

That is very odd.

robdockins added a commit that referenced this issue Mar 15, 2021
has highly counterintuitive results.  Instad, use the `showFree`
mode, which produces results more accurate to the underlying
bit representation.  The `showFree` mode often produces unfortunate
trailing zeros in decimal representations, so we take some care
to trim those out.

Fixes #1089
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something not working correctly UX Issues related to the user experience (e.g., improved error messages)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants