Runtime checks for `nan` and `inf` values? #12

liautaud · 2020-04-10T10:45:06Z

While fuzzing the JSON encoder of Irmin, which uses jsonm as a backend, I came across some counterintuitive bugs while trying to encode non-standard floating-points values.

Although the documentation specifies that "Float lexemes must not be, Pervasives.nan, Pervasives.infinity or Pervasives.neg_infinity", there are no runtime errors when trying to encode such values. Instead, the encoder will comply and encode them using %.16g, which could generate the invalid tokens nan, inf or -inf, and the crash will only happen on decoding.

Is this absence of runtime checks motivated by performance? Otherwise, would it be possible to add checks to w_value to prevent such tokens from being produced in the first place?

The text was updated successfully, but these errors were encountered:

dbuenzli · 2020-04-10T13:46:05Z

There are quite a few other things, that your fuzzer missed like trying to encode invalid UTF-8.

Yes the encoder does not check these things for performance reasons and expects these checks to be performed by the client. See the details of the contract here.

liautaud · 2020-04-10T13:49:17Z

@dbuenzli Alright, thanks! (The fuzzer did find bugs with invalid UTF-8, which we prevented by adding checks in the client, so we'll just do it for floats too then.)

edwintorok · 2023-06-01T13:09:01Z

FWIW this makes 'jsonm' produce non-standard JSON output that cannot be parsed by regular JSON clients.
This resulted in a bug where a client of a server using jsonm had to patch its JSON parsers to recognize the invalid output.

Yojson does produce valid output on NaN (although it is still up to the client to know how to interpret it, but it can be done at the application layer without having to patch the JSON parser grammar). However AFAICT Yojson doesn't validate that its input is UTF-8, so I would still prefer to use 'jsonm' due to its better input validation.

Due to 'jsonm' producing invalid JSON on output we had to switch our code to use yojson though (which AFAICT always produces valid JSON on output), it doesn't make sense to use 2 JSON libraries in the same application.

dbuenzli · 2023-06-01T13:27:44Z

Due to 'jsonm' producing invalid JSON on output we had to switch our code to use yojson though

jsonm does not produce invalid output. It's the duty of the client to check these things as there is nothing standard that can be done. What you want to do on nans and infinity that json cannot represent is for the client to decide. This is properly documented here.

liautaud changed the title ~~Runtime checks for nan and inf floating-point values?~~ Runtime checks for nan and inf values? Apr 10, 2020

dbuenzli closed this as completed Apr 10, 2020

liautaud mentioned this issue Apr 10, 2020

Fixed "illegal literal" error when JSON-decoding NaN. mirage/irmin#979

Merged

craigfe mentioned this issue Apr 16, 2021

Non-finite floats are problematic in JSON codec mirage/repr#54

Closed

edwintorok mentioned this issue Jun 1, 2023

Add "Are we X yet?" pages ocaml/ocaml.org#1226

Merged

edwintorok mentioned this issue Jun 1, 2023

Floating point roundtrip and loss of precision #22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime checks for `nan` and `inf` values? #12

Runtime checks for `nan` and `inf` values? #12

liautaud commented Apr 10, 2020

dbuenzli commented Apr 10, 2020

liautaud commented Apr 10, 2020

edwintorok commented Jun 1, 2023

dbuenzli commented Jun 1, 2023

Runtime checks for nan and inf values? #12

Runtime checks for nan and inf values? #12

Comments

liautaud commented Apr 10, 2020

dbuenzli commented Apr 10, 2020

liautaud commented Apr 10, 2020

edwintorok commented Jun 1, 2023

dbuenzli commented Jun 1, 2023

Runtime checks for `nan` and `inf` values? #12

Runtime checks for `nan` and `inf` values? #12