Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime checks for nan and inf values? #12

Closed
liautaud opened this issue Apr 10, 2020 · 4 comments
Closed

Runtime checks for nan and inf values? #12

liautaud opened this issue Apr 10, 2020 · 4 comments

Comments

@liautaud
Copy link

While fuzzing the JSON encoder of Irmin, which uses jsonm as a backend, I came across some counterintuitive bugs while trying to encode non-standard floating-points values.

Although the documentation specifies that "Float lexemes must not be, Pervasives.nan, Pervasives.infinity or Pervasives.neg_infinity", there are no runtime errors when trying to encode such values. Instead, the encoder will comply and encode them using %.16g, which could generate the invalid tokens nan, inf or -inf, and the crash will only happen on decoding.

Is this absence of runtime checks motivated by performance? Otherwise, would it be possible to add checks to w_value to prevent such tokens from being produced in the first place?

@liautaud liautaud changed the title Runtime checks for nan and inf floating-point values? Runtime checks for nan and inf values? Apr 10, 2020
@dbuenzli
Copy link
Owner

There are quite a few other things, that your fuzzer missed like trying to encode invalid UTF-8.

Yes the encoder does not check these things for performance reasons and expects these checks to be performed by the client. See the details of the contract here.

@liautaud
Copy link
Author

@dbuenzli Alright, thanks! (The fuzzer did find bugs with invalid UTF-8, which we prevented by adding checks in the client, so we'll just do it for floats too then.)

@edwintorok
Copy link

FWIW this makes 'jsonm' produce non-standard JSON output that cannot be parsed by regular JSON clients.
This resulted in a bug where a client of a server using jsonm had to patch its JSON parsers to recognize the invalid output.

Yojson does produce valid output on NaN (although it is still up to the client to know how to interpret it, but it can be done at the application layer without having to patch the JSON parser grammar). However AFAICT Yojson doesn't validate that its input is UTF-8, so I would still prefer to use 'jsonm' due to its better input validation.

Due to 'jsonm' producing invalid JSON on output we had to switch our code to use yojson though (which AFAICT always produces valid JSON on output), it doesn't make sense to use 2 JSON libraries in the same application.

@dbuenzli
Copy link
Owner

dbuenzli commented Jun 1, 2023

Due to 'jsonm' producing invalid JSON on output we had to switch our code to use yojson though

jsonm does not produce invalid output. It's the duty of the client to check these things as there is nothing standard that can be done. What you want to do on nans and infinity that json cannot represent is for the client to decide. This is properly documented here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants