Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve cJSON number handling. #132

Merged
merged 1 commit into from
Jan 22, 2024

Conversation

haydenroche5
Copy link
Collaborator

  • Get rid of CJSON_NO_CLIB and its associated code paths, per Ray's request.
  • Change the type of the J object's valueint member. It should be int32_t if NOTE_LOWMEM and int64_t otherwise. Capture this in a new #define for the type, JINTEGER.
  • When parsing a JSON number from a JSON string, if the number is within the bounds of the JINTEGER type, convert it to an integer using JAtoI rather than simply casting valuenumber to JINTEGER, which can lose precision (e.g. for a Unix timestamp). If the number is outside the bounds of a JINTEGER, saturate, which is the same behavior as before this commit.
  • When printing a JSON number, print it as an integer if possible. Otherwise, print it as floating point.
  • Fix an issue where JItoA would fail to convert the minimum long int value (i.e. LONG_MIN) to a string.

@haydenroche5 haydenroche5 self-assigned this Jan 18, 2024
n_cjson.c Show resolved Hide resolved
@haydenroche5 haydenroche5 force-pushed the json_number_rework branch 2 times, most recently from cb180ae to fa54b5d Compare January 21, 2024 22:59
- Get rid of CJSON_NO_CLIB and its associated code paths, per Ray's request.
- Change the type of the J object's valueint member. It should be int32_t if
NOTE_LOWMEM and int64_t otherwise. Capture this in a new #define for the type,
JINTEGER.
- When parsing a JSON number from a JSON string, if the number is within the
bounds of the JINTEGER type, convert it to an integer using JAtoI rather than
simply casting valuenumber to JINTEGER, which can lose precision (e.g. for a
Unix timestamp). If the number is outside the bounds of a JINTEGER, saturate,
which is the same behavior as before this commit.
- When printing a JSON number, print it as an integer if possible. Otherwise,
print it as floating point.
- Fix an issue where JItoA would fail to convert the minimum long int value
(i.e. LONG_MIN) to a string.
- Add a slew of new unit tests to make sure number handling behaves as expected.
@haydenroche5
Copy link
Collaborator Author

Added unit tests.

n_cjson.c Show resolved Hide resolved
@rayozzie rayozzie merged commit 86aab4a into blues:master Jan 22, 2024
11 checks passed
n_cjson.c Show resolved Hide resolved
@@ -257,7 +257,7 @@ char **endPtr; /* If non-NULL, store terminating character's
case 5:
p10 = 1.0e32;
break;
#ifndef NOTE_FLOAT
#ifndef NOTE_LOWMEM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change NOTE_LOWMEM to NOTE_C_LOWMEM?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to yeah. In theory there could be users depending on this macro. Should we change the define to NOTE_C_LOWMEM, but also define NOTE_LOWMEM if NOTE_C_LOWMEM is defined for backwards compatibility?

The same goes for NOTE_FLOAT. I opted to eliminate it entirely, but there could be someone out there using it in their code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I think we should change it to NOTE_C_LOWMEM, and also declare NOTE_LOWMEM. There should probably be a deprecation message before it is #define'd, in case it was declared by the build system.

n_cjson.c Show resolved Hide resolved
Comment on lines +434 to +435
// Conversion to unsigned is required to handle the case where n is
// LONG_MIN.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not obvious. Why is this the case?

Copy link
Collaborator Author

@haydenroche5 haydenroche5 Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If n is JINTEGER_MIN, when we negate it with the unary minus operator, it overflows and the result is undefined behavior (I'm pretty sure). If I keep the original code that was here and pass in LONG_MIN, this

n = -n;

just keeps n as LONG_MIN, at least on my system, with my compiler, etc. Then you end up doing n % 10 on a negative number and the whole algorithm breaks down. If we cast n to unsigned, then the rules for negation are different. See: https://stackoverflow.com/questions/8026694/c-unary-minus-operator-behavior-with-unsigned-operands

The negative of an unsigned quantity is computed by subtracting its value from 2^n, where n is the number of bits in the promoted operand.

So unsignedN = -unsignedN when JINTEGER is 32 bits and n is JINTEGER_MIN yields 2^32-2147483648, which is 2147483648. Then the algorithm works as intended.

I'll explain all this better in an improved code comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explanation makes more sense when the data type is signed. I don't see how this works with an unsigned data type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JINTEGER_MIN should be 0x80000000, right?

If you make that unsigned, then you have a valid positive number.
unsigned long int unsignedN = n;

Then if you make it negative with -. Now, you have a temporary 64-bit negative number.
0xFFFFFFFF80000000, which you then assign to a 32-bit unsigned number, so you are right back to a valid 32-bit unsigned number 0x80000000.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that modulo will work better on unsigned values, but I don't quite get the point of setting unsignedN to -unsignedN.

unsigned long int unsignedN = n;
long int i, j;
if (n < 0) {
unsignedN = -unsignedN;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data type of unisignedN is unsigned long int, so what does making it negative do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

note.h Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants