Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BJData optimized binary array type #4513

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from
Draft

Conversation

nebkat
Copy link

@nebkat nebkat commented Nov 25, 2024

See NeuroJSON/bjdata#6 for further information.

Introduces a dedicated B marker for bytes.

This is used as the strong type marker in optimized array format to encode binary data such that it can also be decoded back to binary data (instead of wrongly decoding as an integer array).

Draft while awaiting the release of BJData draft 3.

Would a legacy_binary flag be desirable to continue supporting draft 2 expectations of uint8 typed arrays for binary?


Pull request checklist

Read the Contribution Guidelines for detailed information.

  • Changes are described in the pull request, or an existing issue is referenced.
  • The test suite compiles and runs without error.
  • Code coverage is 100%. Test cases can be added by editing the test suite.
  • The source code is amalgamated; that is, after making changes to the sources in the include/nlohmann directory, run make amalgamate to create the single-header files single_include/nlohmann/json.hpp and single_include/nlohmann/json_fwd.hpp. The whole process is described here.

@coveralls
Copy link

coveralls commented Nov 25, 2024

Coverage Status

coverage: 99.648% (+0.001%) from 99.647%
when pulling f465f6e on nebkat:develop
into e41905f on nlohmann:develop.

Copy link

🔴 Amalgamation check failed! 🔴

The source code has not been amalgamated. @nebkat
Please read and follow the Contribution Guidelines.

@nlohmann
Copy link
Owner

Please run make amalgamate with AStyle 3.1.

Introduces a dedicated `B` marker for bytes. This is used as the strong
type marker in optimized array format to encode binary data such that
it can also be decoded back to binary data (instead of decoding as an
integer array).

See NeuroJSON/bjdata#6 for further information.
@@ -1514,11 +1542,8 @@ TEST_CASE("BJData")
// create expected byte vector
std::vector<std::uint8_t> expected;
expected.push_back(static_cast<std::uint8_t>('['));
if (N != 0)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing existing test cases change - is this a breaking change for existing clients?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - this is a breaking change for encoding, but fixes roundtrip encoding/decoding.

Before:

  • json::binary gets encoded as a typed array of U
  • typed array of U gets decoded as json::array ❗does not match encoded data

After:

  • json::binary gets encoded as a typed array of B
  • typed array of B gets decoded as json::binary ✅ matches encoded data

I could introduce a new flag to continue encoding binary as uint8 if the breaking change is undesirable?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must not break existing code, so yes, a parameter would make sense.

@@ -847,11 +847,11 @@ class binary_writer
oa->write_character(to_char_type('['));
}

if (use_type && !j.m_data.m_value.binary->empty())
if (use_type && (use_bjdata || !j.m_data.m_value.binary->empty()))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nlohmann
And that specific test-case you commented on has changed due to this line. In order to ensure that encoding an empty json::binary still decodes to json::binary I force the insertion of the type here regardless of whether there are contents or not.

That said this problem will still exist if other libraries are used (e.g. python encodings an empty bytearray and doesn't include the type).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants