Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Favor ByteBuf Duplicates, NettyJsonContentAuthSigner and StreamChannelConnectionCaptureSerializer Refactoring #615

Merged
merged 18 commits into from
May 3, 2024

Conversation

AndreKurait
Copy link
Member

@AndreKurait AndreKurait commented Apr 26, 2024

Description

This change seeks to simplify the complexity ByteBuf reading and duplicating indexes through use of .duplicate over .slice and being more intentional about side effects to the buffer after reading.

This change also seeks to harden NettyJsonContentAuthSigner by flushing the buffer in error cases and removing unnecessary release/retains.

Furthermore, moved to the Netty Base64 Encoder in ParsedHttpMessagesAsDicts to directly decode from ByteBufs.

Also, fixed up some trace logging that previously wasn't using a supplier to reduce unnecessary cycles.

This change also refactors StreamChannelConnectionCaptureSerializer to clean up the interfaces, improve performance, and remove bugs around chunking and serializing data.

  • Category: Refactoring
  • Why these changes are required? Simplify logic and make it easier to understand
  • What is the old behavior before changes and new behavior after changes? NettyJsonContentAuthSigner was less defensible around exceptions or handler being removed or channel closed scenarios and now will flush out buffer in those cases. ByteBufs are less likely to have their reader index modified unintentionally due to use of slicing.

Issues Resolved

MIGRATIONS-1696

Is this a backport? If so, please add backport PR # and/or commits #

Testing

Unit testing

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from ecde713 to 82bd98a Compare April 26, 2024 01:05
@AndreKurait AndreKurait marked this pull request as ready for review April 26, 2024 01:15
@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from 82bd98a to be72c5b Compare April 26, 2024 01:19
Copy link

codecov bot commented Apr 26, 2024

Codecov Report

Attention: Patch coverage is 81.48148% with 20 lines in your changes are missing coverage. Please review.

Project coverage is 75.79%. Comparing base (495b27c) to head (7b270fa).
Report is 9 commits behind head on main.

Files Patch % Lines
...ture/StreamChannelConnectionCaptureSerializer.java 83.60% 9 Missing and 1 partial ⚠️
.../datahandlers/http/NettyJsonContentAuthSigner.java 70.83% 4 Missing and 3 partials ⚠️
...atahandlers/http/HttpJsonTransformingConsumer.java 60.00% 2 Missing ⚠️
...h/migrations/replay/ParsedHttpMessagesAsDicts.java 83.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #615      +/-   ##
============================================
+ Coverage     75.76%   75.79%   +0.03%     
- Complexity     1533     1543      +10     
============================================
  Files           168      168              
  Lines          6395     6437      +42     
  Branches        570      575       +5     
============================================
+ Hits           4845     4879      +34     
- Misses         1173     1179       +6     
- Partials        377      379       +2     
Flag Coverage Δ
unittests 75.79% <81.48%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from be72c5b to 32a2485 Compare April 26, 2024 01:26
Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple comments for my own understanding. Provided that you're comfortable with the answers, this LGTM.

@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from 32a2485 to 0ee07e3 Compare April 26, 2024 21:20
@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch 5 times, most recently from 6f62907 to 507dd36 Compare April 30, 2024 16:01
Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the ByteBuffer conversion in the serializer be reversed. I think that it will be a big performance hit and won't always work as you are hoping that it would (making it less maintainable). The other comments should be minor/easy to address

@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from 507dd36 to 86a6b3a Compare May 1, 2024 15:02
@gregschohn gregschohn self-requested a review May 1, 2024 15:59
@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from c23ecef to 305c5dc Compare May 2, 2024 20:59
@AndreKurait AndreKurait changed the title Favor ByteBuf Slicing and NettyJsonContentAuthSigner Refactoring Favor ByteBuf Duplicates, NettyJsonContentAuthSigner and StreamChannelConnectionCaptureSerializer Refactoring May 2, 2024
@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from b8054eb to 5632748 Compare May 2, 2024 21:52
@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from 5632748 to 9967581 Compare May 2, 2024 22:00
@AndreKurait
Copy link
Member Author

I'd like to see the ByteBuffer conversion in the serializer be reversed. I think that it will be a big performance hit and won't always work as you are hoping that it would (making it less maintainable). The other comments should be minor/easy to address

I've updated the serializer to be more deterministic around edge cases and require less data copying. Please take a look at the latest revision

@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from 9967581 to 1b66165 Compare May 2, 2024 22:21
@AndreKurait AndreKurait requested a review from peternied as a code owner May 2, 2024 22:21

// Similar to calculateMaxWritableSpace but perform pessimistic calculation with fewer operations. In some cases returns
// up to 1 byte fewer than what could be written out of the available space.
public static int pessimisticallyCalculateMaxWritableSpace(int totalAvailableSpace, int requestedWriteableSpace) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregschohn, added, depended on, and tested pessimisticallyCalculateMaxWritableSpace instead of calculateMaxWritableSpace

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function calculates the max bytes that I could write to a protobuf stream, including the overhead to write the array, right? I'm thinking something like "calculateMaxArrayLengthForSpace" - something that indicates that exactly one piece of overhead is taken into account here, not the tag, no omission of the length. A clearer name (as hard as that is) will make this definition (which is now really easy to understand, thanks) and its applications more obvious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking computeMaxLengthDelimitedFieldSizeForSpace which is analogous to (internal method) computeLengthDelimitedFieldSize in CodedOutputStream

Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a really hard PR to go through because the CodedOutputStream stuff has always been so challenging. I think that 80% of this is great. We just need to cleanup some names to get the rest of it to the point where it can be maintained.
Thanks for persevering.

Signed-off-by: Andre Kurait <[email protected]>
@AndreKurait AndreKurait force-pushed the ByteBufImprovements branch from a5a9861 to 61dbc8f Compare May 3, 2024 16:27
@AndreKurait AndreKurait merged commit d1771cc into opensearch-project:main May 3, 2024
7 checks passed
@AndreKurait AndreKurait deleted the ByteBufImprovements branch May 3, 2024 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants