Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise TokenUtils parsing #611

Merged
merged 7 commits into from
Oct 15, 2022
Merged

Optimise TokenUtils parsing #611

merged 7 commits into from
Oct 15, 2022

Conversation

noetro
Copy link
Contributor

@noetro noetro commented Sep 30, 2022

Changes

Please describe both what is changing and why this is important. Include:

_We've been using the project for a while and I'm a fan. I was in for our 4.0 upgrade and thought I would see if I could find some small thing to start giving back. This optimises the TokenUtils parsing by relying on the JWT format is very specific. In particular:

  1. The format is checked to be good before any array is allocated.
  2. In the case of alg='none', two array allocations are not required._

References

Please include relevant links supporting this change such as a:

No references to note

Testing

Existing TokenUtils test had good coverage. Added a couple of test cases for corner case odd string inputs.

  • This change adds test coverage
  • This change has been tested on the latest version of Java or why not

Checklist

@noetro noetro requested a review from a team as a code owner September 30, 2022 07:01
@jimmyjames
Copy link
Contributor

👋 hi @noetro, thanks for the contribution! Appreciate your willingness to contribute 🙇. Also really like the additional tests (those would be good to add regardless of this PR).

Can you expand a bit on the optimizations benefit in this change? As you noted, I see that we aren't allocating an array prior to ensuring the format is correct, and also not re-allocating in case the alg is none. Any thought to how those optimizations compare to the repeated calling of indexOf and substring? Just want to ensure the optimizations are enough to justify the additional code and handling introduced here. Thanks!

@noetro
Copy link
Contributor Author

noetro commented Oct 4, 2022

Hi @jimmyjames - Sure, happy to discuss. You don't get around the indexOf and substring calls in any scenario here. Have a look at the implementation of String.split(regex) that the current code uses (at least on my OpenJDK). It has a fast path to try to detect a single character regex and uses a similar but more generic implementation (it needs to grab N parts) that is basically a while -> indexOf -> substring pattern.

So, it's a similar implementation in terms of dicing the string. We can just be more specific as we know exactly a pattern we would accept. We can get out early if we don't have two-and-only-two delimiters, and we can avoid allocating a large array if we try to parse a large string with many delimiters.

@noetro
Copy link
Contributor Author

noetro commented Oct 4, 2022

I also took a quick look at a few other open source libraries. It seems like a mixed bag of people doing split(regex) (Vert.x JWT auth) vs. something like what I have here with a few indexOf (Jose JWT)

@jimmyjames
Copy link
Contributor

@noetro thanks for the info! Change looks good, just failing on a couple checkstyle issues. You can see the issue locally by running ./gradle clean check. If you can push another commit to resolve those issues, we can merge this PR. Thanks!

@noetro
Copy link
Contributor Author

noetro commented Oct 6, 2022

Oops, missed there was checkstyle here. I pushed a commit correcting them @jimmyjames

@jimmyjames
Copy link
Contributor

Thanks @noetro! Looks like there are some additional testing paths that need to be covered: https://github.com/auth0/java-jwt/pull/611/checks?check_run_id=8736927827

Once we get tests for those paths we should be good to go. Thanks again!

@noetro
Copy link
Contributor Author

noetro commented Oct 7, 2022

Added a test case for "no parts" @jimmyjames . Intellij at least reports 100% test coverage. I haven't used codecov before - I'm not sure I agree that all 3 added lines lacked coverage. Those thrown exceptions were asserted in the TokenUtilsTest.

We were missing a test case for no delimiters in the input string though, so I added shouldThrowOnSplitTokenWithNoParts and that moved Intellij reporting to 100% on TokenUtils when running TokenUtilsTest

@jimmyjames
Copy link
Contributor

@noetro I think I know what the issue is. It might be an oddity with the jacoco test coverage, but we should update the TokenUtils#wrongNumberOfParts method to just return a new JWTDecodeException, instead of throwing it. Because in use, currently the change calls throw wrongNumberOfParts(2). If you change wrongNumberOfParts to simply return a new exception, that will be correct and also I bet it fixes the coverage issue. Thanks!

Return a new JWTDecodeException from private utility method `wrongNumberOfParts`, instead of throwing, since we throw from `splitToken()`.
@jimmyjames
Copy link
Contributor

I added a new commit that changes wrongNumberOfParts to return a new JWTDecodeException, as discussed.

We're going to release 4.1.0 early next week, then will merge this change. Thanks again!

@noetro I think I know what the issue is. It might be an oddity with the jacoco test coverage, but we should update the TokenUtils#wrongNumberOfParts method to just return a new JWTDecodeException, instead of throwing it. Because in use, currently the change calls throw wrongNumberOfParts(2). If you change wrongNumberOfParts to simply return a new exception, that will be correct and also I bet it fixes the coverage issue. Thanks!

@noetro
Copy link
Contributor Author

noetro commented Oct 8, 2022

Interesting find @jimmyjames - thanks for pushing.

@noetro
Copy link
Contributor Author

noetro commented Oct 8, 2022

Sounds good @jimmyjames - I'll have another one for you after we merge this that's a slightly bigger change but makes a noticeable performance improvement on deserialisation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants