Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Multipart delimiter boundary parsing #502

Merged
merged 9 commits into from
Oct 14, 2024

Conversation

calvincestari
Copy link
Member

@calvincestari calvincestari commented Oct 4, 2024

Fixes apollographql/apollo-ios#3453.

The multipart message parsing code was not splitting message chunks at the correct boundary. It would only look for the dash boundary (--graphql) instead of the full delimiter (\r\n--graphql). This meant that if the dash boundary was present in the message body the chunk would be split causing a parsing error.

  • Multipart delimiter changed to match the RFC specifications.
  • Refactored property and function names to match multipart RFC terminology.
  • Test data updated to represent RFC specification-matching message data.
  • URLSessionClientTests restyled and new tests added to ensure correct delimiter parsing.
  • Tests added to defer and subscription protocol parsers to ensure correct delimiter parsing.
  • I've also done manual integration testing with the client-router-e2e-tests repo test services and everything works as expected.

@svc-apollo-docs
Copy link
Collaborator

svc-apollo-docs commented Oct 4, 2024

✅ Docs Preview Ready

No new or changed pages found.

Copy link

netlify bot commented Oct 4, 2024

Deploy Preview for eclectic-pie-88a2ba canceled.

Name Link
🔨 Latest commit 010f5f4
🔍 Latest deploy log https://app.netlify.com/sites/eclectic-pie-88a2ba/deploys/67004a3ef2e6e900085ff50f

Copy link

netlify bot commented Oct 4, 2024

Deploy Preview for apollo-ios-docc canceled.

Name Link
🔨 Latest commit 010f5f4
🔍 Latest deploy log https://app.netlify.com/sites/apollo-ios-docc/deploys/67004a3edabae100087061da

@calvincestari calvincestari marked this pull request as ready for review October 4, 2024 20:07
self.wait(for: [expectation], timeout: 1)
}

func test__multipart__givenChunkContainingBoundaryStringWithoutClosingBoundary_shouldNotSplitChunk() throws {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to actually test the critical path.

I was a bit confused about how this managed to work, but when I put a breakpoint on L581, we're actually getting the callback from urlSession dataTask didCompleteWithError and not urlSession dataTask didReceive. Which means we're bypassing the problematic parsing code.

From what I could see earlier, this behavior is actually time sensitive, making it hard to control which parts of URLSessionClient actually get exercised. Which is why I ended up calling its data handler directly.

Copy link
Member Author

@calvincestari calvincestari Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit confused about how this managed to work, but when I put a breakpoint on L581, we're actually getting the callback from urlSession dataTask didCompleteWithError and not urlSession dataTask didReceive. Which means we're bypassing the problematic parsing code.

This is an implementation detail; there is no guarantee the different callbacks nor the logic within them will remain the same in the future. The test message data is what's important and there are multiple tests to cover different combinations of the test data.

  • test__parsing__givenSingleChunk_withDashBoundaryInMessageBody_shouldReturnSuccess
  • test__multipart__givenChunkContainingBoundaryString_shouldNotSplitChunk
  • test__multipart__givenChunkContainingBoundaryStringWithoutClosingBoundary_shouldNotSplitChunk

If you put a breakpoint on both callbacks you'll see that they're both still called. The important detail in this test is that the urlSession(session:dataTask:didReceive:) does not split the chunk incorrectly and also that it doesn't send .failure simply because there is no closing boundary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I could see earlier, this behavior is actually time sensitive, making it hard to control which parts of URLSessionClient actually get exercised. Which is why I ended up calling its data handler directly.

It's not time sensitive but dependent on the type of response (multipart header) and number of chunks. As data is sent in it's checked for complete chunks so they can be processed immediately; this is the callback that needs to check chunk boundaries. Then once the request completes any left over response data is sent on for further parsing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh, you're right. I was skimming and skipped some of the crucial parts. Looks good!

Copy link
Contributor

@AnthonyMDev AnthonyMDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Is there any need to test when the body content inadvertently has something that looks like a multipart boundary somewhere in the text of the JSON?

@calvincestari
Copy link
Member Author

calvincestari commented Oct 8, 2024

This LGTM! Is there any need to test when the body content inadvertently has something that looks like a multipart boundary somewhere in the text of the JSON?

Yes, that's what the tests I mentioned above do. Note that there are no tests for body data that contains the dash-boundary preceded, nor followed, by \r\n because that is explicitly not allowed in RFC2046. If that combination did appear in a message body it would be a malformed mulitpart chunk.

@calvincestari calvincestari merged commit 5ace9ce into main Oct 14, 2024
38 checks passed
@calvincestari calvincestari deleted the fix/multipart-delimiter-boundary-parsing branch October 14, 2024 17:50
BobaFetters pushed a commit to apollographql/apollo-ios that referenced this pull request Oct 14, 2024
BobaFetters pushed a commit that referenced this pull request Oct 14, 2024
bc5a0cdf fix: Multipart delimiter boundary parsing (#502)

git-subtree-dir: apollo-ios
git-subtree-split: bc5a0cdfd11388824aace092701cbdd19ac1c575
BobaFetters pushed a commit that referenced this pull request Oct 14, 2024
git-subtree-dir: apollo-ios
git-subtree-mainline: 012913f
git-subtree-split: bc5a0cdfd11388824aace092701cbdd19ac1c575
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A multipart message with the boundary string inside the body fails parsing
4 participants