Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 BUG]: RoadRunner alters form body #1696

Closed
1 task done
revenkroz opened this issue Aug 28, 2023 · 25 comments · Fixed by roadrunner-server/http#200
Closed
1 task done

[🐛 BUG]: RoadRunner alters form body #1696

revenkroz opened this issue Aug 28, 2023 · 25 comments · Fixed by roadrunner-server/http#200
Assignees
Labels
B-bug Bug: bug, exception
Milestone

Comments

@revenkroz
Copy link

revenkroz commented Aug 28, 2023

No duplicates 🥲.

  • I have searched for a similar issue in our bug tracker and didn't find any solutions.

What happened?

There is one somewhat old service and a client to it. Files are sent there wrapped in application/x-www-form-urlencoded. When this service was migrated to RoadRunner, it was discovered that some of the files were corrupted.

We tried to fix it with raw_body: true option and parse the request ourselves, but the problem persists. Something inside RoadRunner is changing the contents of the request.

Version (rr --version)

2023.2.2

How to reproduce the issue?

I did a quick test with RR and default PHP server. The last one saves files properly, RR doesn't.

Repo has a test.png file for test - this file is sent to the server and saved under a new name so you can check what's going on.

https://github.com/revenkroz/rr-urlencoded-bug

Relevant log output

No response

@rustatian
Copy link
Member

Hey @revenkroz 👋🏻

Something inside RoadRunner is changing the contents of the request.

I'm not quite sure that I understand the problem. What kind of changes do you mean? Order of the args?

@revenkroz
Copy link
Author

I mean the contents of some form fields in the request are different from what ends up in the application. For example field attachments[0][data] has one content and the app gets slightly different.

@rustatian
Copy link
Member

@revenkroz I mean, what is different? How did you check this? Have you tried saving and comparing files?

@revenkroz
Copy link
Author

@rustatian Yeah, I've tried sending and saving files. As you can see in repo I mentioned above, when you send file to an app running with RR and save it, the file becomes corrupted. If you run a default PHP server (e.g. php -S localhost:58080) and send file there it save a valid file.

@rustatian
Copy link
Member

@revenkroz I'm a little suspicious of that line:

parse_str((string)$request->getBody(), $decodedBody);

RR sends byte-encoded data and casting it to string type might be wrong. Are you getting the same result with $decodedBody = $request->getParsedBody(); ?

P.S.: I'll check this tomorrow, I'm working on the other feature, and can't check your code right now 😭

@revenkroz
Copy link
Author

@rustatian yeah, the same thing with getParsedBody().

@rustatian
Copy link
Member

@revenkroz Oh, I missed that. Are you trying to send a file with application/x-www-form-urlencoded ? That is incorrect. If you wanted to send a file, you should use multipart/form-data Content-Type. You may read the answer here: https://stackoverflow.com/questions/4007969/application-x-www-form-urlencoded-or-multipart-form-data

@revenkroz
Copy link
Author

revenkroz commented Aug 30, 2023

@rustatian True. It should be another Content-Type or should be encoded in base64 at least. But we have an old client package for this service that sends in application/x-www-form-urlencoded and this client is used in many other services, so I'd like to keep the default PHP behavior.

@rustatian
Copy link
Member

@revenkroz You actually getting the raw body 😃 (if you use the raw_body flag). The body contains all the data you sent to the RR: https://github.com/revenkroz/rr-urlencoded-bug/blob/main/test.php#L11-#L13, it's not just a png file. So while you're trying to save that, you're also saving all the urlencoded args that aren't part of the sent png file. In case of RAW body, you have to parse it yourself.

The second part is parsed body. Your attached PNG contains a lot of symbols that should be escaped. Since you are forcibly casting the data field to a string (I'm not a PHP dev, but I think this is direct casting), you have all the escaped characters unescaped. So your file looks corrupted.

Sample of the escaped output:

{"attachments":{"0":{"name":"document (2).pdf","type":"application/pdf","data":"\ufffdPNG\r\n\u001a\n\u0000\u0000\u0000\rIHDR\u0000\u0000\u0000\u0004\u0000\u000

@revenkroz
Copy link
Author

@rustatian my bad, mistyped :) I updated the example.

In case of raw_body=true I parse with parse_str function that decode as it should be — the result is corrupted file.
In case of raw_body=false and getParsedBody() — same result.
In case of default PHP (without RR) when I parse with parse_str function like in first example — the result is ok.

I tried to fix all of this by writing my own body parser for RR app, but it is not working in all my tests.

@rustatian
Copy link
Member

Could you please save the data you receive inside the RR (with and without raw) and attach these files to your repository?

@revenkroz
Copy link
Author

@rustatian here: https://github.com/revenkroz/rr-urlencoded-bug/tree/main/results

I updated the code to save data from request. Now when testing, you only need to change raw_body param and corresponding .png will be updated in results dir.

@rustatian
Copy link
Member

Uh, yeah, I think that's because we're sending the JSON from RR to the PHP worker. We're moving to the PROTO protocol, so we won't have these weird content issues like PNG files.
I think you need to encode this file with the base64 to get the needed output. Are there any problems doing this in the legacy system you mentioned?

@rustatian
Copy link
Member

@revenkroz Or you can try multipart Content-Type 😃

@github-project-automation github-project-automation bot moved this to Backlog in General Aug 30, 2023
@rustatian rustatian added this to the v2024 milestone Aug 30, 2023
@rustatian rustatian moved this from Backlog to Todo in General Aug 30, 2023
@revenkroz
Copy link
Author

@rustatian Sounds great! I will test it after the move from json.

Yeah, we have a problem with our legacy codebase - I think we should try to use another methods (multipart or base64), but update will be hard :) Thanks for help anyway!

@rustatian
Copy link
Member

@revenkroz I'll leave this bug open, when I implement RAW proto payloads, I'll ping you here 😃

@Starfox64
Copy link

@rustatian I'm unsure if this is the direct cause of Author's problem but in raw_body mode, the body is decoded instead of being sent as is.
https://github.com/roadrunner-server/http/blob/0f8a315780502eb234760919d99e43048c4542eb/handler/request.go#L114

This is incorrect as it breaks value escaping done by the browser (when the user inserts & [ ] = in a field) and will prevent accurate usage of parse_str. This might also be a security issue as it allows injecting query parameters from a form field.

This is hitting us particularly hard as we are trying to work around #1634 by parsing urlencoded bodies manually but are unable to do so since the raw body is also corrupted.

@rustatian
Copy link
Member

Hey @Starfox64 👋
I'm working on the fix. Basically, we have to do that because of JSON communication. I'm working on a protobuf-based payloads between the PHP workers and RR to completely resolve this problem. I'm sorry that this makes your usage of RR uncomfortable, but very soon that problem will disappear 😃

@rustatian
Copy link
Member

Hey everyone, we tested this scenario with the new proto encoder, everything works good. You'll need to use raw_body: true option to receive untouched by RR payload.

@revenkroz
Copy link
Author

@rustatian Hi! I compiled RR with github.com/roadrunner-server/http/v4@feat/new-algo mod, but but I am now getting a different error. When I run code from my demo repo this message appears:

write response (chunk) error	{"start": "2024-03-29T09:42:16+0000", "elapsed": 2, "error": "unknown payload type: 0"}

@rustatian
Copy link
Member

Hey @revenkroz 👋
Did you update PHP worker (composer.json) dependencies?

@revenkroz
Copy link
Author

@rustatian can confirm, it works! With installed spiral/roadrunner-http:dev-feature-proto-payloads and raw_body=true everything works as expected.

@rustatian
Copy link
Member

Cool 👍
The PR is still in progress and will be merged when the PHP deps would be released.

@rustatian
Copy link
Member

BTW: do not forget to install the protobuf dependency via pecl install protobuf. This allows PHP to parse protobuf payloads almost 2 times faster.

@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Jira 😄 Apr 1, 2024
@msmakouz
Copy link
Member

msmakouz commented Apr 2, 2024

@rustatian can confirm, it works! With installed spiral/roadrunner-http:dev-feature-proto-payloads and raw_body=true everything works as expected.

Hi, the package roadrunner-php/http has already been released, version 3.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-bug Bug: bug, exception
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

4 participants