Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread 'main' panicked at 'lambda runtime failed: Error("failed to convert header to a str", line: 0, column: 0)' #786

Closed
andreas-venturini opened this issue Jan 20, 2024 · 22 comments · Fixed by #800

Comments

@andreas-venturini
Copy link

andreas-venturini commented Jan 20, 2024

We use the AWS Lambda Web Adapter in combination with a Docker container image to resize images on the fly. This works great, but occasionally (though very rarely) we observe the following error in CloudWatch:

thread 'main' panicked at 'lambda runtime failed: Error("failed to convert header to a str", line: 0, column: 0)', src/main.rs:25:25

image

Any pointers on how we might debug this would be appreciated!

@calavera
Copy link
Contributor

calavera commented Jan 20, 2024

if you set RUST_LOG=trace in the environment, you'll be able to see the payload that the function receives in the log. These kind of errors are caused because the runtime cannot deserialize they payload into the struct that your function receives.

@calavera
Copy link
Contributor

Actually, now that I look at it, it might be something caused by Lambda sending the runtime an unexpected internal value 🤔

I might take a look at surfacing all that information better.

@calavera
Copy link
Contributor

I cannot find anything in the runtime that could have thrown that panic.

@andreas-venturini
Copy link
Author

Thanks for your comments.

So you're suggestion would be to set RUST_LOG=trace for our Lambda function and check the payload the function receives when these errors occur?
Is there anything performance related we should be aware of when enabling the tracing log level? Or does this just increase the log volume?

@calavera
Copy link
Contributor

It just increases the log volume, there is no difference performance wise. We don't print payloads be default because there might be sensitive information in them, and we don't want to be responsible for what your cloudwatch logs include.

@andreas-venturini
Copy link
Author

andreas-venturini commented Jan 23, 2024

Had to wait for almost 12 hours until another event occurred. I exported the CloudWatch logs to a gist https://gist.github.com/andreas-venturini/d68aa1266795b17f2ec623b16757e622

I anonymized the user IP and our AWS account id and replaced the Lambda domain name w/ XXX, other than that the logs are unchanged.

We have a CloudFront origin group w/ automatic failover to GCP Cloud Run (as we're still trialing Lambda for our use case) and Cloud Run was able to process the request without error (only difference is that Cloud Run uses x86_64 architecture).

So it seems to me the problem is either caused by the ARM Docker container service or the Lambda Rust runtime/AWS Lambda web adapter?

@andreas-venturini
Copy link
Author

I can get this to fail consistently w/ Lambda.

Maybe the UTF char in the following header trips up the Rust runtime @calavera

Schillers sch\\xc3\\xb6nste Szenenanweisungen -Kabale und Liebe.mp4.avif

@bnusunny
Copy link
Contributor

@andreas-venturini I will take a look. Could you share the version of Lambda Web Adapter you are using?

@andreas-venturini
Copy link
Author

@bnusunny we are using v0.7.2

@calavera
Copy link
Contributor

calavera commented Jan 23, 2024

It might be related to this: #509

See this specific comment: #509 (comment)

@bnusunny
Copy link
Contributor

@andreas-venturini I couldn't reproduce the error with that filename.

.header('content-disposition', "inline; filename=\"Schillers sch\\xc3\\xb6nste Szenenanweisungen -Kabale und Liebe.mp4.avif\"")

Could you share the original filename? It would be best to share the original file if possible.

@andreas-venturini
Copy link
Author

@bnusunny here is a link to the source file (12 hours valid)

@bnusunny
Copy link
Contributor

Got it. Thanks!

@bnusunny
Copy link
Contributor

I reproduced the issue with BUFFERED invoke mode. But it works with RESPONSE_STREAM invoke mode. I will dig deeper.

@bnusunny
Copy link
Contributor

bnusunny commented Jan 26, 2024

The issue happend at this line.

let body = serde_json::to_vec(&body)?;

Runtime uses serde_json::to_vec() to serialize the response object to a JSON byte vector. The filename in the content-disposition header contains non UTF-8 charactores. So the operation failed.

@calavera any suggestions on how to handle non UTF-8 charactores in the headers?

@calavera
Copy link
Contributor

Nope. Serde just doesn't support it, as explained in #509 (comment)

@bnusunny
Copy link
Contributor

@andreas-venturini Can you switch on response streaming mode? It should work. I need to figure out why.

@bnusunny
Copy link
Contributor

Wait, Schillers schönste Szenenanweisungen -Kabale und Liebe.mp4.avif is actually valid UTF-8. I need to dive deeper.

@bnusunny
Copy link
Contributor

bnusunny commented Jan 28, 2024

I think I found the root cause. This line here uses http::HeaderValue.to_str() to convert the header value to a string.

let map_value = headers[key].to_str().map_err(S::Error::custom)?;

http::HeaderValue.to_str() only allows visible ASCII letters. Instead, we can use http::HeaderValue.as_bytes() to retrieve the raw bytes and convert it to a String.

let map_value = String::from_utf8(headers[key].as_bytes().to_vec()).map_err(S::Error::custom)?;

I will send a PR soon.

@andreas-venturini
Copy link
Author

andreas-venturini commented Jan 28, 2024

@bnusunny thanks for figuring this out 👍

Can you switch on response streaming mode? It should work. I need to figure out why.

I briefly tried that, however, in response streaming mode Lambda metrics report numerous errors during function executions. What's strange: request processing does not seem to be affected in any way, nor are there any errors reported in CloudWatch or X-Ray. Are these internal Lambda errors? If so, is there a way to gain visibility into this?

On the chart one can clearly see when buffered mode was changed to response streaming and back (for buffered mode function execution errors are 0):

image

Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

@bnusunny
Copy link
Contributor

@andreas-venturini I didn't see such error rate with response streaming. The error means either the lambda function throw an exception or the lambda service got errors (such as function timeout or wrong response data format).

Could you open an issue in Lambda Web Adapter repo for this reponse streaming issue? If you can provide the original files which cause this issue, it will be very helpful. Search through the CloudWatch Logs for ERROR, you should be able to find those requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants