-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Main thread panic with fetch() #7208
Comments
Thanks for the very detailed reproduction. |
@lucacasonato np, wish I could do more, but don't even know where to start debugging the rust side of Deno =/ (happy to learn if you point me to some useful resources). Until now I ran trough about 1500 images of which 11 caused a panic, all with the same kind of pattern I described above, the content-disposition header always has some encoded characters in it when stuff breaks... so I'd definitely go down that rabbit hole first if I would know how to debug this. Updated the initial comment with the new data. This sounds kinda like what seems to happen with
from https://stackoverflow.com/questions/36362020/what-is-unwrap-in-rust-and-what-is-it-used-for |
From the docs of the reqwest:
I think 'filename' field could be encoded wrong. IMO the we could ignore this problem by matching against the error and ignore the error. At Least the deno wont crash r8? For the real solution we need to split the "content-disposition" around ";" and ignore "filename" field when possible. But it seems a painful task at least for me. If someone confirms my analysis I might help out. |
@mustafapc19 I wouldn't be surprised at all if the filename is encoded wrong, at least based on what I know about the platform that does the "encoding" =)...but it's a proprietary black box, so nothing I could change on how that is done, therefore I agree it would probably be best to simply ignore the error to_str throws... |
I might be reproducing the same issue. It's not exactly on the same line but I'm using 1.32.
Here is the script I ran, along with the backtrace resulting from I'm running deno (https://github.com/denoland/deno/releases/tag/v1.3.2) on Manjaro Linux. |
I think this can be mitigated if convert the "val" there into a byte array and split it on ";" and ignore the filename in case of error. But like I said I dont know if this would be worth the pain. Or we could ignore it the encoding error. But I don't think we should let the server crash. If someone greenlit this ? @lucacasonato |
A very odd property of HTTP is that header values do not need to be strings - they can be possibly raw binary data.
It appears the "content-disposition" header has some non-utf8 values in it. Valid HTTP but creates a bug in deno. What does browser |
@ry I might be wrong but "content-disposition" only makes sense in a browser r8?, as per as I read up on MDN. So I don't really get the reason why deno should be even processing this header. But yeah. Wrong binary encoding will make deno crash on some other header too and can be used for crashing server. So I do think we should ignore and maybe warn,when "to_str()" returns an error, instead of "unwrapping" |
I think technically that non-utf8 char should be a 'ü' character (check the utf8 encoded second version and decode it), which is definitely part of utf8, the source somehow messes that up... but the same could happen with other sources too or even be used malicously... Browser
|
I seem to have a talent for causing Deno-panics^^, looking for a workaround until the fetch issue is resolved, I implemented a simple download replacement with curl:
it doesn't have problems with those strange characters in the header, but it crashes kinda randomly with always the same error:
when I restart the script, it seems to download the file without problems (need to investigate further) until it panics again at some point, so this one is harder to reproduce (at least can't do it reliably until now)... but it could also be that it is an implementation error on my side with EDIT: I actually didn't have that part with EDIT 2: After trying everything under the sun (including killing the child process with a timeout etc.) it turned out that the ******* |
First off, if anything I say is wrong, don't hesitate to correct me.
The problem with variable-byte encodings is that you can't just ignore the header since you don't even know where it ends. If any part of it isn't UTF-8, you can't proceed with the iteration, therefore you can't even determine where the header ends.
The weird part about this is that German umlauts aren't ASCII control characters, and are perfectly visible. Unrelated to the thread, but, since you seemed interested, if you really wanted to learn Rust, you should just check out their site: https://www.rust-lang.org/learn. Regarding the problem, can the value be read as a byte array, like |
I tried looking up how the code should behave, and it gets a bit complicated.
Main reference: https://www.cuba-platform.com/blog/utf-8-in-http-headers/ So I think the key to solving this issue is to first not assume that the headers are UTF-8, and instead try to go by the spec (using ISO-8859-1), then add handling of UTF-8 headers using guesstimation - in order to improve support for out-of-spec real world scenarios. |
We've had a major overhaul of |
#11070 should fix this. |
Confirmed @11070 fixes this. The original reproduction does not panic anymore. |
I'm running into some strange issues when fetching image files. Pulled my hair out trying to catch the problematic portion...turns out the problematic portion is simply
fetch()
itself =), and there is no catching or working around the issue, it simply crashes each time it encounters whatever the problem is. I reduced what I'm doing to a (hopefully) reproducible snippet:the error I'm getting is:
thread 'main' panicked at 'called Result::unwrap() on an Err value: ToStrError { _priv: () }', cli/ops/fetch.rs:96:42
I thought at first it might be some strange thing with one specific image/file, but then I skipped that one and soon thereafter ran into the same crash again, so it isn't only one specific file, I think it is more something within the headers of the response that fails not with the actual file-data, because the rust stack trace points to this:
from here
deno/cli/ops/fetch.rs
Line 97 in 6d964fc
I'm assuming it is specifically this part:
val.to_str().unwrap().to_owned()
as it calls.unwrap()
which is mentioned in the stack trace...I had a closer look at the headers (via browser inspector > network tab) and the only potentially problematic thing I'm seeing for ALL "panic" files is the
content-disposition
header that they contain the actual filename the image was uploaded as and when they have (seemingly?) encoded German umlauts in their names, things break. As far as I can take from MDN:so technically that would mean the
filename*
one is taken, which seems properly encoded? But yeah, just guessing here, the pattern seems quite constant though...Might be somehow related to those #6649 #6965
PS: Just upgraded to Deno 1.3.1, same issue, but had the same on the version before that, don't remember what it was, but for sure 1.x!
The text was updated successfully, but these errors were encountered: