-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 encoding detection fails on PHP 8.1 #181
Comments
thx for the report
did you check whether its a intended change in php-src or maybe even a php 8.1 only bug? |
Opened php/php-src#8279 |
Note that there is the same problem in https://github.com/sabre-io/dav/blob/master/lib/DAV/StringUtil.php#L78 and https://github.com/sabre-io/vobject/blob/master/lib/StringUtil.php#L41 |
@staabm See the PHP dev answer, this change is on purpose. Can you clarify why encoding detection is needed in this case? Is the webdav protocol accepting any encoding or is it because some clients are misbehaving? If we really have to detect the encoding then current behavior is fine, if (as I hope) the protocol is enforcing utf-8 we should only detect encoding if the input is not valid utf-8. |
I can't say anything about the protocol, but technically the detection is needed because |
But how come we do not know the encoding of what we are receiving in the first place? |
I don't know.
Yep, after the discussion it seems the way forward. |
I can only guess here .... sabre dav has a long history and a lot of bug fixing went into buggy clients sending essentially 💩 . |
btw: I tracked back the history when this was added and found the commit 885ada5#diff-c8569a19ab0202cf69472c1bc9c1945632a552d660283e3c8354408803a12972R79 |
@come-nc thanks for investigating this issue. I reported this to Nextcloud repository since I noticed the problem, not only with file names but also with contact names. Since (probably) more then 90% names in Slavic languages contain one of šđčćž it's quite a problem. After reading the discussion at php/php-src#8279 I'm even more worried. It seems that sometimes PHP will guess UTF-8 and sometimes it will guess ISO-8859-1 for the same application. It seems very awkward. Shouldn't the encoding be detected from the HTTP request header or something? Unless there's is a way to specify the encoding within WebDAV protocol, I strongly suggest assuming UTF-8 by default. |
Also note that |
Use mb_check_encoding to detect encoding as mb_detect_encoding is misbehaving under PHP 8.1. Also use mb_convert_encoding instead of utf8_encode as it’s getting deprecated in PHP 8.2. Fixes sabre-io#181
Hello
Someone openned an issue at Nextcloud signaling that creating a folder with special character broke when using PHP 8.1.
It appears that this comes from mb_detect_encoding changes in PHP 8.1 that end up detecting ISO-8859-1 instead of UTF-8
The code handling this is the same as the one in you decodePathSegment at https://github.com/sabre-io/http/blob/master/lib/functions.php#L404
You can see the behavior here: https://3v4l.org/IT9dJ
It feels we should not be trying to detect encoding here but maybe assume UTF-8, or only check with mb_check_encoding (which does return true for UTF-8).
What is the reason for this test, are some dav client sending ISO-8859-1? Is that allowed by webdav protocol?
Would this be ok instead:
The text was updated successfully, but these errors were encountered: