-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving entry failed if slug includes 4 byte characters, such as Japanese #4628
Comments
We also encounter problems with the slug generation in Craft 3.2. On sites that ran on Craft 3.1 there are entries that contain special characters in their slugs, Craft did not remove those characters in 3.1 (I have never been a fan of this but that's how Craft used to work). With Craft 3.2 those entries become unsaveable with a very weird encoding showing up after hitting the save button. While on a single site setup the slug can be corrected by hand, on a multisite setup it is completely impossible to save those entries (as the other sites that cannot be corrected by hand will throw an error when saving). Example
SidenoteIf this turns out to be another ICU problem I would strongly recommend investigating alternatives. We had problems with Craft and its dependency on ICU for downcoding before (In our case we could not upload assets cause the ICU version on a shared host was too old). There are solid PHP libraries out there that handle character downcoding very well and without the hassle on relying on the ICU tables. Additional info
|
Okay, the issue seems to be this regular expression: cms/src/helpers/ElementHelper.php Line 79 in a6ee904
The regular expression splits multibyte characters in half, after joining the strings back together an illegal string is created, e.g.: |
Yeah sorry about that, that should have been flagged as a unicode regex. Just fixed this for the next release. To get the fix early, change your "require": {
"craftcms/cms": "dev-develop#ccd3182d187fd12627da706b6acccc98df0a0f92 as 3.2.5.1",
"...": "..."
} Then run |
Thanks for quick fix! I actually just found out that Craft has a config option to downcode slugs, it's called cms/src/validators/SlugValidator.php Line 75 in c2c33cd
So non ascii characters are only removed if the slug is empty, which is never the case as the slug will be set by JavaScript in the frontend. I've just played around with a breakpoint in there and the only way I got it to trigger was by creating an entry in code, not setting the slug and saving it. So users a free to throw any fancy multi byte characters in there they like to. It would be great if we could force Craft to always remove non ascii characters from slugs, I generally don't want characters like "ä" or "ß" in my slugs and they should be replaced by "ae" or "ss". The same is true for uploaded assets, another topic where I've seen files with strange characters in their filenames uploaded. So, could we have an option like |
@sebastian-lenz The |
@brandonkelly |
@watarutmnh can you send your |
@brandonkelly I sent the data, Thank you! |
@watarutmnh Thanks! I was able to reproduce and just got it fixed for today’s 3.2.6 release. |
@brandonkelly I've just tried out the prerelease you gave in here and if I use it I get an error cause of the new version of Imagine used. It looks like there is a bug in Imagine. Should I comment here, open a new issue for Craft, a new issue for Imagine or are you aware of the problem with the new Imagine version? |
@sebastian-lenz that’s already fixed. |
We just released Craft 3.2.6 with the fix for this. |
Description
I cannot save entries if the slug includes 4-byte characters, such as Japanese and Chinese.
Steps to reproduce
こんにちは
into the slug fieldSave Entry
ButtonAdditional info
The text was updated successfully, but these errors were encountered: