Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent string length limits on frontend and backend #1012

Closed
lukasIO opened this issue Jul 28, 2022 · 6 comments
Closed

Inconsistent string length limits on frontend and backend #1012

lukasIO opened this issue Jul 28, 2022 · 6 comments

Comments

@lukasIO
Copy link

lukasIO commented Jul 28, 2022

Describe the bug

Similar to the recently fixed file size calculation mismatch for File Upload fields and server-side validation. this one is about the charcter count mismatch.
The server implementation uses strlength which counts bytes, the frontend uses innerHtml.length which counts characters.
For multi-byte UTF8 characters there's a mismatch between both.

Steps to reproduce

  1. use a single line text field with limit of 255 chars
  2. paste in a bunch of emojis
  3. submit
  4. see client validation pass and server validation fail

Form settings

  • Multi-page form: No
  • Submission Method: Ajax
  • Client-side Validation: Yes
  • Custom Form Templates: No

Craft CMS version

3.7.50

Plugin version

1.6.4

Multi-site?

No response

Additional context

https://github.com/verbb/formie/blob/craft-4/src/web/assets/frontend/src/js/fields/text-limit.js#L40-L42
Personally, I'd rather see a negative value here if something went wrong, instead of forcing it to display 0. It might lead to more confusion on the user side, if it incorrectly displays 0

@engram-design
Copy link
Member

Great point on emoji's hadn't even considered that would be a difference in behaviour.

As for showing negative numbers, it shouldn't ever be the case as we use the maxlength attribute for input and textarea elements so it shouldn't be possible to enter characters more than the maximum length. That check really was there to ensure that the t() function didn't show {num} when the value of the characters/words left was 0. As such, you shouldn't ever see a negative value.

But interestingly, maxlength doesn't work very well with emoji's, and there's nothing we can do about that at all. As such, we've actually gone ahead and removed this attribute, so now you do have negative numbers showing. I'm happy with this though, as it might be useful to over-type content and cull extra text (like Twitter).

Fixed for the next release. To get this early, change your verbb/formie requirement in composer.json to:

"require": {
  "verbb/formie": "dev-craft-3 as 1.6.5",
  "...": "..."
}

Then run composer update.

@engram-design
Copy link
Member

engram-design commented Jul 29, 2022

Actually, upon further reflection, this might be tricky. The reason being is that the string length actually differs from input (and JS, client-side checks) to saving it. We need to translate:

test 😅

to

test :sweat_smile:

In order for it to save correctly in some database engines. As you can see, the length is different once we get to server-side validation to actually save it to the database.

Respectively, the byte length above is 7 and 18. We're using byte not character length, because that's important when saving the value, with database column restrictions. I've also noted other platforms out there will do a similar thing (Twitter for instance). This is particularly important for fields with an inherent 255 character limit due to VARCHAR limits.

Instead of using Litemoji, we could use StringHelper::encodeMb4() to convert it to be saved, which would be a similar result:

test 😅 = byte length 14

Which would make the length of this emoji 9 instead of the native 2.

I'll have to give this some further thought.

@engram-design
Copy link
Member

Alright, I think I've figured out a way to make this work. The only downside is an emoji now "weighs" 9 characters long. There's nothing we can do about this I'm afraid, as we store the emoji as a HTML entity (😅) in the database, so that's the actual length of the use of emojis.

This should now be at least using the same values client and server side.

@lukasIO
Copy link
Author

lukasIO commented Jul 29, 2022

thanks for looking into this!

not sure what the solution you came up with entails, just wanted to mention that I don't think the problem is restricted to emojis only, as there's a bunch of utf8 characters that use more than one byte:
https://design215.com/toolbox/ascii-utf8.php
https://design215.com/toolbox/utf8-3byte-characters.php
https://design215.com/toolbox/utf8-4byte-characters.php

edit: ah, I see you replaced strlen with the string helper count! great :)

@engram-design
Copy link
Member

Yep, I believe that should be factored in. I've actually just made one more final fix.

On the server-side we now use strlen(LitEmoji::encodeHtml('😅')). This will convert multibyte characters to their HTML entity equivalent, which is what we also save to the database. Now that it's no longer a multibyte string, a strlen will suffice to check.

And on the front-end, we do the same thing, thanks to a nifty function.

So - very long answer, but we should be sorted now. 😅

@engram-design
Copy link
Member

Fixed in 1.6.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants