-
Notifications
You must be signed in to change notification settings - Fork 7.6k
[I18N]: File name with non-ASCII character displays incorrectly in URL hints. #5357
Comments
Reviewed. Medium priority to @redmunds |
@redmunds, the most obvious and simple solution for this problem is to not encode the strings, but there is a comment in the code: |
The comment I made about encoding had space characters in mind, which should be encoded in urls. I don't think that non-ASCII chars should be encoded in urls -- am I wrong about that? |
@redmunds, no you are not wrong. Just wanted to clarify the intention. I will create a fix that only encodes space characters. |
I just used space char as an example -- there are many other chars that need to be encoded in urls. I think using Maybe one of our i18n watch dogs ( @RaymondLim, @SAplayer ) can comment on encoding non-ASCII chars in URLs. |
@redmunds Non-ASCII chars need to be encoded in URLs as well, but showing encoded characters in hint list is definitely a usability issue that we need to solve somehow. |
I wanted to make sure that user saw what was going to be put in the page, but I can see it's not easy to use. I think the solution is to display both:
How does that sound? |
Oops, yeah, my error. Didn't realize URLs had such a limited character set. Chrome just allows any characters (including spaces) so I didn't see the problem testing with Live Development. Seems like the intuitive solution is to show the decoded characters (non-ASCII, space, etc) in the code hint list and encode them when they are inserted into the editor. That makes it easy to select the correct file name while still providing valid URLs in the editor. I imagine anyone familiar enough with HTML isn't going to be too confused when the encoded characters show up in the editor, and it is worth that temporary confusion to make file selection easier. What do you all think about this solution? |
@lkcampbell @SAplayer |
@RaymondLim, if I understand what you are saying, I have a file called "foo bar.html", I should get a match if I partially type "foo%2" but not if I partially type "foo b"? It is a little strange to match this in the other direction, typing in the editor and updating the code hint list. It is going to feel weird if you see "foo bar.html" as a selection in the code hint list, but the moment you type "foo ", the list disappears. It will seem like you are typing what the Code Hint is telling you to type and then it is failing. @redmunds, maybe your idea is better, to include both the encoded and normal versions in the hint list. |
@lkcampbell No, I'm not talking about typing url-encoded path. I'm talking about setting the cursor in an existing url-encoded path. Taking your example, if the cursor is after "foo%20" and before "bar.html", then you need to bold "foo ", but not "bar.html" in the code hint list. And as the user moves the cursor to the left, you may need to update the highlighted text in the hint list, but if the cursor is inside a specific % value, then we probably should exclude that particular character from bolded characters in the hint list. %20 in your example may not be a good one for highlighting since we can't highlight white space in the list. And regarding the user typing characters that need to be url encoded, we still need to match as is without the conversion. That is, we should always show the partial match regardless of whether the existing url is the one we inserted or the one user types in. And I can see some edge cases where the user is typing characters using % format OR the user is fixing an existing %-encoded url by typing raw high-ascii characters. Ideally, we should be able to handle all these edge cases in filtering and highlighting the match in the hint list. |
@RaymondLim yes, I understand what you are saying. Based on that idea, @redmunds has the most reasonable solution. We definitely need to have both versions in the Code Hint List. What if we make it work like the HTML Entity Code Hints works? If there is at least one encoded character, we display two columns. The left column contains the encoded version and it highlights as you type in the path or file name. The right column contains the normal, readable version and it doesn't highlight anything at all. How does this sound? |
Honestly, I don't like the idea of having two strings -- one encoded and the other not encoded -- since it will require the hint list a lot wider. And if we want to show both strings, then I would say show human readable one on the left, not on the right. |
Okay, that's not the design used by HTML Entity Hints, though. I agree, however, that the hint list could get pretty wide. The only other idea I have is just to ignore the outdated constraints of RFC 3986's view of the URI, and let people put in any characters they want. Chrome doesn't seem to mind it. I imagine most modern browsers know how to deal with Unicode characters and white space in their URLs. We present normal names in the list and inject normal names into the editor and leave it up to the HTML developers to solve the file name and encoding problems themselves, if they even occur. |
I agree that modern browsers can deal with non-ascii characters in urls without the url encoding required in RFC 3986. But special characters ( |
I haven't followed the entire thread, but I think the right tradeoff is to have the strings be unencoded in the hint list and encoded on insertion (i.e., in the code hint list you would show "my file" and when inserting it you insert "my%20file". I don't think developers will be confused by that since they know that URLs need to be encoded. It's a little different from the HTML entities case, I think, because in that case you're specifically looking for a special character to type. It would be a nice bonus to make it so that if they manually type "my%20folder" we match "my folder", but that seems like an extra feature. |
@njx, that sounds fine to me. I can provide a fix for the first scenario pretty easily. The bonus scenario, not sure, don't know Code Hints source well enough to estimate the difficulty. Will look at it this weekend. One question on the bonus scenario, if the user types "my f" instead of "my%20f", would you expect the same entry in the Code Hints list to match that string as well, or should the list dismiss itself with no matches? |
@lkcampbell |
Fix in progress. Assigning to myself. |
@RaymondLim and @njx, I submitted a fix for this issue. The first issue was easy to fix, the second issue was not as easy because you can't predict the intention of a user who enters a percentage character until the user enters the rest of the string. I used a fairly simple heuristic. It's not perfect but it performs adequately, and, considering how rare manually typed percent-encoded characters will be, it is probably good enough. |
@RaymondLim and @njx, @redmunds did his first review of my PR and, based on our discussions, I am rolling back the percent-character encoding matching code for now. I will play with the problem a bit more today, but if I can't figure out a fairly straight forward solution soon, I am going to file a separate low priority bug on the problem. The PR, as it stands right now, addresses the issue as originally presented by @julieyuan. |
New, clean PR submitted. |
FBNC back to @julieyuan. |
Thanks for your efforts. The PR has not merged into build 0.34.0-10270 (On branch master,master 2f15f5c) yet. Will check it with next build. |
Steps:
<a herf="">
,src=""
.Result:
File names are decoded and cannot display well.
Expected:
File name should not be decoded and display well.
Notes:
If I live preview a html file with name contains non-ASCII character, its name will display correctly in browser. So I think the file name for URL in Brackets should not be decoded.
ENV: MAC10.7.5 and Win8.1 English OS
Build: 0.32.0-9660
Snapshot:
Please refer to snapshot for details:
file name in browser for reference:
The text was updated successfully, but these errors were encountered: