Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLOSED] [I18N]: File name with non-ASCII character displays incorrectly in URL hints. #4927

Open
core-ai-bot opened this issue Aug 29, 2021 · 26 comments

Comments

@core-ai-bot
Copy link
Member

Issue by julieyuan
Thursday Sep 26, 2013 at 10:14 GMT
Originally opened as adobe/brackets#5357


Steps:

  1. Launch Brackets and new several files with non-ASCII character, like French character: éçàù.
  2. Put the cursor into a tag where should input URL, for example, <a herf="">, src="".
  3. Press Ctrl+Space to bring up the hints menu.

Result:
File names are decoded and cannot display well.

Expected:
File name should not be decoded and display well.

Notes:
If I live preview a html file with name contains non-ASCII character, its name will display correctly in browser. So I think the file name for URL in Brackets should not be decoded.

ENV: MAC10.7.5 and Win8.1 English OS
Build: 0.32.0-9660

Snapshot:
Please refer to snapshot for details:
3333
file name in browser for reference:
44444

@core-ai-bot
Copy link
Member Author

Comment by njx
Monday Sep 30, 2013 at 18:28 GMT


Reviewed. Medium priority to@redmunds

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 17:30 GMT


@redmunds, the most obvious and simple solution for this problem is to not encode the strings, but there is a comment in the code:
// code hints show the same strings that are inserted into text,
// so strings in list will be encoded. wysiwyg, baby!
which implies that this behavior is intentional. Is it intentional?

@core-ai-bot
Copy link
Member Author

Comment by redmunds
Friday Oct 18, 2013 at 17:45 GMT


The comment I made about encoding had space characters in mind, which should be encoded in urls. I don't think that non-ASCII chars should be encoded in urls -- am I wrong about that?

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 17:55 GMT


@redmunds, no you are not wrong. Just wanted to clarify the intention.

I will create a fix that only encodes space characters.

@core-ai-bot
Copy link
Member Author

Comment by redmunds
Friday Oct 18, 2013 at 18:23 GMT


I just used space char as an example -- there are many other chars that need to be encoded in urls. I think using encodeURI() is correct.

Maybe one of our i18n watch dogs (@RaymondLim,@SAPlayer ) can comment on encoding non-ASCII chars in URLs.

@core-ai-bot
Copy link
Member Author

Comment by RaymondLim
Friday Oct 18, 2013 at 18:48 GMT


@redmunds Non-ASCII chars need to be encoded in URLs as well, but showing encoded characters in hint list is definitely a usability issue that we need to solve somehow.

@core-ai-bot
Copy link
Member Author

Comment by redmunds
Friday Oct 18, 2013 at 18:53 GMT


I wanted to make sure that user saw what was going to be put in the page, but I can see it's not easy to use.

I think the solution is to display both:

  • readable string (with encoded string in parens)
  • Encoded string should only be displayed if there is 1 or more encoded chars.

How does that sound?

@core-ai-bot
Copy link
Member Author

Comment by MarcelGerber
Friday Oct 18, 2013 at 18:55 GMT


For me, this works just fine:
unfiltered.push(entryStr); (src/extensions/default/UrlCodeHints/main.js#L144)

See these screens:
image
image
image
image

PS: Thanks for being an official i18n watchdog 👍

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 19:51 GMT


Oops, yeah, my error. Didn't realize URLs had such a limited character set. Chrome just allows any characters (including spaces) so I didn't see the problem testing with Live Development.

Seems like the intuitive solution is to show the decoded characters (non-ASCII, space, etc) in the code hint list and encode them when they are inserted into the editor. That makes it easy to select the correct file name while still providing valid URLs in the editor.

I imagine anyone familiar enough with HTML isn't going to be too confused when the encoded characters show up in the editor, and it is worth that temporary confusion to make file selection easier.

What do you all think about this solution?

@core-ai-bot
Copy link
Member Author

Comment by RaymondLim
Friday Oct 18, 2013 at 19:58 GMT


@lkcampbell@SAPlayer
You also need to highlight the partial match of decoded url in hint list as the user moves the cursor between an existing url in the editor using arrow keys.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 21:26 GMT


@RaymondLim, if I understand what you are saying, I have a file called "foo bar.html", I should get a match if I partially type "foo%2" but not if I partially type "foo b"?

It is a little strange to match this in the other direction, typing in the editor and updating the code hint list. It is going to feel weird if you see "foo bar.html" as a selection in the code hint list, but the moment you type "foo ", the list disappears. It will seem like you are typing what the Code Hint is telling you to type and then it is failing.

@redmunds, maybe your idea is better, to include both the encoded and normal versions in the hint list.

@core-ai-bot
Copy link
Member Author

Comment by RaymondLim
Friday Oct 18, 2013 at 21:41 GMT


@lkcampbell No, I'm not talking about typing url-encoded path. I'm talking about setting the cursor in an existing url-encoded path. Taking your example, if the cursor is after "foo%20" and before "bar.html", then you need to bold "foo ", but not "bar.html" in the code hint list. And as the user moves the cursor to the left, you may need to update the highlighted text in the hint list, but if the cursor is inside a specific % value, then we probably should exclude that particular character from bolded characters in the hint list. %20 in your example may not be a good one for highlighting since we can't highlight white space in the list.

And regarding the user typing characters that need to be url encoded, we still need to match as is without the conversion. That is, we should always show the partial match regardless of whether the existing url is the one we inserted or the one user types in.

And I can see some edge cases where the user is typing characters using % format OR the user is fixing an existing %-encoded url by typing raw high-ascii characters. Ideally, we should be able to handle all these edge cases in filtering and highlighting the match in the hint list.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 22:02 GMT


@RaymondLim yes, I understand what you are saying.

Based on that idea,@redmunds has the most reasonable solution. We definitely need to have both versions in the Code Hint List.

What if we make it work like the HTML Entity Code Hints works? If there is at least one encoded character, we display two columns. The left column contains the encoded version and it highlights as you type in the path or file name. The right column contains the normal, readable version and it doesn't highlight anything at all.

How does this sound?

@core-ai-bot
Copy link
Member Author

Comment by RaymondLim
Friday Oct 18, 2013 at 22:07 GMT


Honestly, I don't like the idea of having two strings -- one encoded and the other not encoded -- since it will require the hint list a lot wider. And if we want to show both strings, then I would say show human readable one on the left, not on the right.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 22:19 GMT


Okay, that's not the design used by HTML Entity Hints, though.

I agree, however, that the hint list could get pretty wide.

The only other idea I have is just to ignore the outdated constraints of RFC 3986's view of the URI, and let people put in any characters they want. Chrome doesn't seem to mind it. I imagine most modern browsers know how to deal with Unicode characters and white space in their URLs.

We present normal names in the list and inject normal names into the editor and leave it up to the HTML developers to solve the file name and encoding problems themselves, if they even occur.

@core-ai-bot
Copy link
Member Author

Comment by RaymondLim
Friday Oct 18, 2013 at 22:27 GMT


I agree that modern browsers can deal with non-ascii characters in urls without the url encoding required in RFC 3986. But special characters (&, white space and quotes) in ascii range still need to be url-encoded. So even if we decide not to encode high ascii and double bytes, we still have the same issue in showing up percent-encoded characters in the hint list for those special characters in the ascii range.

@core-ai-bot
Copy link
Member Author

Comment by njx
Friday Oct 18, 2013 at 22:37 GMT


I haven't followed the entire thread, but I think the right tradeoff is to have the strings be unencoded in the hint list and encoded on insertion (i.e., in the code hint list you would show "my file" and when inserting it you insert "my%20file". I don't think developers will be confused by that since they know that URLs need to be encoded. It's a little different from the HTML entities case, I think, because in that case you're specifically looking for a special character to type.

It would be a nice bonus to make it so that if they manually type "my%20folder" we match "my folder", but that seems like an extra feature.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Friday Oct 18, 2013 at 22:49 GMT


@njx, that sounds fine to me. I can provide a fix for the first scenario pretty easily.

The bonus scenario, not sure, don't know Code Hints source well enough to estimate the difficulty. Will look at it this weekend.

One question on the bonus scenario, if the user types "my f" instead of "my%20f", would you expect the same entry in the Code Hints list to match that string as well, or should the list dismiss itself with no matches?

@core-ai-bot
Copy link
Member Author

Comment by RaymondLim
Friday Oct 18, 2013 at 23:41 GMT


@lkcampbell
In general, we should not expect the user to type in something like "my%20f" even though some may try to do that. So my answer to your question is yes. We need to match "my f" in filtering url hint list and bolding all the matched entries up to "f".

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Wednesday Oct 23, 2013 at 18:03 GMT


Fix in progress. Assigning to myself.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Thursday Oct 24, 2013 at 22:46 GMT


@RaymondLim and@njx, I submitted a fix for this issue. The first issue was easy to fix, the second issue was not as easy because you can't predict the intention of a user who enters a percentage character until the user enters the rest of the string. I used a fairly simple heuristic. It's not perfect but it performs adequately, and, considering how rare manually typed percent-encoded characters will be, it is probably good enough.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Sunday Nov 03, 2013 at 22:24 GMT


@RaymondLim and@njx,@redmunds did his first review of my PR and, based on our discussions, I am rolling back the percent-character encoding matching code for now. I will play with the problem a bit more today, but if I can't figure out a fairly straight forward solution soon, I am going to file a separate low priority bug on the problem.

The PR, as it stands right now, addresses the issue as originally presented by@julieyuan.

@core-ai-bot
Copy link
Member Author

Comment by lkcampbell
Tuesday Nov 05, 2013 at 14:21 GMT


New, clean PR submitted.

@core-ai-bot
Copy link
Member Author

Comment by redmunds
Tuesday Nov 05, 2013 at 16:01 GMT


FBNC back to@julieyuan.

@core-ai-bot
Copy link
Member Author

Comment by julieyuan
Wednesday Nov 06, 2013 at 03:12 GMT


Thanks for your efforts. The PR has not merged into build 0.34.0-10270 (On branch master,master 2f15f5c) yet. Will check it with next build.

@core-ai-bot
Copy link
Member Author

Comment by julieyuan
Thursday Nov 07, 2013 at 05:34 GMT


Fixed in build 0.34.0-10292,thanks. So closing it and here is the snapshot for reference:
2013-11-06 9 30 44 pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant