Skip to content
This repository has been archived by the owner on Sep 6, 2021. It is now read-only.

Replace percent-encoded characters in URL Code Hint List with regular characters #5677

Closed
wants to merge 10 commits into from
Closed

Conversation

lkcampbell
Copy link
Contributor

This PR fixes issue #5357.

* of the URI are stripped off. Any encoded characters in the string that
* would make decodeURI() fail are interpreted as literal characters instead.
*/
UrlCodeHints.prototype._cleanAndDecodeURI = function (URI) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need a better function name. Couldn't think of a good one that encompassed exactly what this function does, which is that it makes sure that any partial, or complete, percent-encoded URIs an end user manually enters into the editor matches the hint list as best it can, without making decodeURI() throw an exception.

A percent sign could be literal ("100% cool.html"), or it could be part of a percent-encoded character ("%20" space character), or it could be a Unicode character encoding ("%C3%A7" ç character). Users will rarely be manually typing percent-encoded characters into the editor, so I used a fairly simple heuristic that performs adequately enough.

@ghost ghost assigned redmunds Oct 24, 2013
Narciso Jaramillo and others added 4 commits November 1, 2013 13:38
When autoindenting, only insert a tab if the cursor doesn't move due to CodeMirror's indentation
@redmunds
Copy link
Contributor

redmunds commented Nov 3, 2013

This look pretty good, but I noticed a couple quirks.

But, I found a bug. If I type a % that matches a char that should be encoded, then that entry stays in the list (as intended). But, if I hit Enter to insert entry into page, it inserts an extra (duplicated) char at the beginning. I'm guessing that the % is not ignored when determining what to insert in the page.

Another thing I noticed is that not all encoded chars seem to be matching. When typing %20 it matches with a space char in the file name, but when typing %A3 it does not match with ã.

Done with review.

redmunds and others added 2 commits November 3, 2013 08:46
@lkcampbell
Copy link
Contributor Author

@redmunds, for the first problem where the extra character is being inserted, can you provide me a repro scenario with specific steps I can follow?

For the second problem the encoding of ã should be %C3%A3. That's what I get using this online tool:

http://meyerweb.com/eric/tools/dencoder/

Can you test %C3%A3 and make sure it is working?

@redmunds
Copy link
Contributor

redmunds commented Nov 3, 2013

The name of the file I'm using is "high ascii ãx.css".

For the first case:

  1. Start with: @import url("");
  2. Put cursor between quotes and press Ctrl-Space
  3. Type "high%" so that "high ascii ãx.css" becomes selected
  4. Press Enter

Result:
@import url("hhigh%20ascii%20%C3%A3x.js"); Notice that it starts with "hh"

For the second case: Ooops, I didn't notice the double %-entries being inserted.

It's kind of weird how ã is matched in the list as I'm typing the first part (i.e. "%C3"), then goes away after typing the second % (i.e. "%C3%"), then shows up again after typing the complete encoding (i.e. "%C3%A3"). But this is probably pretty rare that anyone's going to do this, so if it's not an easy fix I can live with it.

@lkcampbell
Copy link
Contributor Author

@redmunds, I know, it is strange behavior. If I can't improve on it today, let's just file a separate low priority bug on it. That way, it is documented without hindering the progress on this PR.

@lkcampbell
Copy link
Contributor Author

@redmunds, I rolled back this PR to the minimum needed to fix issue #5357. Sorry about the messy diff. I think the extra file changes appeared when I switched computers this weekend. The only real change is in src/extensions/default/UrlCodeHints/main.js.

If I can get the percent-encoded matching working this weekend, I will push another commit. If not, I will file a low priority bug so we can document the problem.

@RaymondLim
Copy link
Contributor

@lkcampbell I updated your code as below and is working for me.

    UrlCodeHints.prototype._cleanAndDecodeURI = function (URI) {
        var matchResults,
            finalURI = URI;

        // If there is a partially encoded character on the end, strip it off
        matchResults = finalURI.match(/%[\dA-Fa-f]?$/); // e.g. "%", "%5", "%A", "%d"
        if (matchResults) {
            finalURI = finalURI.substring(0, matchResults.index);
        }
        var i = 0;
        do {

            // Decode any encoded characters, checking for incorrect formatting
            try {
                finalURI = decodeURI(finalURI);
                break;
            } catch (err) {
                i++;
                // do nothing, finalURI does not change...
                // treat all characters as literal characters
                finalURI = finalURI.substring(0, finalURI.lastIndexOf("%"));
            }
        } while (i < 4);

        return finalURI;
    };

@lkcampbell
Copy link
Contributor Author

@RaymondLim, so your solution mostly works, but it doesn't bold text match correctly for a file called "100% awesome.html", for example.

This is the kind of problem I have been dealing with trying to get this fix to work. Every time I fix it in one test case, a different problem pops up. I'm sure there is a reasonable solution out there given a bit more time and a clearer picture of what scenarios we want to match correctly versus scenarios we don't care about, for example, should we care about users mixing encoded characters and regular characters?

@redmunds, what are your thoughts on how we should address this "almost functioning" percent-encoded match problem?

@lkcampbell
Copy link
Contributor Author

And just a point of clarification, I do believe there is a good solution to the tricky bug, we just haven't found it yet, partially because we haven't clearly defined what the behavior of the fix should be. My concerns have more to do with having a medium priority, easy to fix bug mixed in with a low priority, more difficult bug. I think the best solution is to separate them out so we can get this PR merged and the bug closed ASAP. I can document the second bug and assign it to myself as a longer term solution.

@redmunds
Copy link
Contributor

redmunds commented Nov 4, 2013

@lkcampbell I agree that the issues should be separated. Create a new branch and pull request (to squash commits and remove unrelated changes) for the minimum fix.

I think the typing percent-encoded case is rare, so I'm ok with doing nothing and listening to the Community for feedback. Of course, you are welcome to create an issue for it, and even continuing to work on it.

@lkcampbell
Copy link
Contributor Author

@redmunds, closing this PR and submitting a new, clean PR.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants