Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routing is incorrect with some special characters (e.g. dot) #1525

Closed
andreymal opened this issue Mar 21, 2019 · 16 comments · Fixed by #2031
Closed

Routing is incorrect with some special characters (e.g. dot) #1525

andreymal opened this issue Mar 21, 2019 · 16 comments · Fixed by #2031
Milestone

Comments

@andreymal
Copy link
Contributor

andreymal commented Mar 21, 2019

I want to use something like this:

@app.route('/path/to/file.<ext>')
async def generate_some_file(request, ext):
    ...

It works, but the dot is interpreted as a special regex character that represents ANY character, so there urls are working:

  • /path/to/file.txt
  • /path/to/fileQtxt
  • /path/to/file txt
  • /path/to/file%3Ftxt etc.

When I try to escape the dot:

@app.route(r'/path/to/file\.<ext>')
async def generate_some_file(request, ext):
    ...

then this route works as expected, but app.url_for('generate_some_file', ext=ext) returns broken url:

/path/to/file\.txt

Is also affects other regex characters e.g. ^, $, [, ] etc. For example,

@app.route('/path/to/fi[abcl]e.<ext>')
async def generate_some_file(request, ext):
    return response.text(app.url_for('generate_some_file', ext=ext))

When I open GET /path/to/file.txt it returns /path/to/fi[abcl]e.txt that is totally incorrect.

Note that Flask does not interpret these characters as special and it works as expected (werkzeug/routing.py uses re.escape everywhere).

I use Sanic 18.12 (because 19.3 is not yet uploaded to PyPI)

@vltr
Copy link
Member

vltr commented Mar 21, 2019

Thanks, @andreymal ! That is a known bug on the Router - it does not escapes special characters given by the developer prior to compiling into a regular expression. For now, it's good to have this pointed out in an issue - we are kind of discussing the future of our Router, as it started in another issue (#1318), became even a test parameter by @harshanarayana for us and we are open to discussion about the topic, in case you want to join us in our community forums 😉

@harshanarayana
Copy link
Contributor

harshanarayana commented Mar 21, 2019

@andreymal Did you try using a custom regex instead of the default string format for the path?
@app.route(uri="/path/to/<file:file[abc]\.(txt|jpg|gif)>")?

The above will match /path/to/filea.txt /path/to/fileb.txt /path/to/filea.gif but not /path/to/filea.jfg or /path/to/files.txt or /path/to/fileabtxt

from sanic import Sanic
from sanic.response import json

app = Sanic(__name__)

@app.route(uri="/path/to/<file:file[abc]\.(txt|jpg|gif)>")
def test(request, file):
    return json({
        "test": "message",
        "file": file
    })
app.run(port="5032")

@andreymal
Copy link
Contributor Author

@harshanarayana it works but is too complicated for me :) /path/to/file.<ext> looks nicer

@harshanarayana
Copy link
Contributor

@harshanarayana it works but is too complicated for me :) /path/to/file.<ext> looks nicer

Well, we can make it much simpler by turning it into @app.route(uri="/path/to/<file:file(.*)\.(.*)>")
😆

@stale
Copy link

stale bot commented Jun 19, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is incorrect, please respond with an update. Thank you for your contributions.

@stale stale bot added the stale label Jun 19, 2019
@andreymal
Copy link
Contributor Author

@‍stale nope, <file:file[abc]\.(txt|jpg|gif)> is still too complicated :)

@stale stale bot removed the stale label Jun 19, 2019
@vltr
Copy link
Member

vltr commented Jun 19, 2019

Yeah, well, the Router per se doesn't escape special RegEx chars (like .) when building the routes, you need to do that by yourself (or else the dot will match all). We do have ongoing conversations regarding the Router (see #1318 and comments), but no decision made so far.

@ahopkins
Copy link
Member

Once #1475 is merged, and v19.6 is out, I am turning back my attention to the Router. But, even with my hyperscan implementation, it still would likely have a problem with that since it will also be regex based. Perhaps we can think about turning off matching for a route, but I am hesitant to say this would be viable, or desirable. Now, Sanic does in theory allow you to swap out its router. I have never tried it, but it should be possible. For the 99% of use cases, I do not want to break or slow it down for this. But, I think maybe someway to opt out should be the direction.

@stale
Copy link

stale bot commented Sep 17, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is incorrect, please respond with an update. Thank you for your contributions.

@stale stale bot added the stale label Sep 17, 2019
@andreymal
Copy link
Contributor Author

@‍stale nope

@stale stale bot removed the stale label Sep 17, 2019
@Tronic
Copy link
Member

Tronic commented Sep 20, 2019

Why not simply change Sanic router so that paths are re.escaped prior to injecting dynamic elements' regex? Like @andreymal pointed out, this is how other frameworks handle it. Escaping is done during application startup and thus has no runtime performance cost. This way also the unescaped path fragments may be retained for efficient url_for implementation.

@stale
Copy link

stale bot commented Dec 19, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is incorrect, please respond with an update. Thank you for your contributions.

@stale stale bot added the stale label Dec 19, 2019
@andreymal
Copy link
Contributor Author

@‍stale I think nope (although I haven't tested it in newer versions yet)

@stale stale bot removed the stale label Dec 19, 2019
@ahopkins
Copy link
Member

That should get rid of @Stale.

@ahopkins ahopkins added this to the Routing milestone Jan 11, 2021
@ahopkins
Copy link
Member

The new version of the router continues support for regex matching:

@app.route("/path/to/<file:file\.txt>")

To handle a case like this example where you only want to capture part of the match, it will allow you to declare a single group in your regex.

@app.route("/path/to/<ext:file\.(txt)>")

That group can be named, as long as the name of the matching group is the same as the name of the parameter

# OK
@app.route("/path/to/<ext:file\.(?P<ext>txt)>")

# NOT OK
@app.route("/path/to/<ext:file\.(?P<bad>txt)>")

@ahopkins
Copy link
Member

See also #2031.

@ahopkins ahopkins modified the milestones: Routing, v21.3 Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants