Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL-escaping semantics of path components in urlpath are unclear #13

Open
egnor opened this issue Jul 20, 2020 · 0 comments
Open

URL-escaping semantics of path components in urlpath are unclear #13

egnor opened this issue Jul 20, 2020 · 0 comments

Comments

@egnor
Copy link

egnor commented Jul 20, 2020

It sort-of seems like urlpath will URL-encode non-URL-safe characters in URLs:

>>> urlpath.URL('/with space')
URL('/with%20space')
>>> urlpath.URL('/with%percent')
URL('/with%25percent')

But, existing URL encoding is preserved, so I guess that was not a true escaping process?

>>> urlpath.URL('/with%20code')
URL('/with%20code')

Retrieving the path parts afterwards decodes the escapes:

>>> urlpath.URL('with%20code').parts
('with code',)

That means taking a part and then re-appending it with '/' doesn't round-trip:

>>> (urlpath.URL('') / urlpath.URL('with%2520code').parts[0]).parts[1]
'with code'

Using .with_name() does seem to escape fully:

>>> urlpath.URL('foo').with_name('with%20code')
URL('with%2520code')

I see no direct way to add/edit/replace path components in a way that would ensure escaping, which seems unfortunate. If nothing else, quoting/escaping behavior needs to be very well documented, since it can be important to security (and certainly to functionality!).

It seems like the idea is that paths as passed to the constructor or a '/' operator are expected to be pre-urlencoded (but if anything unsafe does show up, it will be escaped -- IMHO this should raise an exception instead?). But, path parts as exposed in .parts and modified by .with_name() and .with_suffix() are the "underlying" unescaped values. However, there's no way to set an unescaped path, except by using urllib.parse.quote() yourself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant