-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SourceRange.prettyRange: escape sourceName #92
base: master
Are you sure you want to change the base?
Conversation
commonmark/src/Commonmark/Types.hs
Outdated
-- if the source name contains special characters it can lead to ambiguity when | ||
-- a filename exactly matches a fragment of syntax of the range | ||
escapeSourceName :: String -> String | ||
escapeSourceName = concatMap escapeChar | ||
where | ||
escapeChar '-' = "%-" | ||
escapeChar '%' = "%%" | ||
escapeChar ':' = "%:" | ||
escapeChar ';' = "%;" | ||
escapeChar x = [x] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actally thinking of percent encoding like in URIs.
But it could be just for the characters you mention. I think that's cleaner, because it's a recognizable format and people will know what's going on.
https://en.wikipedia.org/wiki/Percent-encoding
9ec55b0
to
50672f5
Compare
Ah, I didn't think of that because at least |
Hand-coding is the way to go; I want dependencies to be light. I'm still not quite sure whether this sort of ad hoc URI escaping is the right approach. Alternatively we could do backslash escapes. Maybe that would be more intuitive. |
deb5672
to
09b9ebb
Compare
I wrote some benchmarks for measuring the impact of the source name on parsing time: Before (on commit fa3e4e5 with the last commit applied over it)
On the most recent commit 09b9ebb
So it looks like performance doesn't degrade very heavily unless there are special characters in the filename. For reference, these tests are using source names of length 50 times the number for the test, while parsing I haven't tested variants to see if the overhead can be made lower (especially for filenames involving the special characters), but can maybe do that at somepoint in the future if you still think that is necessary.
Yes, I believe this would work. Only escaping a few of the characters seems a little bit adhoc to me, as it means that the meaning of Backslash escapes seem fine too. |
One more relevant comparison is parsing
|
Thanks for checking this. I'm not too worried about it if it doesn't significantly affect the most common case, where filenames don't have these special characters. I guess it's just a matter of figuring out what to make escapable and what sort of escaping to use:
|
Maybe backslash-escapes are the most straightforward. However, this would mean that backslashes in the filename would always be escaped (doubled); I suppose this could be an issue for Windows users? |
Fixes #91 by escaping filenames. I think technically we would only have to escape
-
and%
, but I escaped all of the special syntax to avoid inconsistency.