Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add | to the list of valid symbol chars #39

Merged
merged 1 commit into from
Mar 28, 2022

Conversation

ikappaki
Copy link
Contributor

Hi,

would you please consider path to add | to the list of valid symbol chars.

Fixes #38.

Also added a test entry.

Thanks.

@plexus
Copy link
Collaborator

plexus commented Mar 14, 2022

Not so sure about this, neither the clojure docs nor the EDN spec mention pipes in symbols

@ikappaki
Copy link
Contributor Author

Not so sure about this, neither the clojure docs nor the EDN spec mention pipes in symbols

Good point, question posted https://ask.clojure.org/index.php/11627/the-pipe-char-considered-valid-symbol-constituent-character

@ikappaki
Copy link
Contributor Author

The official reply from https://ask.clojure.org/index.php/11627/the-pipe-char-considered-valid-symbol-constituent-character is

You should not use |, it is being reserved for possible delimited names.

Specifically, something like :one|two 'three|four might in the future actually read as a single
 keyword with embedded space and quote.

and

See https://clojure.org/guides/faq#unreadable_keywords for why we don't validate this. 
By using characters not on the reader list, you are in undefined behavior territory.

Given that you can still have characters in symbols and keywords not from the official list but still resulting in working programs, is there an appetite to create an option (say passing an optional :symbol-parse-mode 'loose while the default being 'strict) to accommodate this behavior?

My use case that results to | in keywords is from keywordizing maps constructed from external data and I'd rather live with the low risk of failing on a future clojure release because of that than using strings for keys or replacing non-conforming chars to something else.

Thanks

@plexus
Copy link
Collaborator

plexus commented Mar 20, 2022

I don't think that's a good idea. It's going to encourage people to write code that will eventually break, and puts us in a spot where we are likely forced to do breaking changes in the future.

I would suggest a much simpler solution. Stick that list of special characters into a separate variable, so people can rebind it if they want to.

(setq parseclj-lex-symbol-special-chars '(...)); or defvar?

;; User code
(let (( parseclj-lex-symbol-special-chars '(...)))
  ...parse...)

@ikappaki ikappaki force-pushed the issue/symbol-pipe-char branch from 2ea03fa to bd11cf4 Compare March 26, 2022 21:08
@ikappaki
Copy link
Contributor Author

ikappaki commented Mar 26, 2022

Hi @plexus, sorry for the delay in replying.

I took the liberty to move the list of valid characters into a custom option with a comment to indicate that the value can be changed but at the user's own risk. I though it valuable for discoverability to have this as a custom option. Let me know if you think a custom option is too much, and I can move into a variable as originally suggested.

Thanks

@plexus plexus merged commit b04eae6 into clojure-emacs:main Mar 28, 2022
@plexus
Copy link
Collaborator

plexus commented Mar 28, 2022

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't parse symbols with pipe chars
2 participants