Rethink mangling #1173

gilch · 2016-12-12T22:48:58Z

Python since version 3.0 allows much of Unicode in its identifiers. Hy's punycode mangling therefore serves no purpose but to make Unicode identifiers harder to read and Hy-Python interop more difficult. Maybe we should remove this "feature" altogether.

Some Hy features are already restricted to Python 3 or later. Unicode identifiers could be another such feature. It's probably a minority of users that need Python 2 support at this point anyway (and this will only become more true over time). They'll be able to make do with ASCII.

On the other hand, mangling of ASCII characters can be improved. Hy already converts - to _, to allow more Lispy names. Hy code would look very different without this, but it does cause some problems.

The other rules are even worse. The earmuff conversion to all caps is of dubious value. It usually indicates a dynamic variable in Lisp, but Hy doesn't have those. (maybe we could add them hylang/hyrule#51) I'd like to remove it.

Hy also converts a trailing ! to trailing _bang and a trailing ? to a leading is_. But if these characters appear anywhere else, they don't get converted. The AST mostly doesn't care when the results are not valid Python identifiers, but we have no guarantee this will continue. It's already been an issue for getargspec #1172 . There are other ASCII characters that are allowed in Hy symbols (like +) but never get converted at all.

Clojure also has to use Java identifiers, but it has a more consistent approach that we might consider emulating. Java's identifier rules are almost as strict as Python 2.

The text was updated successfully, but these errors were encountered:

Kodiologist · 2016-12-13T16:36:15Z

Python since version 3.0 allows much of Unicode in its identifiers.

By Lisp standards, it's pretty idiosyncratic. For example, λ is legal but ⚘ isn't.

Punycode mangling seems to be broken or inactive at the moment, since '⚘ returns ⚘, not hy_w7h as documented.

I never use earmuffs, so I would support that removal. Lisps are traditionally case-insensitive, but Python and hence Hy is case-sensitive, so names in all caps are just fine.

The conversions of trailing ? and ! seem fine to me except for the annoyance you mentioned in #1115, and to be honest, I feel as if we ought to send a bug to the Python people pointing out that the inconsistency. It seems pretty obvious that Python should either have is_integer and is_lower, or isinteger and islower, but not one of each.

gilch · 2016-12-14T03:53:34Z

By Lisp standards, it's pretty idiosyncratic. For example, λ is legal but ⚘ isn't.

Um, which Lisp are we talking about? I don't code in Unicode much. Do we want emojis and such in Hy identifiers? Or just the written word for other languages? Does Python allow any mathematics symbols? We might want those too.

Are the uncommon extra symbols worth making all the mangled symbols impossible for a human to read for non-latin alphabets?

If we really want both human readable mangled symbols and emojis, then punycode is out, since it all has to be ASCII alphabetic. We'd have to come up with some other encoding scheme.

I feel as if we ought to send a bug to the Python people pointing out that the inconsistency. It seems pretty obvious that Python should either have is_integer and is_lower, or isinteger and islower, but not one of each.

Feel free to send that bug, but it's not going to get us anywhere. Backwards compatibility would be more important to them at this point.

Kodiologist · 2016-12-14T05:25:13Z

My understanding is that in most Lisps, any character other than the handful that have special meaning (like parentheses and whitespace) are legal characters in a symbol. By contrast, Python 3 permits only characters with certain Unicode character properties. See https://docs.python.org/3/reference/lexical_analysis.html#identifiers

Kodiologist · 2016-12-27T20:25:02Z

I created http://bugs.python.org/issue29088.

Kodiologist · 2016-12-27T20:44:21Z

It was closed in record time. I would've thought they could create temporary aliases to the old names if backwards compatibility was a concern, but hysterical raisins strike again.

zackmdavis · 2016-12-27T20:57:05Z

It's very sad; the right time to change the standard library to uniformly use is_ would have been in Python 3.0, but now our one and only one chance to backwards-incompatibly break the world has been spent (as the Python community has learned the hard way that the world doesn't necessarily re-form afterwards).

ghost · 2017-02-09T23:55:44Z

I like earmuffs.
There is however a problem: symbols with earmuffs are transformed in their upper-cased version, whereas upper-cased symbols from Python code stay the same when imported to Hy.

(setv *test* 42) gives TEST = 42
And if I import a Python module containing TEST = 42, the translated Hy symbol is still TEST.

I think this is inconsistent.

Kodiologist · 2017-02-10T00:36:52Z

That's how mangling is supposed to work. In Hy, you can write TEST as *test* whether you defined it originally in Hy or in Python. Would you have a name TEST imported from a Python module renamed to *test*, so that (import [foo [TEST]]) would be compiled to something like from foo import TEST; globals()["*test*"] = TEST; del TEST; ? Then it would be very hard to access the imported name in Hy, because the symbol *test* wherever it appears in Hy code is translated to TEST: Hy's mangling gets in the way of accessing a variable actually named *test*.

ghost · 2017-02-10T09:00:09Z

Oh, my bad.
I didn’t realise that we could in fact access Python‘s SYMBOL as either SYMBOL or *symbol*, since well, it isn’t really documented yet ;)
And the same goes for underscore and dashes. Oh. Well, that’s nice, and I am going to write it down.

Thanks !

gilch mentioned this issue Dec 13, 2016

Should keywords be mangled? #1168

Closed

Kodiologist mentioned this issue Feb 10, 2017

Docs: Hy <-> Python interop fix #1061 #1218

Merged

Kodiologist added the complaint / disgust label May 30, 2017

gilch mentioned this issue Jul 20, 2017

add destructure to contrib #1328

Closed

Kodiologist mentioned this issue Nov 16, 2017

Mangling makeover #1458

Closed

9 tasks

Kodiologist mentioned this issue Feb 26, 2018

Mangling makeover #1517

Merged

9 tasks

Kodiologist closed this as completed in #1517 Mar 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rethink mangling #1173

Rethink mangling #1173

gilch commented Dec 12, 2016 •

edited

Loading

Kodiologist commented Dec 13, 2016

gilch commented Dec 14, 2016 •

edited

Loading

Kodiologist commented Dec 14, 2016

Kodiologist commented Dec 27, 2016

Kodiologist commented Dec 27, 2016

zackmdavis commented Dec 27, 2016

ghost commented Feb 9, 2017

Kodiologist commented Feb 10, 2017

ghost commented Feb 10, 2017 •

edited by ghost

Loading

Rethink mangling #1173

Rethink mangling #1173

Comments

gilch commented Dec 12, 2016 • edited Loading

Kodiologist commented Dec 13, 2016

gilch commented Dec 14, 2016 • edited Loading

Kodiologist commented Dec 14, 2016

Kodiologist commented Dec 27, 2016

Kodiologist commented Dec 27, 2016

zackmdavis commented Dec 27, 2016

ghost commented Feb 9, 2017

Kodiologist commented Feb 10, 2017

ghost commented Feb 10, 2017 • edited by ghost Loading

gilch commented Dec 12, 2016 •

edited

Loading

gilch commented Dec 14, 2016 •

edited

Loading

ghost commented Feb 10, 2017 •

edited by ghost

Loading