Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mangling makeover #1517

Merged
merged 14 commits into from
Mar 13, 2018
10 changes: 10 additions & 0 deletions NEWS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,20 @@ Unreleased

Other Breaking Changes
------------------------------
* Mangling rules have been overhauled, such that mangled names
are always legal Python identifiers
* `_` and `-` are now equivalent even as single-character names

* The REPL history variable `_` is now `*1`

* Non-shadow unary `=`, `is`, `<`, etc. now evaluate their argument
instead of ignoring it. This change increases consistency a bit
and makes accidental unary uses easier to notice.

New Features
------------------------------
* Added `mangle` and `unmangle` as core functions

Bug Fixes
------------------------------
* Fix `(return)` so it works correctly to exit a Python 2 generator
Expand Down
138 changes: 1 addition & 137 deletions docs/language/api.rst
Original file line number Diff line number Diff line change
@@ -1,142 +1,6 @@
=================
Hy (the language)
=================

.. warning::
This is incomplete; please consider contributing to the documentation
effort.


Theory of Hy
============

Hy maintains, over everything else, 100% compatibility in both directions
with Python itself. All Hy code follows a few simple rules. Memorize
this, as it's going to come in handy.

These rules help ensure that Hy code is idiomatic and interfaceable in both
languages.


* Symbols in earmuffs will be translated to the upper-cased version of that
string. For example, ``foo`` will become ``FOO``.

* UTF-8 entities will be encoded using
`punycode <https://en.wikipedia.org/wiki/Punycode>`_ and prefixed with
``hy_``. For instance, ``⚘`` will become ``hy_w7h``, ``♥`` will become
``hy_g6h``, and ``i♥u`` will become ``hy_iu_t0x``.

* Symbols that contain dashes will have them replaced with underscores. For
example, ``render-template`` will become ``render_template``. This means
that symbols with dashes will shadow their underscore equivalents, and vice
versa.

Notes on Syntax
===============

numeric literals
----------------

In addition to regular numbers, standard notation from Python 3 for non-base 10
integers is used. ``0x`` for Hex, ``0o`` for Octal, ``0b`` for Binary.

.. code-block:: clj

(print 0x80 0b11101 0o102 30)

Underscores and commas can appear anywhere in a numeric literal except the very
beginning. They have no effect on the value of the literal, but they're useful
for visually separating digits.

.. code-block:: clj

(print 10,000,000,000 10_000_000_000)

Unlike Python, Hy provides literal forms for NaN and infinity: ``NaN``,
``Inf``, and ``-Inf``.

string literals
---------------

Hy allows double-quoted strings (e.g., ``"hello"``), but not single-quoted
strings like Python. The single-quote character ``'`` is reserved for
preventing the evaluation of a form (e.g., ``'(+ 1 1)``), as in most Lisps.

Python's so-called triple-quoted strings (e.g., ``'''hello'''`` and
``"""hello"""``) aren't supported. However, in Hy, unlike Python, any string
literal can contain newlines. Furthermore, Hy supports an alternative form of
string literal called a "bracket string" similar to Lua's long brackets.
Bracket strings have customizable delimiters, like the here-documents of other
languages. A bracket string begins with ``#[FOO[`` and ends with ``]FOO]``,
where ``FOO`` is any string not containing ``[`` or ``]``, including the empty
string. For example::

=> (print #[["That's very kind of yuo [sic]" Tom wrote back.]])
"That's very kind of yuo [sic]" Tom wrote back.
=> (print #[==[1 + 1 = 2]==])
1 + 1 = 2

A bracket string can contain newlines, but if it begins with one, the newline
is removed, so you can begin the content of a bracket string on the line
following the opening delimiter with no effect on the content. Any leading
newlines past the first are preserved.

Plain string literals support :ref:`a variety of backslash escapes
<py:strings>`. To create a "raw string" that interprets all backslashes
literally, prefix the string with ``r``, as in ``r"slash\not"``. Bracket
strings are always raw strings and don't allow the ``r`` prefix.

Whether running under Python 2 or Python 3, Hy treats all string literals as
sequences of Unicode characters by default, and allows you to prefix a plain
string literal (but not a bracket string) with ``b`` to treat it as a sequence
of bytes. So when running under Python 3, Hy translates ``"foo"`` and
``b"foo"`` to the identical Python code, but when running under Python 2,
``"foo"`` is translated to ``u"foo"`` and ``b"foo"`` is translated to
``"foo"``.

.. _syntax-keywords:

keywords
--------

An identifier headed by a colon, such as ``:foo``, is a keyword. Keywords
evaluate to a string preceded by the Unicode non-character code point U+FDD0,
like ``"\ufdd0:foo"``, so ``:foo`` and ``":foo"`` aren't equal. However, if a
literal keyword appears in a function call, it's used to indicate a keyword
argument rather than passed in as a value. For example, ``(f :foo 3)`` calls
the function ``f`` with the keyword argument named ``foo`` set to ``3``. Hence,
trying to call a function on a literal keyword may fail: ``(f :foo)`` yields
the error ``Keyword argument :foo needs a value``. To avoid this, you can quote
the keyword, as in ``(f ':foo)``, or use it as the value of another keyword
argument, as in ``(f :arg :foo)``.

discard prefix
--------------

Hy supports the Extensible Data Notation discard prefix, like Clojure.
Any form prefixed with ``#_`` is discarded instead of compiled.
This completely removes the form so it doesn't evaluate to anything,
not even None.
It's often more useful than linewise comments for commenting out a
form, because it respects code structure even when part of another
form is on the same line. For example:

.. code-block:: clj

=> (print "Hy" "cruel" "World!")
Hy cruel World!
=> (print "Hy" #_"cruel" "World!")
Hy World!
=> (+ 1 1 (print "Math is hard!"))
Math is hard!
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
=> (+ 1 1 #_(print "Math is hard!"))
2

Built-Ins
=========
=================

Hy features a number of special forms that are used to help generate
correct Python AST. The following are "special" forms, which may have
Expand Down
30 changes: 30 additions & 0 deletions docs/language/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -699,6 +699,20 @@ Returns the single step macro expansion of *form*.
HySymbol('e'),
HySymbol('f')])])

.. _mangle-fn:

mangle
------

Usage: ``(mangle x)``

Stringify the input and translate it according to :ref:`Hy's mangling rules
<mangling>`.

.. code-block:: hylang

=> (mangle "foo-bar")
'foo_bar'

.. _merge-with-fn:

Expand Down Expand Up @@ -1431,6 +1445,22 @@ Returns an iterator from *coll* as long as *pred* returns ``True``.
=> (list (take-while neg? [ 1 2 3 -4 5]))
[]

.. _unmangle-fn:

unmangle
--------

Usage: ``(unmangle x)``

Stringify the input and return a string that would :ref:`mangle <mangling>` to
it. Note that this isn't a one-to-one operation, and nor is ``mangle``, so
``mangle`` and ``unmangle`` don't always round-trip.

.. code-block:: hylang

=> (unmangle "foo_bar")
'foo-bar'

Included itertools
==================

Expand Down
1 change: 1 addition & 0 deletions docs/language/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Contents:

cli
interop
syntax
api
core
internals
21 changes: 6 additions & 15 deletions docs/language/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,17 +157,8 @@ HySymbol
``hy.models.HySymbol`` is the model used to represent symbols
in the Hy language. It inherits :ref:`HyString`.

``HySymbol`` objects are mangled in the parsing phase, to help Python
interoperability:

- Symbols surrounded by asterisks (``*``) are turned into uppercase;
- Dashes (``-``) are turned into underscores (``_``);
- One trailing question mark (``?``) is turned into a leading ``is_``.

Caveat: as the mangling is done during the parsing phase, it is possible
to programmatically generate HySymbols that can't be generated with Hy
source code. Such a mechanism is used by :ref:`gensym` to generate
"uninterned" symbols.
Symbols are :ref:`mangled <mangling>` when they are compiled
to Python variable names.

.. _hykeyword:

Expand Down Expand Up @@ -340,7 +331,7 @@ Since they have no "value" to Python, this makes working in Hy hard, since
doing something like ``(print (if True True False))`` is not just common, it's
expected.

As a result, we auto-mangle things using a ``Result`` object, where we offer
As a result, we reconfigure things using a ``Result`` object, where we offer
up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can
be used to get the value of whatever was just run. Hy does this by forcing
assignment to things while running.
Expand All @@ -352,11 +343,11 @@ As example, the Hy::
Will turn into::

if True:
_mangled_name_here = True
_temp_name_here = True
else:
_mangled_name_here = False
_temp_name_here = False

print _mangled_name_here
print _temp_name_here


OK, that was a bit of a lie, since we actually turn that statement
Expand Down
41 changes: 6 additions & 35 deletions docs/language/interop.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@ Hy <-> Python interop
Despite being a Lisp, Hy aims to be fully compatible with Python. That means
every Python module or package can be imported in Hy code, and vice versa.

:ref:`Mangling <mangling>` allows variable names to be spelled differently in
Hy and Python. For example, Python's ``str.format_map`` can be written
``str.format-map`` in Hy, and a Hy function named ``valid?`` would be called
``is_valid`` in Python. In Python, you can import Hy's core functions
``mangle`` and ``unmangle`` directly from the ``hy`` package.

Using Python from Hy
====================

Expand All @@ -27,41 +33,6 @@ You can use it in Hy:

You can also import ``.pyc`` bytecode files, of course.

A quick note about mangling
--------

In Python, snake_case is used by convention. Lisp dialects tend to use dashes
instead of underscores, so Hy does some magic to give you more pleasant names.

In the same way, ``UPPERCASE_NAMES`` from Python can be used ``*with-earmuffs*``
instead.

You can use either the original names or the new ones.

Imagine ``example.py``::

def function_with_a_long_name():
print(42)

FOO = "bar"

Then, in Hy:

.. code-block:: clj

(import example)
(.function-with-a-long-name example) ; prints "42"
(.function_with_a_long_name example) ; also prints "42"

(print (. example *foo*)) ; prints "bar"
(print (. example FOO)) ; also prints "bar"

.. warning::
Mangling isn’t that simple; there is more to discuss about it, yet it doesn’t
belong in this section.
.. TODO: link to mangling section, when it is done


Using Hy from Python
====================

Expand Down
Loading