Skip to content

Commit

Permalink
Merge pull request #1517 from Kodiologist/mangling-makeover
Browse files Browse the repository at this point in the history
Mangling makeover
  • Loading branch information
Kodiologist authored Mar 13, 2018
2 parents d947a27 + 4c5dea0 commit b023ebd
Show file tree
Hide file tree
Showing 26 changed files with 612 additions and 481 deletions.
10 changes: 10 additions & 0 deletions NEWS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,20 @@ Unreleased

Other Breaking Changes
------------------------------
* Mangling rules have been overhauled, such that mangled names
are always legal Python identifiers
* `_` and `-` are now equivalent even as single-character names

* The REPL history variable `_` is now `*1`

* Non-shadow unary `=`, `is`, `<`, etc. now evaluate their argument
instead of ignoring it. This change increases consistency a bit
and makes accidental unary uses easier to notice.

New Features
------------------------------
* Added `mangle` and `unmangle` as core functions

Bug Fixes
------------------------------
* Fix `(return)` so it works correctly to exit a Python 2 generator
Expand Down
138 changes: 1 addition & 137 deletions docs/language/api.rst
Original file line number Diff line number Diff line change
@@ -1,142 +1,6 @@
=================
Hy (the language)
=================

.. warning::
This is incomplete; please consider contributing to the documentation
effort.


Theory of Hy
============

Hy maintains, over everything else, 100% compatibility in both directions
with Python itself. All Hy code follows a few simple rules. Memorize
this, as it's going to come in handy.

These rules help ensure that Hy code is idiomatic and interfaceable in both
languages.


* Symbols in earmuffs will be translated to the upper-cased version of that
string. For example, ``foo`` will become ``FOO``.

* UTF-8 entities will be encoded using
`punycode <https://en.wikipedia.org/wiki/Punycode>`_ and prefixed with
``hy_``. For instance, ```` will become ``hy_w7h``, ```` will become
``hy_g6h``, and ``i♥u`` will become ``hy_iu_t0x``.

* Symbols that contain dashes will have them replaced with underscores. For
example, ``render-template`` will become ``render_template``. This means
that symbols with dashes will shadow their underscore equivalents, and vice
versa.

Notes on Syntax
===============

numeric literals
----------------

In addition to regular numbers, standard notation from Python 3 for non-base 10
integers is used. ``0x`` for Hex, ``0o`` for Octal, ``0b`` for Binary.

.. code-block:: clj
(print 0x80 0b11101 0o102 30)
Underscores and commas can appear anywhere in a numeric literal except the very
beginning. They have no effect on the value of the literal, but they're useful
for visually separating digits.

.. code-block:: clj
(print 10,000,000,000 10_000_000_000)
Unlike Python, Hy provides literal forms for NaN and infinity: ``NaN``,
``Inf``, and ``-Inf``.

string literals
---------------

Hy allows double-quoted strings (e.g., ``"hello"``), but not single-quoted
strings like Python. The single-quote character ``'`` is reserved for
preventing the evaluation of a form (e.g., ``'(+ 1 1)``), as in most Lisps.

Python's so-called triple-quoted strings (e.g., ``'''hello'''`` and
``"""hello"""``) aren't supported. However, in Hy, unlike Python, any string
literal can contain newlines. Furthermore, Hy supports an alternative form of
string literal called a "bracket string" similar to Lua's long brackets.
Bracket strings have customizable delimiters, like the here-documents of other
languages. A bracket string begins with ``#[FOO[`` and ends with ``]FOO]``,
where ``FOO`` is any string not containing ``[`` or ``]``, including the empty
string. For example::

=> (print #[["That's very kind of yuo [sic]" Tom wrote back.]])
"That's very kind of yuo [sic]" Tom wrote back.
=> (print #[==[1 + 1 = 2]==])
1 + 1 = 2

A bracket string can contain newlines, but if it begins with one, the newline
is removed, so you can begin the content of a bracket string on the line
following the opening delimiter with no effect on the content. Any leading
newlines past the first are preserved.

Plain string literals support :ref:`a variety of backslash escapes
<py:strings>`. To create a "raw string" that interprets all backslashes
literally, prefix the string with ``r``, as in ``r"slash\not"``. Bracket
strings are always raw strings and don't allow the ``r`` prefix.

Whether running under Python 2 or Python 3, Hy treats all string literals as
sequences of Unicode characters by default, and allows you to prefix a plain
string literal (but not a bracket string) with ``b`` to treat it as a sequence
of bytes. So when running under Python 3, Hy translates ``"foo"`` and
``b"foo"`` to the identical Python code, but when running under Python 2,
``"foo"`` is translated to ``u"foo"`` and ``b"foo"`` is translated to
``"foo"``.

.. _syntax-keywords:

keywords
--------

An identifier headed by a colon, such as ``:foo``, is a keyword. Keywords
evaluate to a string preceded by the Unicode non-character code point U+FDD0,
like ``"\ufdd0:foo"``, so ``:foo`` and ``":foo"`` aren't equal. However, if a
literal keyword appears in a function call, it's used to indicate a keyword
argument rather than passed in as a value. For example, ``(f :foo 3)`` calls
the function ``f`` with the keyword argument named ``foo`` set to ``3``. Hence,
trying to call a function on a literal keyword may fail: ``(f :foo)`` yields
the error ``Keyword argument :foo needs a value``. To avoid this, you can quote
the keyword, as in ``(f ':foo)``, or use it as the value of another keyword
argument, as in ``(f :arg :foo)``.

discard prefix
--------------

Hy supports the Extensible Data Notation discard prefix, like Clojure.
Any form prefixed with ``#_`` is discarded instead of compiled.
This completely removes the form so it doesn't evaluate to anything,
not even None.
It's often more useful than linewise comments for commenting out a
form, because it respects code structure even when part of another
form is on the same line. For example:

.. code-block:: clj
=> (print "Hy" "cruel" "World!")
Hy cruel World!
=> (print "Hy" #_"cruel" "World!")
Hy World!
=> (+ 1 1 (print "Math is hard!"))
Math is hard!
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
=> (+ 1 1 #_(print "Math is hard!"))
2
Built-Ins
=========
=================

Hy features a number of special forms that are used to help generate
correct Python AST. The following are "special" forms, which may have
Expand Down
30 changes: 30 additions & 0 deletions docs/language/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -699,6 +699,20 @@ Returns the single step macro expansion of *form*.
HySymbol('e'),
HySymbol('f')])])
.. _mangle-fn:

mangle
------

Usage: ``(mangle x)``

Stringify the input and translate it according to :ref:`Hy's mangling rules
<mangling>`.

.. code-block:: hylang
=> (mangle "foo-bar")
'foo_bar'
.. _merge-with-fn:

Expand Down Expand Up @@ -1431,6 +1445,22 @@ Returns an iterator from *coll* as long as *pred* returns ``True``.
=> (list (take-while neg? [ 1 2 3 -4 5]))
[]
.. _unmangle-fn:

unmangle
--------

Usage: ``(unmangle x)``

Stringify the input and return a string that would :ref:`mangle <mangling>` to
it. Note that this isn't a one-to-one operation, and nor is ``mangle``, so
``mangle`` and ``unmangle`` don't always round-trip.

.. code-block:: hylang
=> (unmangle "foo_bar")
'foo-bar'
Included itertools
==================

Expand Down
1 change: 1 addition & 0 deletions docs/language/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Contents:

cli
interop
syntax
api
core
internals
21 changes: 6 additions & 15 deletions docs/language/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,17 +157,8 @@ HySymbol
``hy.models.HySymbol`` is the model used to represent symbols
in the Hy language. It inherits :ref:`HyString`.

``HySymbol`` objects are mangled in the parsing phase, to help Python
interoperability:

- Symbols surrounded by asterisks (``*``) are turned into uppercase;
- Dashes (``-``) are turned into underscores (``_``);
- One trailing question mark (``?``) is turned into a leading ``is_``.

Caveat: as the mangling is done during the parsing phase, it is possible
to programmatically generate HySymbols that can't be generated with Hy
source code. Such a mechanism is used by :ref:`gensym` to generate
"uninterned" symbols.
Symbols are :ref:`mangled <mangling>` when they are compiled
to Python variable names.

.. _hykeyword:

Expand Down Expand Up @@ -340,7 +331,7 @@ Since they have no "value" to Python, this makes working in Hy hard, since
doing something like ``(print (if True True False))`` is not just common, it's
expected.

As a result, we auto-mangle things using a ``Result`` object, where we offer
As a result, we reconfigure things using a ``Result`` object, where we offer
up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can
be used to get the value of whatever was just run. Hy does this by forcing
assignment to things while running.
Expand All @@ -352,11 +343,11 @@ As example, the Hy::
Will turn into::

if True:
_mangled_name_here = True
_temp_name_here = True
else:
_mangled_name_here = False
_temp_name_here = False

print _mangled_name_here
print _temp_name_here


OK, that was a bit of a lie, since we actually turn that statement
Expand Down
41 changes: 6 additions & 35 deletions docs/language/interop.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@ Hy <-> Python interop
Despite being a Lisp, Hy aims to be fully compatible with Python. That means
every Python module or package can be imported in Hy code, and vice versa.

:ref:`Mangling <mangling>` allows variable names to be spelled differently in
Hy and Python. For example, Python's ``str.format_map`` can be written
``str.format-map`` in Hy, and a Hy function named ``valid?`` would be called
``is_valid`` in Python. In Python, you can import Hy's core functions
``mangle`` and ``unmangle`` directly from the ``hy`` package.

Using Python from Hy
====================

Expand All @@ -27,41 +33,6 @@ You can use it in Hy:
You can also import ``.pyc`` bytecode files, of course.

A quick note about mangling
--------

In Python, snake_case is used by convention. Lisp dialects tend to use dashes
instead of underscores, so Hy does some magic to give you more pleasant names.

In the same way, ``UPPERCASE_NAMES`` from Python can be used ``*with-earmuffs*``
instead.

You can use either the original names or the new ones.

Imagine ``example.py``::

def function_with_a_long_name():
print(42)

FOO = "bar"

Then, in Hy:

.. code-block:: clj
(import example)
(.function-with-a-long-name example) ; prints "42"
(.function_with_a_long_name example) ; also prints "42"
(print (. example *foo*)) ; prints "bar"
(print (. example FOO)) ; also prints "bar"
.. warning::
Mangling isn’t that simple; there is more to discuss about it, yet it doesn’t
belong in this section.
.. TODO: link to mangling section, when it is done
Using Hy from Python
====================

Expand Down
Loading

0 comments on commit b023ebd

Please sign in to comment.