Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start macro expansion chapter #26

Merged
merged 6 commits into from
Jan 31, 2018
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions src/macro-expansion.md
Original file line number Diff line number Diff line change
@@ -1 +1,66 @@
# Macro expansion

Macro expansion happens during parsing. `rustc` has two parsers, in fact: the
normal Rust parser, and the macro parser. During the parsing phase, the normal
Rust parser will call into the macro parser when it encounters a macro. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be more precise about what a reference to a macro is? e.g. ,do you mean a macro invocation, like foo!(...)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it really called from the parser? I thought there was a second phase that came after parsing, but maybe I'm going to learn something here =)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me verify that :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikomatsakis

Ok, so it looks like

macro parser, in turn, may call back out to the Rust parser when it needs to
bind a metavariable (e.g. `$my_expr`). There are a few aspects of this system to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, you mean when the macro is trying to parse the contents of the macro invocation against one of the macro arms?

be explained. The code for macro expansion is in `src/libsyntax/ext/tt/`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make links into GH here (master branch)? this at least allows us to detect if those links rot


### The macro parser
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a meta-comment, I think it's a good idea to start out with some kind of concrete example and walk it through. For example:

Imagine we have a macro

macro_rules! foo {
    ($metavariable:tt) => { ... }
}

now you can reference this example from the text below


Basically, the macro parser is like an NFA-based regex parser. It uses an
algorithm similar in spirit to the [Earley parsing
algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro parser is
defined in `src/libsyntax/ext/tt/macro_parser.rs`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make links into GH here (master branch)? this at least allows us to detect if those links rot


In a traditional NFA-based parser, one common approach is to have some pattern
which we are trying to match an input against. Moreover, we may try to capture
some portion of the input and bind it to variable in the pattern. For example:
suppose we have a pattern (borrowing Rust macro syntax) such as `a $b:ident a`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g., this example would reference back to your running example.

-- that is, an `a` token followed by an `ident` token followed by another `a`
token. Given an input `a foo a`, the _metavariable_ `$b` would bind to the
`ident` `foo`. On the other hand, an input `a foo b` would be rejected as a
parse failure because the pattern `a <ident> a` cannot match `a foo b` (or as
the compiler would put it, "no rules expected token `b`").

The macro parser does pretty much exactly that with one exception: in order to
parse different types of metavariables, such as `ident`, `block`, `expr`, etc.,
the macro parser must sometimes call back to the normal Rust parser.

Interestingly, both definitions and invokations of macros are parsed using the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: invocations

macro parser. This is extremely non-intuitive and self-referential. The code to
parse macro _definitions_ is in `src/libsyntax/ext/tt/macro_rules.rs`. It
defines the pattern for matching for a macro definition as `$( $lhs:tt =>
$rhs:tt );+`. In other words, a `macro_rules` defintion should have in its body
at least one occurence of a token tree followed by `=>` followed by another
token tree. When the compiler comes to a `macro_rules` definition, it uses this
pattern to match the two token trees per rule in the definition of the macro
_using the macro parser itself_.

When the compiler comes to a macro invokation, it needs to parse that
invokation. This is also known as _macro expansion_. The same NFA-based macro
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: invocation

parser is used that is described above. Notably, the "pattern" (or _matcher_)
used is the first token tree extracted from the rules of the macro _definition_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where the running example would be really handy

In other words, given some pattern described by the _definition_ of the macro,
we want to match the contents of the _invokation_ of the macro.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: invocation


The algorithm is exactly the same, but when the macro parser comes to a place in
the current matcher where it needs to match a _non-terminal_ (i.e. a
metavariable), it calls back to the normal Rust parser to get the contents of
that non-terminal. Then, the macro parser proceeds in parsing as normal.

For more information about the macro parser's implementation, see the comments
in `src/libsyntax/ext/tt/macro_parser.rs`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to repo


### Hygiene

TODO

### Procedural Macros

TODO

### Custom Derive

TODO