Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(book): adding examples #300

Merged
merged 27 commits into from
Apr 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8ae9430
Initial commit for the Logos Handboook
maciejhirsz Apr 23, 2020
9f6174f
A bit more adds to the book
maciejhirsz Apr 23, 2020
971903a
chore(ci): setup automated CI for book
jeertmans Apr 12, 2023
710e5e9
chore(ci): update branches
jeertmans Apr 12, 2023
c6d8185
fix(ci): remove extra needs
jeertmans Apr 12, 2023
7829706
chore(docs): adding brainfuck example
jeertmans Apr 14, 2023
5d44f5a
Add missing `Debug` error type requirement (#298)
shilangyu Apr 15, 2023
c94d304
chore(docs): create JSON example
jeertmans Apr 18, 2023
824edd8
Merge remote-tracking branch 'origin/book' into examples
jeertmans Apr 18, 2023
6959ba0
Initial commit for the Logos Handboook
maciejhirsz Apr 23, 2020
afb327b
A bit more adds to the book
maciejhirsz Apr 23, 2020
3ae8bb4
chore(ci): setup automated CI for book
jeertmans Apr 12, 2023
50caad4
chore(ci): update branches
jeertmans Apr 12, 2023
6a576e6
fix(ci): remove extra needs
jeertmans Apr 12, 2023
c7c48f9
chore(docs): adding brainfuck example
jeertmans Apr 14, 2023
f4b6a32
chore(docs): create JSON example
jeertmans Apr 18, 2023
8b6e3c1
Merge remote-tracking branch 'origin/book' into book
jeertmans Apr 18, 2023
4f4bca0
chore(ci): test code examples
jeertmans Apr 18, 2023
efc2b89
chore(docs): scrape examples and autodoc features
jeertmans Apr 18, 2023
bebf2bb
chore(docs): adding brainfuck example
jeertmans Apr 14, 2023
9f74861
Add missing `Debug` error type requirement (#298)
shilangyu Apr 15, 2023
c74495b
chore(docs): create JSON example
jeertmans Apr 18, 2023
60d4d61
chore(ci): test code examples
jeertmans Apr 18, 2023
9e49ebb
chore(docs): scrape examples and autodoc features
jeertmans Apr 18, 2023
c65dc52
Merge remote-tracking branch 'origin/book' into book
jeertmans Apr 18, 2023
f4f3900
Auto stash before rebase of "maciejhirsz/book"
jeertmans Apr 18, 2023
f0a2e62
chore(book): typos and styling
jeertmans Apr 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/workflows/pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ concurrency:
group: pages
cancel-in-progress: true

jobs:
jobs:
# Build job
build-book:
runs-on: ubuntu-latest
Expand All @@ -34,7 +34,6 @@ jobs:
uses: peaceiris/actions-mdbook@v1
with:
mdbook-version: '0.4.28'
# mdbook-version: 'latest'
- name: Build book
run: mdbook build book
- name: Upload artifact
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ By default, **Logos** uses `()` as the error type, which means that it
doesn't store any information about the error.
This can be changed by using `#[logos(error = T)]` attribute on the enum.
The type `T` can be any type that implements `Clone`, `PartialEq`,
`Default` and `From<E>` for each callback's error type `E`.
`Default`, `Debug` and `From<E>` for each callback's error type `E`.

## Token disambiguation

Expand Down
6 changes: 5 additions & 1 deletion book/src/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
# Summary

+ [Intro](./intro.md)
+ [Getting Started](./getting-started.md)
+ [Examples](./examples.md)
+ [Brainfuck interpreter](./examples/brainfuck.md)
+ [JSON parser](./examples/json.md)
+ [Attributes](./attributes.md)
+ [`#[logos]`](./attributes/logos.md)
+ [`#[error]`](./attributes/error.md)
+ [`#[token]` and `#[regex]`](./attributes/token_and_regex.md)
+ [Token disambiguation](./token-disambiguation.md)
+ [Using `Extras`](./extras.md)
+ [Using callbacks](./callbacks.md)
+ [Common regular expressions](./common-regex.md)
+ [Common regular expressions](./common-regex.md)
7 changes: 7 additions & 0 deletions book/src/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Examples

The following examples are ordered by increasing level of complexity.

**[Brainfuck interpreter](./examples/brainfuck.md)**: Lexers are very powerful tools for parsing code programs into meaningful instructions. We show you how you can build an interpreter for the Brainfuck programming language under 100 lines of code!

**[JSON parser](./examples/json.md)**: We present a JSON parser written with Logos that does nice error reporting when invalid values are encountered.
32 changes: 32 additions & 0 deletions book/src/examples/brainfuck.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Brainfuck interpreter

In most programming languages, commands can be made of multiple program tokens, where a token is simply string slice that has a particular meaning for the language. For example, in Rust, the function signature `pub fn main()` could be split by the **lexer** into tokens `pub`, `fn`, `main`, `(`, and `)`. Then, the **parser** combines tokens into meaningful program instructions.

However, there exists programming languages that are so simple, such as Brainfuck, that each token can be mapped to a single instruction. There are actually 8 single-characters tokens:

```rust,no_run,noplayground
{{#include ../../../logos/examples/brainfuck.rs:tokens}}
```

All other characters must be ignored.

Once the tokens are obtained, a Brainfuck interpreter can be easily created using a [Finite-state machine](https://en.wikipedia.org/wiki/Finite-state_machine). For the sake of simpliciy, we collected all the tokens into one vector called `operations`.

Now, creating an interpreter becomes straightforward[^1]:
```rust,no_run,noplayground
{{#include ../../../logos/examples/brainfuck.rs:fsm}}
```

[^1]: There is a small trick to make it easy. As it can be seen in the full code, we first perform a check that all beginning loops (`'['`) have a matching end (`']'`). This way, we can create two maps, `pairs` and `pairs_reverse`, to easily jump back and forth between them.

Finally, we provide you the full code that you should be able to run with[^2]:
```bash
cd logos/logos
cargo run --example brainfuck examples/hello_word.bf
```

[^2] You first need to clone [this repository](https://github.com/maciejhirsz/logos).

```rust,no_run,noplayground
{{#include ../../../logos/examples/brainfuck.rs:all}}
```
55 changes: 55 additions & 0 deletions book/src/examples/json.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# JSON parser

JSON is a widely used format for exchanging data between formats, while being human-readable.

Possible values are defined recursively and can be any of the following:

```rust,no_run,noplayground
{{#include ../../../logos/examples/json.rs:values}}
```

Object are delimites with braces `{` and `}`, arrays with brackets `[` and `]`, and values with commas `,`. Newlines, tabs or spaces should be ignored by the lexer.

Knowing that, we can construct a lexer with `Logos` that will identify all those cases:

```rust,no_run,noplayground
{{#include ../../../logos/examples/json.rs:tokens}}
```

> NOTE: the hardest part is to define valid regexes for `Number` and `String` variants. The present solution was inspired by [this stackoverflow thread](https://stackoverflow.com/questions/32155133/regex-to-match-a-json-string).

Once we have our tokens, we must parse them into actual JSON values. We will proceed be creating 3 functions:

+ `parse_value` for parsing any JSON object, without prior knowledge of its type;
+ `parse_array` for parsing an array, assuming we matched `[`;
+ and `parse_oject` for parsing an object, assuming we matched `{`.

Starting with parsing an arbitrary value, we can easily obtain the four scalar types, `Bool`, `Null`, `Number`, and `String`, while we will call the next functions for arrays and objects parsing.

```rust,no_run,noplayground
{{#include ../../../logos/examples/json.rs:value}}
```

To parse an array, we simply loop between tokens, alternating between parsing values and commas, until a closing bracket is found.

```rust,no_run,noplayground
{{#include ../../../logos/examples/json.rs:array}}
```

A similar approach is used for objects, where the only different is that we expect (key, value) pairs, separated by a colon.

```rust,no_run,noplayground
{{#include ../../../logos/examples/json.rs:object}}
```

Finally, we provide you the full code that you should be able to run with[^1]:
```bash
cd logos/logos
cargo run --example json examples/example.json
```

[^1] You first need to clone [this repository](https://github.com/maciejhirsz/logos).

```rust,no_run,noplayground
{{#include ../../../logos/examples/json.rs:all}}
```
72 changes: 72 additions & 0 deletions book/src/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Getting Started

**Logos** can be included in your Rust project using the `cargo add logos` command, or by directly modifying your `Cargo.toml` file:

```toml
[dependencies]
logos = "0.13.0"
```

Then, you can automatically derive the [`Logos`](https://docs.rs/logos/latest/logos/trait.Logos.html) trait on your `enum` using the `Logos` derive macro:

```rust,no_run,no_playground
use logos::Logos;

#[derive(Logos, Debug, PartialEq)]
#[logos(skip r"[ \t\n\f]+")] // Ignore this regex pattern between tokens
enum Token {
// Tokens can be literal strings, of any length.
#[token("fast")]
Fast,

#[token(".")]
Period,

// Or regular expressions.
#[regex("[a-zA-Z]+")]
Text,
}
```

Then, you can use `Logos::lexer` method to turn any `&str` into an iterator of tokens[^1]:

```rust,no_run,no_playground
let mut lex = Token::lexer("Create ridiculously fast Lexers.");

assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.span(), 0..6);
assert_eq!(lex.slice(), "Create");

assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.span(), 7..19);
assert_eq!(lex.slice(), "ridiculously");

assert_eq!(lex.next(), Some(Ok(Token::Fast)));
assert_eq!(lex.span(), 20..24);
assert_eq!(lex.slice(), "fast");

assert_eq!(lex.next(), Some(Ok(Token::Text)));
assert_eq!(lex.slice(), "Lexers");
assert_eq!(lex.span(), 25..31);

assert_eq!(lex.next(), Some(Ok(Token::Period)));
assert_eq!(lex.span(), 31..32);
assert_eq!(lex.slice(), ".");

assert_eq!(lex.next(), None);
```

[^1]: Each item is actually a [`Result<Token, _>`](https://docs.rs/logos/latest/logos/struct.Lexer.html#associatedtype.Item), because the lexer returns an error if some part of the string slice does not match any variant of `Token`.

Because [`Lexer`](https://docs.rs/logos/latest/logos/struct.Lexer.html), returned by [`Logos::lexer`](https://docs.rs/logos/latest/logos/trait.Logos.html#method.lexer), implements the `Iterator` trait, you can use a `for .. in` construct:

```rust,no_run,no_playground
for result in Token::lexer("Create ridiculously fast Lexers.") {
match result {
Ok(token) => println!("{:#?}", token),
Err(e) => panic!("some error occured: {}", e),
}
}
```


16 changes: 16 additions & 0 deletions logos/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,17 @@ categories = ["parsing", "text-processing"]
readme = "../README.md"
edition = "2018"

[package.metadata.docs.rs]
all-features = true
cargo-args = ["-Zunstable-options", "-Zrustdoc-scrape-examples"]
rustdoc-args = ["--cfg", "docsrs"]

[dependencies]
logos-derive = { version = "0.13.0", path = "../logos-derive", optional = true }

[dev-dependencies]
ariadne = { version = "0.2.0", features = ["auto-color"] }

[features]
default = ["export_derive", "std"]

Expand All @@ -24,3 +32,11 @@ std = []
# import this crate and `use logos::Logos` to get both the trait and
# derive proc macro.
export_derive = ["logos-derive"]

[[example]]
name = "brainfuck"
path = "examples/brainfuck.rs"

[[example]]
name = "json"
path = "examples/json.rs"
Loading