Skip to content

Commit

Permalink
Solved problem of nested comments; refactored parsers
Browse files Browse the repository at this point in the history
  • Loading branch information
laughedelic committed Nov 17, 2013
1 parent e7b8426 commit d620362
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 33 deletions.
35 changes: 17 additions & 18 deletions docs/src/main/scala/LiteratorParsers.scala.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ May be there are standard ones like this — I didn't find.
def many (p: => Parser[String]): Parser[String] = p.* ^^ (_.mkString)
def emptyLine: Parser[String] = """^[ \t]*""".r ~> eol
def anythingBut[T](p: => Parser[T]): Parser[String] = guard(not(p)) ~> (char | eol)
def surrounded(left: String, right: String)(inner: Parser[String] = failure("")) =
left ~> many(inner | anythingBut(right)) <~ right ^^ { left+_+right }
```

Here are the parsers for the comment opening and closing braces.
Expand All @@ -41,19 +43,6 @@ They return the offset of the braces, which will be useful later.
def commentEnd = spaces <~ lang.comment.end
```

Using `escapedCode` parser we can ignore escaped closing
comment brace inside of a comment.

_Note:_ the only limitation is that you cannot use an escaped
block of code with a closing comment brace inside of a
comment _with margin_.

```scala
def escapedCodeWith(esc: String) = esc ~> many(anythingBut(esc)) <~ esc ^^ { esc+_+esc }
def escapedCode = escapedCodeWith("```") |
escapedCodeWith("`")
```

When parsing block comments, we care about indentation and this is the
only complex part of this code.

Expand All @@ -73,11 +62,16 @@ _Note:_ you can use space as a delimiter, just put _two_ spaces after the
opening comment brace and remember to indent the following lines to the
same level. See this comment in the source for example.

_Note:_ you cannot use an escaped block of code with a closing comment brace
inside of a comment with margin.

```scala
def comment: Parser[Comment] = {
import java.util.regex.Pattern.quote

def inner = escapedCode | anythingBut(commentEnd | eol)
def innerCode = surrounded("```", "```")() | surrounded("`", "`")()
def innerComment: Parser[String] = surrounded(lang.comment.start, lang.comment.end)()
def inner = innerCode | innerComment | anythingBut(commentEnd | eol)

commentStart >> { offset => (
spaces ~> many(inner) <~ commentEnd // there is only one line
Expand All @@ -98,14 +92,19 @@ same level. See this comment in the source for example.

When parsing code blocks we should remember, that it
can contain a comment-opening brace inside of a string.
So code is just strings or anything but comment.

Also block comments may appear isinde of inline comments
(which we treat as code).

```scala
def str: Parser[String] =
("\"\"\"" | "\"") >> { q => many("\\\"" | anythingBut(q)) <~ q ^^ { q+_+q } }
def str: Parser[String] =
surrounded("\"\"\"", "\"\"\"")() | // three quotes string
surrounded("\"", "\"")("\\\"") // normal string may contain escaped quote

def lineComment: Parser[String] = surrounded(lang.comment.line, "\n")()

def code: Parser[Code] =
many1(str | anythingBut(emptyLine.* ~ commentStart)) ^^ Code
many1(str | lineComment | anythingBut(emptyLine.* ~ commentStart)) ^^ Code
```

Finally, we parse source as a list of chunks and
Expand Down
32 changes: 17 additions & 15 deletions src/main/scala/LiteratorParsers.scala
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ case class LiteratorParsers(val lang: Language) extends RegexParsers {
def many (p: => Parser[String]): Parser[String] = p.* ^^ (_.mkString)
def emptyLine: Parser[String] = """^[ \t]*""".r ~> eol
def anythingBut[T](p: => Parser[T]): Parser[String] = guard(not(p)) ~> (char | eol)
def surrounded(left: String, right: String)(inner: Parser[String] = failure("")) =
left ~> many(inner | anythingBut(right)) <~ right ^^ { left+_+right }


/* Here are the parsers for the comment opening and closing braces.
Expand All @@ -32,16 +34,6 @@ case class LiteratorParsers(val lang: Language) extends RegexParsers {
def commentStart = spaces <~ lang.comment.start ^^ { _ + " "}
def commentEnd = spaces <~ lang.comment.end

/*` Using `escapedCode` parser we can ignore escaped closing
` comment brace inside of a comment.
`
` _Note:_ the only limitation is that you cannot use an escaped
` block of code with a closing comment brace inside of a
` comment _with margin_.
*/
def escapedCodeWith(esc: String) = esc ~> many(anythingBut(esc)) <~ esc ^^ { esc+_+esc }
def escapedCode = escapedCodeWith("```") |
escapedCodeWith("`")

/* When parsing block comments, we care about indentation and this is the
only complex part of this code.
Expand All @@ -61,11 +53,16 @@ case class LiteratorParsers(val lang: Language) extends RegexParsers {
_Note:_ you can use space as a delimiter, just put _two_ spaces after the
opening comment brace and remember to indent the following lines to the
same level. See this comment in the source for example.
_Note:_ you cannot use an escaped block of code with a closing comment brace
inside of a comment with margin.
*/
def comment: Parser[Comment] = {
import java.util.regex.Pattern.quote

def inner = escapedCode | anythingBut(commentEnd | eol)
def innerCode = surrounded("```", "```")() | surrounded("`", "`")()
def innerComment: Parser[String] = surrounded(lang.comment.start, lang.comment.end)()
def inner = innerCode | innerComment | anythingBut(commentEnd | eol)

commentStart >> { offset => (
spaces ~> many(inner) <~ commentEnd // there is only one line
Expand All @@ -85,13 +82,18 @@ case class LiteratorParsers(val lang: Language) extends RegexParsers {

/*. When parsing code blocks we should remember, that it
. can contain a comment-opening brace inside of a string.
. So code is just strings or anything but comment.
.
. Also block comments may appear isinde of inline comments
. (which we treat as code).
*/
def str: Parser[String] =
("\"\"\"" | "\"") >> { q => many("\\\"" | anythingBut(q)) <~ q ^^ { q+_+q } }
def str: Parser[String] =
surrounded("\"\"\"", "\"\"\"")() | // three quotes string
surrounded("\"", "\"")("\\\"") // normal string may contain escaped quote

def lineComment: Parser[String] = surrounded(lang.comment.line, "\n")()

def code: Parser[Code] =
many1(str | anythingBut(emptyLine.* ~ commentStart)) ^^ Code
many1(str | lineComment | anythingBut(emptyLine.* ~ commentStart)) ^^ Code


/*- Finally, we parse source as a list of chunks and
Expand Down

0 comments on commit d620362

Please sign in to comment.