Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example parsers #14

Open
30 of 59 tasks
Geal opened this issue Feb 26, 2015 · 98 comments
Open
30 of 59 tasks

Example parsers #14

Geal opened this issue Feb 26, 2015 · 98 comments

Comments

@Geal
Copy link
Collaborator

Geal commented Feb 26, 2015

We currently have a few example parsers. In order to test the project and make it useful, other formats can be implemented. Here is a list, if anyone wants to try it:

@thehydroimpulse
Copy link

I'm writing a Thrift library for Rust that'll use Nom for both their IDL and the network protocol, so that can be another example (although in a different repo).

@Geal
Copy link
Collaborator Author

Geal commented Apr 3, 2015

Nice idea, that will be useful! Please notify me when it is done, I will add a link in this list.

@filipegoncalves
Copy link
Contributor

This looks interesting. Is anyone actively working on any of these parsers? I'd like to work on a few of these.

@Geal
Copy link
Collaborator Author

Geal commented Apr 27, 2015

I have some code for a GIF one at https://github.com/Geal/gif.rs but it is hard to test, since the graphical tools in Piston change a lot.

You can pick any of them. Network packets may be the easiest, since they don't require a decompression phase.

I am using the gif example to see what kind of API can be built over nom. Most of the parsing example are done as one pass over the data, but often there is some logic on the side, and it is not easy to encode correctly.

@elij
Copy link

elij commented May 1, 2015

@Geal
Copy link
Collaborator Author

Geal commented May 5, 2015

@elij this is a great idea! Was it easy to do?

@elij
Copy link

elij commented May 5, 2015

yup it's a great framework -- though I struggled a bit with eof so I borrowed some code from rust-config (https://github.com/elij/fastq.rs/blob/master/src/parser.rs#L69) -- is there a better solution?

@Geal
Copy link
Collaborator Author

Geal commented May 5, 2015

yes, eof should be a parser provided by nom, I am just waiting for @filipegoncalves to send a PR 😉

@filipegoncalves
Copy link
Contributor

Hah, sorry for my silence. I've been busy lately. I just sent a PR (#31).

I will be working on one of these example parsers as soon as I get some spare time. There are some great ideas in here!

@Keruspe
Copy link
Contributor

Keruspe commented May 29, 2015

I might give tar a try

@nelsonjchen
Copy link
Contributor

Does this check off PCAP?

https://github.com/richo/pcapng-rs

@Geal
Copy link
Collaborator Author

Geal commented Jun 19, 2015

pcap-ng and pcap are two different formats, right? It seems the consensus now is to move everything to pcap-ng, though.

@TechnoMancer
Copy link

I will try a FLAC parser, need to add quite a few things for it though.

@badboy
Copy link
Contributor

badboy commented Jul 17, 2015

ISO8601 is done in https://github.com/badboy/iso8601 (I hope it's mostly correct.)

@Geal
Copy link
Collaborator Author

Geal commented Jul 17, 2015

ok, it should be up to date. More to come 😄

@sbeckeriv
Copy link

WARC file format released. https://crates.io/crates/warc_parser

@Geal
Copy link
Collaborator Author

Geal commented Aug 24, 2015

@sbeckeriv great, thanks!

@porglezomp
Copy link

It might be informative to try parsing the rust grammar with nom, if nobody has yet. In any case, I'd like to see a few programming languages on that list, since that's my use case.

@Geal
Copy link
Collaborator Author

Geal commented Sep 15, 2015

@porglezomp programming languages examples would definitely be useful, but the Rust grammar might be a bit too much for the first attempt. Which other languages would you like to handle?

@porglezomp
Copy link

Yeah, I'm aware of the scale problem of Rust. I don't want to write that one, but I think it's a good holy grail for any parser library written in Rust. I'd like to try parsing the Lua grammar first, I think.

I recommend adding to the list:

  • Programming Languages
    • Rust
    • Lua (I'll do this)
    • Python (or some other whitespace significant language)
    • C

@Geal
Copy link
Collaborator Author

Geal commented Sep 15, 2015

ok, I added them to the list :)

@chriskrycho
Copy link

You have INI marked as done; do you have a link to it? (I'd love to use this for some tooling I'm hoping to build in 2016; need a good non-trivial example for it, though.)

@badboy
Copy link
Contributor

badboy commented Nov 16, 2015

@chriskrycho
Copy link

Thanks very much, @badboy!

@fbernier
Copy link

I'll try to make the TOML parser very soon.

@Geal
Copy link
Collaborator Author

Geal commented Nov 16, 2015

Actually, I think I should rewrite that INI parser, now that more convenient combinators are available.
Also, I should really work on that combinator for space separated stuff

@Geal
Copy link
Collaborator Author

Geal commented Nov 16, 2015

@fbernier great! Please keep me posted!

@l0calh05t
Copy link

Maybe add a simple example for trailing commas in lists? Python has those, but is quite complex. Can't think of a simple example though.

@johshoff
Copy link

That IRC example is no longer using nom. The parser was moved into its own repository: https://github.com/Detegr/RBot-parser

@vandenoever
Copy link
Contributor

A parser for Turtle. It passes the test suite in 15ms.

https://github.com/vandenoever/rome/tree/master/src/io/turtle

@progval
Copy link
Contributor

progval commented Jun 18, 2018

I wrote a Python parser: https://docs.rs/python-parser/

@idursun
Copy link

idursun commented Jan 4, 2019

I think Redis database file format parser is not using nom at all. I couldn't find any reference to nom anywhere.

@nelsonjchen
Copy link
Contributor

@idursun Maybe it refers to this old branch from a year before the last update to master. https://github.com/badboy/rdb-rs/tree/nom-parser

@saggit
Copy link

saggit commented Mar 12, 2019

is there any SQL parser?

@naturallymitchell
Copy link

is there any SQL parser?

it'd seem better to me to import it to an sql engine and interact with that data using Diesel. parsing flat sql files seems very limited.

instead of writing a one-off Rust app to do this, you could add diesel bindings to Torchbear, see jazzdotdev/jazz#85 , then make a Speakeasy library for transporting data from your schema using content model in ContentDB.

then, you could develop a lot further beyond.

@ithinuel
Copy link
Contributor

@naturallymitchell maybe @saggit was simply looking for something to extract some data from a raw sql dump. Like a one-off log analysis tool. :D

@MarkMcCaskey
Copy link
Contributor

MarkMcCaskey commented Jun 19, 2019

I made a GameBoy ROM parser with nom5!
https://github.com/MarkMcCaskey/gameboy-rom-parser
https://crates.io/crates/gameboy-rom

It's extremely simple and doesn't do much, but the crate provides a useful abstraction over the metadata of GameBoy ROMs.

I'll add more optional validation functions to it and refactor my emulator's ROM code to use it soon.

edit:
this post is what inspired me to make this

@naturallymitchell
Copy link

It's extremely simple and doesn't do much, but the crate provides a useful abstraction over the metadata of GameBoy ROMs.

@MarkMcCaskey It could even make sense to refactor it then into a generalized library with config files (like, TOML and YAML, and now SANE). Do you think that'd be too much more work?

@dwerner
Copy link

dwerner commented Jun 26, 2019

@Geal - I wanted to post my public suffix domain list parser that I wrote a few months back. I couldn't find a performant library that did what I needed, so I grabbed nom and went to work. https://github.com/dwerner/nom-psl

@MarkMcCaskey
Copy link
Contributor

@naturallymitchell

Do you mean specifying the layout of the bytes as data and creating a dynamic data structure from it? That's an interesting idea, but I don't think it'd be too helpful for my use case -- as I see it, the primary value-add of the gameboy rom parser is the data layer that it exposes, which lets the user get things like the game's title as as string or the exact cartridge type and how much ROM and RAM it has as well-named, plain Rust values.

The parser may be implementable with serde deserialize on a repr(C) struct though, which is kind of the reverse of what you're saying, I think... I'm not familiar enough with how serde-derive handles errors though.

@o0Ignition0o
Copy link

Just got a 0.0.1 version of an NMEA-0183 parser using nom 5 https://github.com/YellowInnovation/nmea-0183 . I need to have a look at the docs and guidelines (the code is ugly for now) and refactor it :) I hope to submit a pull request adding a clean version of it to the parsers list soon ! :)

@kurotych
Copy link

kurotych commented Oct 17, 2020

@Geal
Copy link
Collaborator Author

Geal commented Oct 24, 2020

@armatusmiles thanks, i added it to the list in 2e58a2c

@bionicles
Copy link

bionicles commented Nov 26, 2020

Please add OpenCypher to the list... a nice way to parse Graph DB queries could enable a wave of innovation in databases. There are zero legit serverless / autoscaling or decentralized graph databases (like you'd get with a CRDT/ORDT backend for an OpenCypher parser). GunJS is fairly close but JavaScript is not ideal for storage IMHO

@OtaK
Copy link

OtaK commented Feb 15, 2021

Wrote a UBJSON parser w/ nom
Pretty early version with just parsing, but it does the job.

https://github.com/OtaK/ubjson

https://crates.io/crates/ubjson

@NilsIrl
Copy link
Contributor

NilsIrl commented Aug 14, 2021

There is a PDF parser here: https://github.com/J-F-Liu/lopdf (it requires using the nom_parser feature).

FWIW, with lopdf, the nom parser is much faster than the default parser

@atmnk
Copy link

atmnk commented Dec 23, 2021

I wrote a tool with its own programming language using nom. here is source repo.

@erihsu
Copy link
Contributor

erihsu commented Feb 18, 2022

The gds2-parser released at https://crates.io/crates/gds2_io. BTW, my pull request tag is #1497

@manuschillerdev
Copy link

would it be feasible to write an ecmascript/typescript parser with nom as well? Or would the scope be too big for that?

@alexrsagen
Copy link

alexrsagen commented Apr 2, 2022

I have written 2 (public) parsers using nom which may be used as examples:

@edg-l
Copy link

edg-l commented Jul 5, 2022

I made a bencode parser (the format used by .torrent files), https://github.com/edg-l/nom-bencode/

@LikeLakers2
Copy link

Hey there! I was wondering why the Rust parser on this list is syn? From what I can tell, syn does not use nom (although it might have in the past).

Since this is a list of examples of parsers built with nom, I don't see why we should be linking to syn here.

@OtaK
Copy link

OtaK commented Jul 13, 2022

@LikeLakers2

Hey there! I was wondering why the Rust parser on this list is syn? From what I can tell, syn does not use nom (although it might have in the past).

Since this is a list of examples of parsers built with nom, I don't see why we should be linking to syn here.

dtolnay/syn#476

syn was using nom until v0.15, this issue was created 3 years before syn dropped its usage of nom. That's why it's still linked here.

You're absolutely correct that it should be removed though.

@eatgrass
Copy link

mdict-parser is a parser library for .mdx dictionary format file
https://github.com/eatgrass/mdict-parser

@wjwei-handsome
Copy link

crussmap is a parser library and tool for .chain file format

https://github.com/wjwei-handsome/crussmap

@ifsheldon
Copy link

I am a bit surprised that no one mentioned HTML!? I saw nom_html_parser, but it was long left unmaintained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests