Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom containers for many functions #1409

Closed
epage opened this issue Sep 27, 2021 · 5 comments · Fixed by #1621
Closed

Support custom containers for many functions #1409

epage opened this issue Sep 27, 2021 · 5 comments · Fixed by #1621
Milestone

Comments

@epage
Copy link
Contributor

epage commented Sep 27, 2021

Prerequisites

Here are a few things you should provide to help me understand the issue:

  • Rust version : 1.55.0
  • nom version : 7.0.0
  • nom compilation features used: default

Test case

I wanted to experiment with porting toml_edit to nom, one thing I especially found missing was custom collection support. If I use many0, it only supports Vec. The naive approach in code is to do

fn ml_body(input: &str) -> IResult<&str, String> {
    many(substr).map(|s| s.join(""))
}

Leading me to allocate a Vec<String> and then merge them. combine defaults to Vec but allows any Extend type, allowing

fn ml_body(input: &str) -> IResult<&str, String> {
    many(substr)
}

which will collect directly into the String, saving on a lot of allocations

I'm also looking at optimizing a section of my parser where collect into a Vec and then turn it into a HashMap with errors by creating a custom type that wraps HashMap with my error type and stores off errors on Extend, allowing me to bypass the Vec allocation.

I could do all of this with fold_ variants but it will be a bit messier.

@Stargateur
Copy link
Contributor

I guess this can be link to #1393.

I think we can use https://doc.rust-lang.org/std/iter/trait.Extend.html to allow this for many.

I could do all of this with fold_ variants but it will be a bit messier.

You can add your own parser for reuse in the waiting, something like that (I'm more used to nom 6 so probably mistake):

fn many0_string(f: impl Parser) -> impl Parser {
  |input| {
    fold_many0(
      f,
      String::new,
      |mut acc: String, item| {
        string.push_str(item);
        acc
      }
    )(input)
  }
}

combine defaults to Vec but allows any Extend type, allowing

What is combine ? I don't follow here.

I'm also looking at optimizing a section of my parser where collect into a Vec and then turn it into a HashMap with errors by creating a custom type that wraps HashMap with my error type and stores off errors on Extend, allowing me to bypass the Vec allocation.

Some sort of partition_result() ?

@epage
Copy link
Contributor Author

epage commented Sep 27, 2021

What is combine ? I don't follow here.

Sorry, combine is another parser combinator.

e.g. https://docs.rs/combine/4.6.1/combine/fn.many.html has the collection set to F: Extend<P::Output> + Default,

Some sort of partition_result() ?

More like try_fold

So speaking of, maybe we should have try_fold_many* :)

@Stargateur
Copy link
Contributor

Stargateur commented Sep 27, 2021

I think we can clearly add a try_fold and I will try update my PR to include extend.

@epage
Copy link
Contributor Author

epage commented Sep 27, 2021

Oh, thanks for looking to include Extend!

I've split out try_fold into #1411.

@epage epage mentioned this issue Sep 28, 2021
9 tasks
@Geal
Copy link
Collaborator

Geal commented Oct 10, 2021

oh, using Extend is a great suggestion! I was wondering about doing that too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants