Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string parser (and possibly others internally using consume_while) force unnecessary stream reads #67

Open
dario23 opened this issue Jul 16, 2017 · 1 comment

Comments

@dario23
Copy link

dario23 commented Jul 16, 2017

problem

the chomp::parsers::string parser (and possibly others internally using consume_while) might force unnecessary stream reads. example code:

#[macro_use]
extern crate chomp;

use chomp::prelude::*;
use chomp::buffer::{Source, Stream};

use std::net::TcpStream;


fn main() {
    let tcp = TcpStream::connect("faumail.fau.de:143").unwrap();
    let mut src = Source::new(tcp);

    // IMAP lines end in b"\r\n", so the real text is everything up to b'\r',
    // but we have to read the line ending nonetheless before reading any future stuff
    let p = src.parse(parser!{take_till(|c| c == b'\r') <* string(b"\r\n")});
    println!("{:?}", p);
}

expected output: Ok(<some bytes from the imap server welcome line>)

actual output: Err(Retry)

cause

the string parser (src/parsers.rs:378) uses consume_while(f), which first reads the next token from the input stream, and only after that inspects it (using f) for whether to consume it or not. note this is not a bug in consume_while, but its perfectly fine expected behaviour. the problem with using it the way it currently is for string(s) is that after len(s) tokens have been consumed, we could return successfully, but consume_while waits for the next token to call its decider function on (which then determines that it has read len(s) tokens already and tells consume_while to quit), which in some cases can force a read on the underlying stream when actually the answer would be clear.

solution

i wrote a (very hackish) fix for the string parser at https://github.com/dario23/chomp/tree/fix_string but (without having checked in depth) i'm expecting more parsers to be affected. probably a more exhaustive fix would include adding consume_while_max_n(f, usize).

i'd be happy to propose changes and submit a PR, but only after hearing your opinion on the matter :-)

dario23 pushed a commit to dario23/chomp that referenced this issue Jul 16, 2017
@dario23
Copy link
Author

dario23 commented Jul 16, 2017

afterthought: with some (minor?) api change and test breakage i think this specific instance could also be solved by making string(s) do something roughly like (not actual code, just illustration)

fn string(i, s) {
    let bytes = i.take(len(s))
    if s == bytes {
        i.ret(bytes)
    } else {
        i.err(expected(something))
    }
}

where the question would be what the something should be and whether the different amount of tokens read from the source (currently until mismatch token, here none if mismatch anywhere) make any difference. with some luck it's mostly restored anyway, so (even though it's still an api-breaking change) it would not need a lot of actual code changes.

dario23 pushed a commit to dario23/chomp that referenced this issue Jul 23, 2017
dario23 pushed a commit to dario23/chomp that referenced this issue Oct 31, 2017
dario23 pushed a commit to dario23/chomp that referenced this issue Oct 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant