datetime_from_str is too lenient, doesn't roundtrip #548

koivunej · 2021-03-22T15:00:22Z

For another crate cargo-fuzz found me the input "4\t3\t\u{c}\t\t3\t\u{c}\t5\t3\t60Z" which is parsed successfully with Utc.datetime_from_str(s, "%Y%m%d%H%M%S%.3fZ"). Minimal test case for version 0.4.19:

#[test]
fn fuzzed_roundtrip() {
    let fmt = "%Y%m%d%H%M%S%.3fZ";
    let input = "4\t3\t\u{c}\t\t3\t\u{c}\t5\t3\t60Z";
    let ts = Utc.datetime_from_str(input, fmt).unwrap();
    let fmted = ts.format(fmt).to_string(); // 00040303050360.000Z
    assert_eq!(input, fmted);
}

There are certainly more of these with fuzzing target:

#![no_main]
use libfuzzer_sys::fuzz_target;

use chrono::{TimeZone, Utc};

static FMT: &str = "%Y%m%d%H%M%S%.3fZ";

fuzz_target!(|data: &[u8]| {
    if let Ok(s) = std::str::from_utf8(data) {
        if let Ok(ts) = Utc.datetime_from_str(s, FMT) {
            let rt = ts.format(FMT).to_string();

            assert_eq!(s, rt);
        }
    }
});

As far as I understand by looking at the documentation, "4\t3" should not get through "%Y%m" as "%Y" should be zero padded 4 positive digits. Overall I think the format string in question should only allow input which is formatted exactly same, though I read that the %.xf might be lenient for the number of zeroes. I'll try to see if I can offer any help with this. Could be that what I am after is out of scope of chrono.

The text was updated successfully, but these errors were encountered:

koivunej · 2021-03-22T15:57:03Z

It looks like the \t and \u{c} are gotten rid of by str::trim_left in

chrono/src/format/parse.rs

Line 325 in 3467172

s = s.trim_left();

~~Not sure why for just %Y the "4" is not accepted but with the larger format string it works ok.~~ Nevermind the NotEnough comes from the to_naive_{date,time} meaning the fmt string is insufficient.

koivunej · 2021-03-23T08:48:08Z

Looking with git blame the trim_left has been with us for 6 years so I guess it aims to give a more lenient than strict parsing, same goes with accepting 1..=4 digits for %Y instead. Still given the age of the s = s.trim_left() being the first step of Item::Numeric it does feel a bit strange as Item::Space(_) seems to exist only for this reason. I wonder if use cases where it is wanted to accept more lenient whitespace, shouldn't this Item::Space be used for that reason?

Just tested with against my local glibc (2.31-0ubuntu9.2) how does strptime react to "leading whitespace": it just skips over, and doesn't mind about trailing characters and overall is surprisingly fuzzy. I did not know this, which must be the origin for the whitespace behaviour.

For my case, I will probably match it to a regex before passing it over to Utc.datetime_from_str with some added machinery to same length string output under #[cfg(fuzzing)] or similar.

I still think the docs would probably need to changed at https://docs.rs/chrono/0.4.19/chrono/format/strftime/index.html:

%Y is specified as The full proleptic Gregorian year, zero-padded to 4 digits. whereas it will be parsed with or without leading zeros, and can be less than 4 digits
%Y also refers to a footnote %Y: Negative years are allowed in formatting but not in parsing. whereas in code sign prefixed values are accepted, and having a sign prefix makes the accepted digit count to be usize::MAX (I can't yet understand why) -- could it be that some later part in parsing denies the negative years?

Apologies if these are already collected somewhere.

koivunej · 2021-03-25T07:32:21Z

I was looking at the failures if I were to comment the s = s.trim_left() out and found this

chrono/src/naive/datetime.rs

Line 2417 in 3467172

// `s` and `s_` may differ, but `s.parse()` and `s_.parse()` must be same

Sadly it does not make for a good property to fuzz given the negative %Y year parsing to up to unlimited digits or %.3f accepting more than 3 digits, which would be truncated using the format string "%Y%m%d%H%M%S%.3fZ" and the two parsed values wouldn't be equal.

jtmoon79 mentioned this issue Mar 21, 2022

overly relaxed whitespace parsing leads to unexpected parsing success (was Local.datetime_from_str errantly ignoring format whitespace) #660

Open

pitdicker added the parsing label Jul 2, 2023

pitdicker mentioned this issue Aug 31, 2023

Breaking change in RFC 3339 parser #1239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datetime_from_str is too lenient, doesn't roundtrip #548

datetime_from_str is too lenient, doesn't roundtrip #548

koivunej commented Mar 22, 2021 •

edited

Loading

koivunej commented Mar 22, 2021 •

edited

Loading

koivunej commented Mar 23, 2021 •

edited

Loading

koivunej commented Mar 25, 2021

datetime_from_str is too lenient, doesn't roundtrip #548

datetime_from_str is too lenient, doesn't roundtrip #548

Comments

koivunej commented Mar 22, 2021 • edited Loading

koivunej commented Mar 22, 2021 • edited Loading

koivunej commented Mar 23, 2021 • edited Loading

koivunej commented Mar 25, 2021

koivunej commented Mar 22, 2021 •

edited

Loading

koivunej commented Mar 22, 2021 •

edited

Loading

koivunej commented Mar 23, 2021 •

edited

Loading