-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup CSV WriterBuilder, Default to AutoSI Second Precision (#4735) #4909
Conversation
@@ -265,48 +248,91 @@ impl WriterBuilder { | |||
} | |||
|
|||
/// Set whether to write headers | |||
#[deprecated(note = "Use Self::with_header")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it consistent with https://docs.rs/arrow-csv/latest/arrow_csv/reader/struct.Format.html#method.with_header
const DEFAULT_DATE_FORMAT: &str = "%F"; | ||
const DEFAULT_TIME_FORMAT: &str = "%T"; | ||
const DEFAULT_TIMESTAMP_FORMAT: &str = "%FT%H:%M:%S.%9f"; | ||
const DEFAULT_TIMESTAMP_TZ_FORMAT: &str = "%FT%H:%M:%S.%9f%:z"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://docs.rs/chrono/latest/chrono/format/strftime/index.html
These are just the definitions for an RFC3339 representation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice cleanup. Thank you @tustvold
arrow-csv/src/writer.rs
Outdated
@@ -97,26 +92,14 @@ pub struct Writer<W: Write> { | |||
/// Is the beginning-of-writer | |||
beginning: bool, | |||
/// The value to represent null entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The value to represent null entries | |
/// The value to represent null entries. If not specified, [`DEFAULT_NULL_VALUE`] is used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might help to add information about the default values for the formats above as well
@@ -207,9 +190,9 @@ impl<W: Write> RecordBatchWriter for Writer<W> { | |||
#[derive(Clone, Debug)] | |||
pub struct WriterBuilder { | |||
/// Optional column delimiter. Defaults to `b','` | |||
delimiter: Option<u8>, | |||
delimiter: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the idea that a u8
is more space efficient than Option<u8>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more just unnecessary 😅
@@ -53,48 +53,6 @@ fn test_export_csv_timestamps() { | |||
} | |||
drop(writer); | |||
|
|||
let left = "c1,c2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this test removed?
I see the output is identical in the two cases -- is there still coverage showing that `
arrow_csv::WriterBuilder::new()
.with_rfc3339()
Produces the default values?
Perhaps given .with_rfc3339()
is the default, maybe it should should be deprecated too 🤔 ?
Which issue does this PR close?
Closes #4735
Closes #4906
Rationale for this change
Cleans up
WriterBuilder
to not default the format strings, this not only saves allocations, but also allows using the fast formatting path for RFC3339 formatting.What changes are included in this PR?
Are there any user-facing changes?
This changes timestamps to default to AutoSI. I'm not sure this really counts as a breaking change, any well-behaved parser should handle this