-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add appending support for Delimited files #3573
Conversation
c980a79
to
4f8aca5
Compare
3736803
to
5d3ae84
Compare
00242dd
to
9abbb56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits
## PRIVATE | ||
Reads the beginning of the file to detect the existing headers and column | ||
count. | ||
detect_headers : File -> File_Format.Delimited -> Detected_Headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It kinda feels like this should use a Maybe
and be empty instead of using a separate new marker type No_Data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I thought it makes sense to create a domain-specific ADT for all the cases here. But your solution also sounds good to me, so I can switch to it. I think I'd use a Nothing
as we know it would either be a Nothing
or one of the two other ADT cases - using Maybe
seems like an overkill to me because I want to represent 3 states and not a presence or lack of two states.
should_write_headers headers = case headers of | ||
True -> True | ||
Infer -> True | ||
False -> False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should_write_headers headers = case headers of | |
True -> True | |
Infer -> True | |
False -> False | |
should_write_headers headers = case headers of | |
Infer -> True | |
_ -> headers |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, well it can also be:
case headers of
False -> False
_ -> True
but I wanted to make it explicit which of the three options map to which. IMO the codes where the other two options are collapsed are slightly less readable - it is less clear which options map to which.
std-bits/table/src/main/java/org/enso/table/read/DelimitedReader.java
Outdated
Show resolved
Hide resolved
if (effectiveColumnNames == null) { | ||
detectHeaders(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like ensureHeadersDetected
could be called here instead for DRY?
skipFirstRows(); | ||
Row firstRow = internalReadNextRow(); | ||
if (firstRow == null) { | ||
effectiveColumnNames = new String[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this return null
according to the comment on getDefinedColumnNames
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
effectiveColumnNames
and definedColumnNames
are two distinct concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments but looks good to me.
|
||
private record Row(long lineNumber, String[] cells) {} | ||
|
||
private final Deque<Row> pendingRows = new ArrayDeque<>(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a Stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? I need it to be FIFO, but stack is FILO, no? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry meant a queue. We don't need the double ended version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but I can make the type Queue
while still instantiating the ArrayDeque
. That sounds good.
distribution/lib/Standard/Table/0.0.0-dev/src/Internal/Delimited_Writer.enso
Show resolved
Hide resolved
to run it without reading the whole file - that will be used by Delimited Append.
9abbb56
to
12b0e42
Compare
Pull Request Description
Implements https://www.pivotaltracker.com/story/show/182309839
Important Notes
Checklist
Please include the following checklist in your PR:
Scala,
Java,
and
Rust
style guides.
./run ide dist
and./run ide watch
.