Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better CSV support #53

Merged
merged 15 commits into from
Oct 29, 2014
Merged

Better CSV support #53

merged 15 commits into from
Oct 29, 2014

Conversation

tixxit
Copy link
Owner

@tixxit tixxit commented Oct 27, 2014

This adds a new CSV loader/exporter for Framian. This supports a feature rich set of formatting options - including row-delimiters, empty/invalid markers, and quote escapes. It also allows auto-detection of all or some of the formatting options (excluding header detection, which was flakey). The parser itself parses in chunks and can read the data from a stream of bytes, rather than all-at-once.

tixxit added 14 commits July 22, 2014 17:09
This was added to support converting data loaded from a CSV as Strings
to numeric data. However, the new CSV parser just creates 2 columns, one
for numeric data and one for textual data, then `orElse`s them together.
Conflicts:
	framian/src/main/scala/framian/Frame.scala
	framian/src/main/scala/framian/csv/CsvRowExtractor.scala
Conflicts:
	framian/src/main/scala/framian/FrameUtils.scala
	framian/src/main/scala/framian/UntypedColumn.scala
	framian/src/main/scala/framian/csv/CsvRowExtractor.scala
/**
* A single row in a CSV file.
*/
final class CsvRow(val cells: Vector[CsvCell]) extends AnyVal {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly used in a Vector of Either, so I guess it will end up being boxed most of the time. Do you think we could make it a type alias instead?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'd still prefer a value class over a type alias. Makes implicits and whatnot easier and is a much more real. Is there any reason to make it a type alias?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter much to me, I just thought that you wanted to avoid boxing as much as possible, and this AnyVal will be end up being boxed most of the time, based on what I see in the rest of the code.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, I see :) Argh... yeah, I'm not sure the best solution. I'm not terribly worried about boxing here, since so much boxing already goes into making a row that I don't think it would make much of a difference. I'll ponder, but perhaps as a separate issue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to check the intent :-)

@ghost
Copy link

ghost commented Oct 29, 2014

Looks very good to me. Feel free to merge when you have figured the merge out :-)

tixxit added a commit that referenced this pull request Oct 29, 2014
@tixxit tixxit merged commit efb5d77 into master Oct 29, 2014
@tixxit tixxit deleted the topic/csv branch October 30, 2014 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant