Skip to content

Commit

Permalink
Fixed CSV read failure involving blank lines in RFC4180 mode. (#2434)
Browse files Browse the repository at this point in the history
* Ensure we can read blank lines in CSV files.

There was a problem handling blank lines in a quoted
CSV column when reading RFC4180 mode.
Instead of continuing to read more lines, it dropped out
assuming there was at least 1 character. Now, if the
line we read contains 0 characters, it immediately goes
back to read more lines.

I did not notice this before, because the test did not
use rfc4180 csv files for input or output.

* Added a test for quotes in quoted CSV.

This adds a test to make sure we can process
quotes within a quoted column. Quotes in quotes
need to be double-quoted.
  • Loading branch information
strRM authored Oct 24, 2023
1 parent ab6c00d commit 7353751
Show file tree
Hide file tree
Showing 15 changed files with 31 additions and 5 deletions.
6 changes: 6 additions & 0 deletions src/include/souffle/io/ReadStreamCSV.h
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,12 @@ class ReadStreamCSV : public ReadStream {
pos = 0;
end = line.length();
}
if (pos == end) {
// this means we've got a blank line and we need to read
// more
continue;
}

char c = line[pos++];
if (c == '"' && (pos < end) && line[pos] == '"') {
// two double-quote => one double-quote
Expand Down
2 changes: 2 additions & 0 deletions tests/semantic/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ positive_test(load8)
positive_test(load9)
negative_test(load10)
positive_test(load11)
positive_test(load12)
positive_test(load13)
positive_test(load_adt)
positive_test(load_adt2)
positive_test(load_adt3)
Expand Down
2 changes: 1 addition & 1 deletion tests/semantic/load11/A.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"foo
bar", nothing , "one
bar","nothing","one
two"
2 changes: 1 addition & 1 deletion tests/semantic/load11/facts/A.facts
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"foo
bar", nothing , "one
bar",nothing,"one
two"
6 changes: 3 additions & 3 deletions tests/semantic/load11/load11.dl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.decl A(x:symbol)
.input A()
.output A()
.decl A(x:symbol,y:symbol,z:symbol)
.input A(rfc4180=true)
.output A(rfc4180=true)

4 changes: 4 additions & 0 deletions tests/semantic/load12/A.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"FOO","Line1

Line3.
"
4 changes: 4 additions & 0 deletions tests/semantic/load12/facts/A.facts
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FOO,"Line1

Line3.
"
4 changes: 4 additions & 0 deletions tests/semantic/load12/load12.dl
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.decl A(x:symbol, y:symbol)
.input A(rfc4180=true)
.output A(rfc4180=true)

Empty file.
Empty file.
1 change: 1 addition & 0 deletions tests/semantic/load13/A.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
foo"bar"
1 change: 1 addition & 0 deletions tests/semantic/load13/facts/A.facts
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"foo""bar"""
4 changes: 4 additions & 0 deletions tests/semantic/load13/load13.dl
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.decl A(x:symbol)
.input A(rfc4180=true)
.output A()

Empty file.
Empty file.

0 comments on commit 7353751

Please sign in to comment.