Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whitespace causes parse failure with "Illegal quoting in line" error #44

Open
PhaedrusTheGreek opened this issue Nov 29, 2016 · 6 comments
Labels

Comments

@PhaedrusTheGreek
Copy link

PhaedrusTheGreek commented Nov 29, 2016

When spaces are present between quoted entries, trailing or leading the CSV data, parse failure occurs.

Seems to happen in all versions of the plugin.

input {
 stdin {}
}

filter {
 csv {
 columns => [ "screentype", "devicetype" ]
 }
}

output {
 stdout {
  codec => rubydebug {}
 }
}

Note the space before the line in the 2nd record, and the space between fields in the 3rd record

2016-11-29T11:54:39,123][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
"test1","test2"
{
    "@timestamp" => 2016-11-29T16:54:45.725Z,
      "@version" => "1",
    "screentype" => "test1",
       "message" => "\"test1\",\"test2\"",
    "devicetype" => "test2"
}
 "test1","test2"
[2016-11-29T11:54:57,222][WARN ][logstash.filters.csv     ] Error parsing csv {:field=>"message", :source=>" \"test1\",\"test2\"", :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}
{
    "@timestamp" => 2016-11-29T16:54:56.479Z,
      "@version" => "1",
       "message" => " \"test1\",\"test2\"",
          "tags" => [
        [0] "_csvparsefailure"
    ]
}
"asdf3",   "234"
[2016-11-29T12:00:18,634][WARN ][logstash.filters.csv     ] Error parsing csv {:field=>"message", :source=>"\"asdf3\", \"234\"", :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}
{
    "@timestamp" => 2016-11-29T17:00:17.933Z,
      "@version" => "1",
       "message" => "\"asdf3\", \"234\"",
          "tags" => [
        [0] "_csvparsefailure"
    ]
}

@PhaedrusTheGreek PhaedrusTheGreek changed the title Trailing or leading spaces causes parse failure with "Illegal quoting in line" error whitespace causes parse failure with "Illegal quoting in line" error Nov 29, 2016
@jordansissel
Copy link
Contributor

jordansissel commented Nov 30, 2016

I don't' think there's any practical standard for what constitutes "CSV" but my rough understanding was that fields were comma-delimited and sometimes quotes were used for values. Spaces outside of values seems invalid, to me.

If I look outward a bit, what should the expected parsing result be for the following probably-not-valid-csv:

  • "hello" "world","foo bar"
  • "hello", "world"

@jordansissel
Copy link
Contributor

I tested loading csv into LibreCalc with a few variants of "one","two","three" with spaces in various places, and things seem like they are loaded successfully.

@jordansissel
Copy link
Contributor

After research above, I am agreeing this is a bug. We use the Ruby standard library CSV parser for this filter, and I don't see any mechanism in the CSV library to make it work with the whitespace-filled data you provide. This means we'll probably have to find (or write) a replacement library. I have no ETA on that effort.

@pavelnikolov
Copy link

I have the same issue and I have no idea how to fix it

@SHSauler
Copy link

I can third this issue. Is there a workaround?

@kikaragyozov
Copy link

kikaragyozov commented Apr 16, 2021

The original spec (RF-4180) from 2005 doesn't mention what to do with this case, but a draft case of 2016 or so suggests we just trim any leading/trailing white-spaces outside a quote segment.

Source

@jordansissel can this get a fix now in that direction?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants