-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logstash CSV Filter - Quote character parse failure #64
Comments
I too have the same issue! |
Could you please show how you work around this issue with gsub? I tried removing quote characters and whitespace (as per issue #44), but it still leads to |
Hi, My issue was that I had extra double quotes inside the field. The issue you referenced has spaces between two fields. I had similar issue once with a csv. I ended up using the pandas python library to parse and index that data. You could do the following and check if it works:
Open a post on the forum (discuss.elastic.co) and ping me. We can continue over there if needed. |
WRT the initial issue, the example doesn't seem to be wellformed csv: https://csvlint.io/validation/5ae2c74704a9ea0004000048 Also, a csv linter in go only accepts the file with a flag to "try to parse improperly escaped quotes":
|
The RFC for CSV says "a double-quote appearing inside a field must be escaped by preceding it with another double quote" (it's rule 7 in section 2). A non-standard alternative is of course to escape quotes using some other character, usually a backslash. Other CSV parsers have also had the dilemma over whether to support these non-standard CSV formats. For example, SuperCSV agonised over it for a while in super-csv/super-csv#14 before eventually adding an option to support it in super-csv/super-csv#103. I've added this comment to (a) make sure there's a clear statement of the dilemma and (b) subscribe to the issue because other parts of the Elastic Stack also use CSV now, and it would be nice if there was consistency about which escaping options are supported. Currently the find_file_structure endpoint added in ML 6.5.0 doesn't support non-standard escaping of quotes. But this could be added to find_file_structure in a future release if Logstash and or Filebeat ever support this. |
Hi
I have a CSV file and the format is something like this:
"102","60","Open","I hope this works out for \"[email protected]\""
When i parse this using the CSV filter i get the following error:
[2018-01-23T13:11:58,523][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"\"102\",\"60\",\"Open\",\"I hope this works out for \\\"[email protected]\\\"\"", :exception=>#<CSV::MalformedCSVError: Missing or stray quote in line 1>}
The quote characters seem to be malformed in the error message. I have currently worked around this issue by using gsub before passing the data to the csv filter. Is this a know bug with the csv filter?
https://discuss.elastic.co/t/csv-filter-quote-character-parse-failure/116611
The text was updated successfully, but these errors were encountered: