-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV to JSON is too fragile #127
Comments
Hey there! Thanks for the report, I'll try to either fix the current code or try to find a more solid library for conversion. I have to admit this isn't a script I use a lot personally so it's hard to see shortcomings... I appreciate the details and will look into the conventions for converting between the two formats. |
[Off-Topic, but context: I'm attempting to make a JSON->YAML converter script and noticed this Issue; @IvanMathy's comment about "more solid library" for conversions hits home with the problem I'm facing; external libraries.] The best library for converting CSV to JSON I've found is https://www.papaparse.com/ . I've tried creating a script like this:
But I get the error
Which makes sense: it's not one of the included libraries laid out in your Modules docs here... So, what's the suggested process here? Feel free to "RTFM" with a link :D |
For what it's worth: I grabbed the minified version of Papa Parse and pasted it straight into my
If the pattern Boop wants to follow is to have all dependencies in the lib folder, then creating a new file in there for Papa (with its LICENSE, of course) is probably the cleanest way, but before I throw up a PR I wanted to double-check that this is the approach you want. |
Original implementation couldn't handle many common "Gotchas" in CSV, including quoted commas IvanMathy#127
Put up a PR that (I THINK) is aligned with the Boop docs:
https://github.com/IvanMathy/Boop/blob/main/Boop/Documentation/CustomScripts.md#performance The entire PapaParse library was 20k. Not sure how this compares to other "helpers", so I opted to start "Fast and easy". If this is too large, there are components (mostly around file read/write) that we don't need of the library, but removing them piecemeal isn't my idea of a great time (and probably won't save much space considering the maturity of PapaParse), so I left it for now. |
Sorry for the double-post, but I just realized that there's a combination of an assumption and a "Limitation" at play around the "header" stuff in this script. First, we're assuming that this data has a header (both in the original and in my fix). There's a "Limitation" (and I use that term VERY loosely because I actually see this as a feature/simplicity) in Boop that you can't pass parameters to the scripts. So... the solutions I see are:
If we went with # 1, I'd argue that the 2nd script would just be an optional script, and the default would be |
Original implementation couldn't handle many common "Gotchas" in CSV, including quoted commas IvanMathy/Boop#127
The CSV to JSON script is too fragile and breaks on real-world data such as double-quoted columns to contain commas and double-double-quotes for literal double-quotes.
I had a \r in some of my data which means the script isn't standardizing to \n line endings.
Test:
Result:
Expected:
The JSON to CSV gets similarly confused in that it uses
\"
for literal double-quotes instead of the CSV convention of double-double-quotes within a quoted column.The text was updated successfully, but these errors were encountered: