You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Question: accented characters are replaced with ? "tofu" when I view my CSV.
Answer:
This is usually a file character encoding issue. The question mark character is displayed when a character isn't encoded correctly (or is being interpreted in the wrong encoding), so the editor doesn't know what character to display. The corrupted characters shouldn't break anything, they just won't display correctly.
However, you get an ugly error from Jekyll / Ruby when you jekyll s --> the most likely issue is a incorrect CSV created by Excel (the encoding “UTF-8 with BOM” will cause errors in Ruby). Excel can not correctly create CSVs!
The metadata CSV needs to be in UTF-8 encoding (without BOM). There are some tips on: https://collectionbuilder.github.io/cb-docs/docs/metadata/uploading/
UTF-8 can encode any characters--so if formatted correctly, you can display any kind of accents, etc. Usually corrupted characters happen when opening a UTF-8 file with Excel.
Sometimes you can fix the CSV by re-encoding on VS Code and choosing UTF-8 (click the encoding such as "UTF-8" at the bottom of the editor window, displayed in the blue bar at the bottom right).
However, sometimes the characters are fully corrupted in the CSV (because it is interpreting it in the wrong encoding to start with)--so you might need to go back to the source spreadsheet and export it again, or figure out what the encoding really is and reopen with correct encoding (then re-encode as UTF-8).
If you create metadata in Excel--don't try to save the CSV from Excel, it doesn't work!
Import the xlsx file into Sheets or OpenRefine or LibreOffice, then create the CSV from one of those tools instead.
Excel won't properly parse the UTF-8 CSV (if you have any special characters in it, they will be corrupted) and can't properly export a csv--you can create metadata directly in Excel, using the xlsx file format then export using something else--but going back and forth with CSV + Excel doesn't work.
If you want a desktop app for editing spreadsheets, I would use LibreOffice (its free and open). There is also some plugins for VS Code for editing spreadsheets--but I am not familiar with them, so can't suggest anything! I do a lot of CSV editing just directly in Code, not so hard once you get used to looking at it. I use the Rainbow CSV extension to make visualizing the columns easier.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Question: accented characters are replaced with
?
"tofu" when I view my CSV.Answer:
This is usually a file character encoding issue. The question mark character is displayed when a character isn't encoded correctly (or is being interpreted in the wrong encoding), so the editor doesn't know what character to display. The corrupted characters shouldn't break anything, they just won't display correctly.
However, you get an ugly error from Jekyll / Ruby when you
jekyll s
--> the most likely issue is a incorrect CSV created by Excel (the encoding “UTF-8 with BOM” will cause errors in Ruby). Excel can not correctly create CSVs!The metadata CSV needs to be in UTF-8 encoding (without BOM). There are some tips on: https://collectionbuilder.github.io/cb-docs/docs/metadata/uploading/
UTF-8 can encode any characters--so if formatted correctly, you can display any kind of accents, etc. Usually corrupted characters happen when opening a UTF-8 file with Excel.
Sometimes you can fix the CSV by re-encoding on VS Code and choosing UTF-8 (click the encoding such as "UTF-8" at the bottom of the editor window, displayed in the blue bar at the bottom right).
However, sometimes the characters are fully corrupted in the CSV (because it is interpreting it in the wrong encoding to start with)--so you might need to go back to the source spreadsheet and export it again, or figure out what the encoding really is and reopen with correct encoding (then re-encode as UTF-8).
If you create metadata in Excel--don't try to save the CSV from Excel, it doesn't work!
Import the xlsx file into Sheets or OpenRefine or LibreOffice, then create the CSV from one of those tools instead.
Excel won't properly parse the UTF-8 CSV (if you have any special characters in it, they will be corrupted) and can't properly export a csv--you can create metadata directly in Excel, using the xlsx file format then export using something else--but going back and forth with CSV + Excel doesn't work.
If you want a desktop app for editing spreadsheets, I would use LibreOffice (its free and open). There is also some plugins for VS Code for editing spreadsheets--but I am not familiar with them, so can't suggest anything! I do a lot of CSV editing just directly in Code, not so hard once you get used to looking at it. I use the Rainbow CSV extension to make visualizing the columns easier.
Beta Was this translation helpful? Give feedback.
All reactions