File.toString only works on UTF-8 files #29

albertdahlin · 2021-10-26T20:02:37Z

Problem

I use File.toString on text files that are not encoded in UTF-8 (in my case it is ISO-8859-1). This turns all my swedish letters into � (U+FFFD) which is the unicode replacement character.

This happens when the file is converted to string. Since all non-ascii characters (åäöÅÄÖ in my case) all gets turned into the same unicode character there is no easy way to fix it after file has been read. I guess one could read it as Bytes and convert to UTF-8 manually.

Possible solution

The FileReader.readAsText() supports a second parameter to specify encoding.

I tested changing the code in the Elm.Kernel.File module, adding my encoding and it worked. Maybe the encoding could be added as an argument to File.toString?

reader.readAsText(blob, 'ISO-8859-1');

The text was updated successfully, but these errors were encountered:

lue-bird mentioned this issue Aug 8, 2022

File.toString doesn't work on non-UTF-8 files gren-lang/browser#48

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File.toString only works on UTF-8 files #29

File.toString only works on UTF-8 files #29

albertdahlin commented Oct 26, 2021

File.toString only works on UTF-8 files #29

File.toString only works on UTF-8 files #29

Comments

albertdahlin commented Oct 26, 2021

Problem

Possible solution