-
Notifications
You must be signed in to change notification settings - Fork 362
File Encoding
guwirth edited this page Mar 10, 2021
·
2 revisions
To read in files the encoding is defined as follows:
-
source files:
- First, it checks if the file contains a BOM. If a BOM is present, this encoding is used.
- For files without BOM it tries to read the encoding from the property sonar.sourceEncoding
- default is default system encoding
-
XML reports:
- Encoding type is read from the prolog section of the XML document.
- If no definition is available, UTF-8 is used.
-
text reports:
- First, it checks if the file contains a BOM. If a BOM is present, this encoding is used.
- For files without BOM it tries to read the encoding from the properties (e.g.
sonar.cxx.clangtidy.charset=UTF8
). - If no property is defined, UTF-8 is used as default.
The list of available encodings depends on your JVM. Every implementation of the Java platform is required to support the following standard charsets:
Charset | Description |
---|---|
US-ASCII | Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set |
ISO-8859-1 | ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1 |
UTF-8 | Eight-bit UCS Transformation Format |
UTF-16BE | Sixteen-bit UCS Transformation Format, big-endian byte order |
UTF-16LE | Sixteen-bit UCS Transformation Format, little-endian byte order |
UTF-16 | Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark |