Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf-8 Chinese string #42

Open
Sloaix opened this issue Jul 24, 2014 · 2 comments
Open

utf-8 Chinese string #42

Sloaix opened this issue Jul 24, 2014 · 2 comments

Comments

@Sloaix
Copy link

Sloaix commented Jul 24, 2014

hi man.
I met a question when i used Gzip to decompress String.When I decompress a utf-8 Chinese String.it will be wrong.
Because i use ajax to exchange data.And the xhr.responseText is gziped by server.
so i need to decompress the data on browser.
can you give me some advice?
thx : )

@Sloaix Sloaix changed the title utf-8 chinses string utf-8 Chinese string Jul 24, 2014
@kevinxucs
Copy link

Use https://github.com/inexorabletash/text-encoding for decoding UTF-8 characters

@Siman8
Copy link

Siman8 commented Aug 10, 2020

//unint8Arr转char
function Uint8ArrToChar(array) {
	var out, i, len, c;
	var char2, char3;

	out = "";
	len = array.length;
	i = 0;
	while (i < len) {
		c = array[i++];
		switch (c >> 4) {
			case 0:
			case 1:
			case 2:
			case 3:
			case 4:
			case 5:
			case 6:
			case 7:
				// 0xxxxxxx
				out += String.fromCharCode(c);
				break;
			case 12:
			case 13:
				// 110x xxxx 10xx xxxx
				char2 = array[i++];
				out += String.fromCharCode(((c & 0x1F) << 6) | (char2 & 0x3F));
				break;
			case 14:
				// 1110 xxxx 10xx xxxx 10xx xxxx
				char2 = array[i++];
				char3 = array[i++];
				out += String.fromCharCode(((c & 0x0F) << 12) |
					((char2 & 0x3F) << 6) |
					((char3 & 0x3F) << 0));
				break;
		}
	}

	return out;
}

//char转uint8Arr
function charToUint8Arr (str) {
	var code;
	var utf = [];
	for (var i = 0; i < str.length; i++) {
		code = str.charCodeAt(i); //返回每个字符的Unicode 编码
		if (code < 0x0080) { //ascII
			utf.push(code); //返回指定位置的字符
		} else if (code < 0x0800) {
			utf.push(0xC0 | ((code >> 6) & 0x1F));
			utf.push(0x80 | ((code >> 0) & 0x3F));
		} else if (code < 0x10000) { //中文
			utf.push(0xE0 | ((code >> 12) & 0x0F));
			utf.push(0x80 | ((code >> 6) & 0x3F));
			utf.push(0x80 | ((code >> 0) & 0x3F));
		} else {
			throw "不是UCS-2字符集"
		}
	}
	return new Uint8Array(utf);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants