Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paste non-ASCII character causes Mojibake (garbled text) #844

Closed
oneonestar opened this issue Jun 15, 2023 · 1 comment
Closed

Paste non-ASCII character causes Mojibake (garbled text) #844

oneonestar opened this issue Jun 15, 2023 · 1 comment

Comments

@oneonestar
Copy link

oneonestar commented Jun 15, 2023

Pasting a relatively long string with non-ASCII character into trino-cli causes part of the string becomes Mojibake (garbled text).

jline reads 64 bytes at a time. It will then decode that 64 bytes using CharsetDecoder.
When pasting a line contains more that 64bytes, it will repeat the above action until the end.

For non-ASCII character using UTF-8, a code point could take 1-4 bytes. For example, one Japanese character usually takes 3 bytes. If we paste 22 Japanese characters into trino-cli, that string is 22*3byte = 66 bytes. The last character will be broken.

First 64 bytes read: 21 Japanese characters + 1st byte of the last character
=> the 1st byte then being dropped silently

Second 64 bytes read: 2nd & 3rd byte of the last character
=> failed to decode and becomes Mojibake

How to reproduce:
Paste 一二三四五六七八九〇一二三四五六七八九〇一二三 into trino-cli (without using the slow pasting feature provided by terminal).

Environment:

  • trino-cli-388 & trino-cli-419
  • jline 3.21.0
  • MacOS Monterey 12.6.1
  • iTerm2 Build 3.4.19
Jun 15, 2023 11:41:59 AM org.jline.utils.Log logr
FINE: Registering shutdown-hook: Thread[JLine Shutdown Hook,5,main]
Jun 15, 2023 11:41:59 AM org.jline.utils.Log logr
FINE: Adding shutdown-hook task: org.jline.terminal.impl.PosixSysTerminal$$Lambda$81/0x0000000801098da8@14a2f921
Jun 15, 2023 11:41:59 AM org.jline.utils.Log logr
FINE: Using terminal PosixSysTerminal
Jun 15, 2023 11:41:59 AM org.jline.utils.Log logr
FINE: Using pty OsXNativePty

Looks like #782 fixed this issue unintentional(?).
The old problematic codes are below:

byte[] buf = new byte[b.length];
int l = input.readBuffered(buf);
if (l < 0) {
return l;
} else {
ByteBuffer bytes = ByteBuffer.wrap(buf, 0, l);
CharBuffer chars = CharBuffer.wrap(b);
decoder.decode(bytes, chars, false);

Update jline to 3.22.0 fixed this issue on my machine.
I report this issue just in case to prevent regression in the future.
Feel free to close this issue if this is not a problem anymore.

Ref: trinodb/trino#17916 trinodb/trino#17915

@gnodet
Copy link
Member

gnodet commented Oct 24, 2023

Closing as solved in 3.22.0

@gnodet gnodet closed this as completed Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants