Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

built-in PTY / readline support #2664

Closed
thefallentree opened this issue Jan 2, 2020 · 16 comments
Closed

built-in PTY / readline support #2664

thefallentree opened this issue Jan 2, 2020 · 16 comments

Comments

@thefallentree
Copy link
Contributor

Hi,

I'm developing an application that consume telnet protocol data from an application using websocket , and an core functionality missing is the ability to switch between line-mode and chractermode , and switch local-echo on and off base on server side request. What's the best way to achieve this?

Thanks

@thefallentree thefallentree changed the title built-in line-editing support built-in line-editing / local echo support Jan 2, 2020
@thefallentree
Copy link
Contributor Author

thefallentree commented Jan 2, 2020

The typical layers involved in the typical linux terminal stack is pretty complex , so I want to list it here first to further the discussion.

In an typical Linux program, full featured terminal application involves.

User -> PTY <-> optional terminal library (readline (ncurse) + termcap?) <-> Shell (eg. bash) < - application program 
          | -> Terminal Emulator

Since Terminal Emulator, PTY, and program are on the same machine, the link speed and latency between them are minimal.

And in term of "local-echoing" and "line editing" when connect to an remote server using an protocol like telnet is implemented as following

User -> PTY <-> terminal library (readline (ncurse) + termcap?) <-> telnet<-> Remote Server
          | -> Terminal Emulator

telnet implement local-echo and line editing by manipulating PTY through a terminal library. (cfmakeraw() or something)

and SSH actually does no local-echo and line editing, it does remote-echo and remote line-editing through an remote PTY, which is bound by latency and speed of the link.

User -> PTY <-> terminal library (readline (ncurse) + termcap?) <-> shell <-> ssh <- ssh protocol -> SSHD <-> PTY(remote) <-> terminal library <-> bash (remote)
          | -> Terminal Emulator

Ok, now introduce xtermjs and websocket to the stack

The official setup, seems to be:

user -> XtermJS as Terminal Emulator <- websocket transport -> PTY (through nodejs-pty) <-> terminal library (readline (ncurse) + termcap?) <-> program (like bash) 

and this works fine locally if everything is local, however, my desired setup is

user -> XtermJS as Terminal Emulator <- websocket transport -> Remote Application

This creates a ton of problem:

  1. no one handles terminal resize etc , however xtermjs does do soft-wrapping , which is suppose to be part of PTY I think.
  2. no one does echoing etc , why not have xtermjs does it directly?
  3. of course no one implements line editing either.

I could implement this on the server-side to achieve remote echo and remote line-editing , by doing this

user -> XtermJS as Terminal Emulator <- websocket transport -> some part of Application that launches Remote PTY <-> readline/ncurse/termcap <-> Real Remote Application 

But built-in remote PTY launch is actually pretty hard to implement correctly across platform and my attempts to hook readline/ncurse/termcap with PTY then to websocket seems way too complex for such simple use case.

So, what's the best way to solve this issue?

Currently I'm thinking this

User -> JS PTY <- JS Protocol library like (TELNET) -> <- websocket transport -> Remote Application
       |--> XtermJS as Terminal Emulator 

There should be an JS PTY library that could handle local echoing/linemode or raw mode, and an JS protocol library that could negotiate with backend on enabling/disabling local-echo and linemode/rawmode.

Furthermore, to implement line-editing on the client side, we also need to implement ncurse/readline in JS, as following.

User -> JS PTY <- JS readline library -> <- JS Protocol library( eg. TELNET, or socket.io) -> <- websocket transport -> Remote Application
      | -> XtermJS as Terminal Emulator

Right now, I found wavesoft/local-echo to do some combination of PTY / readline, but not in a way that is extendable or reusable. Obviously JS protocol library is outside the scope of xtermjs, but maybe JS PTY library is doable and should be in the scope .

I hope someone with more knowledge can chime in.

Cheers

@PerBothner
Copy link
Contributor

I believe it is possible to use the readline library without a pty - you can set rl_getc_function to a custom read function.

Originally for DomTerm I used the contentEditable feature of modern browsers to implement local (in-terminal) editing. It worked tolerably well, but there were some problems (that I don't fully remember). I also experimented with using CodeMirror. That can work pretty well if you don't mind a separate input area, and it can be made to work for a more traditional REPL interface (a la Emacs shell mode), though it's a bit fragile in the latter case.

I ended up implementing a small editor, which works pretty well. I translate key-bindings to strings (using the browserKeymap library) which I use to index into a table of named commands; if found the command is executed, modifying the input area as needed. Typing Enter sends the completed line to the remote application. If remote echo is enabled, then the input area is deleted (lazily, when the next output, presumably echo, arrives).

If the input area is "inline" with the main terminal then you have to consider what happens if the remote application sends output while there is an active non-empty input area. (A reasonable model is that when output appears you temporarily remove the input area, then you process the output including escape sequences. then you restore the input area at the new cursor position.) You probably also want to prevent multiple Left-arrow presses from moving the cursor into the prompt area - or previous output lines.

DomTerm allows you to switch on-the-flow between character-at-a-time (raw) mode, local line editing, or auto mode (depends on the state of the remote pty), with or without local echo. This can be switched by the application using an escape sequence or by the user: For example the user might be typing at a remote readline-enabled shell, but might prefer to "force" local editing due to latency issues.

Something like this could be a very nice extension for xtermjs - but it's a fair amount of work. (Could be appropriate for a Google Summer of Code project, for example, though it needs someone willing and able to help with API design.) If anyone is tempted, feel free to ask me questions about how it works in DomTerm - and feel free to make use of DomTerm code.

@thefallentree thefallentree changed the title built-in line-editing / local echo support built-in PTY / readline support Jan 2, 2020
@thefallentree
Copy link
Contributor Author

@PerBothner Yes, both xtermjs and domterm are implementing things in a different way, where xtermjs implemented PTY style wrapping but no echo and line-editing, and domterm implemented wrapping and echo, also line-editing, but with an custom application level protocol (some custom escape sequence). This creates an interesting problem of forcing user to have to choose one implementation with another because it wouldn't really be possible to switch implementations or if someone else want to implement an mobile native emulator, they wouldn't know what exactly to implement.

Also I want to point out that implementing an echo and line-editing feature external but directly on the terminal is no small task, especially with CJK characters and IME support (temporal editing line), you would need to keep a lot of state and take care of width of the characters.

So hard that I don't think it actually make sense. In my current experiment, I chose to uses an external normal input box , which has 100% echo and line-editing support out of box, I implemented some limited history feature and the only thing missing is to turn on/off single character mode and echoing .

Now, what's missing in my experiment is to have an js class that behave like a PTY device, where all data flow in and flow out, so it knows to remember cursor position when input starts , draw them, and when output starts, erase them and re-draw when output stops. So, in other words, behave exactly like an PTY :-)

@jerch
Copy link
Member

jerch commented Jan 3, 2020

Now, what's missing in my experiment is to have an js class that behave like a PTY device, where all data flow in and flow out, so it knows to remember cursor position when input starts , draw them, and when output starts, erase them and re-draw when output stops. So, in other words, behave exactly like an PTY :-)

I've started a fake pty for xterm.js in https://github.com/jerch/browser-fakepty. Its in a very early state, and I am still struggling to find out where to stop to pull in half of the POSIX stuff like process management / signalling. Basic echoing already works, erase is still missing from ICANON. There is an early primitive shell with pipe support, no other operators though yet.

@Tyriar
Copy link
Member

Tyriar commented Jan 3, 2020

I think this is out of scope of the xterm.js project, @jerch's experiment might be useful though.

@theflyingape
Copy link
Contributor

theflyingape commented Jan 4, 2020

@thefallentree I suggest reviewing kernel drivers/tty/tty_io.c for your specific app needs, or man tset
If it helps, I used Alan Cox term functions here to switch modes.

@PerBothner
Copy link
Contributor

@Tyriar "I think this is out of scope of the xterm.js project"

Consider something like Emac's "comint" mode, which is the base for shell mode, and many other modes that display process output. Something similar can useful for many IDE environments, and it makes sense to build it on top of a terminal emulator widget. (One reason is to consistently handle escape sequences in process output.) This is my motivation when I wrote term.el. (I hoped it would replace comint.el but that never happened.) Some of these modes (including all shell/repl modes) require being able to edit user input - and it is nice to do that inline following a prompt.

So while you might consider an input editor out of scope, I suggest that a documented API to embed an editable line (or sometimes multiple lines) would make a lot of sense. Having an actual editor available as an extension would be even better, but not as important as an API that can be used for embedding an editable input line.

@jerch
Copy link
Member

jerch commented Jan 4, 2020

While it is fun to play around with the idea to "look behind the pty curtain", imho it should not be a primary goal of xterm.js. We still have enough issues to close the gap to other emulators as a character based terminal, as well as issues with relatively new things (like image handling/Unicode as more general problems of the terminal interface, or new sequences in general). To me xterm.js should stay focused on trying to give ppl a nice (character based) terminal experience as the main goal. "Do one thing and do it well."

Now approaching it from the other side - like having "higher level" functionality within the terminal itself. Like an editor mode. Oh wait that idea is not new at all - almost all IBM terminals were block mode terminals, where you would enter and edit data locally and transmit pages/blocks of data instead of every single char. So the question is - why dont we have or implement that? Simple answer - the whole terminal stack we aim for is Unix/POSIX like, therefore character driven. Unix does not have proper block semantics in any tty implementation (line discipline) (Edit: AIX has, but not the common termios interfaces in other Unices), thus there is simply no interface/software we could interact with. Its kinda a hen-egg problem.
What about ICANON? Well yes, thats somewhat a block mode on line level, but again not in the terminal itself, it lives in kernel space. There were recently some discussions from the systemd group about moving the tty impl out of the kernel, to make it open to more ppl and fix termios in certain ways.

Now looking at terminal + tty(termios) + app again - its actually doing what block mode does for those ancient IBM terminals, but with more interfaces involved (and ofc other differences): vim running on app side holds the editable data buffer and gets orchestrated by forwarded sequences while in a block terminal this all would have happened locally.

From the viewpoint of "separation of concerns" the character driven way seems to be superior, the other side can more easily be changed to something else, while coupling stuff tighter to the terminal as a "rich client" is less flexible. (This was even more true for hardware terminals in the 80s). But the flexibility comes to a price - the orchestration needs tons of sequences, and some things are simply not possible due to a "phone line" (WTH?) in between. I think @PerBothner is the best to be asked for all these limitations, as some of his ideas try to overcome some.

What to make out of all of this?
I think the char based transmission as main interface for xterm.js should be seen as given, still its possible to hook any other complex piece of abstraction right onto this by the existing API - just like a pty/tty would do (see it as a different line discipline than we have in kernels). For browser local stuff short circuits/side channels from/into the terminal would be possible*, but I would not consider this a common use case worth adding complicated APIs prehand.

Hope this was not to much fundamental talk. (I am still under the pty/tty impression lol)

Edit: Btw I plan to include a better line editor than ICANON in the fake-pty package later on. (But this is unlikely to happen anytime soon, as there are so many low level POSIX things to deal with first.)

Edit2:
[*] Side channeling is always cumbersome in a stream based env - it might collide with assumptions during data processing. We already have such a side channel for the standard pty/tty - resize. This alone is quite hard to synchronize with band data (in fact it is impossible and might lead to screwed up terminal pages during resize while data is flowing in, see #1914 and #2564 (comment)).

@thefallentree
Copy link
Contributor Author

thefallentree commented Jan 7, 2020

@jerch But if xtermjs want to become an character based terminal emulator, then it should probably provide an character based interface (not an bytes/column based one like the current escape sequences), I wholeheartedly support an departure from the old baggage, as it is possible to have glue layer deal with those instead.

Right now, it is almost impossible to implement echoing and line-editing correctly in regards to mulitco column characters , because of the rules for calculating width and there are no facility to generate correct amount of cursor movement No matter where you put the logic, not in front end and not in backend either

@PerBothner
Copy link
Contributor

@thefallentree "if xtermjs want to become an character based interface, then it should provide an character based interface (not an column based one like the current escape sequences)"

You might be interesting this very preliminary proposal for variable-sized characters. I'm thinking about implementing this for DomTerm (and maybe a fork of the fish shell to take advantage of it) - but for now it's just some vague ideas.

@jerch
Copy link
Member

jerch commented Jan 7, 2020

But if xtermjs want to become an character based terminal emulator, then it should probably provide an character based interface (not an bytes/column based one like the current escape sequences)

Well it kinda does and does not - the problem arised with Unicode, which is currently broken in the terminal for advanced rules (even worse - the rules are different for different Unicode versions). Back in the days a byte would stand for a single char (either being printable or part of a sequence/control code), but with Unicode/UTF8 this is not true anymore. The newer Unicode versions created so many advanced rules that wcwidth alone is not sufficient anymore to correctly deal on character level (in the meaning of user perceived character, which can be composed of several codepoints). Thats why we need a better way to handle Unicode than we currently do. Your extractor might be a starting point for this, but we cannot do that alone, it kinda affects the whole terminal interface. @PerBothner linked above some ideas that would help to overcome some broken aspects.

As for now - it is kinda important, that app side and terminal agree 100% on the used Unicode rules + wcwidth table, otherwise the output will break for certain things. For an advanced in-browser editor on top of the xterm.js interface this is easier to accomplish (just use the same libs on both ends), but for real system interaction that is already a problem, which currently gets worse with every new Unicode version release (libc's wcwidth tends to fall behind here).

Edit: Also note, that some Unicode rules directly depend on renderer capabilities, like the compound emojis: they shall be rendered as a single emoji if the renderer/font is capable to do so, but also have a second legal repr of single emojis, if the renderer cannot do that. This leads to different run-widths by design, an issue that cannot be solved at all for the current wcwidth-based terminal interface - the app side simply cannot rely on renderer/output caps at all. @PerBothner's idea of automatic floating text with arbitrary widths would also solve that, but thats a fundamental break with the current interface, thus needs careful design to not break with 95% of the current paradigms. This def. needs some broader attention and a more fundamental solution (thus taking libc wcwidth impls into account and such). Until we have that - dont use any complicated Unicode stuff like compound emojis :(

@jerch
Copy link
Member

jerch commented Jan 7, 2020

@PerBothner, @Tyriar, @egmontkob: Should we try to address the whole Unicode problem in terminal-wg to get a broader attention? Things that come to my mind:

  • build a solid rule set / framework ppl could use to support a certain Unicode version for terminals, including defaults for Bidi (prolly taken from @egmontkob's work), grapheme handling and wcwidth tables
  • make that framework the default and deprecate libc's wcwidth and other custom libs not based on that framework
  • find a way to announce Unicode version support of the terminal (and backwards - of the app?, some handshaking protocol, no clue yet)
  • identify things that never gonna work and mark them as "DONT USE"
  • encourage the usage of the framework above - this way things will play nicely together for the same version

Yes, this partly reads like doing the Unicode-Consortium's job for the terminal, but since the Consortium does not care much for the terminal at all, I think we need some "formal regulation" to deal with the uncertainty created by the newer Unicode rules. Not sure who could do that other than those actually affected by it (emulator devs, cmdline app devs).

@egmontkob
Copy link

This has nothing to do with the original subject of this issue, right? I haven't read that.

As for Unicode: it would be lovely if this happened, although might be overengineering, e.g. we saw some problems during the Unicode 8 -> 9 switch, but not much since. Okay, emoji handling, VS16, and the concept of the width of individual characters not simply adding up but doing more complex maths is still a problem (for which IMO the best approach is to forbid this behavior and always stick to the sum of individual widths).

Mind you, I think there was already a thread about it on terminal-wg, wasn't there?

I'm not planning to participate in such discussions due to lack of time in the foreseeable future, combined with the IMHO relatively low priority of this issue.

@jerch
Copy link
Member

jerch commented Jan 8, 2020

This has nothing to do with the original subject of this issue, right? I haven't read that.

Yes, it got somewhat out of hand here due to my fundamental talk. 😄

Okay, emoji handling, VS16, and the concept of the width of individual characters not simply adding up but doing more complex maths is still a problem (for which IMO the best approach is to forbid this behavior and always stick to the sum of individual widths).

We quite constantly now get issues regarding Unicode and the width handling in particular (like once in 3 months or so), most being emoji related (which seems rather silly to me, but ppl love them so we cannot just ignore it). Esp. the grapheme rules create tons of problems, and they are not working across the common emulators (tested with xterm, gnome-terminal, iTerm2, Terminal.app). As for now they are still "rare stream content", but getting more common (again mainly through emojis lol). Furthermore there is a major unsolvable issue in the Unicode specs for the current cell based terminal model - some rules allow different output representations based on renderer capabilities, thats a no-go for app side as it basically means that the run-width is undetermined. (How to solve that is still a mystery to me, at least @PerBothner has some ideas that might help here.)

I am currently under the impression that the simple wcwidth based approach will not work reliably anymore (kinda gets worse with every version release). Also version discrepancies with app side are a major issue imho. I admit that xterm.js is abit exotic here, we always run "remote" and thus have no legal way to derive a default unicode version from a main system (but thats also true for any emulator during ssh). Thus the idea to address it on a more formal level.

Mind you, I think there was already a thread about it on terminal-wg, wasn't there?

Yes I think so (have not yet looked through the list there). It also will touch some of the other issues there, at least the one about announcing terminal capabilities. WIll see where this leads to 😸

@jerch
Copy link
Member

jerch commented Feb 3, 2020

@thefallentree Can we close this? Imho the offtopic part can be discussed elsewhere, regarding xterm.js itself I dont see much at hand to do. An extended editor mode like those block terminals had is way beyond our current scope.

@Tyriar
Copy link
Member

Tyriar commented Feb 3, 2020

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants