Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A keyboard input model #753

Closed
pyfisch opened this issue Jan 5, 2019 · 165 comments
Closed

A keyboard input model #753

pyfisch opened this issue Jan 5, 2019 · 165 comments
Labels
C - needs discussion Direction must be ironed out S - api Design and usability S - enhancement Wouldn't this be the coolest? S - platform parity Unintended platform differences

Comments

@pyfisch
Copy link

pyfisch commented Jan 5, 2019

TLDR: I think that Winit needs more expressive keyboards events and to follow a written specification to keep platform inconsistencies to a minimum. I propose to adapt the JS KeyboardEvent for winit and to follow the UI Events specification for keyboard input.

Winit is used for many applications that need
to handle different kinds of keyboard input.

  • Games: Physical location of keys like
    WASD for movement and actions.
    Text inpput for names and chat.
  • GUI applications: Text input and keyboard shortcuts.
  • the Servo Browser: Wants to support JS KeyboardEvent well.

Currently there are two events for text input in Winit:
KeyboardInput and ReceivedCharacter.

pub struct KeyboardInput {
    pub scancode: ScanCode,
    pub state: ElementState,
    pub virtual_keycode: Option<VirtualKeyCode>,
    pub modifiers: ModifiersState,
}

The KeyboardInput event carries information about keys pressed and released.
scancode is a platform-dependent code identifying the physical key.
virtual_keycode optionally describes the meaning of the key.
It indicates ASCII letters, some punctuation and some function keys.
modifiers tells if the Shift, Control, Alt and Logo keys are currently pressed.

The ReceivedCharacter event sends a single Unicode codepoint. The character can
be pushed to the end of a string and if this is done for all events the user
will see the text they intended to enter.

Shortcomings

This is my personal list in no particular order.

  1. List of VirtualKeyCode is seen as incomplete (Enable Less and Greater Keys on X11 #71, Support keypress events with non-ascii chars #59).
    Without a given list it is hard to decide which keys to include
    and when the list is complete.
    Also it is necessary to define each virtual key code so multiple platforms will
    map keys to the same virtual key codes.
    While it probably uncontroversial that ASCII keys should be included
    for non-ASCII single keys found on many keyboards like é, µ, or ü
    it is more difficult to decide and to create an exhaustive list.
  2. While VirtualKeyCode should capture the meaning of the key there
    are different codes for e.g. "0": Key0 and Numpad0 or LControl and RControl.
  3. The ScanCode is platform dependent. Therefore apps wanting to use keys like
    WASD for navigation will assume an QWERTY layout instead of
    using the key locations.
  4. It is unclear if a key is repeated or not. Some applications only want to
    act on the first keypress and ignore all following repeated keys. Right
    now these applications need to do extra tracking and are probably not
    correct if the keyboard focus changes while a key is held down. ( A way to disable key repeats #310)
  5. A few useful modfiers like AltGraph and NumLock are missing.
  6. There is no relation between ReceivedCharacter and KeyboardInput
    events. While this is not necessary for every application some
    (like browsers) need it and have to use ugly (and incorrect) work-arounds. (Associate received characters with key inputs #34)
  7. Dead-key handling is unspecified and IMEs (Input Method Editors) are not supported.

In general there are many issues that are platform-dependant and where it is
unclear what the correct behavior is or it is not documented.
Both alacritty and Servo just to name two applications have multiple
issues where people mention that keyboard input does not work as expeced.

Proposed Solution

Winit is not the first software that needs to deal with keyboard input on
a variety of platforms. In particular the web platform has a complete
specification how keyboard events should behave which is implemented on
all platforms that Winit aims to support.

While the specification talks about JS objects it can be easily ported
to Rust. Some information is duplicated in KeyboardEvent for
backwards compatibility but this can be omitted in Rust so Winit stays simpler.

See the keyboard-types for how keyboard events can look like in Rust.

  • (shortcoming 1) VirtualKeyCode is replaced with a Key. This is an enum
    with all the values for functional keys and a variant for Unicode values
    that stores printable characters both from the whole Unicode range.
    Specification
  • (shortcoming 2) is also adressed by this. There is just one value for keys
    like "Control" but if necessary one can distinguish left/right or
    keyboard/numpad keys by their location attribute.
  • (shortcoming 3) ScanCode is complemented by Code. Codes describe
    physical key locations in a cross-platform way.
    Specification
  • (shortcoming 4) a repeat attribute is added.
  • (shortcoming 5) All known modifier keys are supported.
    **Specification
    Note: W3C decided to include some keys that are usually handled
    in hardware and don't emit keyboard events (like Fn, FnLock)
  • (shortcoming 6) received characters and keyboard events are now one
    (exceptions see below)
  • (shortcoming 7) to handle dead keys and IMEs a composition event
    is introduced. It describes the text that should be added at
    the current cursor position. Specification
    Note: The introduction composition events makes it a bit harder to
    get "just the text" which is currently emitted by ReceivedCharacter.
    Either ReceivedCharacter is kept around for easier use or a utility
    function is provided that takes keyboard and composition events and
    emits the printable text.

Implementation

This is obviously a breaking change so there needs to be a new release of winit and release notes.
While the proposed events are very expressive it is possible to convert Winit to the new
events first and then improve each backend to emit the additional information about key-codes,
locations, repeating keys etc.

Thank you for writing and maintaining Winit! I hope this helps to get a discussion about keyboard input handling started and maybe some ideas or even the whole proposal is implemented in Winit.

@Osspial Osspial added S - enhancement Wouldn't this be the coolest? S - api Design and usability C - needs discussion Direction must be ironed out S - platform parity Unintended platform differences labels Jan 5, 2019
@Osspial
Copy link
Contributor

Osspial commented Jan 7, 2019

Hi, and thanks for taking the time to put this together! Overall, I like the direction this is going, but there are some specific feedback points that come up for this.

VirtualKeyCode is replaced with a Key. This is an enum with all the values for functional keys and a variant for Unicode values that stores printable characters both from the whole Unicode range.

Being more general on this would be a good change. I don't like using a full String for this, though - it introduces various issues that I'm not particularly happy with:

  • A full String is more difficult to match on than an enum, str, or char.
  • Unicode characters have multiple cases, while keyboard keys only have one case. This could introduce some tricky bugs into people's applications.

Unfortunately I can't think of a good replacement that's as flexible as a string while accounting for both of those issues, but it's something that rubs me the wrong way.

There is just one value for keys like "Control" but if necessary one can distinguish left/right or keyboard/numpad keys by their location attribute.

I like the idea of having a left/right enum to distinguish between sided keys. However, the Location enum should be exposed through variants in the Key enum (e.g. Ctrl(Location)), rather than on the main KeyboardEvent struct we expose.

ScanCode is complemented by Code. Codes describe physical key locations in a cross-platform way. Specification

Making scan codes platform-independent is certainly something we should do, although the W3C Code specification relies a bit too much on the layout of the US keyboard for my liking. Perhaps we should use some sort of numeric index for this? I feel we should also remove ScanCode support entirely, since it doesn't seem to provide any real use for cross-platform application programming. I'd be open to a counter-example, though.

Whatever mechanism we decide on, there should be some method for translating between Codes and Keys, for display purposes.

(shortcoming 5) All known modifier keys are supported. Note: W3C decided to include some keys that are usually handled in hardware and don't emit keyboard events (like Fn, FnLock)

I'd like to leave the hardware-handled keys out of our "officially supported" keys, but this would be a good change. We may also want to create a separate ModifiersChanged event, but that needs discussion and I'm not entirely sure it's the right move.

(shortcoming 4) a repeat attribute is added.
(shortcoming 6) received characters and keyboard events are now one (exceptions see below)
(shortcoming 7) to handle dead keys and IMEs a composition event is introduced. It describes the text that should be added at the current cursor position.

I'm down with all of these changes.

@pyfisch
Copy link
Author

pyfisch commented Jan 7, 2019

Hi, thanks for taking the time to review this!

Being more general on this would be a good change. I don't like using a full String for this, though - it introduces various issues that I'm not particularly happy with:

  • A full String is more difficult to match on than an enum, str, or char.

  • Unicode characters have multiple cases, while keyboard keys only have one case. This could introduce some tricky bugs into people's applications.

Unfortunately I can't think of a good replacement that's as flexible as a string while accounting for both of those issues, but it's something that rubs me the wrong way.

I couldn't agree more and would have preferred to use char instead. Matching is easy and the Key enum can implement Copy. The reason to use a String is that a key string has a base character and 0 or more combining characters because certain languages have keys that can't be represented with a single code point. (The problem with &str is that someone needs to own it and an enum is problematic because someone needs to decide which characters exist ahead of time and extend it for each new Unicode version.)

Because matching strings is so painful I wrote the ShortcutMatcher which is used by Servo. It is a quite convenient way to match keys and shortcuts. (Btw it ignores ASCII case and handles some other quirks)

Unicode characters have multiple cases, while keyboard keys only have one case.

One way to think about keyboard keys is that they have multiple levels. For example the "M" key on my keyboard has four levels that are accessed with different modifier keys: "m", "M", "µ", "º". These values should be a different Key. On the other hand, while the current VirtualKeyCodes can be typed without modifiers on a US-ASCII keyboard (I think) some variants like LBracket can only be accessed with modifier keys (in my case AltGr+8) on other keyboards. For these reasons I think it is preferable to have different Unicode cases in key values.

I like the idea of having a left/right enum to distinguish between sided keys. However, the Location enum should be exposed through variants in the Key enum (e.g. Ctrl(Location)), rather than on the main KeyboardEvent struct we expose.

Is there a specific reason to do it this way?
If there is ever a need to add a location to a key that previously did not have one (e.g. Backspace on num pad) this would be a breaking change.

Making scan codes platform-independent is certainly something we should do, although the W3C Code specification relies a bit too much on the layout of the US keyboard for my liking. Perhaps we should use some sort of numeric index for this?

One upside of using names referring to the US keyboard layout is that this layout is already familiar to a lot of people and there are plenty of diagrams and photos of the layout for quick reference. Classic scancodes are too short (8-bit) and vary between keyboards from different manufacturers. One language independent index used by X11 can be seen below. (search for X11 keycode names)

I'd like to leave the hardware-handled keys out of our "officially supported" keys

Yeah, there should be a list of supported modifiers for each platform in the docs.

We may also want to create a separate ModifiersChanged event, but that needs discussion and I'm not entirely sure it's the right move.

I am not sure when I would use ModifiersChanged event as modifier keys already send keydown and keyup events.

@Osspial
Copy link
Contributor

Osspial commented Jan 8, 2019

The problem with &str is that someone needs to own it and an enum is problematic because someone needs to decide which characters exist ahead of time and extend it for each new Unicode version.

There's a solution for using &str, actually - we could convert unicode Strings that are constructed at runtime into &'static strs as follows, then we can internally store a cache of keypress strings so that we don't consume additional memory for every keypress:

let string: String = "Hello".to_string();
// Construct a 'static string at runtime.
let x: &'static str = Box::leak(string.into_boxed_str());

That would let us pass &strs through the unicode variant and let people use string matching.

For example the "M" key on my keyboard has four levels that are accessed with different modifier keys: "m", "M", "µ", "º". These values should be a different Key. On the other hand, while the current VirtualKeyCodes can be typed without modifiers on a US-ASCII keyboard (I think) some variants like LBracket can only be accessed with modifier keys (in my case AltGr+8) on other keyboards. For these reasons I think it is preferable to have different Unicode cases in key values.

The purpose of having Key-codes is to let the program figure out which keys have been pressed irrespective of any modifier-key presses - we'd want all of those characters to always be exposed under one key, since they're mapped to the same key. If you want to access the character that's outputted, taking into account modifier keys, you check the received character.

Is there a specific reason to do it this way? If there is ever a need to add a location to a key that previously did not have one (e.g. Backspace on num pad) this would be a breaking change.

Mainly, to make matching more ergonomic. If you wanted to match on both location and key with the types being separate, you'd have to do this:

match (key, location) {
    (Key::A, _) => (),
    (Key::B, _) => (),
    (Key::C, _) => (),
    (Key::Alt, _) => (),
    (Key::Ctrl, Location::Left) => (),
    (Key::Ctrl, Location::Right) => (),
    _ => ()
}

With them combined into one type, it looks like this:

match key {
    Key::A => (),
    Key::B => (),
    Key::C => (),
    Key::Alt(_) => (),
    Key::Ctrl(Location::Left) => (),
    Key::Ctrl(Location::Right) => (),
    _ => ()
}

The second version is nicer to read, and it also lets the reader know when a key's specific location is being ignored, versus when a key only has one possible location. The first version doesn't communicate that information.

Regarding adding a location to an existing key being a breaking change - there shouldn't be any reason we ever have to do that! Keyboard layouts are fairly static, and only a limited subset of keys are going to have multiple locations on the keyboard. We should be able to keep track of which ones have multiple locations and structure the enum as necessary.

One upside of using names referring to the US keyboard layout is that this layout is already familiar to a lot of people and there are plenty of diagrams and photos of the layout for quick reference. Classic scancodes are too short (8-bit) and vary between keyboards from different manufacturers. One language independent index used by X11 can be seen below. (search for X11 keycode names)

My main issue with using the QWERTY keys to specify a layout-independent keymap feels against the spirit of providing such an API. Something in the vein of that X11 index seems like a decent solution, though.

I am not sure when I would use ModifiersChanged event as modifier keys already send keydown and keyup events.

If users could always keep track of which modifier keys have been pressed with keydown and keyup events, we wouldn't need to expose a modifiers parameter at all. The reason we expose them is because if someone presses a modifier key outside of the window then focuses the window, or presses the modifier key inside the window and unfocuses the window, the key-down/key-up events won't be properly delivered.

The reason I was floating a separate ModifiersChanged event was so that we wouldn't have to expose a modifiers variable alongside pretty much every window event, as we do now. However, I realize now that it would be simpler from a user's standpoint to provide stronger guarantees about keypress events so that they can reliably keep track of which keys have been pressed without running into the pitfalls described above (such as, guaranteeing to deliver a KeyUp event for every KeyDown event or automatically sending KeyDown events for all pressed keys when a user focuses the window).

@Osspial
Copy link
Contributor

Osspial commented Jan 8, 2019

Actually, regarding device-dependent virtual key-codes - what real purpose do they provide that isn't provided by exposing the received character and the device-independent key code? I can't think of a reason for using the virtual key-codes that isn't better-served by one of the other two methods; keyboard mappings should generally be done with the device-independent keys, and character input is best done with received character events.

@pyfisch
Copy link
Author

pyfisch commented Jan 8, 2019

The UI Event Specification explains how keyboards work. It discusses why each part of the event is useful and how they relate to each other.

@pyfisch
Copy link
Author

pyfisch commented Jan 8, 2019

The purpose of having Key-codes is to let the program figure out which keys have been pressed irrespective of any modifier-key presses

Looks like we are talking about different things then. You seem to associate the visual markings on the key cap with key codes.
While the UI Events specification and I refer to the functional mapping of the key.

If you want to access the character that's outputted, taking into account modifier keys, you check the received character.

What I propose is that the character that's outputted is the key. Received character is then redundant.

To match with separate key and location you can do this:

match event.key {
    Key::Home => ...
    Key::End => ...
    Key::Control if event.location == Left => ...
    Key::Control if event.location == Right => ...
    _ => ...
}

If a user does not care about key locations they don't have to
know they exist at all. On the other hand if key and location are one type
every user needs to know (or be told by the compiler)
which keys have multiple locations to write Key::Control(_i_dont_care).
(I expect this to be the common case.)

@Osspial
Copy link
Contributor

Osspial commented Jan 13, 2019

Looks like we are talking about different things then. You seem to associate the visual markings on the key cap with key codes. While the UI Events specification and I refer to the functional mapping of the key.

Not quite - if the user has switched their keyboard layout away from what's printed on their keyboard (say, to Dvorak) the key code would correspond to the remapped keybindings. Otherwise that seems fairly accurate.

What I propose is that the character that's outputted is the key. Received character is then redundant.

So, following the UI Events specification would have us mix character input and other keypresses (ctrl, alt, arrow keys, etc.) into a single API, right? I really don't like the idea of doing that. Having that API in addition to the physical key-press and character composition APIs leads to a situation where there's a lot of overlap for what each API does:

  • Functional key-press API (handles unicode characters most of the time and layout-agnostic key-presses for layout-agnostic keys)
  • Physical key-press API (layout-agnostic keypresses)
  • Character composition API (handles unicode characters sometimes, but only when they aren't tied to a single keypress)

The functional key-press API doesn't have its own specific purpose: sometimes it does things the physical keypress API does, and because it handles the majority of unicode input it make the character composition API easy to ignore.

I'd rather only have two keyboard input APIs:

  • Physical key-press API (layout-agnostic keypresses)
  • Character input API (handles unicode characters and composition events)

Under this design, the purpose of each API is much more clear: the physical keypress API handles mapping each key to a function, and the character input API handles... well, all character input. Skimming through the UI Events spec it seems like it would be possible to map this API onto that, as well.

On the other hand if key and location are one type every user needs to know (or be told by the compiler) which keys have multiple locations to write Key::Control(_i_dont_care).

That's the point of merging those two events - to force users to decide whether they care or not. Whether you like that is up to personal preference, I guess; I like it because it improves the readability of the code (you know when someone's opting out of considering location vs. when there's no location to consider) and the documentation (we don't have to manually specify which keys have locations - if a key has a location, it's inherent to the declaration of the variant).

@pyfisch
Copy link
Author

pyfisch commented Jan 13, 2019

I'd rather only have two keyboard input APIs:

  • Physical key-press API (layout-agnostic keypresses)
  • Character input API (handles unicode characters and composition events)

Fine. How do you handle keyboard shortcuts like Control+Z (for undo)? Keep in mind that the placement of the Z key varies across common layouts and reasonable people may move the functionality of the Control key to another physical key.

@Osspial
Copy link
Contributor

Osspial commented Jan 13, 2019

How do you handle keyboard shortcuts like Control+Z (for undo)?

I... hmm.

That's something that crossed my mind briefly when I was first writing that comment, and I'll admit that that design doesn't handle this case well. Ideally, we'd be able to keep the same physical keymap across layouts (which is what you want for things like videogame keymaps), but that also leads to problems when other software developers haven't done that, causing our applications to violate those UX standards!

Something we could do is use the UIEvents-Code keycodes (or an equivalent), and structure keyboard events like this:

struct KeyboardInput {
    /// The pressed key, ignoring keyboard layout.
    ///
    /// Alphanumeric keys always correspond to their location on a QWERTY keyboard,
    /// regardless of whether or not the user is using an alternate keymap. For instance,
    /// pressing the Z key on a QWERTZ keyboard will result in `KeyCode::KeyY` getting
    /// sent. This also ignores any other remappings (e.g. even if the user has bound
    /// Control to Caps Lock, pressing the Caps Lock key will result in `KeyCode::CapsLock`.)
    ///
    /// This is useful for things like videogame keymaps, where the physical location of a
    /// key is more important than the actual key being pressed.
    physical_key: KeyCode,
    /// The pressed key, taking keyboard layout into account.
    ///
    /// If the user is using an alternate keyboard layout or have remapped any of their keys,
    /// their preferred mappings will be sent. Unlike `physical_key`, pressing Z on a QWERTZ
    /// keyboard will output `KeyCode::KeyZ`, and rebound keys as mentioned above will output
    /// the rebound key.
    ///
    /// This is useful for desktop application keymaps, where maintaining keybinding
    /// consistency with other applications is more important than the exact location of the
    /// key pressed.
    logical_key: KeyCode,
    /* other fields intentionally omitted */
}

EDIT: I have physical_key and logical_key using the same type to make it clear that they both have the same underlying variants. We may want to split them into separate types with the same internal layout, as we've done with DPI types, but that decision isn't important for establishing whether or not this general API is a good idea.

@pyfisch
Copy link
Author

pyfisch commented Jan 13, 2019

Well this design is a lot better.

What happens if I want to detect the "Page Up" key on my numpad? If "Num Lock" is on I want to receive the character "9" instead.

@Osspial
Copy link
Contributor

Osspial commented Jan 13, 2019

There are two ways I can think of to do that:

  • Pass a location parameter alongside all keys that appear both on the numpad and elsewhere on the keyboard.
  • Always deliver the same keys regardless of whether or not numlock is pressed, and let the application handle translating them into special keys.

My feeling is that we should take the first approach for logical_key and the second approach for physical_key. That sacrifices some consistency across the two input methods, but it also matches better with their respective stated goals.

@pyfisch
Copy link
Author

pyfisch commented Jan 14, 2019

Pass a location parameter alongside all keys that appear both on the numpad and elsewhere on the keyboard.

I understand that if I press the "Page Up" key I will get a logical_key of PageUp(Standard) and if I press "Page Up" on the numpad I get Page Up(Numpad). Is this correct? But if "Num Lock" is active I will instead receive "9". So some logical keys are now depend on modifiers present?

  • What is the logical_key value for keys not found on un-shifted US keyboards?
  • What is the correct way to detect that a user pressed ":" (colon) for vim-style controls?

@Osspial
Copy link
Contributor

Osspial commented Jan 14, 2019

I understand that if I press the "Page Up" key I will get a logical_key of PageUp(Standard) and if I press "Page Up" on the numpad I get Page Up(Numpad). Is this correct? But if "Num Lock" is active I will instead receive "9". So some logical keys are now depend on modifiers present?

That is correct. I realize that this may be inconsistent with my stance on the alphanumeric keys, but it feels like there's a difference here since enabling/disabling numlock fundamentally changes how those keys interact with applications, rather than just outputting a different variation of a character.

What is the logical_key value for keys not found on un-shifted US keyboards?

You're talking about these sorts of keys, right?

image

For those, I'd use the Intl**** codes from the UIEvents-Code spec. Honestly, I'd lean towards replacing some of the standard US Keyboard values in that block with more international codes, seeing as there's a pretty wide range in what different keyboard locales put on those keys.

What is the correct way to detect that a user pressed ":" (colon) for vim-style controls?

Because vim mainly uses character input for its controls, I'd say to use the character input API.

@pyfisch
Copy link
Author

pyfisch commented Jan 16, 2019

Yes the keys marked red. But also those found on keyboards for non Latin scripts.

I understand that you want to use codes from the UIEvents-Code spec for the logical_key values. But these codes are almost arbitrary names to describe keys with a shared location but widely varying functions. I don't know when I would want to use those key values.


I don't think we can reach a consensus on keyboard events. You appear to prefer an API with just a physical location value and a separate API for character input. You made some additions to the keyboard API but it feels rather crude now and heavily relies on the assumption that you know every keyboard layout in existence and can predict how it will be used. (fixed number of key values, how does a numpad work, ...) I especially disagree with not providing an API for shifted keyboard symbols. This is available across Windows, Linux, Mac OS, but you prefer to only expose character data.

I would recommend that if winit changes its keyboard API it copies one from an existing system and does not try to have a unique variant.

Something we appear to agree on, is that there should be a code for physical keyboard locations. Maybe we can add this to the existing API?

@Osspial
Copy link
Contributor

Osspial commented Jan 16, 2019

I don't think we can reach a consensus on keyboard events. You appear to prefer an API with just a physical location value and a separate API for character input.

To be clear: I'd like character input to be delivered alongside the physical_key and logical_key values in the same event, just not expose the key as character input. Ideally, you'd have a keyboard input event structured like this:

struct InputEvent {
    keyboard_event: Option<KeyboardEvent>,
    composition_input: Option<CompositionEvent>,
}

struct KeyboardEvent {
    physical_key: PhysicalKey,
    logical_key: LogicalKey,
    key_state: ElementState,
}

enum CompositionEvent {
    Char(char),
    CompositionStart(String),
    CompositionUpdate(String),
    CompositionEnd(String),
}

That general structure associates character input with keyboard input, but exposes them as two separate things.

I'm not comfortable with exposing character input events and keyboard input events through the same enumeration (i.e. having enum Key {UnicodeKey(String), /*everything else*/}) for a couple of reasons: one, it creates an unnecessary stumbling block when creating keyboard shortcuts. Two, it hurts internationalization of keybindings.

About keyboard shortcuts: let's say that we exposed a Key enumeration similar to what's shown in the above paragraph, with UnicodeKey exposing shifted unicode values (as far as I understand, that's the structure you proposed initially). If somebody wanted to have control+z be a shortcut for undo, they might write this code:

match (key, modifiers) {
    (Key::UnicodeKey('Z'), Modifiers{ control: true, alt: false, shift: false, logo: false})
        => /*whatever undo stuff*/,
    _
}

The issue there is, because they're matching on Z and not z, that whole undo branch becomes dead code. It's not obviously dead code; there's no way for us to make the compiler warn about it, and it's doesn't seem immediately unreasonable, but it's the sort of API design that leads to developers banging their head against our library wondering why code that they'd think should work doesn't.

Regarding the second point: if a developer with a Latin-script keyboard creates a layout that associates 'a' with an action, and a Russian user (or some other user with a non-Latin keyboard) has a keyboard that doesn't output 'a' without some form of shifting, the non-Latin keyboard will in the best case have keybindings that require extra shifting to function; worst-case, the keybindings won't work at all. Conversely, non-Latin keybindings won't work on Latin keyboards, and an action bound to 'Б' will only work in a select few locales.

Neither of those are API compromises that I'm willing to accept. That's why I don't want to adopt the UI Events API verbatim - I think it's fundamentally flawed in ways that aren't obvious, but concretely harm both users and developers.


One thing that I haven't said but probably should've mentioned sooner: I'm in favor of having a mechanism for translating between our internal key enumeration and the default character output for the keyboard's layout. The intention would be to have a standardized internal structure for keyboard input and then display to the user whatever key value is associated with each particular key for their keyboard layout. I'm sorry I hadn't communicated that before - it's something that was in my head as a given, but seeing as I never wrote it down there's no way you would know that 😅.

Yes the keys marked red. But also those found on keyboards for non Latin scripts.
I understand that you want to use codes from the UIEvents-Code spec for the logical_key values. But these codes are almost arbitrary names to describe keys with a shared location but widely varying functions. I don't know when I would want to use those key values.

Hey, you've gotta have some sort of arbitrary code. QWERTY just happens to be one that isn't arbitrary for a large portion of the world.

I mentioned possibly using some index-based system above, but I've since changed my mind on that. All the foreign-script keyboards I've seen from googling have also had QWERTY markings alongside their non-Latin characters, and if you're programming in Rust you need to have some amount of familiarity with a Latin keyboard to even start using the language.

You made some additions to the keyboard API but it feels rather crude now and heavily relies on the assumption that you know every keyboard layout in existence and can predict how it will be used. (fixed number of key values, how does a numpad work, ...)

How are those unreasonable assumptions to make? From the research I've done, the only difference in keyboard layouts are:

  • Which characters get bound to the alphanumeric keys.
  • If they add any other, miscellaneous keys appropriate for that locale.

There are a limited number of "other, miscellaneous keys"; certainly few enough that we can expose them through a well-formed enum.

As far as assuming how a numpad works: it's a standard that keyboard manufacturers have settled on, and it seems to be standard across every keyboard that has a numpad. If we're making an abstraction we have to make assumptions somewhere, and there's nothing unreasonable about assuming this.

I especially disagree with not providing an API for shifted keyboard symbols. This is available across Windows, Linux, Mac OS, but you prefer to only expose character data.

What's the difference between exposing character input and shifted symbols? I've been working under the assumption that they're the same thing, but you're saying here that they're not; we may be talking about two different things here.

Something we appear to agree on, is that there should be a code for physical keyboard locations. Maybe we can add this to the existing API?

Yes, but I think we can go further with more comprehensive improvements. Like I've said elsewhere - I think that most of the ideas behind your proposal are good, I just don't agree with some of the specifics of how things should get exposed.

@inodentry
Copy link

Feel free to ping me if you need any testing on Wayland and XWayland. 🙂 I am a multilingual user with keyboard layouts (incl. custom ones) for different scripts.

@mahkoh
Copy link
Contributor

mahkoh commented Oct 21, 2021

Afaict the API as discussed here does not support multiple seats which makes it cumbersome impossible to implement it on platforms that support multiple seats. E.g. consider the following sequence of events:

  • Seat 1: Press Alt
  • Seat 1: Press A
  • Seat 2: Press Shift
  • Seat 2: Press B
  • Seat 1: Press C

I'm not sure which sequence of events this would generate with this API.

@dhardy
Copy link
Contributor

dhardy commented Oct 21, 2021

I'm pretty sure a winit app would be restricted to one seat and only report events from that. It's not like the apps on Seat 1 should know what is going on at Seat 2.

@mahkoh
Copy link
Contributor

mahkoh commented Oct 21, 2021

I'm pretty sure a winit app would be restricted to one seat and only report events from that.

That seems like a significant restriction. Has this already been discussed elsewhere?

@mahkoh
Copy link
Contributor

mahkoh commented Oct 23, 2021

It's not like the apps on Seat 1 should know what is going on at Seat 2.

I missed this when I read your comment the first time. I think there is a misunderstanding what a seat is. A seat is simply a mouse and/or keyboard set. If you have multiple independent sets, then you have multiple seats. Each seat is represented on the screen by its own cursor.

Iirc on wayland it is standard that each tablet connected to the PC is its own seat. This means that you have to support multiple seats to fully support tablet input on wayland.

Another example of multiple seats are VNC applications that represent the remote mouse/keyboard as a separate seat so that the remote user doesn't have to fight with the local user over control of the cursor.

@veryjos
Copy link

veryjos commented Nov 25, 2021

This issue is huge and a few years old, I'm having trouble tracking..

Is the key repeat solution proposed in the original issue available yet? I can't find anything in the source.

@ArturKovacs
Copy link
Contributor

I think this will be very helpful to you: #1806

To answer your question, nothing is merged into master yet but the implementations are mostly done actually. See the tracking issue I referenced above.

@kchibisov
Copy link
Member

The #2662 was merged addressing most of this. For follow ups I'd suggest to open separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C - needs discussion Direction must be ironed out S - api Design and usability S - enhancement Wouldn't this be the coolest? S - platform parity Unintended platform differences
Development

No branches or pull requests