-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A keyboard input model #753
Comments
Hi, and thanks for taking the time to put this together! Overall, I like the direction this is going, but there are some specific feedback points that come up for this.
Being more general on this would be a good change. I don't like using a full
Unfortunately I can't think of a good replacement that's as flexible as a string while accounting for both of those issues, but it's something that rubs me the wrong way.
I like the idea of having a left/right enum to distinguish between sided keys. However, the
Making scan codes platform-independent is certainly something we should do, although the W3C Whatever mechanism we decide on, there should be some method for translating between
I'd like to leave the hardware-handled keys out of our "officially supported" keys, but this would be a good change. We may also want to create a separate
I'm down with all of these changes. |
Hi, thanks for taking the time to review this!
I couldn't agree more and would have preferred to use Because matching strings is so painful I wrote the
One way to think about keyboard keys is that they have multiple levels. For example the "M" key on my keyboard has four levels that are accessed with different modifier keys: "m", "M", "µ", "º". These values should be a different
Is there a specific reason to do it this way?
One upside of using names referring to the US keyboard layout is that this layout is already familiar to a lot of people and there are plenty of diagrams and photos of the layout for quick reference. Classic scancodes are too short (8-bit) and vary between keyboards from different manufacturers. One language independent index used by X11 can be seen below. (search for X11 keycode names)
Yeah, there should be a list of supported modifiers for each platform in the docs.
I am not sure when I would use |
There's a solution for using let string: String = "Hello".to_string();
// Construct a 'static string at runtime.
let x: &'static str = Box::leak(string.into_boxed_str()); That would let us pass
The purpose of having
Mainly, to make matching more ergonomic. If you wanted to match on both location and key with the types being separate, you'd have to do this: match (key, location) {
(Key::A, _) => (),
(Key::B, _) => (),
(Key::C, _) => (),
(Key::Alt, _) => (),
(Key::Ctrl, Location::Left) => (),
(Key::Ctrl, Location::Right) => (),
_ => ()
} With them combined into one type, it looks like this: match key {
Key::A => (),
Key::B => (),
Key::C => (),
Key::Alt(_) => (),
Key::Ctrl(Location::Left) => (),
Key::Ctrl(Location::Right) => (),
_ => ()
} The second version is nicer to read, and it also lets the reader know when a key's specific location is being ignored, versus when a key only has one possible location. The first version doesn't communicate that information. Regarding adding a location to an existing key being a breaking change - there shouldn't be any reason we ever have to do that! Keyboard layouts are fairly static, and only a limited subset of keys are going to have multiple locations on the keyboard. We should be able to keep track of which ones have multiple locations and structure the enum as necessary.
My main issue with using the QWERTY keys to specify a layout-independent keymap feels against the spirit of providing such an API. Something in the vein of that X11 index seems like a decent solution, though.
If users could always keep track of which modifier keys have been pressed with keydown and keyup events, we wouldn't need to expose a modifiers parameter at all. The reason we expose them is because if someone presses a modifier key outside of the window then focuses the window, or presses the modifier key inside the window and unfocuses the window, the key-down/key-up events won't be properly delivered. The reason I was floating a separate |
Actually, regarding device-dependent virtual key-codes - what real purpose do they provide that isn't provided by exposing the received character and the device-independent key code? I can't think of a reason for using the virtual key-codes that isn't better-served by one of the other two methods; keyboard mappings should generally be done with the device-independent keys, and character input is best done with received character events. |
The UI Event Specification explains how keyboards work. It discusses why each part of the event is useful and how they relate to each other. |
Looks like we are talking about different things then. You seem to associate the visual markings on the key cap with key codes.
What I propose is that the character that's outputted is the key. Received character is then redundant. To match with separate
If a user does not care about key locations they don't have to |
Not quite - if the user has switched their keyboard layout away from what's printed on their keyboard (say, to Dvorak) the key code would correspond to the remapped keybindings. Otherwise that seems fairly accurate.
So, following the UI Events specification would have us mix character input and other keypresses (ctrl, alt, arrow keys, etc.) into a single API, right? I really don't like the idea of doing that. Having that API in addition to the physical key-press and character composition APIs leads to a situation where there's a lot of overlap for what each API does:
The functional key-press API doesn't have its own specific purpose: sometimes it does things the physical keypress API does, and because it handles the majority of unicode input it make the character composition API easy to ignore. I'd rather only have two keyboard input APIs:
Under this design, the purpose of each API is much more clear: the physical keypress API handles mapping each key to a function, and the character input API handles... well, all character input. Skimming through the UI Events spec it seems like it would be possible to map this API onto that, as well.
That's the point of merging those two events - to force users to decide whether they care or not. Whether you like that is up to personal preference, I guess; I like it because it improves the readability of the code (you know when someone's opting out of considering location vs. when there's no location to consider) and the documentation (we don't have to manually specify which keys have locations - if a key has a location, it's inherent to the declaration of the variant). |
Fine. How do you handle keyboard shortcuts like Control+Z (for undo)? Keep in mind that the placement of the Z key varies across common layouts and reasonable people may move the functionality of the Control key to another physical key. |
I... hmm. That's something that crossed my mind briefly when I was first writing that comment, and I'll admit that that design doesn't handle this case well. Ideally, we'd be able to keep the same physical keymap across layouts (which is what you want for things like videogame keymaps), but that also leads to problems when other software developers haven't done that, causing our applications to violate those UX standards! Something we could do is use the UIEvents-Code keycodes (or an equivalent), and structure keyboard events like this: struct KeyboardInput {
/// The pressed key, ignoring keyboard layout.
///
/// Alphanumeric keys always correspond to their location on a QWERTY keyboard,
/// regardless of whether or not the user is using an alternate keymap. For instance,
/// pressing the Z key on a QWERTZ keyboard will result in `KeyCode::KeyY` getting
/// sent. This also ignores any other remappings (e.g. even if the user has bound
/// Control to Caps Lock, pressing the Caps Lock key will result in `KeyCode::CapsLock`.)
///
/// This is useful for things like videogame keymaps, where the physical location of a
/// key is more important than the actual key being pressed.
physical_key: KeyCode,
/// The pressed key, taking keyboard layout into account.
///
/// If the user is using an alternate keyboard layout or have remapped any of their keys,
/// their preferred mappings will be sent. Unlike `physical_key`, pressing Z on a QWERTZ
/// keyboard will output `KeyCode::KeyZ`, and rebound keys as mentioned above will output
/// the rebound key.
///
/// This is useful for desktop application keymaps, where maintaining keybinding
/// consistency with other applications is more important than the exact location of the
/// key pressed.
logical_key: KeyCode,
/* other fields intentionally omitted */
} EDIT: I have |
Well this design is a lot better. What happens if I want to detect the "Page Up" key on my numpad? If "Num Lock" is on I want to receive the character "9" instead. |
There are two ways I can think of to do that:
My feeling is that we should take the first approach for |
I understand that if I press the "Page Up" key I will get a
|
That is correct. I realize that this may be inconsistent with my stance on the alphanumeric keys, but it feels like there's a difference here since enabling/disabling numlock fundamentally changes how those keys interact with applications, rather than just outputting a different variation of a character.
You're talking about these sorts of keys, right? For those, I'd use the
Because vim mainly uses character input for its controls, I'd say to use the character input API. |
Yes the keys marked red. But also those found on keyboards for non Latin scripts. I understand that you want to use codes from the I don't think we can reach a consensus on keyboard events. You appear to prefer an API with just a physical location value and a separate API for character input. You made some additions to the keyboard API but it feels rather crude now and heavily relies on the assumption that you know every keyboard layout in existence and can predict how it will be used. (fixed number of key values, how does a numpad work, ...) I especially disagree with not providing an API for shifted keyboard symbols. This is available across Windows, Linux, Mac OS, but you prefer to only expose character data. I would recommend that if winit changes its keyboard API it copies one from an existing system and does not try to have a unique variant. Something we appear to agree on, is that there should be a code for physical keyboard locations. Maybe we can add this to the existing API? |
To be clear: I'd like character input to be delivered alongside the struct InputEvent {
keyboard_event: Option<KeyboardEvent>,
composition_input: Option<CompositionEvent>,
}
struct KeyboardEvent {
physical_key: PhysicalKey,
logical_key: LogicalKey,
key_state: ElementState,
}
enum CompositionEvent {
Char(char),
CompositionStart(String),
CompositionUpdate(String),
CompositionEnd(String),
} That general structure associates character input with keyboard input, but exposes them as two separate things. I'm not comfortable with exposing character input events and keyboard input events through the same enumeration (i.e. having About keyboard shortcuts: let's say that we exposed a match (key, modifiers) {
(Key::UnicodeKey('Z'), Modifiers{ control: true, alt: false, shift: false, logo: false})
=> /*whatever undo stuff*/,
_
} The issue there is, because they're Regarding the second point: if a developer with a Latin-script keyboard creates a layout that associates Neither of those are API compromises that I'm willing to accept. That's why I don't want to adopt the UI Events API verbatim - I think it's fundamentally flawed in ways that aren't obvious, but concretely harm both users and developers. One thing that I haven't said but probably should've mentioned sooner: I'm in favor of having a mechanism for translating between our internal key enumeration and the default character output for the keyboard's layout. The intention would be to have a standardized internal structure for keyboard input and then display to the user whatever key value is associated with each particular key for their keyboard layout. I'm sorry I hadn't communicated that before - it's something that was in my head as a given, but seeing as I never wrote it down there's no way you would know that 😅.
Hey, you've gotta have some sort of arbitrary code. QWERTY just happens to be one that isn't arbitrary for a large portion of the world. I mentioned possibly using some index-based system above, but I've since changed my mind on that. All the foreign-script keyboards I've seen from googling have also had QWERTY markings alongside their non-Latin characters, and if you're programming in Rust you need to have some amount of familiarity with a Latin keyboard to even start using the language.
How are those unreasonable assumptions to make? From the research I've done, the only difference in keyboard layouts are:
There are a limited number of "other, miscellaneous keys"; certainly few enough that we can expose them through a well-formed As far as assuming how a numpad works: it's a standard that keyboard manufacturers have settled on, and it seems to be standard across every keyboard that has a numpad. If we're making an abstraction we have to make assumptions somewhere, and there's nothing unreasonable about assuming this.
What's the difference between exposing character input and shifted symbols? I've been working under the assumption that they're the same thing, but you're saying here that they're not; we may be talking about two different things here.
Yes, but I think we can go further with more comprehensive improvements. Like I've said elsewhere - I think that most of the ideas behind your proposal are good, I just don't agree with some of the specifics of how things should get exposed. |
Feel free to ping me if you need any testing on Wayland and XWayland. 🙂 I am a multilingual user with keyboard layouts (incl. custom ones) for different scripts. |
Afaict the API as discussed here does not support multiple seats which makes it
I'm not sure which sequence of events this would generate with this API. |
I'm pretty sure a |
That seems like a significant restriction. Has this already been discussed elsewhere? |
I missed this when I read your comment the first time. I think there is a misunderstanding what a seat is. A seat is simply a mouse and/or keyboard set. If you have multiple independent sets, then you have multiple seats. Each seat is represented on the screen by its own cursor. Iirc on wayland it is standard that each tablet connected to the PC is its own seat. This means that you have to support multiple seats to fully support tablet input on wayland. Another example of multiple seats are VNC applications that represent the remote mouse/keyboard as a separate seat so that the remote user doesn't have to fight with the local user over control of the cursor. |
This issue is huge and a few years old, I'm having trouble tracking.. Is the key repeat solution proposed in the original issue available yet? I can't find anything in the source. |
I think this will be very helpful to you: #1806 To answer your question, nothing is merged into master yet but the implementations are mostly done actually. See the tracking issue I referenced above. |
The #2662 was merged addressing most of this. For follow ups I'd suggest to open separate issues. |
Winit is used for many applications that need
to handle different kinds of keyboard input.
WASD for movement and actions.
Text inpput for names and chat.
KeyboardEvent
well.Currently there are two events for text input in Winit:
KeyboardInput
andReceivedCharacter
.The
KeyboardInput
event carries information about keys pressed and released.scancode
is a platform-dependent code identifying the physical key.virtual_keycode
optionally describes the meaning of the key.It indicates ASCII letters, some punctuation and some function keys.
modifiers
tells if the Shift, Control, Alt and Logo keys are currently pressed.The
ReceivedCharacter
event sends a single Unicode codepoint. The character canbe pushed to the end of a string and if this is done for all events the user
will see the text they intended to enter.
Shortcomings
This is my personal list in no particular order.
VirtualKeyCode
is seen as incomplete (EnableLess
andGreater
Keys on X11 #71, Support keypress events with non-ascii chars #59).Without a given list it is hard to decide which keys to include
and when the list is complete.
Also it is necessary to define each virtual key code so multiple platforms will
map keys to the same virtual key codes.
While it probably uncontroversial that ASCII keys should be included
for non-ASCII single keys found on many keyboards like é, µ, or ü
it is more difficult to decide and to create an exhaustive list.
VirtualKeyCode
should capture the meaning of the key thereare different codes for e.g. "0":
Key0
andNumpad0
orLControl
andRControl
.ScanCode
is platform dependent. Therefore apps wanting to use keys likeWASD for navigation will assume an QWERTY layout instead of
using the key locations.
act on the first keypress and ignore all following repeated keys. Right
now these applications need to do extra tracking and are probably not
correct if the keyboard focus changes while a key is held down. ( A way to disable key repeats #310)
ReceivedCharacter
andKeyboardInput
events. While this is not necessary for every application some
(like browsers) need it and have to use ugly (and incorrect) work-arounds. (Associate received characters with key inputs #34)
In general there are many issues that are platform-dependant and where it is
unclear what the correct behavior is or it is not documented.
Both alacritty and Servo just to name two applications have multiple
issues where people mention that keyboard input does not work as expeced.
Proposed Solution
Winit is not the first software that needs to deal with keyboard input on
a variety of platforms. In particular the web platform has a complete
specification how keyboard events should behave which is implemented on
all platforms that Winit aims to support.
While the specification talks about JS objects it can be easily ported
to Rust. Some information is duplicated in
KeyboardEvent
forbackwards compatibility but this can be omitted in Rust so Winit stays simpler.
See the keyboard-types for how keyboard events can look like in Rust.
VirtualKeyCode
is replaced with aKey
. This is an enumwith all the values for functional keys and a variant for Unicode values
that stores printable characters both from the whole Unicode range.
Specification
like "Control" but if necessary one can distinguish left/right or
keyboard/numpad keys by their location attribute.
ScanCode
is complemented byCode
. Codes describephysical key locations in a cross-platform way.
Specification
repeat
attribute is added.**Specification
Note: W3C decided to include some keys that are usually handled
in hardware and don't emit keyboard events (like
Fn
,FnLock
)(exceptions see below)
is introduced. It describes the text that should be added at
the current cursor position. Specification
Note: The introduction composition events makes it a bit harder to
get "just the text" which is currently emitted by
ReceivedCharacter
.Either
ReceivedCharacter
is kept around for easier use or a utilityfunction is provided that takes keyboard and composition events and
emits the printable text.
Implementation
This is obviously a breaking change so there needs to be a new release of winit and release notes.
While the proposed events are very expressive it is possible to convert Winit to the new
events first and then improve each backend to emit the additional information about key-codes,
locations, repeating keys etc.
Thank you for writing and maintaining Winit! I hope this helps to get a discussion about keyboard input handling started and maybe some ideas or even the whole proposal is implemented in Winit.
The text was updated successfully, but these errors were encountered: