Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to represent screen-reader specific modifier keys with interaction.pressKeys #34

Open
zcorpan opened this issue Nov 7, 2022 · 6 comments
Labels
Deprioritised Work that is deprioritised for now

Comments

@zcorpan
Copy link
Member

zcorpan commented Nov 7, 2022

From #26

Include a special modifier for the screen-reader specific modifier keys.

This is not done yet. How do we want to support this? Should there be a special string for each vendor in place of the raw key string, e.g. "nvda", "macOS VoiceOver", etc?

{
  "method": "interaction.pressKeys",
  "params": {
    "keys": ["nvda", "a"]
  }
}

Or a boolean property in InteractionPressKeysParameters to indicate that the screen reader specific modifier keys should be pressed?

{
  "method": "interaction.pressKeys",
  "params": {
    "keys": ["a"],
    "vendorModifier": true
  }
}

Originally posted by @zcorpan in #26 (comment)

@zcorpan zcorpan changed the title How to represent modifier keys with interaction.pressKeys How to represent screen-reader specific modifier keys with interaction.pressKeys Nov 7, 2022
@zcorpan zcorpan mentioned this issue Nov 7, 2022
@jugglinmike
Copy link
Contributor

@jscholes This feature originates from our discussion on 2022-07-07, but I can't remember the motivating use case. Using "interaction.pressKeys" requires writing vendor-specific instructions, and an abstraction for a vendor's modifier key doesn't seem to change that. Could you say a bit about what this feature would enable?

@zcorpan
Copy link
Member Author

zcorpan commented Nov 23, 2022

I think an important aspect is that the special modifier key is configurable for some screen readers.

@jugglinmike
Copy link
Contributor

Good point! That means the "meta" key would be contextual not just to the screen reader under test but also to its internal state.

To put this in terms of use cases: is this feature about letting folks control screen readers where the configuration is unknown?

@jscholes
Copy link

@jugglinmike Not sure I fully understand the ask/context here, but I'll give it a go.

  • Most (all?) screen readers support a modifier key/set of keys, to carry out functions specific to that screen reader. Some, like VoiceOver, make very heavy use of these modifiers in most commands.
  • ARIA-AT tests use these keys where appropriate. For example, almost all commands targeting VoiceOver will include the VO modifiers, whereas only some commands targeting JAWS and NVDA require a modifier.
  • The VoiceOver modifiers, by default, are Control+Option. These are standard, system-wide modifier keys, possible to represent in most systems that facilitate programmatic keyboard simulation (e.g. they are registered as modifiers in HID standards).
  • JAWS and NVDA do not use system modifiers. They use Insert, Caps Lock, and/or Numpad Insert as a single modifier key, all of which are usually considered standard keys in keyboard simulation software. E.g. if you were sending a bitfield of modifiers, Insert wouldn't be part of it.
  • It is also possible, on macOS, to configure VoiceOver to use Caps Lock as a modifier, for users who find a single key easier/more convenient to hold down. I'm not sure if this is the default, or has to be turned on; it's been a while since I set up my Mac.

With all of this context in mind, VoiceOver may already be well covered by any standard allowing a modifiers field to be included, if it includes Control and Option in its definition. But on Windows, this would not be the case, and hence some "special" handling seems required. Maybe we could just abstract the details away to a single screen reader/meta key (although "meta" has connotations), and behind the scenes, each implementation can respond appropriately?

@jugglinmike
Copy link
Contributor

Thanks, @jscholes! It seems like the term "modifier" may have a couple meanings. Let me see if I've got this right:

  • Keyboard simulation systems make a technical distinction between "standard" keys and "modifier" keys. The latter cannot be pressed in isolation (or even in a specific sequence); they can only be enabled for the entirety of a sequence of "standard" key presses.
  • Screen readers also use the term "modifier," but with a different meaning. For screen readers, "modifier" keys are those that signal the beginning of a keyboard command.

Is that accurate?

@jugglinmike
Copy link
Contributor

Today, the folks at the 2022-12-05 Community Group meeting discussed this.

First, we acknowledged that some system APIs support simulating the pressing of "system modifier" keys (e.g. Control and Shift) declaratively as a refinement to a sequence of additional keys to be pressed. For instance, they support instructions like: "press the X key, Y key, and Z key in that order, and ensure Control is depressed for the entirety of the sequence."

We were not convinced that such a convention is necessary for this proposal because the same sequence can be modeled with an unrefined series of keys-to-be-pressed, and that's already possible with the API which @zcorpan has drafted (see gh-26). The earlier example can be expressed in these terms: "press the Control key, X key, Y key, and Z key in that order."

We didn't come to any formal conclusions, but we do feel more confident about waiting for implementation experience before designing any API around system modifier keys.

@lolaodelola lolaodelola added Deprioritised Work that is deprioritised for now and removed Agenda+Automation labels Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprioritised Work that is deprioritised for now
Projects
None yet
Development

No branches or pull requests

4 participants