Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AT Automation API Roadmap #15

Open
4 of 7 tasks
zcorpan opened this issue Apr 7, 2022 · 5 comments
Open
4 of 7 tasks

AT Automation API Roadmap #15

zcorpan opened this issue Apr 7, 2022 · 5 comments

Comments

@zcorpan
Copy link
Member

zcorpan commented Apr 7, 2022

This is a proposed roadmap of milestones for the AT Automation API specification (see https://github.com/w3c/aria-at-automation#proposal-specify-a-new-service-to-compliment-webdriver )

The relative order of the milestones below are somewhat arbitrary, and some could be rearranged or happen in parallel. Any dependencies on other milestones are documented. Security considerations for each milestone is also documented.

MVP is milestones 0 through 3.

Milestone 0: Protocol

Design an architecture, API shape, protocol.

security

  • opt in to API
  • use an existing widely supported network protocol (e.g. WebSocket, like WebDriver BiDi)

Milestone 1: Settings

Vendor-specific settings (also see #16)

security

  • opt in to API

Milestone 2: Capture output

API to capture spoken output without changing the TTS voice (also see #24)

security

  • opt in to API

  • sandbox (e.g. do not capture output when the expected applications do not have focus)

Milestone 3: Keypresses

API to simulate keypresses (also see #12)

security

  • opt in to API

  • not HID level simulated keypresses

  • sandbox (e.g. do not allow sending keypresses when the expected applications do not have focus)

  • session

Milestone 4: Activate commands

Vendor-specific API to activate commands (also see #12). Example: go to the next heading. At minimum setting "modes" (as used in aria-at).

security

  • opt in to API

  • sandbox

  • session

  • exclude access to any security-sensitive commands

Straw-person message structure example:

{
  "method": "nvda:activateCommand",
  "params": {
    "command": "change to browse mode"
  }
}

Return Type: EmptyResult

Milestone 5: Internal state

Depends on: milestone 4

New API to expose internal state or information in screen readers that is not directly exposed to users but is still useful for testing purposes, e.g. virtual focus position, mode (interaction mode vs. reading mode). At minimum getting the current "mode" (as used in aria-at)

security

  • opt in to API
  • exclude access to any security-sensitive information

Straw-person message structure example:

{
  "method": "nvda:getState",
  "params": {
    "state": "mode"
  }
}

Return Type: TBD

Milestone 6: Headless mode

Depends on: milestone 2

Turn off output to TTS (headless mode) (also see #13)

security

  • opt in to API

  • signal to user somehow that SR is active (visual + audio)?

@jscholes
Copy link

jscholes commented Apr 7, 2022

@zcorpan Thanks for writing this up! Some comments:

enunciate punctuation

This is quite a complex setting, so we'll need to scope out exactly what we want/need here. E.g. different screen readers have different predefined levels, but also some additional customisation on top of that (such as symbols dictionaries in NVDA).

Start reading

I don't know what this command is/would be expected to do. Do you mean starting a say all, to read from the cursor position to the end of the page? Note that we don't currently use that in any ARIA-AT tests.

Move to first status menu in menu bar

Not sure what this refers to. Which menu bar?

Find next/previous misspelled word

We don't currently have any ARIA-AT tests relying on this, and I'm not sure which screen readers even support it in virtual web content. Definitely doesn't seem like a Milestone 4 command to me.

@zcorpan
Copy link
Member Author

zcorpan commented Apr 8, 2022

enunciate punctuation

This is quite a complex setting, so we'll need to scope out exactly what we want/need here. E.g. different screen readers have different predefined levels, but also some additional customisation on top of that (such as symbols dictionaries in NVDA).

OK.

Start reading

I don't know what this command is/would be expected to do. Do you mean starting a say all, to read from the cursor position to the end of the page? Note that we don't currently use that in any ARIA-AT tests.

I believe that's what the command does, yes. I don't know if we need it for aria-at, though it might be useful for more general testing of websites or web apps.

Move to first status menu in menu bar

Not sure what this refers to. Which menu bar?

I'm not sure. It doesn't seem relevant for testing web content, so I'll remove it from the list.

Find next/previous misspelled word

We don't currently have any ARIA-AT tests relying on this, and I'm not sure which screen readers even support it in virtual web content. Definitely doesn't seem like a Milestone 4 command to me.

Indeed, I'll remove it.

Thanks!

@mfairchild365
Copy link
Contributor

For Milestone 4, I think we are missing Navigate to the previous element.

zcorpan added a commit that referenced this issue Apr 25, 2022
A start of milestone 0 and milestone 1 in #15

This is modeled after the WebDriver BiDi spec: https://w3c.github.io/webdriver-bidi/
This was referenced Apr 25, 2022
@zcorpan zcorpan mentioned this issue Jun 20, 2022
@zcorpan
Copy link
Member Author

zcorpan commented Jun 22, 2022

I've edited the milestones in OP to reflect our current thinking. In particular:

  • Milestone 1, settings, are now vendor-specific and can include all settings (except any to exclude for security reasons)
  • Milestone 4, activate commands, are similarly vendor-specific
  • Removed milestones 6 and 7 (previously "more settings" and "more commands")
  • Milestones 0 through 3 should represent a good MVP

zcorpan added a commit that referenced this issue Jun 23, 2022
@zcorpan zcorpan mentioned this issue Jun 23, 2022
@zcorpan zcorpan mentioned this issue Jul 6, 2022
@zcorpan
Copy link
Member Author

zcorpan commented Aug 16, 2022

Based on our conversation in the CG meeting yesterday (minutes), I think we should make the following adjustments to the roadmap:

  • Milestone 4: Activate commands
  • Milestone 5: Headless mode
  • Milestone 6: Internal state

becomes

  • Milestone 4: Activate commands - vendor-specific commands, at minimum setting "modes" (as used in aria-at)
  • Milestone 5: Internal state - expose vendor-specific state, at minimum getting the current "mode" (as used in aria-at)
  • Milestone 6: Headless mode

zcorpan added a commit that referenced this issue Aug 16, 2022
zcorpan added a commit that referenced this issue Aug 16, 2022
Milestone 0 in #15

This is modeled after the WebDriver BiDi spec: https://w3c.github.io/webdriver-bidi/

Co-authored-by: Michael Fairchild <[email protected]>
Co-authored-by: Mike Pennisi <[email protected]>
zcorpan added a commit that referenced this issue Aug 16, 2022
zcorpan added a commit that referenced this issue Aug 16, 2022
zcorpan added a commit that referenced this issue Aug 16, 2022
zcorpan added a commit that referenced this issue Sep 29, 2022
zcorpan added a commit that referenced this issue Oct 18, 2022
zcorpan added a commit that referenced this issue Oct 21, 2022
jugglinmike added a commit that referenced this issue Nov 22, 2022
* Add commands: interaction.sendKeys and actions.perform

Milestone 3 in #15.

* Remove Actions API, change sendKeys to pressKeys

* Missed to update names in a few places

* Add an inline issue about SR-specific modifier keys

* Correct reference to "result type"

Co-authored-by: Mike Pennisi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants