-
Notifications
You must be signed in to change notification settings - Fork 28
Product Version Requirements for Assistive Technologies and Browsers
This page explains:
- Background describing ARIA-AT goals and the technologies that effect them.
- The goals of manual testing.
- How product versions impact manual test goals.
- Resulting requirements for product versions.
The ARIA-AT project is focused on making it possible to deliver high-quality web experiences to all assistive technology users.
Web developers can make experiences that work quite well for people who don't need assistive technology as long as the people accessing their web page are using any reasonably modern computer with nearly any reasonbly modern browser. This is possible because the technologies used to deliver those web experiences are predictable. There are millions of behavioral promises contained in a multitude of specifications, and they are continuously tested as developers change the technologies.
Delivering consistently reliable, high-quality web experiences to assistive technology users is not yet possible because assistive technologies are not yet sufficiently predictable. There are no specifications making promises about how assistive technologies should behave. The ARIA specification and various accessibility API specifications make promises about what will be delivered to assistive technologies, but the last mile to the user, the assistive technology, is not standardized.
While ARIA-AT is not developing an assistive technology standard, it is writing tests and seeking industry consensus for the expectations defined by the tests. ARIA-AT tests are exclusively focused on the behavior of assistive technologies, not the upstream technologies that feed them, because upstream technologies are already covered by other projects and specifications.
ARIA-AT tests assert something about how an assistive technology should behave when it is presenting a user interface that contains ARIA semantics. However, the behaviors of assistive technologies are affected by their operating environment. The outcome of every ARIA-AT test can be affected by:
- Version of the assistive technology
- Version of the browser
- Version of the operating system and supporting accessibility APIs
Thus, while ARIA-AT tests check whether an assistive technology behaves in a certain way given certain web application code, the assistive technologies being tested do not directly process the code on a web page. Browsers and accessibility APIs process the information in the web code as it flows down stream to the assistive technology. So, running an ARIA-AT test is testing not only the assistive technology but every layer of technology in the accessible experience rendering stack.
When deciding what to prioritize when managing product version dependencies, a key factor to consider is that ARIA-AT is focussed on resolving interoperability problems rooted in the last mile to the user -- the mile paved by assistive technologies. If an accessibility semantic does not have sufficient upstream support in browsers and accessibility APIs, it is essentially untestable with assistive technologies. Thus, when ARIA-AT work uncovers an upstream dependency that impedes correct assistive technology behavior, it is important that:
- ARIA-AT reports do not attribute the lack of upstream support for an accessibility semantic to the assistive technology.
- The upstream issue is communicated to the appropriate stakeholder.
The ARIA-AT plan is to automate testing and reporting for at least 5 screen readers by the end of 2023. Automation will make testing with nearly any number of combinations of product versions feasible. In the meantime, to make more immediate progress squashing interoperability bugs, manual testing is essential.
Note also that the automation plan does not eliminate the need for manual testing. Manual testing is part of the process of ensuring the tests are asserting correct expected behaviors and are coded correctly. That is, we don't automate a test until humans have verified the test and expected results are good.
The following assumptions inform the ARIA-AT approach to product versioning for manual testing:
- It is generally practical to require testers to use a recent production release of an assistive technology when running tests.
- Ttesters might have only one testing system where it is practical to have only one test environment (one version of a specific browser and operating system).
- Testers might have systems where the test environment changes automatically, e.g., their I/T department forces browser and operating system upgrades.
While the configurable test environment offered by AssistivLabs.com could prove extremely helpful for removing environmental constraints for some of the testing, it also introduces some complexities and challenges for testers. Even if it doesn't represent a complete solution, we are optomistic it could at least extend how much machine testing of different product versions is feasible.
The ARIA-AT community group Working Mode calls for manual testing to achieve two objectives:
- Test plan validation: Review and refine each plan until the expected assistive technology behaviors asserted by the plan have community consensus.
- Expected output definition: Produce test results that define correct assistive technology output for each assistive technology and browser combination. Correct outputs can then be utilized by automated tests.
Those objectives represent the long term, persistent need for manual test execution. In the near term, while automation capabilities remain nascent, manual testing is also necessary to further the primary goal of improving assistive technology interoperability by testing assistive technologies to identify failures.
When planning manual testing, a key question is how important is it to require testers to use specific versions of the products in the testing stack. The impact of differences in product versions among manual testers are as follows.
- Test plan validation: There is no need to require equivalent product versions when running tests to validate a test plan. Testing any version of an assistive technology in one browser is sufficient for gathering test plan issues. This is because:
- The goals of test plan validation are to ensure a plan is complete, well-crafted, and aligns with community expectations.
- It is not necessary for assistive technologies to satisfy the assertions in order to validate a test plan.
- Test plans cannot have dependencies on a specific browser.
- Expected output definition: Testing a recent production release of each assistive technology in any version of each browser is sufficient. This is because:
- The goal is to capture output that satisfies each assertion. That output can then serve as input for automated tests, which use it as an expected result that satisfies the assertion with full confidence.
- For a given assistive technology, satisfactory output may vary from browser to browser but should rarely depend on the exact version of a single browser.
- If multiple testers deliver equivalent output using different versions of technologies, the objective is achieved.
- If output for a given assistive technology and browser combination varies among testers, the ARIA-AT app will highlight the difference for analysis. If both variations are satisfactory and differences are due to product versions, that information can be captured for utilization within the automated test system.
- Assistive technology testing: Using recent versions of assistive technologies and browsers best serves project goals by producing failure information that is useful to assistive technology developers. However, it is not essential that all testers use precisely the same version of environment and assistive technology. This is because:
- The reporting system can track and break down results based on the version of assistive technology, browser, and operating system. Variations among testers thus enriches the result set.
- The Working Mode requires equivalent manual test results from multiple testers to help ensure that testers do not make mistakes. If equivalent results are obtained from slightly different version stacks, the intent of the comparison is achieved. If results differ, investigation will reveal whether the difference is due to differences in product versions, test execution, or test interpretation. If the variation is due only to product versions, data from both testers could be accepted and regarded as equivalent.
To achieve ARIA-AT objectives, it is not necessary to impose strict requirements for product versions for the systems people use for manual testing. The ARIA-AT Working Mode, the test runner, and the reporting system help identify situations where behaviors that depend on product versions are relevant.
However, since we have limited manual testing capacity, it is important to use that capacity to produce the most useful data possible. So:
- Because automation is yet nascent and we want to make progress finding interoperability bugs for assistive technology developers to fix, it is best to use the latest version of assistive technologies.
- Since we want reports to be useful to the broadest audience possible, it is best to use only public releases of products.
In addition, when running manual tests, it is important to record detailed version information for the assistive technology, browser, and operating system. This information can then be leveraged in reporting.
It is useful to know when test results have changed because of a change in an assistive technology or browser rather than because of a change in a test plan. For instance, if two test plan runs for two different versions of the same assistive technology with the same browser yield different output for some tests but have exactly matching assertion outcomes, i.e. all assertions that passed in run 1 also passed in run 2 and all assertions that failed in run 1 also failed in run 2, assuming no tester errors, then:
- Manual test results will yield conflicts and the runs are not comperable. Another run with one of the versions should be conducted.
- If the variable output is associated with passing assertions, the expected output for automated test runs needs to be adjusted.
- People writing automated tests of their own web components need to be aware of acceptable changes in output due to version differences.
This section defines how the ARIA-AT project will track comparability of product versions.
- Let A represent a specific assistive technology (e.g., JAWS)
- Let B represent a specific browser (e.g., Google Chrome)
- Let A/B represent use of a specific assistive technology with a specific browser (e.g., JAWS with Chrome)
- Let N represent a specific product version number.
- Let A.N represent a specific assistive technology at a specific version (e.g., JAWS version 2021.2105.53.400).
- Let B.N represent a specific browser at a specific version (e.g., Gooogle Chrome version 92.0.4515.131))
- Let N+1 represent the next version of a product (e.g. A.N+1 is the next version of A after version N).
The AT/Browser combination A.N/B.N yields equivalent results to A.N+1/B.N for test T if:
-
A.N/B.N and A.N+1/B.N yield identical assertion support for every command in T:
- Every assertion supported by A.N is also supported by A.N+1.
- Every assertion not supported by A.N is also not supported by A.N+1.
- Every assertion incorrectly supported by A.N is also incorrectly supported by A.N+1.
- A.N/B.N and A.N+1/B.N yield identical output for every command in test T, including unexpected excess output.
If A.N/B.N and A.N+1/B.N yield equivalent results for all tests in a plan, their test plan results are said to be equivalent.
Two combinations of a given AT and browser that differ in version of either AT or browser or both are said to be comparable for a given test if the combinations yield equivalent results for that test. Similarly, they are comparable for a test plan if the combinations yield equivalent test plan results.
The ARIA-AT project tracks comparability of product versions only for combinations of an AT and browser. While a given AT may yield equivalent results in two different browsers, equivalent results across browsers is not essential to interoperability. For instance, a screen reader may yield slightly different output in two browsers, but that output meets expectations in both cases.