Add tests for TokenSubject #4121

BenHenning · 2022-01-21T06:06:05Z

#4051 introduces a new custom Truth test subject: TokenSubject. It would be advantageous to add thorough tests for this to ensure that:

Each of the subject methods correctly allow users to test specific properties of the protos each subject helps verify
Each of the subject methods are verified to fail when expected (i.e. when used with an incompatible proto property) with the correct failure reason being verified

See #4097 for a similar issue.

…4051) ## Explanation Fix part of #4044 Originally copied from #2173 when it was in proof-of-concept form This PR principally introduces the lexical tokenizer (lexer) for math expressions, as defined by the [formal grammar](https://docs.google.com/document/d/1JMpbjqRqdEpye67HvDoqBo_rtScY9oEaB7SwKBBspss/edit). The tokenizer converts a string into a sequence of ``Token``s which provide context to the characters available in the string. While the tokenizer is not yet hooked up, it will be leveraged in an upcoming PR to add support for parsing math expressions and equations. Note that special care was taken to ensure the parser is an LL(1) parser to simplify implementation, ease future grammar additions, and to allow for a table-based parsing method in the future if increased performance is necessary. Some caveats of the tokenizer: - It's implemented on top of a sequence with lazy iteration to ensure that no processing time is spent parsing more than necessary (i.e. if the parser short-circuits due to an error then no additional tokenization cost is paid) - Due to a limitation in the grammar being LL(1), function name parsing is a bit more nuanced and limiting. The lexer has to assume that the start of a function name is actually a name and not a series of variables. To account for this, a special token exists for invalid function names. See the tokenizer's test suite for more details on valid and invalid character combinations. - Care is taken to ensure that overflowed integers & doubles result in failing tokens - The lexer takes on a bit of the actual parsing (for integers & reals) to simplify the parser, but the majority of parsing is not done in the lexer This PR introduces a new Truth subject for the tokenizer's ``Token`` class, though similar to subjects introduced in previous PRs this one does not include tests. #4121 is tracking adding them in the future. To enforce the LL(1) grammar, the tokenizer makes use of a PeekableIterator that doesn't allow for lookahead beyond one character. This iterator will also be used by the upcoming parser to keep the LL(1) property. To enable better testing for the tokenizer, I decided to add parameterized tests. Unfortunately, there wasn't an obvious way on how to exactly do this. While parameterized JUnit tests work, there are some limitations: - They won't work with instrumentation or Robolectric - They don't support combining parameterized & non-parameterized tests Instead, I decided to implement a custom parameterized test runner that addresses both of the points above. I used ``AndroidJUnit4`` and ``ParameterizedRobolectricTestRunner`` as key references into understanding how to actually build this. While the implementation we're now using is quite different than both, they serve as parts of the basis. ``AndroidJUnit4`` is final, so we needed to implement a similar switching mechanism to select Robolectric or Espresso test helpers based on the platform (this is controlled via a test class-level annotation). Further, Robolectric's runner was important to understand how to even set up a parameterized runner, and to better understand how to make Robolectric work in this environment. The result seems to be a very clean test runner, but I'm keen to hear reviewer's thoughts on this. Note a couple things: - While there's an Espresso-supported runner, I didn't actually verify that it works since we haven't yet gotten instrumentation tests working natively with Bazel (though I do expect that it'll work) - I didn't test this on Gradle, but per the CI run it appears to work just fine - There are no tests for any of the new runners or their utilities. The utilities are fairly trivial, and the runners are difficult to test. I decided not to test them or file a tracking issue as they'll likely be the last components we add tests for when we aim to raise the codebase's code coverage (I didn't feel the difficulty was worth overcoming here over vs. manual verification). - The runner has very robust error detection to try and reduce potential errors, including enforcing parameter order consistency - I ran into a little snag with ktlint--see #4122 - The previous iteration of this led to platform selection between Espresso & Robolectric based on Bazel dependencies, but I decided to change this to be an explicit annotation. This has a number of considerations: - It leads to automatic build fixes (i.e. you can't compile a reference to a class without the Bazel dependency being complete) - It allowed me to introduce a JUnit-specific test suite (after changing MathTokenizerTest to this, I saw an almost 40x decrease in runtime; I plan to use this for all future parameterized math tests since Robolectric isn't actually needed for these and is **much** slower) - It doesn't facilitate shared tests like AndroidJUnit4, but we could conceivably create a shared runner in the future that uses either the instrumentation or Robolectric runner based on the current platform (similar to how AndroidJUnit4 works--see the commit history for the old code that did this) - Finally, see the KDoc for ``OppiaParameterizedTestRunner`` for much more details on the runner, including both a code example and suggestions on when to actually use the runner. Further, ``MathTokenizerTest`` demonstrates real uses of the runner. The way that the runner works is it generates a test suite that combines one runner for all non-parameterized tests with a runner for each parameterized iteration (so the total number of tests in a suite is ``number_of_non_parameterized_tests + sum(parameterized_iterations_in_all_parameterized_tests)``. The iteration names are appended to parameterized test's names to provide both Bazel filtering support and direct error reporting when tests fail (so that the exact iteration is known & can be run/debug in isolation). Android Studio will run all iterations when filtering on a method (since that should match against all), but this might be environment-specific (I verified this with Bazel in Android Studio, but not with Gradle). ### Script asset changes Test file exemptions were added for the new JUnit rules & the TokenSubject class--see above for the rationale. Regex content exemptions were added for ``ParameterizedMethod`` since it uses ``Locale`` and ``capitalize``. It doesn't have access to ``MachineLocale`` (at least, currently), and is only running in tests (so localization correctness isn't as important). Finally, the chosen locale to use for capitalization is ``US`` which should more or less match field names in tests for the Oppia codebase. Further, a new regex check was added to require all uses of the new test runner to require approval (i.e. by adding an exemption). This will help ensure it doesn't get abused/used too broadly since parameterization should be an exceptional case rather than the norm. ## Essential Checklist - [x] The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".) - [x] Any changes to [scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets) files have their rationale included in the PR explanation. - [x] The PR follows the [style guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide). - [x] The PR does not contain any unnecessary code changes from Android Studio ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)). - [x] The PR is made from a branch that's **not** called "develop" and is up-to-date with "develop". - [x] The PR is **assigned** to the appropriate reviewers ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)). ## For UI-specific PRs only N/A -- This PR introduces a non-user facing utility, and the utility isn't yet hooked up (it will be in a subsequent PR). Commit history: * Copy proto-based changes from #2173. * Introduce math.proto & refactor math extensions. Much of this is copied from #2173. * Migrate tests & remove unneeded prefix. * Add needed newline. * Some needed Fraction changes. * Introduce math expression + equation protos. Also adds testing libraries for both + fractions & reals (new structure). Most of this is copied from #2173. * Add protos + testing lib for commutative exprs. * Add protos & test libs for polynomials. * Lint fix. * Lint fixes. * Add math tokenizer + utility & tests. This is mostly copied from #2173. * Fix broken test post-refactor. * Post-merge fix. * Add regex check, docs, and resolve TODOs. This also changes regex handling in the check to be more generic for better flexibility when matching files. * Lint fix. * Fix failing static checks. * Fix broken CI checks. Adds missing KDocs, test file exemptions, and fixes the Gradle build. * Lint fixes. * Add docs & exempted tests. * Remove blank line. * Add docs + tests. * Add parameterized test runner. This commit introduces a new parameterized test runner that allows proper combinations of parameterized & non-parameterized tests in the same suite, and in a way that should work on both Robolectric & Espresso (though the latter isn't currently verified). Further, this commit also introduces a TokenSubject that will be used more explicitly by the follow-up commit for verifying MathTokenizer. * Add & update tests. This introduces tests for PeekableIterator, and reimplements all of MathTokenizer's tests to be more structured, thorough, and a bit more maintainable (i.e. by leveraging parameterized tests). * Lint fixes. This includes a fix for 'fun interface' not working with ktlint (see #4122). * Remove internals that broke things. * Add regex exemptions. * Address reviewer comments + other stuff. This also fixes a typo and incorrectly ordered exemptions list I noticed during development of downstream PRs. * Move StringExtensions & fraction parsing. This splits fraction parsing between UI & utility components. * Address reviewer comments. * Alphabetize test exemptions. * Fix typo & add regex check. The new regex check makes it so that all parameterized testing can be more easily tracked by the Android TL. * Add missing KDocs. * Remove the ComparableOperationList wrapper. * Change parameterized method delimiter. * Use more intentional epsilons for float comparing. * Treat en-dash as a subtraction symbol. * Add explicit platform selection for paramerized. This adds explicit platform selection support rather than it being automatic based on deps. While less flexible for shared tests, this offers better control for tests that don't want to to use Robolectric for local tests. This also adds a JUnit-only test runner, and updates MathTokenizerTest to use it (which led to an almost 40x decrease in runtime). * Exemption fixes. Also, fix name for the AndroidJUnit4 runner. * Remove failing test. * Address reviewer comment. Clarifies the documentation in the test runner around parameter injection. * Fix broken build. * Fix broken build post-merge. * Post-merge fix. * More post-merge fixes.

BenHenning mentioned this issue Jan 21, 2022

Fix part of #4044: Add math tokenizer & parameterized test support #4051

Merged

6 tasks

BenHenning added temp: triage for beta and removed temp: triage for beta labels Jun 11, 2022

Broppia added the issue_type_infrastructure label Jun 13, 2022

Broppia added Impact: Low Low perceived user impact (e.g. edge cases). user_team labels Jul 29, 2022

BenHenning added Issue: Needs Clarification Indicates that an issue needs more detail in order to be able to be acted upon. Z-ibt Temporary label for Ben to keep track of issues he's triaged. issue_user_developer labels Sep 15, 2022

seanlip added enhancement End user-perceivable enhancements. and removed issue_type_infrastructure labels Mar 28, 2023

seanlip added this to [Team] Core Learner and Mastery flows & UI Frontend - Android Jun 4, 2023

github-project-automation bot moved this to Todo in [Team] Core Learner and Mastery flows & UI Frontend - Android Jun 4, 2023

adhiamboperes added the Work: Low Solution is clear and broken into good-first-issue-sized chunks. label Aug 14, 2023

adhiamboperes removed this from [Team] Core Learner and Mastery flows & UI Frontend - Android Oct 9, 2023

adhiamboperes added this to [Team] Developer Workflow & Infrastructure - Android Oct 9, 2023

github-project-automation bot moved this to Todo in [Team] Developer Workflow & Infrastructure - Android Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for TokenSubject #4121

Add tests for TokenSubject #4121

BenHenning commented Jan 21, 2022

Add tests for TokenSubject #4121

Add tests for TokenSubject #4121

Comments

BenHenning commented Jan 21, 2022