-
Notifications
You must be signed in to change notification settings - Fork 408
TestSuiteCleanup
EpubCheck's test suite has grown organically, it consists of both packaged and unpackaged EPUB content, with no naming convention, and barely organized.
This document describes the stesp required to clean the test suite into so that it is easier to work with and maintain.
The names of the test cases (test data and Java test methods) currently don't follow any convention. It is difficult to lookup, and one never knows where to add a new test. Test names can be for example testValidateEPUBPLoremMimetype
or testIssue226
or testFXL_WithSVGNotInSpine
.
Proposed convention: lowercase descriptive filenames with hyphens. Prefixes for dedicated domains. Examples:
- svg-not-in-spine describes a file with SVG not in the spine
- edu-basic describes an EPUB for Education basic test file
- meta-inf-not-in-OPF describes a file that is missing the metainf file in the OPF
- mr-collections describes a multiple renditions EPUB with a collection element
We should agree on a well-defined naming convention.
Test cases are distributed in several places:
- under directories
20
or30
(according to the tested EPUB version) or under the directorycom/adobe/epubcheck/test
for the tests added by the Nook contribution. - some test data is packaged (zipped EPUB), some are unpackaged (expanded directories).
Along with the naming convention, we should come up with a well-defined directory structure and make sure that all the test data is organized accordingly.
Some tests are based on anonymized real-world content; Some tests are based off the "lorem ipsum" base EPUB; some other tests are based have been created from scratch. In many cases, the test data is not minimal, and this can introduce changes in the test results when the validation logic is updated.
All the tests should be based off a minimal EPUB sample, including nothing but the very bare minimum content.
For each test:
- check that only a single feature / failure scenario is tested
- if not, split the test in several test cases
- cleanup the test case
- cleanup the EPUB test content to prune useless content (i.e. base the test off the minimal base EPUB)
- rename the test according to the naming convention.
- remove the stored output + collected data (
results.txt
orresults.json
), which is more painful than helpful.
For the entire collection:
- make sure that each error message is tested at least once (this is almost true already).
- add some missing tests if required.
Once the test data is reorganized, the parent Java runner classes may be refactored to be simplified.