TestSuiteCleanup

EpubCheck Test Suite Cleanup

EpubCheck's test suite has grown organically, it consists of both packaged and unpackaged EPUB content, with no naming convention, and barely organized.

This document describes the stesp required to clean the test suite into so that it is easier to work with and maintain.

Define a naming convention

The names of the test cases (test data and Java test methods) currently don't follow any convention. It is difficult to lookup, and one never knows where to add a new test. Test names can be for example testValidateEPUBPLoremMimetype or testIssue226 or testFXL_WithSVGNotInSpine.

Proposed convention: lowercase descriptive filenames with hyphens. Prefixes for dedicated domains. Examples:

svg-not-in-spine describes a file with SVG not in the spine
edu-basic describes an EPUB for Education basic test file
meta-inf-not-in-OPF describes a file that is missing the metainf file in the OPF
mr-collections describes a multiple renditions EPUB with a collection element

We should agree on a well-defined naming convention.

Rework the test directory organization

Test cases are distributed in several places:

under directories 20 or 30 (according to the tested EPUB version) or under the directory com/adobe/epubcheck/test for the tests added by the Nook contribution.
some test data is packaged (zipped EPUB), some are unpackaged (expanded directories).

Along with the naming convention, we should come up with a well-defined directory structure and make sure that all the test data is organized accordingly.

Define a minimal base EPUB test case

Some tests are based on anonymized real-world content; Some tests are based off the "lorem ipsum" base EPUB; some other tests are based have been created from scratch. In many cases, the test data is not minimal, and this can introduce changes in the test results when the validation logic is updated.

All the tests should be based off a minimal EPUB sample, including nothing but the very bare minimum content.

Cleanup (for each test)

For each test:

check that only a single feature / failure scenario is tested
- if not, split the test in several test cases
cleanup the test case
- cleanup the EPUB test content to prune useless content (i.e. base the test off the minimal base EPUB)
- rename the test according to the naming convention.
- remove the stored output + collected data (results.txt or results.json), which is more painful than helpful.

For the entire collection:

make sure that each error message is tested at least once (this is almost true already).
add some missing tests if required.

Rewrite/Refactor the Java test superclasses

Once the test data is reorganized, the parent Java runner classes may be refactored to be simplified.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly