Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support larger integer test arguments #6716

Conversation

gilles-peskine-arm
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm commented Dec 4, 2022

This is a collection of improvements to how the unit test framework handles test case arguments.

  • Support larger (and smaller) integer arguments to test functions, such as long and mbedtls_mpi_sint. Addresses Support other integral types in unit tests #4913. (I'm working on unsigned types, but it requires more refactoring of the Python code.)
  • Support void test_function(void). Resolves Fix generate_test_code.py to allow void foo(void) #1757.
  • Make backslash escapes in test case string data more regular: backslash always escapes the next non-alphanumeric character. (Impacts some existing X.509 tests.)
  • A little cleanup here and there. No completeness objective there.

Planned follow-ups:

  1. Some refactoring in the Python code, which I won't start while there are unreviewed changes to that code.
  2. Add support for unsigned integer types.
  3. Update documentation in the knowledge base.
  4. Get rid of all the foo_t x = x_arg hacks in many test functions (especially PSA).

Gatekeeper checklist

@gilles-peskine-arm gilles-peskine-arm added enhancement needs-ci Needs to pass CI tests size-s Estimated task size: small (~2d) component-test Test framework and CI scripts labels Dec 4, 2022
@gilles-peskine-arm gilles-peskine-arm force-pushed the test-argument-types-union branch from 383ae9c to 0ee469d Compare December 4, 2022 16:41
@gilles-peskine-arm gilles-peskine-arm added needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review and removed needs-ci Needs to pass CI tests labels Dec 4, 2022
@gilles-peskine-arm gilles-peskine-arm added the priority-medium Medium priority - this can be reviewed as time permits label Dec 15, 2022
@gilles-peskine-arm gilles-peskine-arm force-pushed the test-argument-types-union branch from 0ee469d to ac58d3b Compare January 19, 2023 11:27
@gilles-peskine-arm
Copy link
Contributor Author

Force-pushed with the new code style.

@gilles-peskine-arm gilles-peskine-arm force-pushed the test-argument-types-union branch from ac58d3b to 032af91 Compare February 3, 2023 12:58
@gilles-peskine-arm
Copy link
Contributor Author

Rebased due to trivial conflict (consecutive changes)

@@ -31,6 +31,31 @@ exit:
}
/* END_CASE */

/* BEGIN_CASE */
void printf_long_max(const char *format, long value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understand, this test is designed to make sure the test framework could process long int value (actually only LONG_MAX could be tested) correctly. So the parameters seem to be redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing value is necessary because the main intent of this test is to check that the test framework passes the value correctly.

Passing format is not strictly necessary, but this way we're testing the mbedtls_printf function and not a possible compiler optimization. Also it makes the structure of the different tests more uniform. Also it may make it easier to generalize the test in the future.

Copy link
Contributor

@lpy4105 lpy4105 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except a discussion around the type naming.

Copy link

@yanrayw yanrayw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, but I left some questions to discuss plus some potential improvements.

if not INT_VAL_REGEX.match(val):
return False
# Limit the range to what is guaranteed to get through strtol()
return abs(int(val, 0)) <= 0x7fffffff
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we support long after commit acfffeb, should we increase the size to LONG_MAX(0x7fffffffffffffff)? If not, the value longer than 0x7fffffff is considered as exp not an integer type. So verify_int won't process a value larger than 0x7fffffff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the comment indicates, currently, the value needs to be supported by strtol. 0x7fffffff is the largest value that is guaranteed to fit in long. long can be a 32-bit type (and it usually is a 32-bit type on 32-bit architectures, and also on 64-bit Windows).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we don't have to use strtol() - we could use strtoll() or even write a simple conversion routine for ourselves (this is in the test code after all) - we only need bases 10 and 16

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, because it's test code, we needn't bother reimplement the standard library.

And anyway that doesn't matter. The exp path has to exist anyway (so we can use any constant expression, including macro names that the python code could not recognize). So we might as well use it. The int path is just an optimization to reduce the size of the generated code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the comment indicates, currently, the value needs to be supported by strtol. 0x7fffffff is the largest value that is guaranteed to fit in long. long can be a 32-bit type (and it usually is a 32-bit type on 32-bit architectures, and also on 64-bit Windows).

This makes sense to me. I was assuming LONG_MAX(0x7fffffffffffffff) is the largest value to fit in long but I didn't consider it could be a 32-bit type.

We're using the non-standard function strcasecmp() just so that the case
of digits beyond 9 can be different in the library and in the test data.
Use matching case in the test data, and use a standard function for the
comparison.

Signed-off-by: Gilles Peskine <[email protected]>
It is no longer used.

Signed-off-by: Gilles Peskine <[email protected]>
We now require at least Visual Studio 2013, which has stdint.h per
 https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2013/y4hta57s(v=vs.120)
so the workaround to define C99 types on pre-C99 MSVC is no longer needed.

Signed-off-by: Gilles Peskine <[email protected]>
This is just a quick improvement, not meant to tackle the problem as a
whole.

Signed-off-by: Gilles Peskine <[email protected]>
No intended behavior change. This commit is mainly to satisfy pylint, which
complains that gen_from_test_data now has too many variables. But it's a
good thing anyway to make the function a little more readable.

Signed-off-by: Gilles Peskine <[email protected]>
Treat backslash as a universal escape character: "\n" is a newline,
backslash escapes any non-alphanumeric character.

This affects some test cases that had "\," standing for backslash-comma.
With the new uniform treatment of backslashes, this needs to be "\\,".

Signed-off-by: Gilles Peskine <[email protected]>
In the .datax parser, since we're calling strtol() anyway, rely on it for
verification. This makes the .datax parser very slightly more
liberal (leading spaces and '+' are now accepted), and changes the
interpretation of numbers with leading zeros to octal.

Before, an argument like :0123: was parsed as decimal, but an argument like
:0123+1: was parsed as a C expression and hence the leading zero marked an
octal representation. Now, a leading zero is always interpreted according to
C syntax, namely indicating octal. There are no nonzero integer constants
with a leading zero in a .data file, so this does not affect existing test
cases.

In the .datax generator, allow negative arguments to be 'int' (before, they
were systematically treated as 'exp' even though they didn't need to be).

In the .datax parser, validate the range of integer constants. They have to
fit in int32_t. In the .datax generator, use 'exp' instead of 'int' for
integer constants that are out of range.

Signed-off-by: Gilles Peskine <[email protected]>
The test framework stores size_t and int32_t values in the parameter store
by converting them all to int. This is ok in practice, since we assume int
covers int32_t and we don't have test data larger than 2GB. But it's
confusing and error-prone. So make the parameter store a union, which allows
size_t values not to be potentially truncated and makes the code a little
clearer.

Signed-off-by: Gilles Peskine <[email protected]>
Use normalization the equality comparisons instead of loose regular
expressions to determine the type of an argument of a test function.

Now declarations are parsed in a stricter way: there can't be ignored junk
at the beginning or at the end. For example, `long long unsigned int x`
was accepted as a test function argument (but not `long long unsigned x`),
although this was misleading since the value was truncated to the range of
int. Now only recognized types are accepted.

The new code is slightly looser in that it accepts `char const*` as well as
`const char*`.

Signed-off-by: Gilles Peskine <[email protected]>
Change the type of signed integer arguments from int32_t to intmax_t.
This allows the C code to work with test function arguments with a range
larger than int32_t. A subsequent commit will change the .datax generator
to support larger types.

Signed-off-by: Gilles Peskine <[email protected]>
Now that the C code supports the full range of intmax_t, allow any size of
signed integer type in the .data file parser.

Signed-off-by: Gilles Peskine <[email protected]>
Now that the test framework can pass arbitrary values of type
mbedtls_mpi_sint, just do that.

Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
@gilles-peskine-arm gilles-peskine-arm force-pushed the test-argument-types-union branch from e3f49d1 to 9a75131 Compare April 26, 2023 17:40
@gilles-peskine-arm
Copy link
Contributor Author

I rebased on top of the latest development to handle a conflict. There were two places where the commit "Simplify string escapes" changes a line that's adjacent to a depend_on line that changed in development. No change in development adds or changes a line containing \, that would need to be changed.

I will address Yanray's comments in subsequent commits.

The test framework used to treat them specially (but no longer does). Add
these test cases as non-regression for how the test framework allows "?"
and especially "??" (which I think in the very distant path needed special
handling because the test data was embedded in a .c file, and thus ?? could
be interpreted as the prefix of a trigraph).

Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
@gilles-peskine-arm gilles-peskine-arm added needs-ci Needs to pass CI tests and removed needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review labels Apr 26, 2023
@gilles-peskine-arm gilles-peskine-arm added needs-review Every commit must be reviewed by at least two team members, needs-backports Backports are missing or are pending review and approval. and removed needs-ci Needs to pass CI tests labels Apr 26, 2023
Copy link

@yanrayw yanrayw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me, thanks!

Copy link
Contributor

@lpy4105 lpy4105 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gilles-peskine-arm gilles-peskine-arm added approved Design and code approved - may be waiting for CI or backports and removed needs-review Every commit must be reviewed by at least two team members, labels Apr 28, 2023
@gilles-peskine-arm gilles-peskine-arm merged commit 2eff33c into Mbed-TLS:development May 4, 2023
@gilles-peskine-arm gilles-peskine-arm mentioned this pull request May 25, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Design and code approved - may be waiting for CI or backports component-test Test framework and CI scripts enhancement needs-backports Backports are missing or are pending review and approval. priority-medium Medium priority - this can be reviewed as time permits size-s Estimated task size: small (~2d)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix generate_test_code.py to allow void foo(void)
4 participants