Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.14] Add more tests for TCPMappings #4477

Merged
merged 5 commits into from
Sep 16, 2022
Merged

Conversation

LukeShu
Copy link
Contributor

@LukeShu LukeShu commented Sep 1, 2022

Description

The goal of this PR is to have a baseline set of tests for "this is how TCPMappings worked in 1.14". It is not quite as complete as I would like it to be; see the "TODO" comments at the bottom of t_tcpmapping.py.

Also included are a number of backports from v2.x; some to make my test-writing experience better, some to help avoid pointless conflicts when repatriating this in to v2.4.

The KAT_RUN_MODE=envoy tests are failing in CI (err, raises the flake rate to unbearable levels--I can't believe only one job failed this last time; I've never seen any of the envoy jobs pass until now), because apparently this small handful of tests pushes it over what will fit on a CircleCI xlarge runner, and we'd have to upgrade our plan to unlock 2xlarge runners (even though 2xlarge runners are already twice the price-per-minute, you have to pay CircleCI more money for the privilege of paying them different more money). So I guess this PR doesn't get to land, and instead just gets to serve as the starting point for #4493. Still, I wouldn't close this PR as long as v1.14 is a supported version.

Related Issues

Testing

I've verified that the tests work with KAT_RUN_MODE=envoy.

Checklist

  • I made sure to update CHANGELOG.md. - no applicable changes

    Remember, the CHANGELOG needs to mention:

    • Any new features
    • Any changes to our included version of Envoy
    • Any non-backward-compatible changes
    • Any deprecations
  • This is unlikely to impact how Ambassador performs at scale. - not a runtime change

    Remember, things that might have an impact at scale include:

    • Any significant changes in memory use that might require adjusting the memory limits
    • Any significant changes in CPU use that might require adjusting the CPU limits
    • Anything that might change how many replicas users should use
    • Changes that impact data-plane latency/scalability
  • My change is adequately tested. - that's all this is

    Remember when considering testing:

    • Your change needs to be specifically covered by tests.
      • Tests need to cover all the states where your change is relevant: for example, if you add a behavior that can be enabled or disabled, you'll need tests that cover the enabled case and tests that cover the disabled case. It's not sufficient just to test with the behavior enabled.
    • You also need to make sure that the entire area being changed has adequate test coverage.
      • If existing tests don't actually cover the entire area being changed, add tests.
      • This applies even for aspects of the area that you're not changing – check the test coverage, and improve it if needed!
    • We should lean on the bulk of code being covered by unit tests, but...
    • ... an end-to-end test should cover the integration points
  • I updated DEVELOPING.md with any any special dev tricks I had to use to work on this code efficiently.

  • The changes in this PR have been reviewed for security concerns and adherence to security best practices.

These two bumps are required for pytest to work with Python 3.10.
This doesn't matter for builds, or for in CI, but this does matter for
developers whose system Python is up-to-date.

Since Emissary 1.14 is in maintenance mode, I did the smallest
upgrades possible to get things working.

For typed-ast, that means including
python/typed_ast#158, which was first included
in 1.4.3.

For pytest, that means 6.2.5, since that's when the changelog mentions
"Python 3.10 is now supported."
https://docs.pytest.org/en/stable/changelog.html#pytest-6-2-5-2021-08-29

Signed-off-by: Luke Shumaker <[email protected]>
@LukeShu LukeShu changed the base branch from master to release/v1.14 September 1, 2022 04:36
@LukeShu LukeShu force-pushed the lukeshu/v1.14-tcpmapping branch 4 times, most recently from 48213ab to 2835001 Compare September 6, 2022 19:29
Signed-off-by: Luke Shumaker <[email protected]>
(cherry picked from commit fc40a4d)
Signed-off-by: Luke Shumaker <[email protected]>
(cherry picked from commit 03be3e9)
@LukeShu LukeShu force-pushed the lukeshu/v1.14-tcpmapping branch 2 times, most recently from 178450b to ee072c9 Compare September 7, 2022 20:36
Signed-off-by: Luke Shumaker <[email protected]>

(cherry picked from commit e3af51a)

This cherry pick is needed because several of the certs are now expired;
I don't care about the SAN, I care about the timestamps.

I also had to touch up t_tls.py, since it's checking for some details
of the certs, and those details changed.  I didn't have to do that in
the original commit because I guess those tests had been marked as
xfail.  Yikes!

Signed-off-by: Luke Shumaker <[email protected]>
@LukeShu LukeShu force-pushed the lukeshu/v1.14-tcpmapping branch from ee072c9 to 02a69a9 Compare September 7, 2022 20:47
@LukeShu LukeShu changed the title Lukeshu/v1.14 tcpmapping [v1.14] Add more tests for TCPMappings Sep 7, 2022
@LanceEa
Copy link
Contributor

LanceEa commented Sep 12, 2022

@LukeShu - General question on the Gold Tests...I thought we dropped those in v3, correct? What was the reason for dropping these? Did we not feel they were beneficial, duplicates of other tests, or some other reason? I see these are still available in v2 so we at least kept them around for that.

@LukeShu
Copy link
Contributor Author

LukeShu commented Sep 13, 2022

General question on the Gold Tests...I thought we dropped those in v3, correct? What was the reason for dropping these? Did we not feel they were beneficial, duplicates of other tests, or some other reason?

The annotation fixes in 2.1.2 (#4016) willfully broke our ability to run the gold tests; as the mechanisms it uses to run those tests need substantially reworked to work with the fixes.

That said, in the months since then, I have come to believe that we don't get value from the gold tests. The gold tests were an hacky optimization; avoiding CI-billing, dev-wait-time, and CI-flakes associated with running the full KAT_RUN_MODE=envoy tests.

  • However, we had previously (in the 1.x timeframe) had several incidents where we were bitten by a developer mistakenly checking in bad gold files and a regression getting shipped; as a corrective action, we started running the KAT_RUN_MODE=envoy tests in CI (you should be able to find postmortems in Notion). In my opinion; at that point the gold tests are redundant; but potentially still useful for saying "oh, the envoy failed but the gold didn't change; this is almost certainly a CI flake". I'm not sure anyone intended for "have CI always run both gold and envoy" to be a long term situation; but a temporary CI bloat while we figure out how to safely trim CI back down.
  • The value that we'd get by switching back to running the tests with gold is minimal:
    • We have migrated from CircleCI to GitHub Actions; GHA is free for Emissary, so there's no costs to cut there; though there is in Edge Stack.
    • We have split the envoy tests across multiple jobs; this has both cut time dev-wait-time and reduced flakes. The "slow" envoy tests now run faster than the "fast" gold tests ever did.
    • As we've been forced to live with the Envoy tests in CI, we've dramatically improved their flakiness.

That said, I never managed to convince @kflynn that the gold tests aren't worth bringing back; so the scripts related to them haven't been deleted. Flynn, would you care to weigh in on why you believe that the gold tests are worth bringing back?

I see these are still available in v2 so we at least kept them around for that.

No? The gold directory does not exist in release/v2.4. Some of the machinery around gold is still there; but because it's not getting run, the gold files are dead weight, so we deleted them. If we start running them again, all that it would take to bring them back is make pytest-gold.

@LukeShu LukeShu mentioned this pull request Sep 13, 2022
5 tasks
@LukeShu LukeShu marked this pull request as ready for review September 13, 2022 23:46
@LanceEa LanceEa self-requested a review September 14, 2022 13:41
@LukeShu
Copy link
Contributor Author

LukeShu commented Sep 16, 2022

I've filed #4515

@haq204
Copy link

haq204 commented Sep 16, 2022

do any fixes from #4493 need to be ported over?

@LukeShu
Copy link
Contributor Author

LukeShu commented Sep 16, 2022

do any fixes from #4493 need to be ported over?

No, the only fixes that #4493 has are 2.x-specific tests.

@LukeShu LukeShu merged commit c9086e4 into release/v1.14 Sep 16, 2022
@LukeShu LukeShu deleted the lukeshu/v1.14-tcpmapping branch September 16, 2022 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants