Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3.2] Repatriate from v2.4 #4519

Merged
merged 49 commits into from
Sep 21, 2022
Merged

[v3.2] Repatriate from v2.4 #4519

merged 49 commits into from
Sep 21, 2022

Conversation

d6e-automaton
Copy link
Collaborator

@d6e-automaton d6e-automaton commented Sep 16, 2022

This PR merges release/v2.4 in to master.


notes from @LukeShu :

This comes out to be just the TCPMapping PRs, since everything else in 2.4 was backports.

LukeShu and others added 30 commits August 31, 2022 20:03
These two bumps are required for pytest to work with Python 3.10.
This doesn't matter for builds, or for in CI, but this does matter for
developers whose system Python is up-to-date.

Since Emissary 1.14 is in maintenance mode, I did the smallest
upgrades possible to get things working.

For typed-ast, that means including
python/typed_ast#158, which was first included
in 1.4.3.

For pytest, that means 6.2.5, since that's when the changelog mentions
"Python 3.10 is now supported."
https://docs.pytest.org/en/stable/changelog.html#pytest-6-2-5-2021-08-29

Signed-off-by: Luke Shumaker <[email protected]>
Signed-off-by: Luke Shumaker <[email protected]>
(cherry picked from commit fc40a4d)
Signed-off-by: Luke Shumaker <[email protected]>
(cherry picked from commit 03be3e9)
The value was never passed from the configs to the IR objects. This
commit fixes that and adds to test to ensure the clusters for these
services include the alt_stat_name.

Co-authored-by: Lance Austin <[email protected]>
Signed-off-by: Paul Salaberria <[email protected]>
(cherry picked from commit 0e0bd6a)
In emissary-ingress v3.y.z we removed support for Envoy
Configuration V2 but is still valid in Emissary-Ingress
v2.y.z.

This commit enriches the existing fixes test to include
testing envoy configuration V3. There was also a number
of formatting differences due to adopting Black for
formating in v3.y branch.

Signed-off-by: Lance Austin <[email protected]>
Signed-off-by: Luke Shumaker <[email protected]>

(cherry picked from commit e3af51a)

This cherry pick is needed because several of the certs are now expired;
I don't care about the SAN, I care about the timestamps.

I also had to touch up t_tls.py, since it's checking for some details
of the certs, and those details changed.  I didn't have to do that in
the original commit because I guess those tests had been marked as
xfail.  Yikes!

Signed-off-by: Luke Shumaker <[email protected]>
Although this exist in other places it has a lot of dependencies
and is hard to figure out what command to run. This adds a
simple target that only needs to be run once when developing
against the python components of emissary-ingress.

It provides a simple developer QOL improvement.

Signed-off-by: Lance Austin <[email protected]>
Updates to the latest flask version to address cve warnings and ensures
we are on the latest release for our deps. It also updated a few of the
indirect dependencies for flask

Signed-off-by: Lance Austin <[email protected]>
Signed-off-by: David Dymko <[email protected]>
Signed-off-by: David Dymko <[email protected]>
Signed-off-by: Luke Shumaker <[email protected]>
When splitting KAT tests among multiple jobs, we need a listing of all
of the KAT tests; we'll assign each entry in this list to a different
job.

The problem with that right now is that the expressions in that
listing are sometimes ambiguous.  For example, "AuthenticationTest"
shows up as `kat[AuthenticationTest`.  Notice that it doesn't include
a trailing `]`; this causes it to also match the
"AuthenticationTestV1".  If both of those tests were going to be
scheduled on the same job, that's fine; but if they were going to be
scheduled on different jobs, then that "AuthenticationTestV1" test
will run in *both* jobs, potentially overloading the one where it
wasn't supposed to run.

This is probably less of a problem for that one test, and a bigger
problem for common prefixes like "TLS".

Here is a listing of all such ambiguities:

    $ grep -f <(<build-aux/pytest-kat.txt sed -e 's/\[/\\[/g' -e 's/$/./') build-aux/pytest-kat.txt
    kat[AuthenticationTestV1
    kat[ClientCertificateAuthenticationContext
    kat[ClientCertificateAuthenticationContextCRL
    kat[DnsTtlDnsType
    kat[DnsTtlEndpoint
    kat[ErrorResponseMappingBypassAlternate
    kat[ErrorResponseOnStatusCodeMappingCRD
    kat[HostCRDDoubleCrossNamespace
    kat[HostCRDManualContextCRL
    kat[LogicalDnsTypeEndpoint
    kat[RedirectTestsInvalidSecret
    kat[RedirectTestsWithProxyProto
    kat[TLSCoalescing
    kat[TLSContextCipherSuites
    kat[TLSContextIstioSecretTest
    kat[TLSContextProtocolMaxVersion
    kat[TLSContextProtocolMinVersion
    kat[TLSContextTest
    kat[TLSContextsTest
    kat[TLSIngressTest
    kat[TLSInheritFromModule
    kat[TLSInvalidSecret
    kat[TLSOriginationSecret
    kat[TracingTestLongClusterName
    kat[TracingTestSampling
    kat[TracingTestShortTraceId
    kat[TracingTestZipkinV1
    kat[TracingTestZipkinV2

Signed-off-by: Luke Shumaker <[email protected]>
(cherry picked from commit 67279ec)
Yep, because of the horror that is self.skip_node!  Eradicating skip_node
seems like a bigger task, so just accept the horror as is.

Signed-off-by: Luke Shumaker <[email protected]>
…hods

The motivation of this is

 1. To help shorten the lengthy __init__() method so that it can fit
    on my screen and in my head, so I can understand it in order to
    edit it.

 2. To make more things read-only so I don't need to worry so much
    about mutable state.

Signed-off-by: Luke Shumaker <[email protected]>
This reduces the amount of remembered state we need to keep track of;
it shouldn't cost us any CPU to defer it until later.

Signed-off-by: Luke Shumaker <[email protected]>
…do for HTTP chains

This is in-line with the previously-seen theme of "help shorten the
lengthy __init__() method so that it can fit on my screen and in my
head".  But also making the two types of things to be more similar is
a good thing.

Signed-off-by: Luke Shumaker <[email protected]>
TestBootStrapNoNotifyBeforeSync creates ConfigMaps during its tests which doesn't get cleaned up afterwards
potentially infecting other tests. Add t.Cleanup to the test to clean up those ConfigMaps after the test is finished.

Signed-off-by: Hamzah Qudsi <[email protected]>
The Accumulator struct attempts to coalece changes into a single snapshot update as a way to do graceful load shedding.
However, while this was the behavior on bootstrap, it didn't always happen mid-watch - each event that was received turned into a single snapshot update, thus not really satisfying this requirement.

We add a new option to batch changes for a specified window interval before sending a snapshot update.
The batching behavior is as follows:
 - The Accumulator will receive raw changes up until the window period where it will then send a change, even if new updates are still coming in.
   This is to prevent the potential of a scenario where a change is never sent due to an extremely volatile cluster.
   While there may be a way to be more dynamic in how long to wait before sending this change, this approach is simpler and more predicable.

 - If an isolated updated comes in (e.g. last change was submitted an hour ago but the window period is set to 10s), it may not neccessarily wait until the window period before sending change, it can send immediately.

 - The default interval is set to 1s to be inline with current change velocity.

 - A snapshot update won't be sent until all resources are fully bootstrapped, regardless of what interval is set.
   This is the ensure that the other requirements for the Accumulator are still satisfied.

For testing, we add new test cases.

Signed-off-by: Hamzah Qudsi <[email protected]>
AMBASSADOR_RECONFIG_MAX_DELAY controls the interval to wait before sending snapshot updates when listening for K8s resources, especially when many resources are updated in quick succession.

Signed-off-by: Hamzah Qudsi <[email protected]>
Signed-off-by: Hamzah Qudsi <[email protected]>
This type of magic makes the codebase inconsistent and surprising to
newbies.  And in this case, it happens to not be doing anything for
us, so drop it.

Signed-off-by: Luke Shumaker <[email protected]>
LukeShu and others added 18 commits September 15, 2022 16:19
I couldn't follow what belonged to the Listener and what belonged to the
Host; use fewer pointless variables, or give them better names.

Signed-off-by: Luke Shumaker <[email protected]>
I had to do some soul-searching on this one.

OK, so what happens when:
* A `Listener` with `TLS` in its `protocolStack`, and
* An `IRHost` that has `secure_action = 'Route'`, but does *not* have
  a TLS configuration

> Note that evidently someone at one point clearly intended for
> `IRHost.secure_action` to eventually be configurable, but that it is
> _always_ 'Route'.  (Unless maybe somehow `IRHost.setup()` hasn't
> been called yet, in which case it's empty?)

Current behavior: Emit a FilterChain for that `IRHost` with (1)
`transport_protocol: tls`, with no (2) `transport_socket` (for xDS v3;
`tls_context` for xDS v2), and with (3) an http connection manager as
the first filter.

Change that behavior to: Don't emit the FilterChain for that `IRHost`.

There are arguments you can make that this new behavior is wrong, and
that actually it should emit a FilterChain with a different
`transport_protocol` or something.  But here's the thing: this new
behavior is *less* wrong than the current behavior, and is a safe baby
step.

Justification:
 - If the Listener says `TLS` in it, then listening for something
   without terminating TLS is flat out NOT OK EVER.
   +  counterpoint: layering with PROXY proto; what if we decrypt
      before handling PROXY?
      * counter-counterpoint: there's no way that works right now
        anyway.
 - I'm 99% sure when the not-decrypted traffic (because there's no
   `transport_socket`/`tls_context`) hits the http connection manager,
   the http connection manager will just drop the connection because
   it can't understand it; so the FilterChain was lame anyway.

As the cherry on top, I don't believe that we have _any_ tests that
check what the behavior here is.  There are a handful of tests that
trigger it, but only incidentally.  Why don't we have tests!?  Wasn't
stuff around the half the point of adding the `Listener` resource and
doing a 2.0 major breaking change!?  This is existentially troubling
for me.

Anyway, implementation notes:

The main thing is just adding `and host.context` to the `if` check
that decides whether to call `self.add_chain()`.  Further down, I
remove a now redundant `if chain.context:` check around setting
`transport_socket`/`tls_context`.

Signed-off-by: Luke Shumaker <[email protected]>
…ad of an IRHost

Setting self.context after initialization was awkward.  Taking hostname
separately is setting up for a later bugfix.

Signed-off-by: Luke Shumaker <[email protected]>
…hing Host's hostname when creating filter chains

Co-authored-by: Flynn <[email protected]>
Co-authored-by: Luke Shumaker <[email protected]>

Signed-off-by: Flynn <[email protected]>
Signed-off-by: Luke Shumaker <[email protected]>
This means no longer adding a Host to a chain if we're just going to
ignore it for a TCPMapping.

Signed-off-by: Luke Shumaker <[email protected]>
On pytest-kat-envoy2-5-of-5 CI jobs, I'm seeing weird things about the
XFPRedirect test's Host not appearing.  I'm wondering if (hoping that?)
it's because there are conflicting `ambassador-listener-8080` resources
owned by different tests, and that's causing it to not notice the Host?
I'm not entirely sure that makes sense, but it's my stab in the dark.

Even if that's not it, this is a good cleanup.

Signed-off-by: Luke Shumaker <[email protected]>
I'd accidentally included an extra `extra_port` in one of the tests
because I'd copied it from a different test.

Thanks to @haq204 for pointing out this mistake.

Signed-off-by: Luke Shumaker <[email protected]>
Signed-off-by: David Dymko <[email protected]>
@LukeShu LukeShu changed the title [v2.5] Repatriate from v2.4 [v3.1] Repatriate from v2.4 Sep 20, 2022
@LukeShu LukeShu force-pushed the ci/repatriate/from-v2.4-to-v2.5 branch from 10c937b to 22ecd95 Compare September 21, 2022 17:14
@LukeShu LukeShu changed the title [v3.1] Repatriate from v2.4 [v3.2] Repatriate from v2.4 Sep 21, 2022
@LukeShu LukeShu force-pushed the ci/repatriate/from-v2.4-to-v2.5 branch from 2ae946c to 80eb9c2 Compare September 21, 2022 17:57
Copy link
Contributor

@LanceEa LanceEa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm...thanks!

@LanceEa
Copy link
Contributor

LanceEa commented Sep 21, 2022

Oh just noticed...lgtm once green :). I'm guessing flakes.

@LukeShu LukeShu merged commit f606fe0 into master Sep 21, 2022
@LukeShu LukeShu deleted the ci/repatriate/from-v2.4-to-v2.5 branch September 21, 2022 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants