Add Versioned API and Antctl for NetworkPolicyEvaluation (effective policy rule prediction) #5740

qiyueyao · 2023-11-21T14:55:01Z

Added antctl support for querying effective policy rule by networkpolicyevaluation
antctl query networkpolicyevaluation -S ns1/pod1 -D ns2/pod2
Added versioned API for NetworkPolicy evaluation POST call
curl -d "@<test json file>" -H "Content-Type: application/json" -X POST <k8s-apiserver>:8001/apis/controlplane.antrea.io/v1beta2/networkpolicyevaluation

Adds the above APIs and antctl queriers that returns the predicted effective NetworkPolicy rule, which affects traffic from ns1/pod1 to ns2/pod2. The solution picks the highest priority rule that satisfies the query.

pkg/antctl/antctl.go

pkg/apiserver/handlers/networkpolicyanalysis/handler.go

pkg/antctl/command_definition.go

pkg/apiserver/handlers/networkpolicyanalysis/handler.go

docs/antctl.md

pkg/antctl/antctl.go

pkg/antctl/command_definition.go

pkg/controller/networkpolicy/endpoint_querier.go

pkg/apiserver/handlers/networkpolicyanalysis/handler.go

antoninbas · 2023-12-19T20:04:40Z

pkg/apiserver/handlers/networkpolicyanalysis/handler.go

+
+// HandleFunc creates a http.HandlerFunc which uses an AgentNetworkPolicyInfoQuerier
+// to query network policy rules in current agent.
+func HandleFunc(eq networkpolicy.EndpointQuerier) http.HandlerFunc {


I was just curious as to whether some thought was given regarding implementing all this logic in antctl, rather than in the controller API server? I imagine that in theory antctl query networkpolicyanalysis and antctl query endpoint could share a single API (with extra information compared to what /endpoints provide today) and that the rule comparison / ordering could happen in antctl instead of in the server. If however, we think that there is a need for a dedicated API providing the end result (e.g., because it needs to be consumed by some other component), then that would explain why we need a dedicated API.

Yes we do intend to have this API consumed by some downstream components

Since this tends to be a public API and not only consumed by antctl, should we make a well-defined, structured, and versioned resource API? The API looks helpful to me and could perhaps be extended to test Pod-to-IP as well.
If it's a non-resource API like the current one, it may be hard to evolve and handle compatibility with its consumers.
I think it could be an API under controlplane API like the group membership APIs. There are quite some examples in Kubernetes APIs similar to this, used as ephemeral queries, like AdmissionReview, SubjectAccessReview, TokenReview, TokenRequest, CertificateSigningRequest.
There will be plenty of benefits by making it a resource API:

It can be exposed via APIService so any consumers running out of the cluster (including antctl) can access it publicly.

Generated client code makes it easier to consume, instead of rewriting the parser code in every client.

It's versioned so easier to support different versions of clients.

The request and response can be better structured when extending it.

For example, the data type in my mind:

type NetworkPolicyAccessReview struct { metav1.TypeMeta Request *NetworkPolicyAccessRequest Response *NetworkPolicyAccessResponse } type Entity struct { Namespace string Pod string // It can be added when we can support IP check. IP string } type NetworkPolicyAccessRequest struct { Source Entity Destination Entity // It can be added when we can support port level check. Protocol Protocol SourcePort int DestinationPort int } type NetworkPolicyAccessResponse struct { // The expected action. Action Action // The reference of the effective NetworkPolicy. NetworkPolicy NetworkPolicyReference Direction cpv1beta.Direction RuleIndex int // The content of the effective rule. Type is runtime.Object because it can be different types. Rule runtime.Object }

Changed to resourced API, and updated antctl call to it.

pkg/apiserver/handlers/networkpolicyanalysis/handler.go

pkg/antctl/antctl.go

docs/antctl.md

pkg/antctl/antctl.go

jianjuns · 2024-01-09T23:17:58Z

pkg/antctl/antctl.go

-			transformedResponse: reflect.TypeOf(controllernetworkpolicy.EndpointQueryResponse{}),
+			transformedResponse: reflect.TypeOf(endpointServer.EndpointQueryResponse{}),
+		},
+		{use: "networkpolicyanalysis",


Should we name it "effectivepolicyrule"?

Currently implementing this comment #5740 (comment) , so perhaps this will be "networkpolicyaccess"?

I assume that comment is about making the API definition generic? But the antctl command here is just for a single operation of returning the effective rule, or you plan to extend the command to support generic policy queries later?

Named the API networkpolicyaccessreview and named the antctl command effectivepolicyrule.

Let us hear what @antoninbas and @tnqn may suggest too.

Sorry, I used NetworkPolicyAccessReview in #5740 (comment) just as an example and didn't think too much about the naming. Now I feel it sounds different from what it actually represents. How about naming it NetworkPolicyEnforcement, which may also be used as the cmd name, i.e. antctl query networkpolicyenforcement?

Since the api&cmd doesn't enforce the current configs, just queries them, perhaps enforcement would be misleading? How about NetworkPolicyEffectiveRule, NetworkPolicyPrimeRule, NetworkPolicyEvaluation, NetworkPolicyAnalysis?

I agree with Qiyue that without the query verb in antctl, NetworkPolicyEnforcement as an API name could be misleading as it does not return the actual policy enforcement state. I personally would vote for NetworkPolicyEvaluation but love to hear what @tnqn and @jianjuns think as well

NetworkPolicyEvaluation sounds good to me too.

Renamed the types.

pkg/apis/controlplane/types.go

pkg/apiserver/apiserver.go

pkg/apiserver/handlers/endpoint/handler.go

pkg/controller/networkpolicy/endpoint_querier.go

pkg/antctl/command_definition.go

pkg/antctl/antctl.go

pkg/apis/controlplane/types.go

antoninbas · 2024-02-07T19:20:55Z

pkg/antctl/transform/networkpolicy/transform.go

+	return ns, pod
+}
+
+func NewNetworkPolicyEvaluation(args map[string]string) (runtime.Object, error) {


I thought this package was meant for output transforms. For parameter transform (which is a new concept), maybe it should be in a separate package?

Created a new package paramter under antctl.

pkg/apiserver/handlers/endpoint/handler.go

pkg/apiserver/registry/networkpolicy/networkpolicyevaluation/rest.go

pkg/controller/networkpolicy/endpoint_querier_test.go

tnqn · 2024-02-09T08:58:35Z

pkg/antctl/parameter/parameter.go

+	return ns, pod
+}
+
+func NewNetworkPolicyEvaluation(args map[string]string) (runtime.Object, error) {


should it be moved to transform/networkpolicy like the type of the response? otherwise this file would include all parameters of discrete commands.

The function was moved here as a new package parameter based on this comment, if I understood it correctly.

We should probably find a better package name / location in a future PR though

Sure, do you have any suggestion or insight? To include it inside transform or refactor some of the existing structs?

Thinking about it some more, maybe we could keep it in transform/networkpolicy, but as a separate file to clearly mark the difference? Maybe we could have transform/networkpolicy/request.go and ransform/networkpolicy/response.go?

Sounds good, will open a PR for this.

pkg/antctl/transform/networkpolicy/transform.go

pkg/apis/controlplane/v1beta2/types.go

tnqn · 2024-02-09T09:11:55Z

pkg/apiserver/apiserver.go

@@ -214,6 +216,7 @@ func installAPIGroup(s *APIServer, c completedConfig) error {
 	cpv1beta2Storage["appliedtogroups"] = appliedToGroupStorage
 	cpv1beta2Storage["networkpolicies"] = networkPolicyStorage
 	cpv1beta2Storage["networkpolicies/status"] = networkPolicyStatusStorage
+	cpv1beta2Storage["networkpolicyevaluation"] = networkPolicyEvaluationStorage


should we use "networkpolicies/evaluation" as the path to indicate their relationship? Like "pods/log", "pods/attach", "pods/exec" APIs?

I did some test and found that, since NetworkPolicyEvaluation is currently a resource, to use networkpolicies/evaluation we have to make evaluation a subresource of networkpolicies. But different from status&scale, this NetworkPolicyEvaluation is not affiliated to a particular networkpolicy at input, so I could not provide a networkpolicy name for the subresource. I feel like this might have to stay as a separate resource like appliedtogroups?

makes sense

pkg/apiserver/handlers/endpoint/handler.go

pkg/controller/networkpolicy/endpoint_querier.go

qiyueyao · 2024-02-15T00:28:08Z

Depends on #5989
Merged and rebased.

build/charts/antrea/templates/antctl/clusterrole.yaml

tnqn · 2024-02-27T16:24:03Z

pkg/apiserver/apiserver.go

@@ -214,6 +216,7 @@ func installAPIGroup(s *APIServer, c completedConfig) error {
 	cpv1beta2Storage["appliedtogroups"] = appliedToGroupStorage
 	cpv1beta2Storage["networkpolicies"] = networkPolicyStorage
 	cpv1beta2Storage["networkpolicies/status"] = networkPolicyStatusStorage
+	cpv1beta2Storage["networkpolicyevaluation"] = networkPolicyEvaluationStorage


makes sense

pkg/controller/networkpolicy/endpoint_querier.go

tnqn · 2024-02-27T16:55:07Z

pkg/controller/networkpolicy/endpoint_querier.go

+	if len(commonRules) > 0 {
+		commonRule = commonRules[0]
+		// filter Antrea-native policy rules with Pass action
+		// if pass rule currently has the highest precedence, skip the remaining rules
+		// until the next K8s rule or Baseline rule, or return the pass rule otherwise
+		isPass := func(ruleInfo *controlplane.NetworkPolicyRule) bool {
+			return ruleInfo.Action != nil && *ruleInfo.Action == crdv1beta1.RuleActionPass
+		}
+		if isPass(commonRule.Rule) {
+			for _, rule := range commonRules[1:] {
+				if rule.Policy.SourceRef.Type == controlplane.K8sNetworkPolicy ||
+					(rule.Policy.TierPriority != nil && *rule.Policy.TierPriority == BaselineTierPriority && !isPass(rule.Rule)) {
+					commonRule = rule
+					break
+				}
+			}
+		}
+	}
+	return


Is the following equivalent?

for _, rule := range commonRules { if rule.Policy.SourceRef.Type == controlplane.K8sNetworkPolicy || *rule.Policy.TierPriority == BaselineTierPriority || rule.Rule.Action == nil || *rule.Rule.Action != crdv1beta1.RuleActionPass) { return rule } } return nil

Seems not, we discussed the cases when 1) Pass was the first rule, but there are no satisfied rules found later, then the first Pass rule should be returned. 2) If Pass was the first rule, we need to skip all ACNP/ANNP rules until a K8s rule or Baseline rule appears, looks like the above solution returns the next non-pass rule.

makes sense

build/yamls/externalnode/vm-agent-rbac.yml

tnqn · 2024-02-28T03:07:54Z

pkg/controller/networkpolicy/endpoint_querier.go

+	if len(commonRules) > 0 {
+		commonRule = commonRules[0]
+		// filter Antrea-native policy rules with Pass action
+		// if pass rule currently has the highest precedence, skip the remaining rules
+		// until the next K8s rule or Baseline rule, or return the pass rule otherwise
+		isPass := func(ruleInfo *controlplane.NetworkPolicyRule) bool {
+			return ruleInfo.Action != nil && *ruleInfo.Action == crdv1beta1.RuleActionPass
+		}
+		if isPass(commonRule.Rule) {
+			for _, rule := range commonRules[1:] {
+				if rule.Policy.SourceRef.Type == controlplane.K8sNetworkPolicy ||
+					(rule.Policy.TierPriority != nil && *rule.Policy.TierPriority == BaselineTierPriority && !isPass(rule.Rule)) {
+					commonRule = rule
+					break
+				}
+			}
+		}
+	}
+	return


makes sense

Adds a versioned API and antctl query for NetworkPolicy evaluation that returns the predicted effective NetworkPolicy rule, which affects traffic from ns1/pod1 to ns2/pod2. Signed-off-by: Qiyue Yao <[email protected]>

tnqn

LGTM

Dyanngg

LGTM, thanks for addressing all the comments on this giant PR

tnqn · 2024-02-29T10:43:55Z

@jianjuns @antoninbas do you have other comments?

antoninbas · 2024-03-01T20:55:31Z

@tnqn I took another quick look, LGTM

tnqn · 2024-03-04T05:58:01Z

/test-all

tnqn · 2024-03-04T16:28:02Z

Ignoring the following tests:

“Upgrade from Antrea version N-1” and "API compatible with client version N-1" should be related to the issue fixed by Do not load unnecessary images in ci/kind/test-upgrade-antrea.sh #6039
"Antrea-native (VLAN) secondary network tests on a Kind cluster on Linux" should be fixed by [e2e tests] Wait for secondary IPs before running ping test #6041

Dyanngg reviewed Nov 21, 2023

View reviewed changes

Dyanngg reviewed Dec 6, 2023

View reviewed changes

qiyueyao marked this pull request as ready for review December 8, 2023 01:28

qiyueyao changed the title ~~[WIP] Add Antctl NetworkPolicy Rule Prediction Analysis Query~~ Add Antctl NetworkPolicy Rule Prediction Analysis Query Dec 8, 2023

qiyueyao mentioned this pull request Dec 11, 2023

Fix Endpoint Querier RuleIndex In Response #5783

Merged

qiyueyao force-pushed the ods-antctl branch 5 times, most recently from 9bf6c0f to e859616 Compare December 13, 2023 22:42

qiyueyao requested a review from tnqn December 14, 2023 01:09

Dyanngg mentioned this pull request Dec 14, 2023

Add policy processed verification for NetworkPolicyEvaluation #5801

Closed

Dyanngg added this to the Antrea v1.15 release milestone Dec 14, 2023

Dyanngg added the area/component/antctl Issues or PRs releated to the command line interface component label Dec 14, 2023

Dyanngg reviewed Dec 14, 2023

View reviewed changes

docs/antctl.md Outdated Show resolved Hide resolved

docs/antctl.md Outdated Show resolved Hide resolved

pkg/antctl/antctl.go Outdated Show resolved Hide resolved

pkg/antctl/command_definition.go Outdated Show resolved Hide resolved

qiyueyao force-pushed the ods-antctl branch from e859616 to fd51860 Compare December 14, 2023 22:56

antoninbas reviewed Dec 19, 2023

View reviewed changes

Dyanngg reviewed Jan 4, 2024

View reviewed changes

pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved

pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved

jianjuns reviewed Jan 9, 2024

View reviewed changes

qiyueyao changed the title ~~Add Antctl NetworkPolicy Rule Prediction Analysis Query~~ Add Versioned API and Antctl for Effective NetworkPolicy Rule Prediction Jan 24, 2024

Dyanngg reviewed Jan 24, 2024

View reviewed changes

luolanzone modified the milestones: Antrea v1.15 release, Antrea v1.16 release Jan 24, 2024

qiyueyao force-pushed the ods-antctl branch 2 times, most recently from 886c1a2 to 71e6128 Compare January 25, 2024 09:56

Dyanngg reviewed Jan 25, 2024

View reviewed changes

pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved

qiyueyao force-pushed the ods-antctl branch from 71e6128 to 6bf7bb7 Compare January 26, 2024 21:13

qiyueyao modified the milestones: Antrea v1.16 release, Antrea v1.15 release Jan 26, 2024

Dyanngg reviewed Jan 29, 2024

View reviewed changes

pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved

tnqn reviewed Feb 1, 2024

View reviewed changes

pkg/antctl/command_definition.go Outdated Show resolved Hide resolved

pkg/antctl/command_definition.go Outdated Show resolved Hide resolved

pkg/antctl/command_definition.go Outdated Show resolved Hide resolved

qiyueyao force-pushed the ods-antctl branch from d2e94fc to f4a8b81 Compare February 6, 2024 05:39

qiyueyao changed the title ~~Add Versioned API and Antctl for Effective NetworkPolicy Rule Prediction~~ Add Versioned API and Antctl for NetworkPolicyEvaluation (effective policy rule prediction) Feb 6, 2024

antoninbas reviewed Feb 7, 2024

View reviewed changes

qiyueyao force-pushed the ods-antctl branch 3 times, most recently from 47108e1 to 319cab6 Compare February 9, 2024 02:06

tnqn reviewed Feb 9, 2024

View reviewed changes

qiyueyao mentioned this pull request Feb 15, 2024

Refactor Endpoint Query Controller and Antctl #5989

Merged

qiyueyao force-pushed the ods-antctl branch from b6b1af3 to 568d54c Compare February 19, 2024 09:29

qiyueyao force-pushed the ods-antctl branch 2 times, most recently from 03bc071 to 78aaba6 Compare February 26, 2024 23:15

tnqn reviewed Feb 27, 2024

View reviewed changes

qiyueyao force-pushed the ods-antctl branch from 6c99e30 to bde3644 Compare February 28, 2024 01:52

tnqn reviewed Feb 28, 2024

View reviewed changes

Add API and antctl for NetworkPolicyEvaluation

3cfad4e

Adds a versioned API and antctl query for NetworkPolicy evaluation that returns the predicted effective NetworkPolicy rule, which affects traffic from ns1/pod1 to ns2/pod2. Signed-off-by: Qiyue Yao <[email protected]>

qiyueyao force-pushed the ods-antctl branch from bde3644 to 3cfad4e Compare February 28, 2024 06:26

tnqn approved these changes Feb 28, 2024

View reviewed changes

Dyanngg approved these changes Feb 28, 2024

View reviewed changes

antoninbas added the action/release-note Indicates a PR that should be included in release notes. label Mar 1, 2024

tnqn merged commit 5396f58 into antrea-io:main Mar 4, 2024
49 of 55 checks passed

qiyueyao deleted the ods-antctl branch March 4, 2024 19:59

qiyueyao mentioned this pull request Mar 5, 2024

Followup Antctl Tranform Code Move #6062

Merged

qiyueyao mentioned this pull request Mar 15, 2024

NetworkPolicy Evaluation E2E Tests #6111

Merged

Add Versioned API and Antctl for NetworkPolicyEvaluation (effective policy rule prediction) #5740

Add Versioned API and Antctl for NetworkPolicyEvaluation (effective policy rule prediction) #5740

Conversation

qiyueyao commented Nov 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn Dec 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn Jan 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dyanngg Jan 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiyueyao Feb 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiyueyao commented Feb 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnqn left a comment

Choose a reason for hiding this comment

Dyanngg left a comment

Choose a reason for hiding this comment

tnqn commented Feb 29, 2024

antoninbas commented Mar 1, 2024

tnqn commented Mar 4, 2024

tnqn commented Mar 4, 2024

qiyueyao commented Nov 21, 2023 •

edited

Loading

tnqn Dec 21, 2023 •

edited

Loading

tnqn Jan 29, 2024 •

edited

Loading

Dyanngg Jan 31, 2024 •

edited

Loading

qiyueyao Feb 14, 2024 •

edited

Loading

qiyueyao commented Feb 15, 2024 •

edited

Loading