Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-cluster Service API #3199

Merged
merged 23 commits into from
Jan 20, 2022
Merged

Support multi-cluster Service API #3199

merged 23 commits into from
Jan 20, 2022

Conversation

luolanzone
Copy link
Contributor

No description provided.

@luolanzone
Copy link
Contributor Author

/test-e2e
/test-conformance
/test-networkpolicy

@codecov-commenter
Copy link

codecov-commenter commented Jan 16, 2022

Codecov Report

Merging #3199 (2397032) into main (f9dba58) will decrease coverage by 10.29%.
The diff coverage is 49.28%.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3199       +/-   ##
===========================================
- Coverage   59.81%   49.51%   -10.30%     
===========================================
  Files         306      459      +153     
  Lines       26178    43609    +17431     
===========================================
+ Hits        15659    21595     +5936     
- Misses       8811    19715    +10904     
- Partials     1708     2299      +591     
Flag Coverage Δ
integration-tests 34.02% <ø> (?)
kind-e2e-tests 44.57% <ø> (-1.19%) ⬇️
unit-tests 41.22% <49.28%> (+0.68%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
multicluster/cmd/multicluster-controller/leader.go 0.00% <0.00%> (ø)
multicluster/cmd/multicluster-controller/main.go 0.00% <0.00%> (ø)
multicluster/cmd/multicluster-controller/member.go 0.00% <0.00%> (ø)
...ontrollers/multicluster/clusterclaim_controller.go 0.00% <0.00%> (ø)
...llers/multicluster/member_clusterset_controller.go 0.00% <0.00%> (ø)
...rs/multicluster/resourceexportfilter_controller.go 0.00% <0.00%> (ø)
...rs/multicluster/resourceimportfilter_controller.go 0.00% <0.00%> (ø)
pkg/agent/client.go 77.41% <ø> (ø)
pkg/apiserver/apiserver.go 88.81% <ø> (+2.75%) ⬆️
pkg/apiserver/certificate/cacert_controller.go 56.66% <ø> (+0.29%) ⬆️
... and 172 more

@luolanzone
Copy link
Contributor Author

/test-e2e
/test-conformance
/test-networkpolicy

1 similar comment
@luolanzone
Copy link
Contributor Author

/test-e2e
/test-conformance
/test-networkpolicy

@luolanzone
Copy link
Contributor Author

/test-e2e
/test-networkpolicy

@luolanzone luolanzone added the area/multi-cluster Issues or PRs related to multi cluster. label Jan 18, 2022
@luolanzone
Copy link
Contributor Author

/test-e2e
/test-networkpolicy

@jianjuns
Copy link
Contributor

@luolanzone : the plan is to squash all commits and merge?

@luolanzone
Copy link
Contributor Author

@jianjuns I suppose it lose the author information if we squash all commits, maybe I can merge some of them which is owned by me to reduce the total commits, what's your suggestion?

@luolanzone luolanzone force-pushed the feature/multi-cluster branch from 0b0febf to 0e71762 Compare January 19, 2022 03:00
@jianjuns
Copy link
Contributor

@luolanzone: as we talked, could you check the commits that impact other Antrea code/functionality, and make sure they do not break anything?

@tnqn tnqn added this to the Antrea v1.5 release milestone Jan 19, 2022
@luolanzone
Copy link
Contributor Author

@jianjuns sure, I will check.

@luolanzone
Copy link
Contributor Author

/test-e2e
/test-conformance
/test-networkpolicy
/test-multicluster-e2e
/test-multicluster-integration

@luolanzone luolanzone force-pushed the feature/multi-cluster branch from 0e71762 to c684800 Compare January 19, 2022 05:59
@luolanzone
Copy link
Contributor Author

/test-multicluster-e2e
/test-multicluster-integration

@hjiajing
Copy link
Contributor

/test-multicluster-e2e

@luolanzone luolanzone force-pushed the feature/multi-cluster branch from c684800 to 6d79091 Compare January 19, 2022 09:05
@tnqn
Copy link
Member

tnqn commented Jan 20, 2022

@luolanzone : the plan is to squash all commits and merge?

@jianjuns We could use "create a merge commit" to merge the PR so the branch history will be preserved and there will be only one merge commit in the main branch, and it can be reverted atomically.

@jianjuns
Copy link
Contributor

create a merge commit
Will individual commits be counted to different contributors in the github stats?

@tnqn
Copy link
Member

tnqn commented Jan 20, 2022

Will individual commits be counted to different contributors in the github stats?

git tools preserve the contribution correctly. For example, git blame can show the original commit and author that adds the line of code, instead of the merge commit and its author. But I'm not sure how github stats counts the contribution.

@tnqn
Copy link
Member

tnqn commented Jan 20, 2022

@jianjuns I just confirmed antrea.devstats.cncf.io has counted contribution correctly when contributors merged their code to feature branch.

suwang48404 and others added 2 commits January 20, 2022 11:29
Components added: CRD definitions, controllers, manifest.

Signed-off-by: Su Wang <[email protected]>
Status sub-resource for multi cluster CRDs

Signed-off-by: Abhishek Raut <[email protected]>
luolanzone and others added 20 commits January 20, 2022 11:29
1. remove namespace in config/manager/manager.yaml, so we won't have
a `kube-system` namespace definition in multi-cluster.yaml
2. use `KUSTOMIZE = $(shell pwd)/bin/kustomize` in Makefile,
   otherwise,it will do nothing if KUSTOMIZE is empty by `shell which kustomize`
3. add a common label `app: antrea` for all resources
4. other changes are auto-generated by `make manifests`

Signed-off-by: Lan Luo <[email protected]>
add `// +genclient` mark so we can use *-gen tools to generate client codes automatically.

Signed-off-by: Lan Luo <[email protected]>
1. add new CRD MultiClusterConfig as custom config setting
2. reorg build/yaml file folder
3. add command wrapper

Signed-off-by: Lan Luo <[email protected]>
1. refine cacert_controller and reuse it in MCS
2. remove cert manager related manifests

Signed-off-by: Lan Luo <[email protected]>
add multicluster unit test step in go workflow and
enable it in feature branch temporarily.

Signed-off-by: Lan Luo <[email protected]>
MCS CommonArea and ClusterSetReconciler implementation

MCS is a feature that allows resource and policy configuration
that span across multiple k8 clusters. A ClusterSet defines such
a group of k8 clusters which requires resource exchange.

A ClusterSet needs to have one or more clusters chosen as a
Leader which facilitates the resource exchange so that we dont
need a full-mesh between all clusters in the ClusterSet. Leader
can be dedicated which means they dont participate in the
resource exchange themselves. Or they call be both a leader
and member, so it facilitates and also takes part in the resource
exchange.

Members of the ClusterSet write certain resources into leader
cluster’s common area and all members pull resources from
that common area. Common area refers to the namespace
within a leader cluster where the leader controller is running
and where resources are written and read from.

MCS Controller needs to be run in a leader/member mode.
Because the operations performed on leader/member are different
two sub-commands (leader, member) are provided to specify
which mode the controller is running in.

Validations for ClusterSet configuration
----------------------------------------
Following validations are performed by the ClusterSetReconciler
1. There must be a valid ClusterClaim with ClusterID and ClusterSetID
2. The configured ClusterID is defined as either a member/leader in the
ClusterSet
3. Only one ClusterSet configuration is allowed

MemberClusterSetReconciler
------------------------------
MemberClusterSetReconciler runs only in member cluster
deployment. It will initialize RemoteCommonAreaManager
to manage connections to all Leader’s common area which
will be represented as RemoteCommonArea. Once a connection
is established two things occur:
1. A MemberClusterAnnounce resource is written to all
    RemoteCommonArea periodically.
2. A leader election is performed to choose an elected leader
    among all configured leaders.
An elected leader is the leader cluster from where resources
will be imported for the resource exchange pipeline.

LeaderClusterSetReconciler
-----------------------------
LeaderClusterSetReconciler runs only in leader cluster
deployment and mainly performs the validations.

ClusterSet status:
------------------
Status of the ClusterSet resource is not implemented in this
patch and will come in a future patch.

MCS Manifests
---------------
There are 3 manifests that are generated:
1. Leader global - this contains all the global resources that needs
to be applied once on the leader cluster.
2. Leader namespaced - the leader MCS controller runs within the
context of a namespace. Namespaced yaml must be generated for the
required Namespace and applied on the leader cluster. The provided
yaml uses `changeme` as the Namespace for all resources and must be
changed to the correct Namespace required. Or a new yaml can also be
generated using
 $ antrea/multicluster/hack/generate-manifest.sh -l <namespace>
3. Member - this is the manifest that needs to be applied on the
member cluster.

Key differences in Member and Leader Namespaced yamls:
Controller:
----------
A member runs only one MCS controller which reconciles resources
across all namespaces in the cluster.
A leader runs one MCS controller in a given Namespace which
reconciles resources only within that Namespace.
RBAC is configured to provide entire cluster access to the member
controller and for the given Namespace for the leader controller.

Webhooks:
-----------
Webhooks are configured globally in the member deployment.
Webhooks are configured within a Namespace in the leader Deployment.
So in leader cluster which has multiple MCS Deployments, there will
be multiple webhooks each redirecting requests within a given
Namespace to the corresponding controller. This was a conscious choice
to avoid a hierarchical controller. Also webhooks will be performing
Namespace specific validations so they cannot be handled by a
global service.

A cluster that is both a leader and member:
-------------------------------------------
A cluster can be both a leader and member, in this case, there
will be two controllers running - one member controller and second
leader controller within a Namespace.
TODO: We need special handling so member controller does not
reconcile resources within leader Namespace even though it has
cluster-wide access.

Unit testing:
=========
RemoteCommonArea Tests
---------------------------
Test that MemberClusterAnnounce is written and connectivity is set
to true.

TODO: Find out if fakeClient can error writes so we can test
disconnection case

NOTE: used mockgen to generate mocks for Manager from controller-runtime
library and for RemoteCommonAreaManager. TODO: Add mockgen to Makefile
so it always gets generated automatically.

RemoteCommonAreaManager and LeaderElector tests
-------------------------------------------------------
1. Test adding a new leader cluster
2. Test leader election
3. Test leader election when elected leader becomes disconnected

TODO: Test removing a leader cluster

NOTE: Used mockgen to generate a mock for RemoteCluster so that
we can verify StartMonitoring is invoked. TODO: Add mockgen to Makefile
so it always gets generated automatically.

Signed-off-by: aravindakidambi <[email protected]>
…3111)

MCS: MemberClusterAnnounce webhook and CluterSet.Status implementation

MemberClusterAnnounce webhook
-----------------------------------
Implement a custom webhook to authenticate the request
to write MemberClusterAnnouce.
The webhook will get ClusterID from MemberClusterAnnounce
and get the ServiceAccount associated with that member cluster
from ClusterSet configuration and ensure the incoming request
has been authenticated with the same ServiceAccount.
This prevents a member spoofing the ClusterID when writing
MemberClusterAnnounce.

NOTE: Removed the default webhook from kube-builder
because it does not provide access to the user-info
of the request.

ClusterSet.Status computation on leader clusters
--------------------------------------------------
MemberClusterAnnounceReconciler will run an interval timer
which monitors MemberClusterAnnounce resources to compute
status as follows:
1. All configured members will start with:
   a. "Ready" = "Unknown"
   b. "ImportsResources" = "False"
2. The timer determines the following:
   a. If there is atleast one update of MemberClusterAnnounce
      resource from a member in a consecutive 3 timer intervals,
      the status of member will be "Ready" = "True"
   b. If "Ready" = "True", based on the last MemberClusterAnnounce
      received from that member:
      i. "ImportsResources" = "True" if the local leader cluster
         is the elected leader for member. Message will be
         "Local cluster is the elected leader of member <cluster-id>"
         and reason says "ElectedLeader"
      ii. "ImportsResources" = "False" if some other leader
          was elected by this member. Message will be
          "Local cluster is not the elected leader of member <cluster-id>"
          and reason will be "NotElectedLeader"
      iii. "ImportsResources" = "Unknown", if no leader has
           been elected yet by this member. Message will be
           "Leader has not been elected yet"
   c. If there is no update in a consecutive 3 timer interval, then the
      member is marked "Ready" = "False" and "ImportsResources" = "False".
      Message will state - "No MemberClusterAnnounce update after <timestamp>" and
      Reason will be "Disconnected".

LeaderClusterSetReconciler will have a goroutine that runs
periodically to update the ClusterSet.Status, as follows:
1. TotalClusters is the number of clusters in the ClusterSet resource
   last processed.
2. ObservedGeneration is the Generation in the ClusterSet resource
   last processed.
3. Individual cluster status is obtained from MemberClusterAnnounceReconciler.
4. ReadyClusters is the number of clusters with "Ready" = "True".
5. Overall condition of the ClusterSet is also computed as follows:
   a. "Ready" = "True" if all clusters have
      "Ready" = "True". Message & Reason will be absent.
   b. "Ready" = "Unknown" if all clusters have
      "Ready" = "Unknown". Message will be "All clusters have an unknown status"
      and Reason will be "NoReadyCluster".
   c. "Ready" = "False" for any other combination of cluster
      statues across all clusters. Message will be empty and Reason
      will be "NoReadyCluster".

Note: Other leaders and local leader cluster will not be present in the status.
      Because its status is not meaningful and can skew the overall status.

ClusterSet.Status computation on member clusters
--------------------------------------------------
RemoteCommonAreaManager will support getting status of
all leaders from individual RemoteCommonArea.

RemoteCommonArea provides status as follows:
1. When configured, leader start with:
   a. "Ready" = "Unknown"
   b. "IsElectedLeader" = "False"
2. Update "Ready" = "True"/"False" based on connectivity to
   RemoteCommonArea. Message will contain error if member
   has any error with connectivity. Reason will be
   "Disconnected" in error cases.
3. There will be an additional status for "IsElectedLeader"
   and this will be "True" if leader election has finished
   and has elected this leader, "False" if some other leader
   was elected, "Unknown" if leader election is incomplete.
   a. Message will be "This leader cluster is/is not an elected
      leader for local cluster".
   b. If leader election is incomplete,
      Reason = "LeaderElectionInProgress".

MemberClusterSetReconciler will have a goroutine
that runs periodically to update the status, as follows:
1. TotalClusters is updated based on number of clusters in
   the last processed config.
2. ObservedGeneration is from Generation of the last processed
   config.
3. Individual cluster status is obtained from RemoteCommonAreaManager.
4. ReadyClusters is computed based on number of cluster
   with "Ready" = "True".
5. Overall condition of the ClusterSet is computed as follows:
   a. "Ready" = "True" if it is connected to
      atleast one leader and elected it as a leader.
   b. "Ready" = "False" if it is disconnected
      from all leaders.
   c. "Ready" = "Unknown" otherwise. Message
      will be "Leader not elected yet".

Signed-off-by: aravindakidambi <[email protected]>
1. exporter
  serviceexport_controller is responsible to reconcile ServiceExport resource in member cluster and write
  wrapped ResourceExport into leader cluster, meanwhile it will watch Service and Endpoints change via
  event mapping and update ResourceExport when exported Service/Endpoints are updated.

  resourceexport_controller is responsible to reconcile ResourceExport resources and computes all ResourceExports
  from different member clusters into one ResourceExport in leader cluster if the resources are exported
  with the same namespaced name and the same kind, for this moment, we only support Service, Endpoints kind.

2. importer
  remote_resourceimport_controller watches leader cluster's ResourceImport events and create corresponding
  ServiceImport, Service and Endpoints with AntreaMCServiceAnnotation in member cluster. ServiceImport name will
  be the same as exported Service, new created Service and Endpoints will have an antrea multicluster prefix.

3. stale controller
  stale controller is mainly for special cases when resource is deleted during controller restarting
in leader or member cluster, it will be triggered once only when controller starts.

Notes: serviceexport_controller, stale_controller and
resourceimport_controller will run only in member cluster,
resourceexport_controller will run only in leader cluster.

Signed-off-by: Lan Luo <[email protected]>
1. rename `core` package to `commonarea`
2. move ServiceExport and ServiceImport CRD manifests to a separate folder `k8smcs`

Signed-off-by: Lan Luo <[email protected]>
Our imported Service has name but endpoint is without name
which causes the Antrea agent fail to install openflow rule.
so simply remove the name since it's defined by ourself.

Signed-off-by: Lan Luo <[email protected]>
1. add integration initial scripts
2. integration codes for resource controllers
3. add integration test in docker and howto doc

Signed-off-by: Lan Luo <[email protected]>
Co-authored-by: Lan Luo <[email protected]>
Co-authored-by: zbangqi <[email protected]>
Co-authored-by: Aravinda Kidambi <[email protected]>

Co-authored-by: zbangqi <[email protected]>
Co-authored-by: Aravinda Kidambi <[email protected]>
disable leader election in config yaml
and increase timeout for below error in case we enable it in the
future:

there is an error in a long running MC controller which caused
controller restart every few minutes.

```
E0106 07:29:05.501113       1 leaderelection.go:361] Failed to update lock: context deadline exceeded
I0106 07:29:05.895992       1 leaderelection.go:278] failed to renew lease antrea-mcs-ns/6536456a.crd.antrea.io: timed out waiting for the condition
2022-01-06T07:29:05.896Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"ConfigMap","namespace":"antrea-mcs-ns","name":"6536456a.crd.antrea.io","uid":"a4de74cd-0441-4140-a78b-acf163055f91","apiVersion":"v1","resourceVersion":"23629919"}, "reason": "LeaderElection", "message": "antrea-mc-controller-6dcb88b9d6-vxqvm_e1b1b0a9-b2b5-471f-b424-b11a34343d64 stopped leading"}
2022-01-06T07:29:05.999Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"Lease","namespace":"antrea-mcs-ns","name":"6536456a.crd.antrea.io","uid":"6709c340-ee00-459b-b186-e56c15fbde67","apiVersion":"coordination.k8s.io/v1","resourceVersion":"23629901"}, "reason": "LeaderElection", "message": "antrea-mc-controller-6dcb88b9d6-vxqvm_e1b1b0a9-b2b5-471f-b424-b11a34343d64 stopped leading"}
2022-01-06T07:29:05.598Z	DEBUG	controller-runtime.webhook.webhooks	received request	{"webhook": "/validate-multicluster-crd-antrea-io-v1alpha1-memberclusterannounce", "UID": "da938dc5-cbda-4714-a9f3-f25d7f105353", "kind": "multicluster.crd.antrea.io/v1alpha1, Kind=MemberClusterAnnounce", "resource": {"group":"multicluster.crd.antrea.io","version":"v1alpha1","resource":"memberclusterannounces"}}
F0106 07:29:06.099280       1 leader.go:41] Error running controller: error running Manager: leader election lost
```

Signed-off-by: Lan Luo <[email protected]>
* Allow member and leader deployed in one cluster

We'd like to deploy both member and leader controllers in one
cluster, so need below two fixes:

1. make the memberannounce webhook in member cluster as namespaced
otherwise memberannounce creation will fail.
2. skip any ClusterSet reconsiling in member cluster if it's not the
same as member controller's namespace.

Signed-off-by: Lan Luo <[email protected]>

* Update mutation and validation webhook

* Remove uncessary memberclusterannounce,resourceexport,resourceimport
webhooks in member manifests
* Make clusterclaim,clusterset's validation webhook as namespaced.

Signed-off-by: Lan Luo <[email protected]>
Reuse verify-kustomize.sh to install and verify kustomize

Signed-off-by: Lan Luo <[email protected]>
@luolanzone luolanzone force-pushed the feature/multi-cluster branch from 6d79091 to 687679f Compare January 20, 2022 04:17
@luolanzone
Copy link
Contributor Author

/test-e2e
/test-conformance
/test-networkpolicy
/test-multicluster-e2e

@jianjuns
Copy link
Contributor

Ok, sounds like a good approach to me then!

1. add a doc about basic architecture
2. add a setup guide and sample yamls

Signed-off-by: Lan Luo <[email protected]>
@tnqn
Copy link
Member

tnqn commented Jan 20, 2022

/test-e2e
/test-conformance
/test-networkpolicy
/test-multicluster-e2e

@tnqn tnqn merged commit 694d3cd into main Jan 20, 2022
@tnqn tnqn deleted the feature/multi-cluster branch January 20, 2022 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-cluster Issues or PRs related to multi cluster.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants