Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Add support to enable transport encryption TLS #31

Merged
merged 19 commits into from
Feb 7, 2020
Merged

Conversation

shubhanilBag
Copy link
Contributor

@shubhanilBag shubhanilBag commented Jan 13, 2020

This PR:

  1. Adds configurable params to enable transport encryption
  2. Supports adding custom TLS CA cert using a configurable k8s secret
  3. Adds a clqshrc ConfigMap file that allows cqlsh to be used without any additional configuration when TLS is enabled
  4. Adds integration tests to check
    1.1 Node-to-Node transport encryption
    1.2 Client-to-Node encryption
    1.3 Node-to-Node along with Client-to-Node encryption
    1.4 Data read and write using cqlsh
  5. Adds and updates documentation for enabling TLS
    6. Bumps operator version to 0.2.0

@shubhanilBag shubhanilBag requested a review from mpereira January 13, 2020 16:55
@shubhanilBag shubhanilBag self-assigned this Jan 13, 2020
@shubhanilBag shubhanilBag requested a review from zmalik January 13, 2020 17:02
@shubhanilBag shubhanilBag removed the wip label Jan 20, 2020
@shubhanilBag shubhanilBag marked this pull request as ready for review January 20, 2020 08:15
Copy link
Contributor

@zmalik zmalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good!
leaving some comments to improve the parameter docs

default: "false"

- name: TRANSPORT_ENCRYPTION_REQUIRE_CLIENT_AUTH
description: "TODO"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets add some description here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, and the following ones as well.

@mpereira
Copy link
Contributor

@shubhanilBag could you please merge in the latest master?

Copy link
Contributor

@mpereira mpereira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shubhanilBag, thanks for the great progress so far!

A few comments from my side.

@@ -1,6 +1,6 @@
apiVersion: kudo.dev/v1beta1
name: "cassandra"
operatorVersion: "0.1.2"
operatorVersion: "0.2.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert the operator version bump. Version bumps should happen in the stable branches.

default: "false"

- name: TRANSPORT_ENCRYPTION_REQUIRE_CLIENT_AUTH
description: "TODO"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, and the following ones as well.

name: {{ .Name }}-generate-cqlshrc-sh
namespace: {{ .Namespace }}
data:
generate-cqlshrc.sh: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a configmap of cqlshrc itself instead of a bash script that generates it?

Copy link
Contributor Author

@shubhanilBag shubhanilBag Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what I was going to do initially, but I made a script so that in future if additional configuration is needed for cqlsh, it can be in one script; like adding custom config containing additional config for cqlsh ref: https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlshUsingCqlshrc.html

Copy link
Contributor

@mpereira mpereira Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference regarding supporting future additional configuration between:
A) a templated bash script that outputs a file
B) a templated file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is not required for cqlshrc file then I can revert back to the usual way for mounting a CM and cp it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpereira as discussed we are keeping the script based approach as POD_NAME env variable is used in the script to create the hostname

@@ -148,6 +154,17 @@ spec:
- name: node-readiness-probe-sh
mountPath: /etc/cassandra/node-readiness-probe.sh
subPath: node-readiness-probe.sh
- name: cassandra-home
mountPath: /home/cassandra/.cassandra/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should cassandra-home be just /home/cassandra/?

Copy link
Contributor Author

@shubhanilBag shubhanilBag Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both cqlsh and nodetool creates files in .cassandra, /home/cassandra is not used directly
so directly mounted it in .cassandra, otherwise it would need another mkdir.

cassandra@cassandra-node-0:~ $ ls -lah
total 0
drwxr-xr-x. 3 root root 24 Jan 29 14:46 .
drwxr-xr-x. 1 root root 23 Jan 29 14:46 ..
drwxrwsrwx. 2 root cassandra 66 Jan 29 14:49 .cassandra
cassandra@cassandra-node-0:~$ ls -lah .cassandra/
total 12K
drwxrwsrwx. 2 root cassandra 66 Jan 29 14:49 .
drwxr-xr-x. 3 root root 24 Jan 29 14:46 ..
-rw-------. 1 cassandra cassandra 5 Jan 29 14:49 cqlsh_history
-rw-r--r--. 1 cassandra cassandra 264 Jan 29 14:46 cqlshrc
-rw-r--r--. 1 cassandra cassandra 1.7K Jan 29 14:50 nodetool.history

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name it something other than "cassandra-home" then, since it implies /home/cassandra. Maybe something like cassandra-configuration-directory or dot-cassandra.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dot-cassandra SGTM 👍

@@ -128,3 +128,25 @@ func ExecInPodContainer(

return stdout, nil
}

func FetchLogsOfPod(namespaceName, podName, containerName string) (*bytes.Buffer, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would PodContainerLogs be a more representative name for this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with FetchContainerLogs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that function different than the already existing k8s.GetPodContainerLogs by the way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not, I made a duplicate by mistake 🤔! Thanks for pointing it out, will fix that

@@ -120,3 +122,19 @@ func getConfigurationFromNodeLogs(

return configuration, nil
}

// CQLSH TODO function comment.
func CQLSH(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func CQLSH(
func cqlsh(

Expect(err).To(BeNil())
assertNumberOfCassandraNodes(NodeCount)
})
It("Check container logs", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to follow the active voice for the test names, so that it reads like:

It "checks for the container logs"

like the above

It "installs the operator from a directory"

On the other tests as well.

@@ -0,0 +1,214 @@
package cassandra_tls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put this test file in tests/suites/tls_test.go.

Copy link
Contributor Author

@shubhanilBag shubhanilBag Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed kudo-kafka-operator's way of having tests in different packages with it's own folder. If we have the tls_test.go and sanity_test.go under same package sanity they will share the global vars and functions. We won't get full isolation. like https://github.com/mesosphere/kudo-cassandra-operator/blob/d6f6784bb48ae6b0473a4ed70fc97af783fad2e2/tests/suites/sanity_test.go#L23
and
https://github.com/mesosphere/kudo-cassandra-operator/blob/8932a7e5f58b52495466ca16265aace4b8b3ca97/tests/suites/cassandra_tls/tls_test.go#L19 will conflict if tls_test was in same folder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In that case, let's move

  • tests/suites/sanity_test.go to tests/suites/sanity/sanity_test.go
  • tests/suites/cassandra_tls/tls_test.go to tests/suites/tls/tls_test.go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 👍

@@ -411,7 +411,7 @@ func UpgradeOperator(
func UninstallOperator(
operatorName string, namespaceName string, instanceName string,
) error {
uninstallScript := "../../scripts/uninstall_operator.sh"
uninstallScript := "../../../scripts/uninstall_operator.sh"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required because the suites moved one directory deeper?

Maybe we should pass in the scripts directory as an absolute path at some point and not rely on relative paths. Seems to be ok for now though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be ok for now for me too. Is there a way to make this relative to the actual kudo.go file instead, though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ANeumann82 right, later we can do something like this: https://github.com/mesosphere/kudo-kafka-operator/pull/6/files#diff-dc3495b5db55989eec09ceab2776f370R29
Set a REPO_ROOT env var and create abs paths using that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 53 to 56
"NODE_CPU_MC=200",
"NODE_MEM_MIB=800",
"PROMETHEUS_EXPORTER_CPU_MC=100",
"PROMETHEUS_EXPORTER_MEM_MIB=200",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are probably development settings for local testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right! I will remove these ☑️ from here and tls test

Copy link
Contributor Author

@shubhanilBag shubhanilBag Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's a good practice to have tests that consume as less resource as possible, I added these as the default ones where too large to run in konovy cluster with default config.
It can help in starting up running tests in small clusters faster, plus if we have extra pods in tests like KDC or like a client pod for testing out application features then those can fail to run due to resource starvation on the k8s nodes WDYT @ANeumann82 ?
fyi same practice is followed by kudo-kafka ref:https://github.com/mesosphere/kudo-kafka-operator/blob/b692a7f003afe31a3b35002820542afe62178cbd/tests/utils/client.go#L241

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm fine with that. I didn't know Cassandra enough yet to tune the settings down, but as long as we don't make anything flaky because Cassandra doesn't have enough CPU or Mem, i'm fine with having lower values for the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These settings are too low, and can likely cause resource throttling issues. I've seen it happen during MWTs. Unless we're specifically testing CPU/MEM parameters, let's use the default parameter values.

shubhanilBag and others added 10 commits January 30, 2020 15:39
* wip

* add todo

Co-authored-by: Murilo Pereira <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
Signed-off-by: Shubhanil Bag <[email protected]>
@shubhanilBag
Copy link
Contributor Author

@zmalik @mpereira @ANeumann82 thanks for the thorough review 👍 added changes as requested in fd0ff10

Copy link
Contributor

@mpereira mpereira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @shubhanilBag. Just a few comments from my side regarding documentation, but nothing blocking. Thanks!

README.md Outdated
@@ -48,6 +48,7 @@ kubectl kudo install cassandra
- [Managing](/docs/managing.md)
- [Upgrading](/docs/upgrading.md)
- [Monitoring](/docs/monitoring.md)
- [Security](./docs/secuity.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [Security](./docs/secuity.md)
- [Security](./docs/security.md)

docs/security.md Outdated
@@ -0,0 +1,68 @@
# Securing KUDO Cassandra Operator instances

The KUDO Cassandra service supports Cassandra’s native transport **encryption**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rephrase all instances of "service" with "operator" to maintain consistency.

Suggested change
The KUDO Cassandra service supports Cassandra’s native transport **encryption**
The KUDO Cassandra operator supports Cassandra’s native transport **encryption**

docs/security.md Outdated

The KUDO Cassandra service supports Cassandra’s native transport **encryption**
mechanism. The service provides automation and orchestration to simplify the use
of these important features. For more information on Cassandra’s security, read
Copy link
Contributor

@mpereira mpereira Jan 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also rephrase "Cassadra" instances to "Apache Cassandra" for the same reason.

Suggested change
of these important features. For more information on Cassandra’s security, read
of these important features. For more information on Apache Cassandra’s security, read

docs/security.md Outdated

```
kubectl kudo install cassandra \
--instance=cassandra --namespace=kudo-cassandra \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's maintain consistency of either all parameters in one line, or one parameter per-line when there's lots of them.

Suggested change
--instance=cassandra --namespace=kudo-cassandra \
--instance=cassandra \
--namespace=kudo-cassandra \

@mpereira
Copy link
Contributor

I pushed two small commits fixing a documentation thing.

@shubhanilBag would you be able to add an "unreleased" changelog entry as well?

@shubhanilBag
Copy link
Contributor Author

shubhanilBag commented Jan 31, 2020

I pushed two small commits fixing a documentation thing.

@shubhanilBag would you be able to add an "unreleased" changelog entry as well?

@mpereira Added the doc fixes and added changelog entry in b3bf8f8

@mpereira
Copy link
Contributor

mpereira commented Feb 1, 2020

🚢

@mpereira
Copy link
Contributor

mpereira commented Feb 2, 2020

Looks like there are two test failures:

[18:14:38][Step 5/7] Summarizing 2 Failures:
[18:14:38][Step 5/7] 
[18:14:38][Step 5/7] [Fail] sanity-test [It] Configures Cassandra properties through custom properties 
[18:14:38][Step 5/7] /kudo-cassandra-operator/tests/suites/sanity/sanity_test.go:151
[18:14:38][Step 5/7] 
[18:14:38][Step 5/7] [Fail] sanity-test [It] Configures Cassandra JVM options through custom options 
[18:14:38][Step 5/7] /kudo-cassandra-operator/tests/suites/sanity/sanity_test.go:175
[18:14:38][Step 5/7] 
[18:14:38][Step 5/7] Ran 7 of 7 Specs in 1109.357 seconds
[18:14:38][Step 5/7] FAIL! -- 5 Passed | 2 Failed | 0 Pending | 0 Skipped

@shubhanilBag could you take a look when you get a chance?

Signed-off-by: Shubhanil Bag <[email protected]>
@shubhanilBag
Copy link
Contributor Author

shubhanilBag commented Feb 3, 2020

The sanity test suite was running the tests in this order:

  1. install from community repo
  2. upgrade to local version
  3. Node Scaling tests (3nodes -> 4nodes)
  4. Three parameter update tests

As the result the 3 param update tests were running with 4 nodes in the c* instance, so the SS took more time to update itself than the estimated 5mins of timeout, and failing the tests
In c0fb3bd I moved the node scaling tests to the end. The tests are passing and the update tests are a bit faster now.
cc: @mpereira

@@ -120,3 +122,19 @@ func getConfigurationFromNodeLogs(

return configuration, nil
}

// Cqlsh TODO function comment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in a2162f4

Copy link
Contributor

@zmalik zmalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉

Signed-off-by: Shubhanil Bag <[email protected]>
Copy link
Contributor

@ANeumann82 ANeumann82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@mpereira mpereira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@ANeumann82 ANeumann82 merged commit ff33829 into master Feb 7, 2020
@ANeumann82 ANeumann82 deleted the nil/enable-tls branch February 7, 2020 10:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants