Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tutorial: Deploying Kafka on Kubernetes using Strimzi and Pulumi #13818

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dirien
Copy link
Contributor

@dirien dirien commented Jan 15, 2025

Proposed changes

Unreleased product version (optional)

Related issues (optional)

@dirien dirien requested a review from a team as a code owner January 15, 2025 16:10
@dirien dirien marked this pull request as draft January 15, 2025 16:10
@pulumi-bot
Copy link
Collaborator

@dirien dirien marked this pull request as ready for review January 16, 2025 16:07
@pulumi-bot
Copy link
Collaborator

@dirien dirien requested a review from interurban January 16, 2025 16:30

[Strimzi](https://strimzi.io/) is an open-source project that provides a way to run an Apache Kafka cluster on Kubernetes. It provides operators and custom resources to deploy and manage Kafka clusters on Kubernetes.

It is a CNCF project and is widely used in the Kubernetes community to deploy Kafka clusters as it makes it very easy handle the lifecycle of Kafka clusters. Before Strimzi, deploying Kafka on Kubernetes was a complex task with many manual steps. From creating topics to managing brokers, everything was a manual task.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It is a CNCF project and is widely used in the Kubernetes community to deploy Kafka clusters as it makes it very easy handle the lifecycle of Kafka clusters. Before Strimzi, deploying Kafka on Kubernetes was a complex task with many manual steps. From creating topics to managing brokers, everything was a manual task.
It is a CNCF project and is widely used in the Kubernetes community to deploy Kafka clusters, as it makes it very easy to handle the lifecycle of Kafka clusters. Before Strimzi, deploying Kafka on Kubernetes was a complex task with many manual steps. From creating topics to managing brokers, everything was a manual task.


## Deploying the Strimzi Kafka Operator

In this tutorial, we will two different ways to deploy a Kafka cluster on Kubernetes using Strimzi. The first way is to use Helm to deploy the Strimzi Kafka Operator and then create a Kafka cluster using the Strimzi Kafka Custom Resource Definition (CRD).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we simplify this tutorial to make it focused on using Pulumi to deploy Kafka vs a guide to how to use Strimzi. with that I'd drop having two ways, and highlight how to do this w/Pulumi.


KRaft is an event-based implementation of the Raft protocol with a quorum controller maintaining an event log and a single-partition topic named __cluster_metadata to store the metadata. Unlike the other topics, this is special because records are written to disk synchronously, which is required by the Raft algorithm for correctness.

It works in a leader-follower mode, where the leader writes events into the metadata topic which is then replicated to the follower controllers by using the KRaft replication algorithm. The leader of that single-partition topic is actually the controller node of the Kafka cluster. The metadata changes propagation has the benefit of being event-driven via replication instead of using RPCs. The metadata management is part of Kafka with the usage of a new quorum controller service which uses an event-sources storage model. The KRaft protocol is used to ensure that metadata are fully replicated across the quorum.
Copy link
Collaborator

@interurban interurban Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It works in a leader-follower mode, where the leader writes events into the metadata topic which is then replicated to the follower controllers by using the KRaft replication algorithm. The leader of that single-partition topic is actually the controller node of the Kafka cluster. The metadata changes propagation has the benefit of being event-driven via replication instead of using RPCs. The metadata management is part of Kafka with the usage of a new quorum controller service which uses an event-sources storage model. The KRaft protocol is used to ensure that metadata are fully replicated across the quorum.
It works in a leader-follower mode, where the leader writes events into the metadata topic which is then replicated to the follower controllers by using the KRaft replication algorithm. The leader of that single-partition topic is actually the controller node of the Kafka cluster. The propagation of metadata changes benefits from being event-driven through replication instead of relying on RPCs. Metadata management in Kafka utilizes a new quorum controller service that employs an event-sourced storage model. The KRaft protocol is used to ensure that metadata are fully replicated across the quorum.


#### What is KRaft?

In order to overcome the limitations related to the ZooKeeper usage, the Kafka community came up with the idea of using Kafka itself to store metadata and use an event-driven pattern to make updates across the nodes. The work started with KIP-500 in late 2019 with the introduction of a built-in consensus protocol based on Raft. That was named Kafka Raft (**KRaft**)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we explain more about the Zookeeper usage? is there a blog/article to reference?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should we link out to more detail on Kraft ?


In this section, we will use Pulumi to deploy the Strimzi Kafka Operator and create a Kafka cluster using Pulumi [CustomResource](/registry/packages/kubernetes/api-docs/apiextensions/customresource/)

### Select your Pulumi supported language
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this section block is required


Similar to the Helm installation, we will install the Strimzi Kafka Operator using the [Pulumi Kubernetes provider](/registry/packages/kubernetes/).

Replace the content of created Pulumi program with the following code:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Replace the content of created Pulumi program with the following code:
Replace the content of the created Pulumi program with the following code:


Background:

* We create a dedicated namespace called `kafka` (optional, but often recommended).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* We create a dedicated namespace called `kafka` (optional, but often recommended).
* We create a dedicated namespace called `kafka` (optional, but recommended).


- Learn more about Pulumi and Kubernetes in the [Kubernetes documentation](/docs/iac/clouds/kubernetes/).
- Learn more about the `Release` resource in the [Pulumi Kubernetes API documentation](/registry/packages/kubernetes/api-docs/helm/v3/release/).
- Try the out the `Chart` [tutorial](/tutorials/kubernetes-helm-part-two) to learn how to install Helm charts on Kubernetes using the `Chart` resource.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Try the out the `Chart` [tutorial](/tutorials/kubernetes-helm-part-two) to learn how to install Helm charts on Kubernetes using the `Chart` resource.
- Try out the `Kubernetes Helm Chart` [tutorial](/tutorials/kubernetes-helm-part-two) to learn how to install Helm charts on Kubernetes using the `Chart` resource.

* Sets up both plaintext (plain) and TLS listeners (tls).
* Includes the Entity Operator, which manages users and topics.

### Creating a Kafka Topic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Creating a Kafka Topic
### Create a Kafka Topic


{{% /choosable %}}

### Creating a Kafka User (Optional)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Creating a Kafka User (Optional)
### Create a Kafka User (Optional)

Also; why make this step optional?


![img_1.png](img_1.png)

## Deploying the Strimzi Kafka Operator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Deploying the Strimzi Kafka Operator
## Deploy the Strimzi Kafka Operator


{{% /choosable %}}

### Deploying the whole stack
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Deploying the whole stack
### Deploy the Kafka Kubernetes Cluster

... or something similar

- kubernetes
---

## What is Strimzi?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we start up with an overview of Kafka and K8; and what that architecture looks like (with a great visual like you have below), and talk about the common use cases of this scenario ahead of talking about Strimzi. Currently it's a tutorial about Strimzi, but instead it should be about using Pulumi to deploy a Kafka Cluster primarily, then I think it would be still great to say there's a new open source project that makes this much easier than it was in the past.

title: Deploying Kafka on Kubernetes using Strimzi and Pulumi
layout: single
description: |
Learn how to deploy a Kafka cluster on Kubernetes using Strimzi and Pulumi.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Learn how to deploy a Kafka cluster on Kubernetes using Strimzi and Pulumi.
Learn how to deploy a Kafka cluster on Kubernetes using Strimzi and Pulumi.

per my feedback below, it seems like we should make the highlight and topic about creating a Kafka cluster w/Pulumi and not highlight Strimzi in the title, but rather leave that in the tutorial itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants