Clarify when changing number of cores is supported #850

kbatuigas · 2024-11-12T19:27:40Z

Description

Encourage users to upgrade to 24.3 if they want to use intra-broker partition balancing in production.
Add new note about decreasing core count available only in >=24.2. Decreasing the core count is a separate capability that is supported only if node_local_core_assignment is enabled - the same feature flag that enables intra-broker partition balancing.

Resolves https://github.com/redpanda-data/documentation-private/issues/2706
Review deadline: 13 Nov

Page previews

Cluster balancing > Intra-broker partition balancing

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

netlify · 2024-11-12T19:27:56Z

✅ Deploy Preview for redpanda-docs-preview ready!

Name	Link
🔨 Latest commit	`85e7c6a`
🔍 Latest deploy log	https://app.netlify.com/sites/redpanda-docs-preview/deploys/673b6172ca63710008309770
😎 Deploy Preview	https://deploy-preview-850--redpanda-docs-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kbatuigas · 2024-11-12T19:34:42Z

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

@@ -133,6 +133,8 @@ Prior to Redpanda version 24.2, this meant that some cores on a broker could ina

 Starting in v24.2, topic-aware intra-broker partition balancing allows for dynamically reassigning partitions within a broker.  Redpanda prioritizes an even distribution of a topic's partition replicas across all cores in a broker. If a broker's core count changes, when the broker starts back up, Redpanda can check partition assignments across the broker's cores and reassign partitions, so that a balanced assignment is maintained across all cores. Redpanda can also check partition assignments when partitions are added to or removed from a broker, and rebalance the remaining partitions between cores.

+NOTE: Changing the number of CPU cores in a running cluster is supported from v24.2 only, and requires that you enable `node_local_core_assignment` as described in the previous note.


@daisukebe @wzzzrd86 for this note are we referring to a manual change in number of cores? I am not sure if there is some process that can dynamically change the number and if so, if that's worth mentioning.

@kbatuigas Yes, this is meant to be a manual change in number of cores. To be clear, the general step is below:

Stop a broker, which originally uses 10 cores.

Start the broker with specifying 9 cores (whatever).

This has not been working as expected (broker used to fail to start with decreasing the number of cores) since beginning but this just works today. Does this answer your question?

ztlpn · 2024-11-13T14:25:29Z

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

@@ -133,6 +133,8 @@ Prior to Redpanda version 24.2, this meant that some cores on a broker could ina

 Starting in v24.2, topic-aware intra-broker partition balancing allows for dynamically reassigning partitions within a broker.  Redpanda prioritizes an even distribution of a topic's partition replicas across all cores in a broker. If a broker's core count changes, when the broker starts back up, Redpanda can check partition assignments across the broker's cores and reassign partitions, so that a balanced assignment is maintained across all cores. Redpanda can also check partition assignments when partitions are added to or removed from a broker, and rebalance the remaining partitions between cores.

+NOTE: Changing the number of CPU cores in a running cluster is supported from v24.2 only, and requires that you enable `node_local_core_assignment` as described in the previous note.


Increasing has always been supported (but additional cores were initially idle).

Decreasing has not been supported until 24.2 (with the flag enabled).

Also, since 24.3 the flag will be enabled automatically.

kbatuigas · 2024-11-13T18:41:36Z

@daisukebe @wzzzrd86 I was able to confirm with @ztlpn that node_local_core_assignment is disabled by default for as long as you're running 24.2, so in this PR we'll tell users to upgrade if they want to use intra-broker balancing in production. I have opened a separate PR #855 to remove the "beta" note, mention that you couldn't decrease the core count before 24.2, but also take out any mention of the feature flag because they won't have to interact with it anymore starting in 24.3. Do these changes look good to you? Let me know if it might still confuse readers.

daisukebe

Sounds good. Thank you!

daisukebe · 2024-11-13T21:34:57Z

I've found other pages that mention "decreasing core count is not supported". We need to update these too @kbatuigas (not sure if you want to open a separate PR, anyway).

https://docs.redpanda.com/current/reference/k-redpanda-helm-spec/#resources-cpu-cores
https://docs.redpanda.com/current/manage/kubernetes/k-manage-resources/#limitations

kbatuigas · 2024-11-13T22:03:40Z

I've found other pages that mention "decreasing core count is not supported". We need to update these too @kbatuigas (not sure if you want to open a separate PR, anyway).

https://docs.redpanda.com/current/reference/k-redpanda-helm-spec/#resources-cpu-cores https://docs.redpanda.com/current/manage/kubernetes/k-manage-resources/#limitations

Good spot @daisukebe thank you! @JakeSCahill , I went ahead and drafted something for the Manage Pod Resources doc, but I'm not sure about interacting with feature flags in K8s, would you be able to double check? Also, Is the helm spec page auto generated? I figured I'd leave it for you to update.

daisukebe · 2024-11-15T02:21:45Z

modules/manage/pages/kubernetes/k-manage-resources.adoc

+
+In v24.2, you must enable the `node_local_core_assignment` flag to be able to decrease the core count.
+
+In v24.3 or later, decreasing CPU cores is enabled by default.


In v24.3 or later, the `node_local_core_assignment` flag is enabled and decreasing CPU cores is supported for production clusters.

daisukebe · 2024-11-15T02:22:13Z

modules/manage/pages/kubernetes/k-manage-resources.adoc

-Redpanda does not support decreasing the CPU cores for brokers in an existing cluster.
+Redpanda supports decreasing the CPU cores for brokers in an existing cluster as of version 24.2.
+
+In v24.2, you must enable the `node_local_core_assignment` flag to be able to decrease the core count.


Can you add a note that decreasing CPU core count is not supported for production clusters?

JakeSCahill · 2024-11-15T13:38:07Z

I've found other pages that mention "decreasing core count is not supported". We need to update these too @kbatuigas (not sure if you want to open a separate PR, anyway).
https://docs.redpanda.com/current/reference/k-redpanda-helm-spec/#resources-cpu-cores https://docs.redpanda.com/current/manage/kubernetes/k-manage-resources/#limitations

Good spot @daisukebe thank you! @JakeSCahill , I went ahead and drafted something for the Manage Pod Resources doc, but I'm not sure about interacting with feature flags in K8s, would you be able to double check? Also, Is the helm spec page auto generated? I figured I'd leave it for you to update.

Looks good. The Helm spec is autogenerated. The change is here: redpanda-data/helm-charts#1601

JakeSCahill · 2024-11-15T14:06:56Z

modules/manage/pages/kubernetes/k-manage-resources.adoc

+
+In v24.2, you must enable the `node_local_core_assignment` flag to be able to decrease the core count.
+
+In v24.3 or later, decreasing CPU cores is enabled by default.


I think we need a separate PR for the beta version of 24.3 docs that removes the limitations section.
We don't expect users to switch off feature flags.

In this PR, all we need is something like this (with links to intra-broker partition balancing and v24.3 docs):

Decreasing the number of CPU cores in a production cluster is not supported in this version of Redpanda. However, you can enable intra-broker partition balancing to reduce CPU cores in a development cluster. Starting from version 24.3, decreasing the number of CPU cores in a production cluster is supported. Upgrade to version 24.3 or later to access this feature.

kbatuigas · 2024-11-15T17:07:31Z

modules/manage/pages/kubernetes/k-manage-resources.adoc

+Decreasing the number of CPU cores in a production cluster is not supported in this version of Redpanda. However, you can enable xref:manage:cluster-maintenance/cluster-balancing.adoc#intra-broker-partition-balancing[intra-broker partition balancing] to reduce CPU cores in a development cluster.
+
+Starting from version 24.3, decreasing the number of CPU cores in a production cluster is supported. xref:24.3@ROOT:upgrade:index.adoc[Upgrade to version 24.3] or later to access this feature.


@daisukebe does this phrasing sound good to you?

Yes, sounds good and correct.

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

JakeSCahill · 2024-11-16T06:32:53Z

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

 ====

 In Redpanda, every partition replica is assigned to a CPU core on a broker. While Redpanda's default <<partition-replica-balancing,partition balancing>> monitors cluster-level events, such as the addition of new brokers or broker failure to balance partition assignments, it does not account for the distribution of partitions _within_ an individual broker. 

-Prior to Redpanda version 24.2, this meant that some cores on a broker could inadvertently host many partitions of heavily-used topics and cause the CPU to be xref:manage:monitoring.adoc#cpu-usage[overburdened]. Additionally, when the partition rebalance moved some partitions away from a broker, the remaining partitions did not necessarily rebalance across the broker's cores. Or, if a broker's core count was increased, Redpanda did not assign any partitions to the new cores until new partitions were created or old partitions were moved out.
+Prior to this version, this meant that some cores on a broker could inadvertently host many partitions of heavily-used topics and cause the CPU to be xref:manage:monitoring.adoc#cpu-usage[overburdened]. Additionally, when the partition rebalance moved some partitions away from a broker, the remaining partitions did not necessarily rebalance across the broker's cores. Or, if a broker's core count was increased, Redpanda did not assign any partitions to the new cores until new partitions were created or old partitions were moved out.


I think we should remove this. It’s something that should have been explained in release notes rather than being part of the doc. Users don’t typically care about what a previous version used to do unless they’re reading release notes.

…and not docs

Co-authored-by: Jake Cahill <[email protected]>

kbatuigas commented Nov 12, 2024

View reviewed changes

kbatuigas requested review from wzzzrd86, daisukebe and ztlpn November 12, 2024 19:34

kbatuigas marked this pull request as ready for review November 12, 2024 19:35

kbatuigas requested a review from a team as a code owner November 12, 2024 19:35

ztlpn reviewed Nov 13, 2024

View reviewed changes

kbatuigas changed the base branch from main to v-WIP/24.3 November 13, 2024 15:26

kbatuigas changed the base branch from v-WIP/24.3 to main November 13, 2024 16:39

kbatuigas force-pushed the DOC-521-Document-core-change-capability-is-limited branch from 480a1a8 to 047809d Compare November 13, 2024 16:56

kbatuigas added 2 commits November 13, 2024 13:06

Add admomition regarding core number change

864544a

Point users to upgrade RP version

72809b5

kbatuigas force-pushed the DOC-521-Document-core-change-capability-is-limited branch from 047809d to 72809b5 Compare November 13, 2024 18:06

kbatuigas mentioned this pull request Nov 13, 2024

Change status of intra broker balancing from beta to GA #855

Merged

4 tasks

kbatuigas requested a review from ztlpn November 13, 2024 18:41

daisukebe approved these changes Nov 13, 2024

View reviewed changes

Decrease core count is also mentioned in other pages

864391d

daisukebe reviewed Nov 15, 2024

View reviewed changes

JakeSCahill mentioned this pull request Nov 15, 2024

Redpanda supports decreasing core count in v24.3 redpanda-data/helm-charts#1601

Closed

JakeSCahill reviewed Nov 15, 2024

View reviewed changes

kbatuigas added 2 commits November 15, 2024 11:56

Add link to upgrade to 24.3

c91ff9c

Remove explicit mention of version number

a9d6c88

kbatuigas commented Nov 15, 2024

View reviewed changes

kbatuigas requested a review from JakeSCahill November 15, 2024 17:10

JakeSCahill reviewed Nov 16, 2024

View reviewed changes

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc Outdated Show resolved Hide resolved

JakeSCahill approved these changes Nov 16, 2024

View reviewed changes

JakeSCahill reviewed Nov 16, 2024

View reviewed changes

kbatuigas and others added 2 commits November 18, 2024 10:46

Better to discuss RP behavior prior to this version in release notes …

dda647a

…and not docs

Update modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

85e7c6a

Co-authored-by: Jake Cahill <[email protected]>

kbatuigas merged commit 736f6f4 into main Nov 18, 2024
7 checks passed

kbatuigas deleted the DOC-521-Document-core-change-capability-is-limited branch November 18, 2024 16:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify when changing number of cores is supported #850

Clarify when changing number of cores is supported #850

kbatuigas commented Nov 12, 2024 •

edited

Loading

netlify bot commented Nov 12, 2024 •

edited

Loading

kbatuigas Nov 12, 2024

daisukebe Nov 13, 2024 •

edited

Loading

ztlpn Nov 13, 2024

kbatuigas commented Nov 13, 2024

daisukebe left a comment

daisukebe commented Nov 13, 2024

kbatuigas commented Nov 13, 2024

daisukebe Nov 15, 2024

daisukebe Nov 15, 2024

JakeSCahill commented Nov 15, 2024

JakeSCahill Nov 15, 2024 •

edited

Loading

kbatuigas Nov 15, 2024

daisukebe Nov 18, 2024

JakeSCahill Nov 16, 2024

		@@ -133,6 +133,8 @@ Prior to Redpanda version 24.2, this meant that some cores on a broker could ina

		Starting in v24.2, topic-aware intra-broker partition balancing allows for dynamically reassigning partitions within a broker. Redpanda prioritizes an even distribution of a topic's partition replicas across all cores in a broker. If a broker's core count changes, when the broker starts back up, Redpanda can check partition assignments across the broker's cores and reassign partitions, so that a balanced assignment is maintained across all cores. Redpanda can also check partition assignments when partitions are added to or removed from a broker, and rebalance the remaining partitions between cores.

		NOTE: Changing the number of CPU cores in a running cluster is supported from v24.2 only, and requires that you enable `node_local_core_assignment` as described in the previous note.


		In v24.2, you must enable the `node_local_core_assignment` flag to be able to decrease the core count.

		In v24.3 or later, decreasing CPU cores is enabled by default.

		Decreasing the number of CPU cores in a production cluster is not supported in this version of Redpanda. However, you can enable xref:manage:cluster-maintenance/cluster-balancing.adoc#intra-broker-partition-balancing[intra-broker partition balancing] to reduce CPU cores in a development cluster.

		Starting from version 24.3, decreasing the number of CPU cores in a production cluster is supported. xref:24.3@ROOT:upgrade:index.adoc[Upgrade to version 24.3] or later to access this feature.

Clarify when changing number of cores is supported #850

Clarify when changing number of cores is supported #850

Conversation

kbatuigas commented Nov 12, 2024 • edited Loading

Description

Page previews

Checks

netlify bot commented Nov 12, 2024 • edited Loading

✅ Deploy Preview for redpanda-docs-preview ready!

kbatuigas Nov 12, 2024

Choose a reason for hiding this comment

daisukebe Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

ztlpn Nov 13, 2024

Choose a reason for hiding this comment

kbatuigas commented Nov 13, 2024

daisukebe left a comment

Choose a reason for hiding this comment

daisukebe commented Nov 13, 2024

kbatuigas commented Nov 13, 2024

daisukebe Nov 15, 2024

Choose a reason for hiding this comment

daisukebe Nov 15, 2024

Choose a reason for hiding this comment

JakeSCahill commented Nov 15, 2024

JakeSCahill Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

kbatuigas Nov 15, 2024

Choose a reason for hiding this comment

daisukebe Nov 18, 2024

Choose a reason for hiding this comment

JakeSCahill Nov 16, 2024

Choose a reason for hiding this comment

kbatuigas commented Nov 12, 2024 •

edited

Loading

netlify bot commented Nov 12, 2024 •

edited

Loading

daisukebe Nov 13, 2024 •

edited

Loading

JakeSCahill Nov 15, 2024 •

edited

Loading