Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow creating KafkaTopic CR for topic that already present on the Kafka cluster #914

Closed
wants to merge 13 commits into from

Conversation

bartam1
Copy link
Contributor

@bartam1 bartam1 commented Dec 19, 2022

Q A
Bug fix? no
New feature? yes
API breaks? no
Deprecations? no
License Apache 2.0

What's in this PR?

It allows to create a KafkaTopic CR when the referenced topic is already present on the Kafka cluster.

Why?

Let the user create KafkaTopic resources for that topic which is already on the Kafka cluster
Users can modify the topic configuration and partition number for that topic through the created KafkaTopic CR

Checklist

  • Implementation tested
  • Error handling code meets the guideline
  • Logging code meets the guideline
  • User guide and development docs updated (if needed)

@bartam1 bartam1 requested a review from a team as a code owner December 19, 2022 13:48
Copy link
Member

@hi-im-aren hi-im-aren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case behind this?

  • If the topic was created without a CR then it's either an internal topic, which means there is probably an other system managing this topic already
  • or the user created it manually inside kafka which means they know what they are doing and skipped the CR creation on purpose

Also I think just because we remove the validation from the webhook, kafka will still come back with an error that the topic already exists because our reconciler will try to create the topic in kafka.

@panyuenlau
Copy link
Member

What's the use case behind this?

@hi-im-aren One example use case is for the replicated/migrated Kafka topics: for example, if users want to migrate their legacy Kafka clusters to under Koperator's management, they will need to create the KafkaTopic CRs after the replication/migration is done so Koperator can start managing the actual underline Kafka resources

@hi-im-aren
Copy link
Member

hi-im-aren commented Dec 19, 2022

Not sure that the correct way of doing this is removing the validation from the webhook. We are removing a feature because it gets in the way 1% of the time while it's useful in 99% of other use cases.

I'm sure we can figure something better out than removing the check all together.

@bartam1
Copy link
Contributor Author

bartam1 commented Dec 20, 2022

  1. If the topic was created without a CR then it's either an internal topic, which means there is probably an other system managing this topic already
  • It will not try to create I checked it. It checks its configuration and when it differs from the actual topic then the Koperator will sync the new configuration to the actual one.
    if existing != nil {
  1. Not sure that the correct way of doing this is removing the validation from the webhook. We are removing a feature because it gets in the way 1% of the time while it's useful in 99% of other use cases:
  • Im not sure about why so useful is this. For me It is much better to get all of the topic into KafkaTopic CR so somebody who doesn't like dashboards can check all of the presented topic through kubernetes.
    I also can change partition count and topic configuration for those topics.
    Can you collect your points against this modification so we can argue on those points in parking lot or on a meeting to find what would be the best. (I know this is a protection against users to avoid that case when the user accidentally change configuration for an existing topic)
  1. We should choose a middle way so we only permit adding that kind of KafkaTopic when it has a special annotation. So we can avoid noob users to overwrite topics accidentally

@bartam1 bartam1 requested a review from hi-im-aren December 20, 2022 10:01
@play-io
Copy link

play-io commented Jan 25, 2023

What's the use case behind this?

  • If the topic was created without a CR then it's either an internal topic, which means there is probably an other system managing this topic already
  • or the user created it manually inside kafka which means they know what they are doing and skipped the CR creation on purpose

Also I think just because we remove the validation from the webhook, kafka will still come back with an error that the topic already exists because our reconciler will try to create the topic in kafka.

This is also my case:

  • I started using koperator in 2018 or so
  • There was no KafkaTopic cr or maybe I overlooked it
  • I'm managing topics with a custom scripts
  • However want to migrate to KafkaTopic cr without deletion anything

@bartam1
Copy link
Contributor Author

bartam1 commented Feb 13, 2023

The problem is when the topic is internal or it is managed by 3rd party application and we let the user to create KafkaTopic CR for it

  1. then the user can modify topic configs (partition number ...) -> this can cause error in the 3rd party app or in kafka
  2. when it is modified and the kafka or 3rd party application change its configuration then corresponding KafkaTopic will be out of sync.

Solution A

  1. We should add some kind of confirmation protection when the user want to create KafkaTopic cr for a present topic. When a KafkaTopic has a specific label like: unmanaged=true, then the webhook checks is the present topic differs from the KafkaTopic CR and if yes, then the webhook reject that KafkaTopic with error message.
    When there is a label unmanaged=false then the user can modify the topic values.
    When there is no such label then we dont allow creating that KafkaTopic
  2. We cant guarantee that the settings what are in the KafkaTopic is the same as the present topic's settings

Solution B

  • We can make a controller which purpose to create KafkaTopic for the kafka topics and do a sync time by time (I think we should sync only the unmanaged topics).
    There would be unmanaged=true label by default for the internal topics. Users can only change those KafkaTopic CR settings when they change the unmanaged label to false.
  • There would be a timestamp when the configuration is synced by the controller.
  • the unmanaged=true label cannot be removed (webhook check) from those KafkaTopics that reflect to those topics that are not created by the user. It can be removed when its value set to unmanaged=false first.

mihaialexandrescu and others added 11 commits February 17, 2023 10:05
* add Unwrap method to package errorfactory and a unit test for it

* temporary: add webhooks err check functions and unit tests

* rename removingStorageMsg var and add comment about the use of errorDuringValidationMsg

* sanitize error unit tests and reorder public error check functions

* refactor to use errors.As instead of errors.Cause which no longer fits

* replace pkg alias apiErrors with apierrors in pkg webhooks

* camelCase testName and testCases in errors_test.go from pkg webhooks

* add comment for Unwrap method and shorten public error comparison functions in pkg webhooks
* add just kubebuilder validation markers; code generation NOT run

* update CRDs via make manifests
…929)

* chore(go-mod): fixed Koperator submodule tags

Previously invalid tags were specified for
Koperator sub-Go-modules (api, properties), tagged
these and referenced the latest tags.

* chore(go-mod): removed local replacements

Previously the local sub-Go-modules (api,
properties) were depended upon by the root Go
module using a local replacement.

Removed this because it is not transitive to
downstream usage - where Koperator is also a
dependency - and is anyway only neede for
development purposes which can be kept on local
machine instead of having it committed to the
repo.
* pkg webhooks: update new error functions names to IsAdmission prefix; NOT API breaking

* pkg webhooks: update IsInvalidReplicationFactor exported func name with IsAdmission prefix; API breaking change
* chore(README): removed obsolete badges

We transitioned from docker.io to ghcr.io and from
CircleCI to GHA so those badges are not relevant
anymore.

* chore(README): refactored badges

Added CI and image.
Added Go version and released version and date.

---------

Co-authored-by: Darren Lau <[email protected]>
As a preparation for the release of 0.23.0.
…and re-creation cycles indefinitely (#928)

* Prevent kafka controller from running into NodePort service deletion and re-creation cycles indefinitely

* Rename variable

* Refactor existing implementation to improve readability

* Fix unintended change

* Update logs and comments to reduce confusion
@bartam1
Copy link
Contributor Author

bartam1 commented Feb 17, 2023

This PR has been closed because it is re-created here: #934

@bartam1 bartam1 closed this Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants