Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEVX-2277: Change topic creation to use Confluent Server v3 REST API #300

Merged
merged 12 commits into from
Jan 6, 2021

Conversation

javabrett
Copy link
Member

Description

What behavior does this PR change, and why?

I was using the v3 REST API with cp-demo and figured changing topic-creation to use the v3 REST API would be a good addition. This PR retires shelling-in to a container to run CLI tool kafka-topics, replacing with POST calls to https://localhost:8091/kafka/v3/clusters/${KAFKA_CLUSTER_ID}/topics .

This appears to be much faster, and would eliminate the need for efforts to parallelise topic-creation per #285 .

curl was missing from pre-flight installed checks, so added that too.

Author Validation

Describe the validation already done, or needs to be done, by the PR submitter.

  • Documentation
  • Run cp-demo
  • jmx-monitoring-stacks

Reviewer Tasks

Describe the tasks/validation that the PR submitter is requesting to be done by the reviewer.

Some questions for reviewer:

  • Currently we don't bother to use the same user principal to create the topics as before - they are all created by superUser. Does it matter which user creates the topic?
  • Note currently sending Authorization: Basic and the base64 credentials - would curl -u / --user be preferred?

@javabrett javabrett requested a review from ybyzek December 4, 2020 11:30
@ybyzek
Copy link
Contributor

ybyzek commented Dec 4, 2020

@javabrett absolutely phenomenal idea! And it showcases REST Proxy. Will test shortly and leave feedback.

--header 'Authorization: Basic c3VwZXJVc2VyOnN1cGVyVXNlcg==' \
--data-binary @<(jq -n --arg topic_name users --arg confluent_value_schema_validation "true" -f ${DIR}/topic.jq) | jq

curl -s --insecure --location --request POST "https://localhost:8091/kafka/v3/clusters/${KAFKA_CLUSTER_ID}/topics" \
Copy link
Contributor

@ybyzek ybyzek Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparison (and really promoting the idea that these examples can be or should be consistent), the cp-demo docs shows a curl an example for topic creation as follows:

      docker-compose exec restproxy curl -X POST \
         -H "Content-Type: application/json" \
         -H "accept: application/json" \
         -d "{\"topic_name\":\"dev_users\",\"partitions_count\":64,\"replication_factor\":2,\"configs\":[{\"name\":\"cleanup.policy\",\"value\":\"compact\"},{\"name\":\"compression.type\",\"value\":\"gzip\"}]}" \
         --cert /etc/kafka/secrets/mds.certificate.pem \
         --key /etc/kafka/secrets/mds.key \
         --tlsv1.2 \
         --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
         -u appSA:appSA \
         "https://kafka1:8091/kafka/v3/clusters/${KAFKA_CLUSTER_ID}/topics" | jq

There are differences between what the cp-demo docs show and what is written here. Can we think about if it makes sense to synchronize them in terms of:

  • running on docker container or local host
  • the user/auth
  • --insecure versus not`
  • passing in the certs
  • etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script code is now mostly consistent with this, remaining differences:

  • -d data, ours is dynamically generated for each call from a jq template
  • We aren't sending a key ... we rely on Basic auth ... above sets both key and user, should probably pick one.
  • Example sets --tls1.2 forcing a less-secure-than-available protocol (1.3 is best, available) - doc-bug?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javabrett

We aren't sending a key ... we rely on Basic auth ... above sets both key and user, should probably pick one.

When removing key, the command errors out with: parse error: Invalid numeric literal at line 1, column 5. Does this happen in your env and do you know why?

Example sets --tls1.2 forcing a less-secure-than-available protocol (1.3 is best, available) - doc-bug?

IIRC v1.2 was specified NOT instead of v1.3 but instead of v1.1. Is your recommendation to remove the version completely from the command examples in the docs, or to explicitly set v1.3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this happen in your env and do you know why?

IIUC this is because user appSA has only been granted ResourceOwner on some topics, so might not have grant to create a topic? I might be misreading the grants.

Changing to -u superUser:superUser with no keys or certs works with Basic auth. Maybe it's more correct to use the certs anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC v1.2 was specified NOT instead of v1.3 but instead of v1.1. Is your recommendation to remove the version completely from the command examples in the docs, or to explicitly set v1.3?

@ybyzek you are correct here, my error ... --tlsv1.2 sets a minimum, so TLS v1.3 is typically being negotiated anyway, preventing TLS 1.1 or worse only.

I'll update the PR to include the same guide-rails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added --tlsv1.2 to curl for create topic.

done
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null && pwd )"

KAFKA_CLUSTER_ID=$(curl -s --insecure --location --request GET 'https://localhost:8091/kafka/v3/clusters' --header 'Authorization: Basic c3VwZXJVc2VyOnN1cGVyVXNlcg==' | jq -r '.data[0].cluster_id')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your consideration, there's a function that does this get_kafka_cluster_id_from_container --> https://github.com/confluentinc/cp-demo/blob/6.0.0-post/scripts/helper/functions.sh#L58-L68 . It would be nice to pull in the helper scripts and then use that function to get the ID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, depends on the tools container though.

Copy link
Contributor

@ybyzek ybyzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javabrett awesome time improvement (~55s before, 8s after). Please note requested changes

@ybyzek ybyzek changed the title Change topic creation to use Confluent Server v3 REST API DEVX-2277: Change topic creation to use Confluent Server v3 REST API Dec 4, 2020
@javabrett javabrett force-pushed the topic-creation-via-rest-v3 branch from d39835c to eea6de8 Compare December 12, 2020 05:08
@javabrett javabrett changed the base branch from 6.0.0-post to 6.0.1-post December 12, 2020 05:08
@javabrett
Copy link
Member Author

I am blocked implementing code-review requested changes:

  • Existing get_kafka_cluster_id_from_container relies on jq, is only called from create-role-bindings.sh which runs on tools.
  • I also want to use jq both as a templating-engine for rendering the POST payload for create topic, and for pretty-printing results.
  • As previously discussed, tools is scheduled for deprecation and shouldn't be used further, so a new mechanism will be required for get_kafka_cluster_id_from_container. That might be a new CLI container, but waiting to learn if it will include tools like jq in the base image.
  • If tools is going away without a replacement jq in-container, options include a) adding jq to custom Connect image since we already build one, b) pull-up get_kafka_cluster_id_from_container to run on host since the host already has curl and jq dependencies, c) create a different tools based on existing base containers, d) wait for new CLI container and possible add jq to it, e) investigate running on openldap, which has jq but no curl.

@javabrett javabrett force-pushed the topic-creation-via-rest-v3 branch from eea6de8 to 6d4bb44 Compare December 20, 2020 06:21

if [[ $RC -ne 0 || -z $RESULT || $RESULT =~ "error_code" ]]; then
echo "ERROR: create topic failed $RESULT"
return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this error code be handled by the calling script, i.e., fail fast?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should already fail-fast I think (and according to a manual test), via:

set -euo pipefail

If I change:

diff --git a/scripts/helper/create-topics.sh b/scripts/helper/create-topics.sh
index 7adfca4..d8700b1 100755
--- a/scripts/helper/create-topics.sh
+++ b/scripts/helper/create-topics.sh
@@ -10,6 +10,8 @@ echo "KAFKA_CLUSTER_ID: ${KAFKA_CLUSTER_ID}"

 auth="superUser:superUser"

+create_topic kafka1:8091 ${KAFKA_CLUSTER_ID} wikipedia.parsed.count-by-domain false ${auth}
+
 create_topic kafka1:8091 ${KAFKA_CLUSTER_ID} users true ${auth}
 create_topic kafka1:8091 ${KAFKA_CLUSTER_ID} wikipedia.parsed true ${auth}
 create_topic kafka1:8091 ${KAFKA_CLUSTER_ID} wikipedia.parsed.count-by-domain false ${auth}

... the script stops with:

ERROR: create topic failed {"error_code":40002,"message":"Topic 'wikipedia.parsed.count-by-domain' already exists."}

Is anything further required for fail-fast?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javabrett this test does not fail fast:

[11:53:59] ~/git/cp-demo(topic-creation-via-rest-v3) ✗: git diff
diff --git a/scripts/helper/create-topics.sh b/scripts/helper/create-topics.sh
index 7adfca4..64f386a 100755
--- a/scripts/helper/create-topics.sh
+++ b/scripts/helper/create-topics.sh
@@ -8,7 +8,7 @@ source ${DIR}/functions.sh
 KAFKA_CLUSTER_ID=$(get_kafka_cluster_id_from_container)
 echo "KAFKA_CLUSTER_ID: ${KAFKA_CLUSTER_ID}"
 
-auth="superUser:superUser"
+auth="superUser:superUser2"
 
 create_topic kafka1:8091 ${KAFKA_CLUSTER_ID} users true ${auth}
 create_topic kafka1:8091 ${KAFKA_CLUSTER_ID} wikipedia.parsed true ${auth}

stdout shows:

{
  "servlet": "default",
  "message": "Unauthorized",
  "url": "/kafka/v3/clusters/AO6RqB0HSnaagGflWwIy4w/topics",
  "status": "401"
}

But the start script continues and does not exit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ybyzek OK I see what happens there - for auth 401 errors, the error response is actually different, so we don't detect the error.

curl is well-known in terms of difficulty capturing all of good responses, error responses and HTTP codes - do we have an idiomatic way of doing that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't played with this too much, but some suggestions to consider:

  1. Parse output for "Unauthorized"
  2. Check existence of topics afterwards

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ybyzek I've pushed a commit which adds fetching of the HTTP response code from the curl command for create_topic, so we can test that in addition to the output scraping. Unfortunately curl makes it fairly hard to fetch both output and response code, so the new bash script fragment is necessarily complicated.

The create_topic call now fails-fast for at least topic-already-exists (400) and unauthorized/bad-credentials (401). Any response code >299 will be treated as fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javabrett beauty:

ERROR: create topic failed {
"servlet":"default",
"message":"Unauthorized",
"url":"/kafka/v3/clusters/QKMFRq6RSbeOfk4-6G7pPw/topics",
"status":"401"
}

It fails fast!!

@ybyzek
Copy link
Contributor

ybyzek commented Dec 30, 2020

@javabrett please note PR #316 will impact this PR due to topic renaming. Maybe this PR should retarget 6.1.x to avoid point merge conflicts?

@javabrett
Copy link
Member Author

javabrett commented Jan 1, 2021

@ybyzek Ack that there will be changes/conflict to resolve here, assuming #316 is merged first.

Will those changes also be required assuming #316 applies also on 6.1.x? I'm happy to target that branch, but equally happy to stay on 6.0.1-post, unless there are other indications that suggest 6.1.x is better.

Edit: oh I only just noticed that #316 targets 6.1.x.

@javabrett javabrett force-pushed the topic-creation-via-rest-v3 branch from 66f3b7d to d064952 Compare January 1, 2021 03:13
@javabrett javabrett requested a review from ybyzek January 1, 2021 04:04
@ybyzek ybyzek requested a review from mikebin January 6, 2021 21:01
Copy link
Contributor

@ybyzek ybyzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javabrett LGTM -- wonderful improvement in time and showcases REST Proxy for topic creation -- win/win. Thank you!

@javabrett javabrett merged commit 851accf into 6.0.1-post Jan 6, 2021
@javabrett javabrett deleted the topic-creation-via-rest-v3 branch January 6, 2021 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants