Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thanos to hub quickstart cluster & remote write from all clusters #634

Merged
merged 2 commits into from
May 10, 2024

Conversation

david-martin
Copy link
Member

  • adds installation of thanos to the first/hub cluster
  • configures grafana in the first/hub cluster with thanos-query as a data source
  • configures every cluster to remote write metrics to the hub thanos instance

To verify, the PR branch thanos will need to be referenced until changes are back on main.

  • Setup a hub cluster with KUADRANT_REF=thanos ./hack/quickstart-setup.sh
  • then at least 1 more cluster with KUADRANT_REF=thanos ./hack/quickstart-setup.sh again.

In the first/hub cluster, you should see metrics from both clusters in the Grafana explore view. e.g. the up{app="metallb",component="controller"} metric should have 1 entry for each cluster.
To access grafana in the hub, you'll need to port-forward to the service with kubectl --context kind-kuadrant-local -n monitoring port-forward svc/grafana 3000:3000 then access it at http://127.0.0.1:3000/explore

@david-martin david-martin requested a review from a team as a code owner May 9, 2024 11:32
Copy link

codecov bot commented May 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.50%. Comparing base (ece13e8) to head (431f205).
Report is 85 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #634      +/-   ##
==========================================
+ Coverage   80.20%   82.50%   +2.29%     
==========================================
  Files          64       72       +8     
  Lines        4492     5360     +868     
==========================================
+ Hits         3603     4422     +819     
- Misses        600      634      +34     
- Partials      289      304      +15     
Flag Coverage Δ
integration 73.91% <ø> (+2.63%) ⬆️
unit 32.22% <ø> (+2.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
api/v1beta1 (u) 71.42% <ø> (ø)
api/v1beta2 (u) 93.58% <100.00%> (+2.16%) ⬆️
pkg/common (u) 83.78% <ø> (-5.05%) ⬇️
pkg/istio (u) 75.14% <ø> (+1.23%) ⬆️
pkg/log (u) 94.73% <ø> (ø)
pkg/reconcilers (u) ∅ <ø> (∅)
pkg/rlptools (u) 79.84% <ø> (+0.39%) ⬆️
controllers (i) 80.72% <84.03%> (+3.91%) ⬆️

see 30 files with indirect coverage changes

@ehearneRedHat ehearneRedHat requested review from ehearneRedHat and removed request for ehearneRedHat May 9, 2024 13:23
kubectl apply -k ${KUADARNT_THANOS_KUSTOMIZATION}
success "thanos installed successfully."
fi

# Install observability stack
info "Installing observability stack in ${KUADRANT_CLUSTER_NAME}..."
kubectl kustomize ${KUADARNT_OBSERVABILITY_KUSTOMIZATION} | docker run --rm -i ryane/kfilt -i kind=CustomResourceDefinition | kubectl apply --server-side -f -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script fails at this point for me saying docker command not found. I do have a alias but maybe the pipe is causing weird things with it. Its not a change from this pr that's causing the issue but Im thinking we should use the CONTAINER_RUNTIME_BIN env from the top of the script instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.
Pushed up a fix for that

Copy link
Contributor

@R-Lawton R-Lawton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM works as expected
Screenshot 2024-05-10 at 12 33 27

@R-Lawton R-Lawton merged commit bca5130 into main May 10, 2024
22 checks passed
@ehearneRedHat ehearneRedHat self-requested a review May 10, 2024 12:50
Copy link
Contributor

@ehearneRedHat ehearneRedHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this PR has been approved and merged, but I wanted to document an issue I had when trying to review this PR and have since found the solution.

My system runs on FedoraOS, and without changing any settings, works fine for using just single cluster quickstart. However, I had issues earlier when trying to run multicluster quickstart and have discovered why.

My issue was that I could not get past the INFO: Waiting for cert-manager deployments to be ready part of the script.

The reason for this, was that inotify did not have enough resources to create more than the pods in a single cluster.

Kind has this page which explains why this issue occurs. After I ran:

sudo sysctl fs.inotify.max_user_watches=524288
sudo sysctl fs.inotify.max_user_instances=512

After deleting the faulty cluster, and re-running the quickstart as seen in the description, I was able to verify the changes working as expected.

Commenting this for future reference. :)

@eguzki eguzki deleted the thanos branch June 3, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants