Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update monitoring docs to use the in-cluster monitoring stack #2586

Merged
merged 44 commits into from
Apr 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
66db837
Update monitoring docs to use OpenShift built-in monitoring stack
dkwon17 Apr 13, 2023
3185519
Update modules/administration-guide/partials/ref_grafana-dashboards-f…
dkwon17 Apr 19, 2023
55001e5
Update modules/administration-guide/partials/ref_grafana-dashboards-f…
dkwon17 Apr 19, 2023
1f1ec51
Update modules/administration-guide/partials/ref_grafana-dashboards-f…
dkwon17 Apr 19, 2023
261f30b
Update modules/administration-guide/partials/ref_grafana-dashboards-f…
dkwon17 Apr 19, 2023
4c485f2
Update modules/administration-guide/partials/ref_grafana-dashboards-f…
dkwon17 Apr 19, 2023
34ca26a
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 19, 2023
d2f513b
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
ec457a6
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
8232bc0
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
69aa231
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
fb22bff
Update modules/administration-guide/partials/proc_collecting-dev-work…
dkwon17 Apr 20, 2023
a5adbd2
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 20, 2023
79ac26a
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 20, 2023
90b2444
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 20, 2023
637b970
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 20, 2023
c321680
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 20, 2023
bbfadd2
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 20, 2023
128c0b3
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 20, 2023
0daa7c1
Update modules/administration-guide/partials/proc_collecting-dev-work…
dkwon17 Apr 20, 2023
271a7c1
Update modules/administration-guide/partials/proc_collecting-dev-work…
dkwon17 Apr 20, 2023
0ac7c26
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
e92491e
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
c1b6866
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 20, 2023
fa9e555
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 20, 2023
65c504c
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 20, 2023
68db756
Add suggestion for troubleshoot tips in DWO
dkwon17 Apr 20, 2023
dbf8709
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 21, 2023
27f3dd4
Update modules/administration-guide/partials/proc_viewing-dev-workspa…
dkwon17 Apr 21, 2023
d776e3d
Update modules/administration-guide/partials/ref_grafana-dashboards-f…
dkwon17 Apr 21, 2023
995453a
Update modules/administration-guide/partials/proc_collecting-dev-work…
dkwon17 Apr 21, 2023
fac3016
Update modules/administration-guide/partials/proc_collecting-dev-work…
dkwon17 Apr 21, 2023
2b58837
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 21, 2023
f9a49f8
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 21, 2023
0e7695f
Hardcode OpenShift-related values
dkwon17 Apr 21, 2023
1c0bfac
Add empty lines withing the TIP text
dkwon17 Apr 21, 2023
5e8b36c
Create disclaimer file
dkwon17 Apr 21, 2023
80f9351
Updates to make the JVM Server metrics page more similar to DWO metri…
dkwon17 Apr 21, 2023
07450c7
Update modules/administration-guide/partials/proc_collecting-che-metr…
dkwon17 Apr 21, 2023
04b226b
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 21, 2023
c97ad6e
Update modules/administration-guide/partials/proc_viewing-che-metrics…
dkwon17 Apr 21, 2023
5a0f7df
Update Note admonition
dkwon17 Apr 21, 2023
e5980bb
Remove unnecessary +
dkwon17 Apr 21, 2023
cb72b29
Replace two AsciIDoc periods with AsciiDoc asterisks.
max-cx Apr 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 0 additions & 2 deletions modules/administration-guide/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,6 @@
**** xref:creating-a-telemetry-plugin.adoc[]
*** xref:configuring-server-logging.adoc[]
*** xref:collecting-logs-using-chectl.adoc[]
*** xref:monitoring-with-prometheus-and-grafana.adoc[]
**** xref:installing-prometheus-and-grafana.adoc[]
**** xref:monitoring-the-dev-workspace-operator.adoc[]
**** xref:monitoring-che.adoc[]
** xref:configuring-networking.adoc[]
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@
[id="monitoring-the-dev-workspace-operator"]
= Monitoring the {devworkspace} Operator


You can configure an example monitoring stack to process metrics exposed by the {devworkspace} Operator.
You can configure the OpenShift in-cluster monitoring stack to scrape metrics exposed by the {devworkspace} Operator.

include::partial$proc_collecting-dev-workspace-operator-metrics-with-prometheus.adoc[leveloffset=+1]

Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -3,69 +3,131 @@
[id="collecting-{prod-id-short}-metrics-with-prometheus"]
= Collecting {prod-short} Server metrics with Prometheus

To use Prometheus to collect, store, and query JVM metrics for {prod-short} Server:
To use the in-cluster Prometheus instance to collect, store, and query JVM metrics for {prod-short} Server:

.Prerequisites

* {prod-short} is exposing metrics on port `8087`. See xref:enabling-and-exposing-{prod-id-short}-metrics[Enabling and exposing {prod-short} server JVM metrics].
* Your organization's instance of {prod-short} is installed and running in Red Hat OpenShift.

* An active `oc` session with administrative permissions to the destination OpenShift cluster. See link:https://docs.openshift.com/container-platform/{ocp4-ver}/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the CLI].

* Prometheus 2.26.0 or later is running. The Prometheus console is running on port `9090` with a corresponding Service. See link:https://prometheus.io/docs/introduction/first_steps/[First steps with Prometheus].
* {prod-short} is exposing metrics on port `8087`. See xref:enabling-and-exposing-{prod-id-short}-metrics[Enabling and exposing {prod-short} server JVM metrics].

.Procedure

. Configure Prometheus to scrape metrics from port `8087`.
. Create the ServiceMonitor for detecting the {prod-short} JVM metrics Service.
+
NOTE: The xref:installing-prometheus-and-grafana.adoc[example monitoring stack] already creates the `prometheus-config` ConfigMap with an empty configuration. To provide the Prometheus configuration details, edit the `data` field of the ConfigMap.
.ServiceMonitor
====
[source,yaml,subs="+quotes,+attributes,+macros"]
----
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: che-host
namespace: {prod-namespace} <1>
spec:
endpoints:
- interval: 10s <2>
port: metrics
scheme: http
namespaceSelector:
matchNames:
- openshift-devspaces
selector:
matchLabels:
app.kubernetes.io/name: devspaces
----
<1> The {prod-short} namespace. The default is `{prod-namespace}`.
<2> The rate at which a target is scraped.
====

. Create a Role and RoleBinding to allow Prometheus view the metrics.

+
.Role
====
[source,yaml,subs="+quotes,+attributes,+macros"]
----
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: prometheus-k8s
namespace: {prod-namespace} <1>
rules:
- verbs:
- get
- list
- watch
apiGroups:
- ''
resources:
- services
- endpoints
- pods
----
<1> The {prod-short} namespace. The default is `{prod-namespace}`.
====

+
.Prometheus configuration
.RoleBinding
====
[source,yaml,subs="+quotes,+attributes,+macros"]
----
apiVersion: v1
kind: ConfigMap
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: prometheus-config
data:
prometheus.yml: |-
global:
scrape_interval: 5s <1>
evaluation_interval: 5s <2>
scrape_configs: <3>
- job_name: '{prod-short} Server'
static_configs:
- targets: ['che-host.__<{prod-short}_{orch-namespace}>__:8087'] <4>
name: view-openshift-monitoring-prometheus-k8s
namespace: {prod-namespace} <1>
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: openshift-monitoring
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: prometheus-k8s
----
<1> The rate at which a target is scraped.
<2> The rate at which the recording and alerting rules are re-checked.
<3> The resources that Prometheus monitors. In the default configuration, a single job, `{prod-short} Server`, scrapes the time series data exposed by {prod-short} Server.
<4> The scrape target for the metrics from port `8087`. Replace `__<{prod-short}_{orch-namespace}>__` with the {prod-short} {orch-namespace}. The default {prod-short} {orch-namespace} is `{prod-namespace}`.
<1> The {prod-short} namespace. The default is `{prod-namespace}`.
====

. Scale the `Prometheus` Deployment down and up to read the updated ConfigMap from the previous step.
. Allow the in-cluster Prometheus instance to detect the ServiceMonitor in the {prod-short} namespace. The default {prod-short} namespace is `{prod-namespace}`.
+
[source,terminal,subs="+attributes,quotes"]
----
$ {orch-cli} scale --replicas=0 deployment/prometheus -n monitoring && {orch-cli} scale --replicas=1 deployment/prometheus -n monitoring
$ oc label namespace {prod-namespace} openshift.io/cluster-monitoring=true
----

.Verification

. Use port forwarding to access the `Prometheus` Service locally:
. In the *Administrator* view of the OpenShift web console, go to *Observe* -> *Metrics*.

. Run a PromQL query to confirm that the metrics are available. For example, enter `process_uptime_seconds{job="che-host"}` and click *Run queries*.

[TIP]
====

To troubleshoot missing metrics, view the Prometheus container logs for possible RBAC-related errors:

. Get the name of the Prometheus pod:
+
[source,terminal,subs="+attributes,quotes"]
[source,yaml,subs="+quotes"]
----
$ {orch-cli} port-forward svc/prometheus 9090:9090 -n monitoring
$ oc get pods -l app.kubernetes.io/name=prometheus -n openshift-monitoring -o=jsonpath='{.items[*].metadata.name}'
----
. Verify that all targets are up by viewing the `targets` endpoint at `localhost:9090/targets`.
. Use the Prometheus console to view and query metrics:
** View metrics at `localhost:9090/metrics`.
** Query metrics from `localhost:9090/graph`.

. Print the last 20 lines of the Prometheus container logs from the Prometheus pod from the previous step:
+
For more information, see link:https://prometheus.io/docs/introduction/first_steps/#using-the-expression-browser[Using the expression browser].
[source,yaml,subs="+quotes"]
----
$ oc logs --tail=20 __<prometheus_pod_name>__ -c prometheus -n openshift-monitoring
----

====

[role="_additional-resources"]
.Additional resources

* link:https://prometheus.io/docs/prometheus/latest/configuration/configuration/[Configuring Prometheus]

* link:https://prometheus.io/docs/prometheus/latest/querying/basics/[Querying Prometheus]

Expand Down
Loading