Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] Improve clickhouse monitor And Improve Pulsar monitor #2015

Merged
merged 4 commits into from
May 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 80 additions & 59 deletions home/docs/help/clickhouse.md
Original file line number Diff line number Diff line change
@@ -1,73 +1,94 @@
---
id: clickhouse
title: Monitoring:Clickhouse database monitoring
sidebar_label: Clickhouse database
keywords: [open source monitoring tool, open source database monitoring tool, monitoring clickhouse database metrics]
title: Monitoring ClickHouse Database Monitoring
sidebar_label: ClickHouse Database
keywords: [open source monitoring system, open source database monitoring, ClickHouse database monitoring]
---
> Collect and monitor general performance metrics for the ClickHouse database.

> Collect and monitor the general performance Metrics of Clickhouse database.
### Configuration Parameters

### Configuration parameter
| Parameter Name | Parameter Description |
| -------------- | ------------------------------------------------------------------------- |
| Monitor Host | IP address, IPV4, IPV6, or domain name of the host being monitored. Note ⚠️ without protocol prefix (e.g., https://, http://). |
| Task Name | Name identifying this monitoring, ensuring uniqueness. |
| Port | Port number of the database exposed to the outside, default is 8123. |
| Query Timeout | Timeout for SQL queries to respond, in milliseconds (ms), default is 6000ms. |
| Database Name | Name of the database instance, optional. |
| Username | Username for database connection, optional. |
| Password | Password for database connection, optional. |
| Collection Interval | Interval for periodic data collection during monitoring, in seconds, with a minimum interval of 30 seconds. |
| Tag Binding | Used for categorizing and managing monitored resources. |
| Description | Additional information to identify and describe this monitoring, where users can add remarks. |

| Parameter name | Parameter help description |
|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Monitoring Host | Monitored IPV4, IPV6 or domain name. Note⚠️Without protocol header (eg: https://, http://) |
| Monitoring name | Identify the name of this monitoring. The name needs to be unique |
| Port | Port provided by the database. The default is 8123 |
| Query timeout | Set the timeout time when SQL query does not respond to data, unit: ms, default: 6000ms |
| Database name | Database instance name, optional |
| Username | Database connection user name, optional |
| Password | Database connection password, optional |
| Collection interval | Interval time of monitor periodic data collection, unit: second, and the minimum interval that can be set is 30 seconds |
| Bind Tags | Used to classify and manage monitoring resources |
| Description remarks | For more information about identifying and describing this monitoring, users can note information here |
### Collected Metrics

### Collection Metric
#### Metric Set: ping Availability

#### Metric set:ping_available
| Metric Name | Metric Unit | Metric Description |
| ------------- | ----------- | ------------------ |
| responseTime | N/A | Response time |

| Metric name | Metric unit | Metric help description |
| ----------- | ----------- |------------------------------------|
| responseTime | none | response time |
#### Metric set:Data in the system.metrics table
#### Metric Set: Data from system.metrics table

| Metric name | Metric unit | Metric help description |
| ----------- |-------------| ----------- |
| Query | none | Number of executing queries |
| Merge | none | Number of executing background merges |
| PartMutation | none | Number of mutations (ALTER DELETE/UPDATE) |
| ReplicatedFetch| none | Number of data parts being fetched from replica |
| ReplicatedSend| none | Number of data parts being sent to replicas |
| ReplicatedChecks| none | Number of data parts checking for consistency |
| BackgroundMergesAndMutationsPoolTask| none | Number of active merges and mutations in an associated background pool |
| BackgroundFetchesPoolTask| none | Number of active fetches in an associated background pool |
| BackgroundCommonPoolTask| none | Number of active tasks in an associated background pool |
| BackgroundMovePoolTask| none | Number of active tasks in BackgroundProcessingPool for moves |
| Metric Name | Metric Unit | Metric Description |
| ---------------------- | ----------- | ------------------------------------------------------------- |
| Query | N/A | Number of queries being executed |
| Merge | N/A | Number of background merges being executed |
| Move | N/A | Number of background moves being executed |
| PartMutation | N/A | Number of table mutations |
| ReplicatedFetch | N/A | Number of data blocks fetched from replicas |
| ReplicatedSend | N/A | Number of data blocks sent to replicas |
| ReplicatedChecks | N/A | Number of consistency checks on data blocks |
| QueryPreempted | N/A | Number of queries stopped or waiting |
| TCPConnection | N/A | Number of TCP connections |
| HTTPConnection | N/A | Number of HTTP connections |
| OpenFileForRead | N/A | Number of open readable files |
| OpenFileForWrite | N/A | Number of open writable files |
| QueryThread | N/A | Number of threads processing queries |
| ReadonlyReplica | N/A | Number of Replicated tables in read-only state |
| EphemeralNode | N/A | Number of ephemeral nodes in ZooKeeper |
| ZooKeeperWatch | N/A | Number of ZooKeeper event subscriptions |
| StorageBufferBytes | Bytes | Bytes in Buffer tables |
| VersionInteger | N/A | ClickHouse version number |
| RWLockWaitingReaders | N/A | Number of threads waiting for read-write lock on a table |
| RWLockWaitingWriters | N/A | Number of threads waiting for write lock on a table |
| RWLockActiveReaders | N/A | Number of threads holding read lock on a table |
| RWLockActiveWriters | N/A | Number of threads holding write lock on a table |
| GlobalThread | N/A | Number of threads in global thread pool |
| GlobalThreadActive | N/A | Number of active threads in global thread pool |
| LocalThread | N/A | Number of threads in local thread pool |
| LocalThreadActive | N/A | Number of active threads in local thread pool |

#### Metric Set: Data from system.events table

#### Metric set:Data for the system.events table
| Metric Name | Metric Unit | Metric Description |
| ------------------------------------- | ----------- | ---------------------------------------------------------------------------------------------------- |
| Query | N/A | Number of queries to parse and possibly execute. Excludes queries rejected due to AST size limits, quota limits, or simultaneous query limits. May include internal queries initiated by ClickHouse. Subqueries are not counted. |
| SelectQuery | N/A | Number of Select queries possibly executed |
| InsertQuery | N/A | Number of Insert queries possibly executed |
| InsertedRows | N/A | Number of rows inserted into all tables |
| InsertedBytes | Bytes | Number of bytes inserted into all tables |
| FailedQuery | N/A | Number of failed queries |
| FailedSelectQuery | N/A | Number of failed Select queries |
| FileOpen | N/A | Number of file openings |
| MergeTreeDataWriterRows | N/A | Number of data rows written to MergeTree tables |
| MergeTreeDataWriterCompressedBytes | Bytes | Number of compressed data bytes written to MergeTree tables |

| Metric name | Metric unit | Metric help description |
| ----------- |-------------| ----------- |
| Query | none | Number of queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries. |
| SelectQuery | none | Same as Query, but only for SELECT queries. |
| FailedQuery | none | Number of failed queries. |
| FailedSelectQuery | none | Same as FailedQuery, but only for SELECT queries. |
| QueryTimeMicroseconds | none | Total time of all queries. |


#### Metric set:Data from the system.asynchronous_metrics table

| Metric name | Metric unit | Metric help description |
| ----------- |-------------| ----------- |
| AsynchronousMetricsCalculationTimeSpent | none | Time spent on asynchronous metrics calculation. |
| jemalloc.arenas.all.muzzy_purged | none | Number of muzzy pages purged. |
| jemalloc.arenas.all.dirty_purged | none | Number of dirty pages purged. |
| BlockReadBytes_ram1 | none | Number of bytes read from RAM. |
| jemalloc.background_thread.run_intervals | none | Number of background thread run intervals. |
| BlockQueueTime_nbd13 | none | Time spent in block queue. |
| jemalloc.background_thread.num_threads | none | Number of background threads. |
| jemalloc.resident | none | Resident memory size. |
| InterserverThreads | none | Number of inter-server threads. |
| BlockWriteMerges_nbd7 | none | Number of block write merges. |
#### Metric Set: Data from system.asynchronous_metrics table

| Metric Name | Metric Unit | Metric Description |
| -------------------------------------- | ----------- | -------------------------------------- |
| AsynchronousMetricsCalculationTimeSpent | N/A | Time spent calculating asynchronous metrics (seconds) |
| jemalloc.arenas.all.muzzy_purged | N/A | Number of purged muzzy pages |
| jemalloc.arenas.all.dirty_purged | N/A | Number of purged dirty pages |
| BlockReadBytes_ram1 | N/A | Number of bytes read from ram1 block |
| jemalloc.background_thread.run_intervals | N/A | Number of intervals jemalloc background thread ran |
| BlockQueueTime_nbd13 | N/A | Queue wait time for nbd13 block |
| jemalloc.background_thread.num_threads | N/A | Number of jemalloc background threads |
| jemalloc.resident | N/A | Physical memory size allocated by jemalloc (bytes) |
| InterserverThreads | N/A | Number of Interserver threads |
| BlockWriteMerges_nbd7 | N/A | Number of block write merges for nbd7 block |
| MarkCacheBytes | N/A | Size of marks cache in StorageMergeTree |
| MarkCacheFiles | N/A | Number of files in marks cache for StorageMergeTree |
| MaxPartCountForPartition | N/A | Maximum active data blocks in partitions |
Loading
Loading