-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/clickhouse] export metrics to clickhouse #16477
[exporter/clickhouse] export metrics to clickhouse #16477
Conversation
@Frapschen you need to fix all faileds CI jobs. :-P follwoing scrpits may be helpful: make addlicense
make goporto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awsome work. could you add some design and example to README.md?
Foresight Summary
View More Details✅ tracegen workflow has finished in 8 minutes 21 seconds (
|
Job | Failed Steps | Tests | |
---|---|---|---|
build-dev | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
publish-latest | - 🔗 | N/A | See Details |
✅ telemetrygen workflow has finished in 9 minutes 33 seconds (⚠️ 7 minutes 2 seconds more than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
build-dev | - 🔗 | N/A | See Details |
publish-latest | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
✅ check-links workflow has finished in 17 minutes 1 second (⚠️ 15 minutes 16 seconds more than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
changed files | - 🔗 | N/A | See Details |
check-links | - 🔗 | N/A | See Details |
✅ prometheus-compliance-tests workflow has finished in 18 minutes 45 seconds (⚠️ 11 minutes 18 seconds more than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
prometheus-compliance-tests | - 🔗 | ✅ 21 ❌ 0 ⏭ 0 🔗 | See Details |
✅ build-and-test workflow has finished in 1 hour 17 minutes 36 seconds (⚠️ 29 minutes 37 seconds more than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
unittest-matrix (1.19, processor) | - 🔗 | ✅ 1488 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, internal) | - 🔗 | ✅ 666 ❌ 0 ⏭ 0 🔗 | See Details |
correctness-traces | - 🔗 | ✅ 17 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, extension) | - 🔗 | ✅ 528 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, extension) | - 🔗 | ✅ 528 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, receiver-0) | - 🔗 | ✅ 2565 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, receiver-0) | - 🔗 | ✅ 2565 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, exporter) | - 🔗 | ✅ 2500 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, exporter) | - 🔗 | ✅ 2500 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, other) | - 🔗 | ✅ 4482 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, other) | - 🔗 | ✅ 4482 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, receiver-1) | - 🔗 | ✅ 1894 ❌ 0 ⏭ 0 🔗 | See Details |
integration-tests | - 🔗 | ✅ 55 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, receiver-1) | - 🔗 | ✅ 1894 ❌ 0 ⏭ 0 🔗 | See Details |
setup-environment | - 🔗 | N/A | See Details |
checks | - 🔗 | N/A | See Details |
check-codeowners | - 🔗 | N/A | See Details |
check-collector-module-version | - 🔗 | N/A | See Details |
lint-matrix (receiver-0) | - 🔗 | N/A | See Details |
lint-matrix (receiver-1) | - 🔗 | N/A | See Details |
lint-matrix (processor) | - 🔗 | N/A | See Details |
lint-matrix (extension) | - 🔗 | N/A | See Details |
lint-matrix (exporter) | - 🔗 | N/A | See Details |
lint-matrix (internal) | - 🔗 | N/A | See Details |
unittest-matrix (1.18, processor) | - 🔗 | N/A | See Details |
unittest-matrix (1.18, internal) | - 🔗 | N/A | See Details |
build-examples | - 🔗 | N/A | See Details |
correctness-metrics | - 🔗 | N/A | See Details |
lint-matrix (other) | - 🔗 | N/A | See Details |
unittest (1.19) | - 🔗 | N/A | See Details |
unittest (1.18) | - 🔗 | N/A | See Details |
lint | - 🔗 | N/A | See Details |
cross-compile (darwin, amd64) | - 🔗 | N/A | See Details |
cross-compile (darwin, arm64) | - 🔗 | N/A | See Details |
cross-compile (linux, 386) | - 🔗 | N/A | See Details |
cross-compile (linux, amd64) | - 🔗 | N/A | See Details |
cross-compile (linux, arm) | - 🔗 | N/A | See Details |
cross-compile (linux, arm64) | - 🔗 | N/A | See Details |
cross-compile (linux, ppc64le) | - 🔗 | N/A | See Details |
cross-compile (windows, 386) | - 🔗 | N/A | See Details |
cross-compile (windows, amd64) | - 🔗 | N/A | See Details |
build-package (deb) | - 🔗 | N/A | See Details |
build-package (rpm) | - 🔗 | N/A | See Details |
windows-msi | - 🔗 | N/A | See Details |
publish-check | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
publish-dev | - 🔗 | N/A | See Details |
✅ load-tests workflow has finished in 50 minutes 53 seconds (⚠️ 36 minutes 53 seconds more than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
loadtest (TestIdleMode) | - 🔗 | ✅ 1 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestTraceAttributesProcessor) | - 🔗 | ✅ 3 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) | - 🔗 | ✅ 12 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestMetric10kDPS|TestMetricsFromFile) | - 🔗 | ✅ 6 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) | - 🔗 | ✅ 8 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestBallastMemory|TestLog10kDPS) | - 🔗 | ✅ 19 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) | - 🔗 | ✅ 10 ❌ 0 ⏭ 0 🔗 | See Details |
setup-environment | - 🔗 | N/A | See Details |
⭕ changelog workflow has finished in 3 seconds (2 minutes 1 second less than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
changelog | - 🔗 | N/A | See Details |
⭕ build-and-test-windows workflow has finished in 4 seconds (39 minutes 8 seconds less than main
branch avg.) and finished at 19th Jan, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
windows-unittest-matrix | - 🔗 | N/A | See Details |
windows-unittest | - 🔗 | N/A | See Details |
*You can configure Foresight comments in your organization settings page.
anyone know how to fix this workflow error?
|
pls resolve conflicts. |
0dd0b55
to
010e3b7
Compare
3816a56
to
b565a2f
Compare
All of the work is finished, and ready to merge. @JaredTan95 @hanjm @dmitryax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if err := cfg.Validate(); err != nil { | ||
return nil, err | ||
} | ||
if err := createDatabase(cfg); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to move this code to a start method, that runs as part of the exporter lifecycle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function newMetricsExporter()
is follow the style of newTracesExporter()
, I think we can use another PR to deal with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask you to create an issue to follow that up with @Frapschen ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,here #17558
func attributesToMap(attributes pcommon.Map) map[string]string { | ||
m := make(map[string]string, attributes.Len()) | ||
attributes.Range(func(k string, v pcommon.Value) bool { | ||
m[k] = v.Str() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all values are strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most scenarios, attribute value is string, if it contain other types, the Str()
will return a empy strings, I think it's ok for current usecase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please put a comment to that effect and add tests for this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be using AsString()
instead of Str()
in the event that non string values are passed in don't cause an empty string to be returned as described by the docs.
c1267c5
to
2273bbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a decent amount of comments that I would consider need addressing.
I would comfortable to approve this PR once an the *placeholders
are used as needed and not copied into an array of of a constant string.
If you have the time as well, I would appreciate seeing some benchmarks for some of these operations to help quantify the expected performance.
@@ -39,7 +39,7 @@ exporter/awsxrayexporter/ @open-telemetry/collect | |||
exporter/azuremonitorexporter/ @open-telemetry/collector-contrib-approvers @pcwiese | |||
exporter/azuredataexplorerexporter/ @open-telemetry/collector-contrib-approvers @asaharan @ag-ramachandran | |||
exporter/carbonexporter/ @open-telemetry/collector-contrib-approvers @pjanotti | |||
exporter/clickhouseexporter/ @open-telemetry/collector-contrib-approvers @hanjm @dmitryax | |||
exporter/clickhouseexporter/ @open-telemetry/collector-contrib-approvers @hanjm @dmitryax @Frapschen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think it's ok.
``` | ||
|
||
The OTLP Metrics [define two type value for one datapoint](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto#L358), | ||
clickhouse only use one value of float64 to store them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that mean for users sending data with Int
type?
Are the dropped? Automatically converted? Are they required to convert them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert to float64 Automatically, this come from #16477 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets make it clear to the user that this is automatically done for you and doesn't require any additional work :)
Before you make a metrics query, you need to know the type of metric you wish to use. If your metrics come from | ||
Prometheus(or someone else uses OpenMetrics protocol), you also need to know the | ||
[compatibility](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#prometheus-and-openmetrics-compatibility) | ||
between Prometheus(OpenMetrics) and OTLP Metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it could leave for a bad UX for user's who want to get started with clickhouse, is there any issues in future that look to improve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's not friendly for users, considering this is designed for storing data in an organized way which means the user query experience is not the primary. I'm not expecting End Users to direct make a query, I think there needs a query layer to hide those details and give a universal way to query. This needs more discussion and it's out of the exporter's scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was possible to create an issue that can users can track or follow along, either with Clickhouse or in this project. Just so it isn't forgotten or ignored that it can be improved.
if err := cfg.Validate(); err != nil { | ||
return nil, err | ||
} | ||
if err := createDatabase(cfg); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask you to create an issue to follow that up with @Frapschen ?
func attributesToMap(attributes pcommon.Map) map[string]string { | ||
m := make(map[string]string, attributes.Len()) | ||
attributes.Range(func(k string, v pcommon.Value) bool { | ||
m[k] = v.Str() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be using AsString()
instead of Str()
in the event that non string values are passed in don't cause an empty string to be returned as described by the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like my comments have been addressed so I will remove my requests for changes however, I do believe there is some optimisations to be had with handling strings.
mutex := sync.Mutex{} | ||
var items int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In future, if you wanted to not use a mutex, you could use atomic.Int64
instead :)
return metrics | ||
} | ||
|
||
func mustPushMetricsData(t *testing.T, exporter *metricsExporter, md pmetric.Metrics) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use tb testing.TB
instead of *testing.T
so that they can be used between benchmarks, tests, and fuzzing :)
require.NoError(t, err) | ||
} | ||
|
||
func newTestMetricsExporter(t *testing.T, dsn string, fns ...func(*Config)) *metricsExporter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
createSummaryTableSQL: {}, | ||
} | ||
|
||
var logger *zap.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not using a package level variable since it can be rather hazardous due to concurrent usage.
// SetLogger set a logger instance | ||
func SetLogger(l *zap.Logger) { | ||
logger = l | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method isn't safe for concurrent usage and not a pattern I'd like to encourage in the project.
If a method / function needs a logger, please add it as part of the params or type instead.
return fmt.Errorf("db.Begin: %w", err) | ||
} | ||
defer func() { | ||
_ = tx.Rollback() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that on each Transaction that a rollback is performed?
rm.Resource().Attributes().PutStr("service.name", "demo 1") | ||
rm.Resource().Attributes().PutStr("Resource Attributes 1", "value1") | ||
rm.Resource().SetDroppedAttributesCount(10) | ||
rm.SetSchemaUrl("Resource SchemaUrl 1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a properly formatted URL here, http://example.com/schema/1.0.0
is fine to use.
TracesTableName: "otel_traces", | ||
TTLDays: 7, | ||
TimeoutSettings: exporterhelper.NewDefaultTimeoutSettings(), | ||
QueueSettings: QueueSettings{QueueSize: exporterhelper.NewDefaultQueueSettings().QueueSize}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to do:
QueueSettings: exporterhelper.NewDefaultQueueSettings()
CREATE TABLE IF NOT EXISTS %s_exponential_histogram ( | ||
ResourceAttributes Map(LowCardinality(String), String) CODEC(ZSTD(1)), | ||
ResourceSchemaUrl String CODEC(ZSTD(1)), | ||
ScopeName String CODEC(ZSTD(1)), | ||
ScopeVersion String CODEC(ZSTD(1)), | ||
ScopeAttributes Map(LowCardinality(String), String) CODEC(ZSTD(1)), | ||
ScopeDroppedAttrCount UInt32 CODEC(ZSTD(1)), | ||
ScopeSchemaUrl String CODEC(ZSTD(1)), | ||
MetricName String CODEC(ZSTD(1)), | ||
MetricDescription String CODEC(ZSTD(1)), | ||
MetricUnit String CODEC(ZSTD(1)), | ||
Attributes Map(LowCardinality(String), String) CODEC(ZSTD(1)), | ||
StartTimeUnix DateTime64(9) CODEC(Delta, ZSTD(1)), | ||
TimeUnix DateTime64(9) CODEC(Delta, ZSTD(1)), | ||
Count Int64 CODEC(Delta, ZSTD(1)), | ||
Sum Float64 CODEC(ZSTD(1)), | ||
Scale Int32 CODEC(ZSTD(1)), | ||
ZeroCount UInt64 CODEC(ZSTD(1)), | ||
PositiveOffset Int32 CODEC(ZSTD(1)), | ||
PositiveBucketCounts Array(UInt64) CODEC(ZSTD(1)), | ||
NegativeOffset Int32 CODEC(ZSTD(1)), | ||
NegativeBucketCounts Array(UInt64) CODEC(ZSTD(1)), | ||
Exemplars Nested ( | ||
FilteredAttributes Map(LowCardinality(String), String), | ||
TimeUnix DateTime64(9), | ||
Value Float64, | ||
SpanId String, | ||
TraceId String | ||
) CODEC(ZSTD(1)), | ||
Flags UInt32 CODEC(ZSTD(1)), | ||
Min Float64 CODEC(ZSTD(1)), | ||
Max Float64 CODEC(ZSTD(1)) | ||
) ENGINE MergeTree() | ||
%s | ||
PARTITION BY toDate(TimeUnix) | ||
ORDER BY (MetricName, Attributes, toUnixTimestamp64Nano(TimeUnix)) | ||
SETTINGS index_granularity=8192, ttl_only_drop_parts = 1; | ||
` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way this can be tested?
|
||
start := time.Now() | ||
err := doWithTx(ctx, db, func(tx *sql.Tx) error { | ||
_, err := tx.ExecContext(ctx, fmt.Sprintf("%s %s", e.insertSQL, strings.TrimSuffix(b.String(), ",")), valueArgs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a concern with large amounts of inserts being done within the same transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. it's not a transaction, clickhouse driver will do batch insert.
Leaving final approval for code owner
Description: export metrics to clickhouse
This is a init pr for clickhouse metrics exporter. please get me some suggestions, I can work on.
issue: #16478