Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/clickhouse] export metrics to clickhouse #16477

Merged
merged 18 commits into from
Jan 20, 2023

Conversation

Frapschen
Copy link
Contributor

@Frapschen Frapschen commented Nov 25, 2022

Description: export metrics to clickhouse

This is a init pr for clickhouse metrics exporter. please get me some suggestions, I can work on.

issue: #16478

@JaredTan95
Copy link
Member

JaredTan95 commented Nov 25, 2022

@Frapschen you need to fix all faileds CI jobs. :-P

follwoing scrpits may be helpful:

make addlicense
make goporto

Copy link
Member

@hanjm hanjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awsome work. could you add some design and example to README.md?

exporter/clickhouseexporter/internal/metrics_model.go Outdated Show resolved Hide resolved
exporter/clickhouseexporter/internal/metrics_model.go Outdated Show resolved Hide resolved
@github-actions github-actions bot requested a review from hanjm November 27, 2022 14:29
@github-actions github-actions bot requested a review from hanjm November 29, 2022 09:21
@runforesight
Copy link

runforesight bot commented Dec 4, 2022

Foresight Summary

    
Major Impacts

build-and-test-windows duration(4 seconds) has decreased 39 minutes 8 seconds compared to main branch avg(39 minutes 12 seconds).
View More Details

✅  tracegen workflow has finished in 8 minutes 21 seconds (⚠️ 6 minutes more than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 9 minutes 33 seconds (⚠️ 7 minutes 2 seconds more than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  check-links workflow has finished in 17 minutes 1 second (⚠️ 15 minutes 16 seconds more than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  prometheus-compliance-tests workflow has finished in 18 minutes 45 seconds (⚠️ 11 minutes 18 seconds more than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
prometheus-compliance-tests -     🔗  ✅ 21  ❌ 0  ⏭ 0    🔗 See Details

✅  build-and-test workflow has finished in 1 hour 17 minutes 36 seconds (⚠️ 29 minutes 37 seconds more than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
unittest-matrix (1.19, processor) -     🔗  ✅ 1488  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, internal) -     🔗  ✅ 666  ❌ 0  ⏭ 0    🔗 See Details
correctness-traces -     🔗  ✅ 17  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, extension) -     🔗  ✅ 528  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, extension) -     🔗  ✅ 528  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, receiver-0) -     🔗  ✅ 2565  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-0) -     🔗  ✅ 2565  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, exporter) -     🔗  ✅ 2500  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, exporter) -     🔗  ✅ 2500  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, other) -     🔗  ✅ 4482  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, other) -     🔗  ✅ 4482  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, receiver-1) -     🔗  ✅ 1894  ❌ 0  ⏭ 0    🔗 See Details
integration-tests -     🔗  ✅ 55  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-1) -     🔗  ✅ 1894  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
checks -     🔗  N/A See Details
check-codeowners -     🔗  N/A See Details
check-collector-module-version -     🔗  N/A See Details
lint-matrix (receiver-0) -     🔗  N/A See Details
lint-matrix (receiver-1) -     🔗  N/A See Details
lint-matrix (processor) -     🔗  N/A See Details
lint-matrix (extension) -     🔗  N/A See Details
lint-matrix (exporter) -     🔗  N/A See Details
lint-matrix (internal) -     🔗  N/A See Details
unittest-matrix (1.18, processor) -     🔗  N/A See Details
unittest-matrix (1.18, internal) -     🔗  N/A See Details
build-examples -     🔗  N/A See Details
correctness-metrics -     🔗  N/A See Details
lint-matrix (other) -     🔗  N/A See Details
unittest (1.19) -     🔗  N/A See Details
unittest (1.18) -     🔗  N/A See Details
lint -     🔗  N/A See Details
cross-compile (darwin, amd64) -     🔗  N/A See Details
cross-compile (darwin, arm64) -     🔗  N/A See Details
cross-compile (linux, 386) -     🔗  N/A See Details
cross-compile (linux, amd64) -     🔗  N/A See Details
cross-compile (linux, arm) -     🔗  N/A See Details
cross-compile (linux, arm64) -     🔗  N/A See Details
cross-compile (linux, ppc64le) -     🔗  N/A See Details
cross-compile (windows, 386) -     🔗  N/A See Details
cross-compile (windows, amd64) -     🔗  N/A See Details
build-package (deb) -     🔗  N/A See Details
build-package (rpm) -     🔗  N/A See Details
windows-msi -     🔗  N/A See Details
publish-check -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-dev -     🔗  N/A See Details

✅  load-tests workflow has finished in 50 minutes 53 seconds (⚠️ 36 minutes 53 seconds more than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestBallastMemory|TestLog10kDPS) -     🔗  ✅ 19  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details

⭕  changelog workflow has finished in 3 seconds (2 minutes 1 second less than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

⭕  build-and-test-windows workflow has finished in 4 seconds (39 minutes 8 seconds less than main branch avg.) and finished at 19th Jan, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

@Frapschen
Copy link
Contributor Author

anyone know how to fix this workflow error?

Error: factory.go:110:6: SA1019: component.ExporterConfig is deprecated: [v0.67.0] use Config. (staticcheck)
	cfg component.ExporterConfig,

@JaredTan95
Copy link
Member

pls resolve conflicts.

@Frapschen Frapschen force-pushed the clickhouse_metrics_exporter branch from 0dd0b55 to 010e3b7 Compare December 8, 2022 10:34
exporter/clickhouseexporter/README.md Outdated Show resolved Hide resolved
@Frapschen Frapschen force-pushed the clickhouse_metrics_exporter branch 2 times, most recently from 3816a56 to b565a2f Compare December 11, 2022 09:43
@Frapschen
Copy link
Contributor Author

Frapschen commented Dec 11, 2022

All of the work is finished, and ready to merge. @JaredTan95 @hanjm @dmitryax

Copy link
Member

@JaredTan95 JaredTan95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if err := cfg.Validate(); err != nil {
return nil, err
}
if err := createDatabase(cfg); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to move this code to a start method, that runs as part of the exporter lifecycle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function newMetricsExporter() is follow the style of newTracesExporter(), I think we can use another PR to deal with them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask you to create an issue to follow that up with @Frapschen ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,here #17558

func attributesToMap(attributes pcommon.Map) map[string]string {
m := make(map[string]string, attributes.Len())
attributes.Range(func(k string, v pcommon.Value) bool {
m[k] = v.Str()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all values are strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most scenarios, attribute value is string, if it contain other types, the Str() will return a empy strings, I think it's ok for current usecase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put a comment to that effect and add tests for this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be using AsString() instead of Str() in the event that non string values are passed in don't cause an empty string to be returned as described by the docs.

@Frapschen Frapschen force-pushed the clickhouse_metrics_exporter branch from c1267c5 to 2273bbf Compare January 10, 2023 10:41
Copy link
Contributor

@MovieStoreGuy MovieStoreGuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a decent amount of comments that I would consider need addressing.

I would comfortable to approve this PR once an the *placeholders are used as needed and not copied into an array of of a constant string.

If you have the time as well, I would appreciate seeing some benchmarks for some of these operations to help quantify the expected performance.

@@ -39,7 +39,7 @@ exporter/awsxrayexporter/ @open-telemetry/collect
exporter/azuremonitorexporter/ @open-telemetry/collector-contrib-approvers @pcwiese
exporter/azuredataexplorerexporter/ @open-telemetry/collector-contrib-approvers @asaharan @ag-ramachandran
exporter/carbonexporter/ @open-telemetry/collector-contrib-approvers @pjanotti
exporter/clickhouseexporter/ @open-telemetry/collector-contrib-approvers @hanjm @dmitryax
exporter/clickhouseexporter/ @open-telemetry/collector-contrib-approvers @hanjm @dmitryax @Frapschen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets check with @hanjm and @dmitryax if they are okay with an additional code owner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think it's ok.

```

The OTLP Metrics [define two type value for one datapoint](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto#L358),
clickhouse only use one value of float64 to store them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that mean for users sending data with Int type?
Are the dropped? Automatically converted? Are they required to convert them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert to float64 Automatically, this come from #16477 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make it clear to the user that this is automatically done for you and doesn't require any additional work :)

Comment on lines +230 to +233
Before you make a metrics query, you need to know the type of metric you wish to use. If your metrics come from
Prometheus(or someone else uses OpenMetrics protocol), you also need to know the
[compatibility](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#prometheus-and-openmetrics-compatibility)
between Prometheus(OpenMetrics) and OTLP Metrics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it could leave for a bad UX for user's who want to get started with clickhouse, is there any issues in future that look to improve this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's not friendly for users, considering this is designed for storing data in an organized way which means the user query experience is not the primary. I'm not expecting End Users to direct make a query, I think there needs a query layer to hide those details and give a universal way to query. This needs more discussion and it's out of the exporter's scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was possible to create an issue that can users can track or follow along, either with Clickhouse or in this project. Just so it isn't forgotten or ignored that it can be improved.

if err := cfg.Validate(); err != nil {
return nil, err
}
if err := createDatabase(cfg); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask you to create an issue to follow that up with @Frapschen ?

exporter/clickhouseexporter/exporter_metrics.go Outdated Show resolved Hide resolved
func attributesToMap(attributes pcommon.Map) map[string]string {
m := make(map[string]string, attributes.Len())
attributes.Range(func(k string, v pcommon.Value) bool {
m[k] = v.Str()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be using AsString() instead of Str() in the event that non string values are passed in don't cause an empty string to be returned as described by the docs.

exporter/clickhouseexporter/internal/metrics_model.go Outdated Show resolved Hide resolved
exporter/clickhouseexporter/internal/metrics_model.go Outdated Show resolved Hide resolved
exporter/clickhouseexporter/internal/metrics_model.go Outdated Show resolved Hide resolved
exporter/clickhouseexporter/internal/metrics_model.go Outdated Show resolved Hide resolved
Copy link
Contributor

@MovieStoreGuy MovieStoreGuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like my comments have been addressed so I will remove my requests for changes however, I do believe there is some optimisations to be had with handling strings.

Comment on lines +35 to +36
mutex := sync.Mutex{}
var items int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future, if you wanted to not use a mutex, you could use atomic.Int64 instead :)

return metrics
}

func mustPushMetricsData(t *testing.T, exporter *metricsExporter, md pmetric.Metrics) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tb testing.TB instead of *testing.T so that they can be used between benchmarks, tests, and fuzzing :)

require.NoError(t, err)
}

func newTestMetricsExporter(t *testing.T, dsn string, fns ...func(*Config)) *metricsExporter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

createSummaryTableSQL: {},
}

var logger *zap.Logger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not using a package level variable since it can be rather hazardous due to concurrent usage.

Comment on lines +59 to +62
// SetLogger set a logger instance
func SetLogger(l *zap.Logger) {
logger = l
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method isn't safe for concurrent usage and not a pattern I'd like to encourage in the project.

If a method / function needs a logger, please add it as part of the params or type instead.

return fmt.Errorf("db.Begin: %w", err)
}
defer func() {
_ = tx.Rollback()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that on each Transaction that a rollback is performed?

rm.Resource().Attributes().PutStr("service.name", "demo 1")
rm.Resource().Attributes().PutStr("Resource Attributes 1", "value1")
rm.Resource().SetDroppedAttributesCount(10)
rm.SetSchemaUrl("Resource SchemaUrl 1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a properly formatted URL here, http://example.com/schema/1.0.0 is fine to use.

TracesTableName: "otel_traces",
TTLDays: 7,
TimeoutSettings: exporterhelper.NewDefaultTimeoutSettings(),
QueueSettings: QueueSettings{QueueSize: exporterhelper.NewDefaultQueueSettings().QueueSize},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to do:

QueueSettings: exporterhelper.NewDefaultQueueSettings()

Comment on lines +31 to +68
CREATE TABLE IF NOT EXISTS %s_exponential_histogram (
ResourceAttributes Map(LowCardinality(String), String) CODEC(ZSTD(1)),
ResourceSchemaUrl String CODEC(ZSTD(1)),
ScopeName String CODEC(ZSTD(1)),
ScopeVersion String CODEC(ZSTD(1)),
ScopeAttributes Map(LowCardinality(String), String) CODEC(ZSTD(1)),
ScopeDroppedAttrCount UInt32 CODEC(ZSTD(1)),
ScopeSchemaUrl String CODEC(ZSTD(1)),
MetricName String CODEC(ZSTD(1)),
MetricDescription String CODEC(ZSTD(1)),
MetricUnit String CODEC(ZSTD(1)),
Attributes Map(LowCardinality(String), String) CODEC(ZSTD(1)),
StartTimeUnix DateTime64(9) CODEC(Delta, ZSTD(1)),
TimeUnix DateTime64(9) CODEC(Delta, ZSTD(1)),
Count Int64 CODEC(Delta, ZSTD(1)),
Sum Float64 CODEC(ZSTD(1)),
Scale Int32 CODEC(ZSTD(1)),
ZeroCount UInt64 CODEC(ZSTD(1)),
PositiveOffset Int32 CODEC(ZSTD(1)),
PositiveBucketCounts Array(UInt64) CODEC(ZSTD(1)),
NegativeOffset Int32 CODEC(ZSTD(1)),
NegativeBucketCounts Array(UInt64) CODEC(ZSTD(1)),
Exemplars Nested (
FilteredAttributes Map(LowCardinality(String), String),
TimeUnix DateTime64(9),
Value Float64,
SpanId String,
TraceId String
) CODEC(ZSTD(1)),
Flags UInt32 CODEC(ZSTD(1)),
Min Float64 CODEC(ZSTD(1)),
Max Float64 CODEC(ZSTD(1))
) ENGINE MergeTree()
%s
PARTITION BY toDate(TimeUnix)
ORDER BY (MetricName, Attributes, toUnixTimestamp64Nano(TimeUnix))
SETTINGS index_granularity=8192, ttl_only_drop_parts = 1;
`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way this can be tested?


start := time.Now()
err := doWithTx(ctx, db, func(tx *sql.Tx) error {
_, err := tx.ExecContext(ctx, fmt.Sprintf("%s %s", e.insertSQL, strings.TrimSuffix(b.String(), ",")), valueArgs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a concern with large amounts of inserts being done within the same transaction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. it's not a transaction, clickhouse driver will do batch insert.

@MovieStoreGuy MovieStoreGuy self-requested a review January 16, 2023 10:37
@MovieStoreGuy MovieStoreGuy dismissed their stale review January 16, 2023 10:38

Leaving final approval for code owner

exporter/clickhouseexporter/README.md Outdated Show resolved Hide resolved
@dmitryax dmitryax added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Jan 19, 2023
@dmitryax dmitryax merged commit 8821292 into open-telemetry:main Jan 20, 2023
@plantfansam plantfansam mentioned this pull request Jul 21, 2023
@Frapschen Frapschen deleted the clickhouse_metrics_exporter branch August 3, 2024 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/clickhouse Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants