Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/jaegerremotesampling]: Replace Thrift-gen with Proto-gen types for sampling strategies #18485

Conversation

frzifus
Copy link
Member

@frzifus frzifus commented Feb 9, 2023

Description:

Replace Thrift-gen with Proto-gen types for sampling strategies. This change breaks existing clients.

Do we have to take this into account and first inform the users a few releases ahead?
It seems to be a bit more effort to support both formats over a few releases and then deprecate thrift.

wdyt?

Link to tracking Issue:
Closes #18401
Ref #18047

--

cc @jpkrohling

@frzifus frzifus requested a review from a team February 9, 2023 13:40
@frzifus frzifus changed the title [extension/jaegerremotesampling]: sampling - s/thrift-gen/proto-gen/ [extension/jaegerremotesampling]: Replace Thrift-gen with Proto-gen types for sampling strategies Feb 9, 2023
@runforesight
Copy link

runforesight bot commented Feb 9, 2023

Foresight Summary

    
Major Impacts

build-and-test duration(27 minutes 56 seconds) has decreased 30 minutes 27 seconds compared to main branch avg(58 minutes 23 seconds).
View More Details

⭕  build-and-test-windows workflow has finished in 5 seconds (40 minutes 44 seconds less than main branch avg.) and finished at 9th Feb, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

✅  check-links workflow has finished in 1 minute 22 seconds (1 minute 12 seconds less than main branch avg.) and finished at 9th Feb, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 1 minute 23 seconds (1 minute 35 seconds less than main branch avg.) and finished at 9th Feb, 2023.


Job Failed Steps Tests
publish-latest -     🔗  N/A See Details
build-dev -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  changelog workflow has finished in 3 minutes 18 seconds and finished at 9th Feb, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

❌  prometheus-compliance-tests workflow has finished in 7 minutes 32 seconds (1 minute 12 seconds less than main branch avg.) and finished at 9th Feb, 2023. 1 job failed.


Job Failed Steps Tests
prometheus-compliance-tests Run make otelcontribcol     🔗  N/A See Details

❌  e2e-tests workflow has finished in 8 minutes 56 seconds (11 minutes 39 seconds less than main branch avg.) and finished at 9th Feb, 2023. 1 job failed.


Job Failed Steps Tests
kubernetes-test Build Collector     🔗  N/A See Details

❌  build-and-test workflow has finished in 27 minutes 56 seconds (30 minutes 27 seconds less than main branch avg.) and finished at 9th Feb, 2023. 6 jobs failed.


Job Failed Steps Tests
correctness-metrics -     🔗  ✅ 2  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, internal) -     🔗  ✅ 561  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, internal) -     🔗  ✅ 561  ❌ 0  ⏭ 0    🔗 See Details
correctness-traces -     🔗  ✅ 17  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, processor) -     🔗  ✅ 1472  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, extension) -     🔗  ✅ 546  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, extension) -     🔗  ✅ 546  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, processor) -     🔗  ✅ 1472  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, receiver-0) -     🔗  ✅ 2574  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-0) -     🔗  ✅ 2574  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, other) Run Unit Tests     🔗  ✅ 62  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, exporter) -     🔗  ✅ 2444  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-1) -     🔗  ✅ 1389  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, receiver-1) -     🔗  ✅ 1116  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, exporter) -     🔗  ✅ 1683  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.18, other) -     🔗  ✅ 62  ❌ 0  ⏭ 0    🔗 See Details
integration-tests -     🔗  ✅ 55  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
check-codeowners -     🔗  N/A See Details
checks -     🔗  N/A See Details
build-examples Build Examples     🔗  N/A See Details
lint-matrix (receiver-0) -     🔗  N/A See Details
check-collector-module-version -     🔗  N/A See Details
lint-matrix (receiver-1) -     🔗  N/A See Details
lint-matrix (processor) -     🔗  N/A See Details
lint-matrix (exporter) -     🔗  N/A See Details
lint-matrix (extension) -     🔗  N/A See Details
lint-matrix (other) Lint     🔗  N/A See Details
lint-matrix (internal) -     🔗  N/A See Details
unittest (1.19) Interpret result     🔗  N/A See Details
unittest (1.18) Interpret result     🔗  N/A See Details
lint Interpret result     🔗  N/A See Details
cross-compile -     🔗  N/A See Details
windows-msi -     🔗  N/A See Details
build-package -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-dev -     🔗  N/A See Details
publish-check -     🔗  N/A See Details

✅  load-tests workflow has finished in 20 minutes 15 seconds (⚠️ 3 minutes 31 seconds more than main branch avg.) and finished at 9th Feb, 2023.


Job Failed Steps Tests
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestBallastMemory|TestLog10kDPS) -     🔗  ✅ 19  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

@frzifus frzifus force-pushed the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch 3 times, most recently from dbc6a68 to ae861fc Compare February 9, 2023 14:43
@github-actions github-actions bot added the cmd/otelcontribcol otelcontribcol command label Feb 9, 2023
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general breaking changes are done through the use of a feature gate first.

  • first release: Starting with the gate as an opt-in with a warning of the deprecation for the breaking change,
  • second release: changing the gate to be an opt-out and changing the default behaviour to the new one
  • third release: finally removing the gate

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 24, 2023
@frzifus frzifus removed the Stale label Mar 1, 2023
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label May 26, 2023
@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Jun 10, 2023
@frzifus frzifus reopened this Jun 12, 2023
@frzifus
Copy link
Member Author

frzifus commented Jun 12, 2023

Ups - I thought that was already done. I will look into it.

@github-actions github-actions bot removed the Stale label Jun 13, 2023
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 27, 2023
@jpkrohling jpkrohling removed the Stale label Jun 27, 2023
@frzifus frzifus force-pushed the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch 2 times, most recently from 724a896 to ed44c9c Compare July 1, 2023 01:10
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 22, 2023
@frzifus frzifus removed the Stale label Jul 24, 2023
Signed-off-by: Benedikt Bongartz <[email protected]>
@frzifus frzifus force-pushed the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch from e3d6034 to b3045aa Compare August 31, 2023 01:04
@jpkrohling
Copy link
Member

Once @codeboten approves, this is ready to be merged.

@frzifus
Copy link
Member Author

frzifus commented Aug 31, 2023

Resolved another conflict

@frzifus frzifus force-pushed the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch from d76a9bd to 6468cce Compare August 31, 2023 14:54
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really close, just a couple of questions

receiver/jaegerreceiver/factory.go Outdated Show resolved Hide resolved
@@ -72,7 +72,10 @@ func TestEndpointsAreWired(t *testing.T) {
resp.Body.Close()

body := string(samplingStrategiesBytes)
assert.Equal(t, `{"strategyType":"PROBABILISTIC"}`, body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this question been answered?

Signed-off-by: Benedikt Bongartz <[email protected]>
@frzifus frzifus requested a review from codeboten August 31, 2023 21:12
@frzifus frzifus force-pushed the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch from 3d48cf7 to 4ea501c Compare August 31, 2023 22:41
@frzifus frzifus force-pushed the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch from 4ea501c to f99ee72 Compare August 31, 2023 22:54
@frzifus
Copy link
Member Author

frzifus commented Sep 4, 2023

@codeboten should be fixed. Is there something else missing? :)

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @frzifus

@codeboten codeboten merged commit 6044cc2 into open-telemetry:main Sep 6, 2023
90 checks passed
@github-actions github-actions bot added this to the next release milestone Sep 6, 2023
@frzifus frzifus deleted the jaegerremotesampling_replace_thrift-gen_with_proto-gen branch September 6, 2023 16:35
@zcross
Copy link
Contributor

zcross commented Sep 14, 2023

@frzifus

I'm trying to use this locally (flipping on and off the feature gate) and I keep seeing the warning message about upcoming deprecation.

Does that mean that I'm:

  1. Simply specifying the command name incorrectly?
  2. Running into Featuregates don't work until late in initialization opentelemetry-collector#4967 which suggests that CLI-specified feature gate settings are not necessarily visible to component factories due to a race condition?

@mx-psi
Copy link
Member

mx-psi commented Sep 15, 2023

@zcross Thanks for reporting this, would you mind opening a separate issue for this?

@frzifus
Copy link
Member Author

frzifus commented Sep 16, 2023

@zcross the warning will always be printed. But... 😮‍💨

Something is defiantly wrong. :/ Adding a fmt.Println(protoGate.IsEnabled()) and running the following command multiple times shows different results.

./bin/otelcontribcol_linux_amd64 --config=config.yaml --feature-gates=+extension.jaegerremotesampling.replaceThriftWithProto

What wonders me even more. Because in the generated main file, the factories of all existing components are created first and then afterwards the CLI commands are evaluated.

otelcol Stacktrace
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x5e
runtime/debug.PrintStack()
	runtime/debug/stack.go:16 +0x13
go.opentelemetry.io/collector/featuregate.(*Registry).Set(0x7ffdbf16fbf9?, {0x7ffdbf16fbfa?, 0x1229970?}, 0x1?)
	go.opentelemetry.io/collector/featuregate@v1.0.0-rcv0014/registry.go:114 +0xb1
go.opentelemetry.io/collector/featuregate.(*flagValue).Set(0xc0001320d8, {0x7ffdbf16fbf9?, 0xc0004bc410?})
	go.opentelemetry.io/collector/featuregate@v1.0.0-rcv0014/flag.go:52 +0x76
github.com/spf13/pflag.(*flagValueWrapper).Set(0xf3b6e0?, {0x7ffdbf16fbf9?, 0x10c87ac?})
	github.com/spf13/pflag@v1.0.5/golangflag.go:53 +0x22
github.com/spf13/pflag.(*FlagSet).Set(0xc000162500, {0x10c87ac, 0xd}, {0x7ffdbf16fbf9, 0x36})
	github.com/spf13/pflag@v1.0.5/flag.go:463 +0xa2
github.com/spf13/pflag.(*FlagSet).Parse.func1(0xf3b6e0?, {0x7ffdbf16fbf9?, 0x7ffdbf16fbeb?})
	github.com/spf13/pflag@v1.0.5/flag.go:1138 +0x32
github.com/spf13/pflag.(*FlagSet).parseLongArg(0xc000162500, {0x7ffdbf16fbe9, 0x46}, {0xc000136200, 0x0, 0x0}, 0xc00053fa50)
	github.com/spf13/pflag@v1.0.5/flag.go:1000 +0x24f
github.com/spf13/pflag.(*FlagSet).parseArgs(0xc000162500, {0xc0001361f0?, 0xc000004600?, 0xdfa500?}, 0xc000162b00?)
	github.com/spf13/pflag@v1.0.5/flag.go:1108 +0x1ae
github.com/spf13/pflag.(*FlagSet).Parse(0xc000162500, {0xc0001361f0, 0x2, 0x2})
	github.com/spf13/pflag@v1.0.5/flag.go:1141 +0xd8
github.com/spf13/cobra.(*Command).ParseFlags(0xc000004600, {0xc0001361f0, 0x2, 0x2})
	github.com/spf13/cobra@v1.7.0/command.go:1782 +0xd2
github.com/spf13/cobra.(*Command).execute(0xc000004600, {0xc0001361f0, 0x2, 0x2})
	github.com/spf13/cobra@v1.7.0/command.go:861 +0x137
github.com/spf13/cobra.(*Command).ExecuteC(0xc000004600)
	github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5
github.com/spf13/cobra.(*Command).Execute(0xc0004be450?)
	github.com/spf13/cobra@v1.7.0/command.go:992 +0x13
main.runInteractive({{0xc0004be450, 0xc0004be4e0, 0xc0004be480, 0xc0004be420, 0xc0004be510}, {{0x10c6d67, 0xb}, {0x10f27a3, 0x33}, {0x10c5f93, ...}}, ...})
	go.opentelemetry.io/collector/cmd/otelcorecol/main.go:35 +0x45
main.run(...)
	go.opentelemetry.io/collector/cmd/otelcorecol/main_others.go:11
main.main()
	go.opentelemetry.io/collector/cmd/otelcorecol/main.go:28 +0x1d8

@davispuh
Copy link
Contributor

davispuh commented Oct 18, 2024

In logs I see this

{"level":"warn","ts":1729291563.3518195,"caller":"[email protected]/factory.go:49","msg":"jaeger receiver will deprecate Thrift-gen and replace it with Proto-gen to be compatbible to jaeger 1.42.0 and higher. See #18485 for more details.","kind":"receiver","name":"jaeger","data_type":"traces"}

How do I switch to the new behavior (Proto-gen) so that it doesn't output this warning?

My current config is like this:

receivers:
  jaeger:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14250
      thrift_http:
        endpoint: 0.0.0.0:14268

@mx-psi
Copy link
Member

mx-psi commented Oct 21, 2024

@davispuh Would you mind filing a separate issue for this?

@davispuh
Copy link
Contributor

Okay I created #35894

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bump github.com/jaegertracing/jaeger from 1.41.0 to 1.42.0 causes issues
9 participants