diff --git a/.github/workflows/markdown.yml b/.github/workflows/markdown.yml new file mode 100644 index 00000000000..412e6a3cae4 --- /dev/null +++ b/.github/workflows/markdown.yml @@ -0,0 +1,33 @@ +name: markdown +on: + push: + branches: + - main + pull_request: +jobs: + changedfiles: + name: changed files + runs-on: ubuntu-latest + outputs: + md: ${{ steps.changes.outputs.md }} + steps: + - name: Checkout Repo + uses: actions/checkout@v2 + with: + fetch-depth: 0 + - name: Get changed files + id: changes + run: | + echo "::set-output name=md::$(git diff --name-only --diff-filter=ACMRTUXB origin/${{ github.event.pull_request.base.ref }} ${{ github.event.pull_request.head.sha }} | grep .md$ | xargs)" + lint: + name: lint markdown files + runs-on: ubuntu-latest + needs: changedfiles + if: ${{needs.changedfiles.outputs.md}} + steps: + - name: Checkout Repo + uses: actions/checkout@v2 + - name: Run linter + uses: docker://avtodev/markdown-lint:v1 + with: + args: ${{needs.changedfiles.outputs.md}} diff --git a/CHANGELOG.md b/CHANGELOG.md index fd15776c2b9..b87369dddf8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,11 +10,25 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm ### Added +- Adds `otlpgrpc.WithRetry`option for configuring the retry policy for transient errors on the otlp/gRPC exporter. (#1832) + - The following status codes are defined as transient errors: + | gRPC Status Code | Description | + | ---------------- | ----------- | + | 1 | Cancelled | + | 4 | Deadline Exceeded | + | 8 | Resource Exhausted | + | 10 | Aborted | + | 10 | Out of Range | + | 14 | Unavailable | + | 15 | Data Loss | + ### Changed - Make `NewSplitDriver` from `go.opentelemetry.io/otel/exporters/otlp` take variadic arguments instead of a `SplitConfig` item. - `NewSplitDriver` now automically implements an internal `noopDriver` for `SplitConfig` fields that are not initialized. (#1798) + `NewSplitDriver` now automatically implements an internal `noopDriver` for `SplitConfig` fields that are not initialized. (#1798) - `resource.New()` now creates a Resource without builtin detectors. Previous behavior is now achieved by using `WithBuiltinDetectors` Option. (#1810) +- Move the `Event` type from the `go.opentelemetry.io/otel` package to the `go.opentelemetry.io/otel/sdk/trace` package. (#1846) +- BatchSpanProcessor now report export failures when calling `ForceFlush()` method. (#1860) ### Deprecated @@ -25,6 +39,10 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm ### Fixed +- Only report errors from the `"go.opentelemetry.io/otel/sdk/resource".Environment` function when they are not `nil`. (#1850, #1851) +- The `Shutdown` method of the simple `SpanProcessor` in the `go.opentelemetry.io/otel/sdk/trace` package now honors the context deadline or cancellation. (#1616, #1856) +- BatchSpanProcessor now drops span batches that failed to be exported. (#1860) + ### Security ## [0.20.0] - 2021-04-23 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 73df4022c30..f11608b5e19 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -15,7 +15,7 @@ join the meeting or get in touch on You can view and edit the source code by cloning this repository: -```bash +```sh git clone https://github.com/open-telemetry/opentelemetry-go.git ``` @@ -166,25 +166,25 @@ to the reasons why. ### Configuration -When creating an instantiation function for a complex `struct` it is useful -to allow variable number of options to be applied. However, the strong type -system of Go restricts the function design options. There are a few ways to -solve this problem, but we have landed on the following design. +When creating an instantiation function for a complex `type T struct`, it is +useful to allow variable number of options to be applied. However, the strong +type system of Go restricts the function design options. There are a few ways +to solve this problem, but we have landed on the following design. #### `config` Configuration should be held in a `struct` named `config`, or prefixed with specific type name this Configuration applies to if there are multiple -`config` in the package. This `struct` must contain configuration options. +`config` in the package. This type must contain configuration options. ```go // config contains configuration options for a thing. type config struct { - // options ... + // options ... } ``` -In general the `config` `struct` will not need to be used externally to the +In general the `config` type will not need to be used externally to the package and should be unexported. If, however, it is expected that the user will likely want to build custom options for the configuration, the `config` should be exported. Please, include in the documentation for the `config` @@ -200,13 +200,13 @@ all options to create a configured `config`. ```go // newConfig returns an appropriately configured config. func newConfig([]Option) config { - // Set default values for config. - config := config{/* […] */} - for _, option := range options { - option.Apply(&config) - } - // Preform any validation here. - return config + // Set default values for config. + config := config{/* […] */} + for _, option := range options { + option.apply(&config) + } + // Preform any validation here. + return config } ``` @@ -224,10 +224,14 @@ To set the value of the options a `config` contains, a corresponding ```go type Option interface { - Apply(*config) + apply(*config) } ``` +Having `apply` unexported makes sure that it will not be used externally. +Moreover, the interface becomes sealed so the user cannot easily implement +the interface on its own. + The name of the interface should be prefixed in the same way the corresponding `config` is (if at all). @@ -250,53 +254,53 @@ func With*(…) Option { … } ```go type defaultFalseOption bool -func (o defaultFalseOption) Apply(c *config) { - c.Bool = bool(o) +func (o defaultFalseOption) apply(c *config) { + c.Bool = bool(o) } -// WithOption sets a T* to have an option included. +// WithOption sets a T to have an option included. func WithOption() Option { - return defaultFalseOption(true) + return defaultFalseOption(true) } ``` ```go type defaultTrueOption bool -func (o defaultTrueOption) Apply(c *config) { - c.Bool = bool(o) +func (o defaultTrueOption) apply(c *config) { + c.Bool = bool(o) } -// WithoutOption sets a T* to have Bool option excluded. +// WithoutOption sets a T to have Bool option excluded. func WithoutOption() Option { - return defaultTrueOption(false) + return defaultTrueOption(false) } -```` +``` ##### Declared Type Options ```go type myTypeOption struct { - MyType MyType + MyType MyType } -func (o myTypeOption) Apply(c *config) { - c.MyType = o.MyType +func (o myTypeOption) apply(c *config) { + c.MyType = o.MyType } -// WithMyType sets T* to have include MyType. +// WithMyType sets T to have include MyType. func WithMyType(t MyType) Option { - return myTypeOption{t} + return myTypeOption{t} } ``` #### Instantiation -Using this configuration pattern to configure instantiation with a `New*` +Using this configuration pattern to configure instantiation with a `NewT` function. ```go -func NewT*(options ...Option) T* {…} +func NewT(options ...Option) T {…} ``` Any required parameters can be declared before the variadic `options`. @@ -320,12 +324,12 @@ type config struct { // DogOption apply Dog specific options. type DogOption interface { - ApplyDog(*config) + applyDog(*config) } // BirdOption apply Bird specific options. type BirdOption interface { - ApplyBird(*config) + applyBird(*config) } // Option apply options for all animals. @@ -335,16 +339,16 @@ type Option interface { } type weightOption float64 -func (o weightOption) ApplyDog(c *config) { c.Weight = float64(o) } -func (o weightOption) ApplyBird(c *config) { c.Weight = float64(o) } +func (o weightOption) applyDog(c *config) { c.Weight = float64(o) } +func (o weightOption) applyBird(c *config) { c.Weight = float64(o) } func WithWeight(w float64) Option { return weightOption(w) } type furColorOption string -func (o furColorOption) ApplyDog(c *config) { c.Color = string(o) } +func (o furColorOption) applyDog(c *config) { c.Color = string(o) } func WithFurColor(c string) DogOption { return furColorOption(c) } type maxAltitudeOption float64 -func (o maxAltitudeOption) ApplyBird(c *config) { c.MaxAltitude = float64(o) } +func (o maxAltitudeOption) applyBird(c *config) { c.MaxAltitude = float64(o) } func WithMaxAltitude(a float64) BirdOption { return maxAltitudeOption(a) } func NewDog(name string, o ...DogOption) Dog {…} diff --git a/RELEASING.md b/RELEASING.md index eb9c6e7171f..b6cbd1231b7 100644 --- a/RELEASING.md +++ b/RELEASING.md @@ -75,6 +75,13 @@ After releasing verify that examples build outside of the repository. The script copies examples into a different directory removes any `replace` declarations in `go.mod` and builds them. This ensures they build with the published release, not the local copy. -## Contrib Repository +## Post-Release + +### Contrib Repository Once verified be sure to [make a release for the `contrib` repository](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/RELEASING.md) that uses this release. + +### Website Documentation + +Update [the documentation](./website_docs) for [the OpenTelemetry website](https://opentelemetry.io/docs/go/). +Importantly, bump any package versions referenced to be the latest one you just released and ensure all code examples still compile and are accurate. diff --git a/bridge/opencensus/README.md b/bridge/opencensus/README.md index 58ac90c8387..8e1c0922e7c 100644 --- a/bridge/opencensus/README.md +++ b/bridge/opencensus/README.md @@ -10,7 +10,7 @@ In a perfect world, one would simply migrate their entire go application --inclu However, if you create the following spans in a go application: -```golang +```go ctx, ocSpan := opencensus.StartSpan(context.Background(), "OuterSpan") defer ocSpan.End() ctx, otSpan := opentelemetryTracer.Start(ctx, "MiddleSpan") @@ -54,11 +54,11 @@ Starting from an application using entirely OpenCensus APIs: 4. Remove OpenCensus exporters and configuration To override OpenCensus' DefaultTracer with the bridge: -```golang +```go import ( - octrace "go.opencensus.io/trace" - "go.opentelemetry.io/otel/bridge/opencensus" - "go.opentelemetry.io/otel" + octrace "go.opencensus.io/trace" + "go.opentelemetry.io/otel/bridge/opencensus" + "go.opentelemetry.io/otel" ) tracer := otel.GetTracerProvider().Tracer("bridge") @@ -102,12 +102,12 @@ Starting from an application using entirely OpenCensus APIs: 4. Remove OpenCensus Exporters and configuration. For example, to swap out the OpenCensus logging exporter for the OpenTelemetry stdout exporter: -```golang +```go import ( "go.opencensus.io/metric/metricexport" - "go.opentelemetry.io/otel/bridge/opencensus" + "go.opentelemetry.io/otel/bridge/opencensus" "go.opentelemetry.io/otel/exporters/stdout" - "go.opentelemetry.io/otel" + "go.opentelemetry.io/otel" ) // With OpenCensus, you could have previously configured the logging exporter like this: // import logexporter "go.opencensus.io/examples/exporter" diff --git a/example/otel-collector/go.sum b/example/otel-collector/go.sum index 65606007ace..cf9ba6783d4 100644 --- a/example/otel-collector/go.sum +++ b/example/otel-collector/go.sum @@ -4,6 +4,8 @@ github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03 github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY= github.com/benbjohnson/clock v1.0.3 h1:vkLuvpK4fmtSCuo60+yC63p7y0BmQ8gm5ZXGuBCJyXg= github.com/benbjohnson/clock v1.0.3/go.mod h1:bGMdMPoPVvcYyt1gHDf4J2KE153Yf9BuiUKYMaxlTDM= +github.com/cenkalti/backoff/v4 v4.1.0 h1:c8LkOFQTzuO0WBM/ae5HdGQuZPfPxp7lqBRwQRm4fSc= +github.com/cenkalti/backoff/v4 v4.1.0/go.mod h1:scbssz8iZGpm3xbr14ovlUdkxfGXNInqkPWOWmG2CLw= github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU= github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= github.com/cncf/udpa/go v0.0.0-20191209042840-269d4d468f6f/go.mod h1:M8M6+tZqaGXZJjfX53e64911xZQV5JYwmTeXPW+k8Sc= diff --git a/example/prom-collector/go.sum b/example/prom-collector/go.sum index d5b8faed505..a216f187f13 100644 --- a/example/prom-collector/go.sum +++ b/example/prom-collector/go.sum @@ -29,7 +29,10 @@ github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM= github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= github.com/bgentry/speakeasy v0.1.0/go.mod h1:+zsyZBPWlz7T6j88CTgSN5bM796AkVf0kBD4zp0CCIs= github.com/casbin/casbin/v2 v2.1.2/go.mod h1:YcPU1XXisHhLzuxH9coDNf2FbKpjGlbCg3n9yuLkIJQ= +github.com/cenkalti/backoff v2.2.1+incompatible h1:tNowT99t7UNflLxfYYSlKYsBpXdEet03Pg2g16Swow4= github.com/cenkalti/backoff v2.2.1+incompatible/go.mod h1:90ReRw6GdpyfrHakVjL/QHaoyV4aDUVVkXQJJJ3NXXM= +github.com/cenkalti/backoff/v4 v4.1.0 h1:c8LkOFQTzuO0WBM/ae5HdGQuZPfPxp7lqBRwQRm4fSc= +github.com/cenkalti/backoff/v4 v4.1.0/go.mod h1:scbssz8iZGpm3xbr14ovlUdkxfGXNInqkPWOWmG2CLw= github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU= github.com/cespare/xxhash/v2 v2.1.1 h1:6MnRN8NT7+YBpUIWxHtefFZOKTAPgGjpQSxqLNn0+qY= github.com/cespare/xxhash/v2 v2.1.1/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= diff --git a/exporters/otlp/README.md b/exporters/otlp/README.md index 0cd71f330bc..5c66b93bb67 100644 --- a/exporters/otlp/README.md +++ b/exporters/otlp/README.md @@ -4,28 +4,12 @@ This exporter exports OpenTelemetry spans and metrics to the OpenTelemetry Collector. - ## Installation and Setup The exporter can be installed using standard `go` functionality. ```bash -$ go get -u go.opentelemetry.io/otel/exporters/otlp +go get -u go.opentelemetry.io/otel/exporters/otlp ``` A new exporter can be created using the `NewExporter` function. - -## Retries - -The exporter will not, by default, retry failed requests to the collector. -However, it is configured in a way that it can be easily enabled. - -To enable retries, the `GRPC_GO_RETRY` environment variable needs to be set to `on`. For example, - -``` -GRPC_GO_RETRY=on go run . -``` - -The [default service config](https://github.com/grpc/proposal/blob/master/A6-client-retries.md) used by default is defined to retry failed requests with exponential backoff (`0.3seconds * (2)^retry`) with [a max of `5` retries](https://github.com/open-telemetry/oteps/blob/be2a3fcbaa417ebbf5845cd485d34fdf0ab4a2a4/text/0035-opentelemetry-protocol.md#export-response)). - -These retries are only attempted for reponses that are [deemed "retry-able" by the collector](https://github.com/grpc/proposal/blob/master/A6-client-retries.md#validation-of-retrypolicy). diff --git a/exporters/otlp/go.mod b/exporters/otlp/go.mod index fe4fdba8535..70e9232c7cb 100644 --- a/exporters/otlp/go.mod +++ b/exporters/otlp/go.mod @@ -8,6 +8,7 @@ replace ( ) require ( + github.com/cenkalti/backoff/v4 v4.1.0 github.com/google/go-cmp v0.5.5 github.com/stretchr/testify v1.7.0 go.opentelemetry.io/otel v0.20.0 @@ -17,6 +18,7 @@ require ( go.opentelemetry.io/otel/sdk/metric v0.20.0 go.opentelemetry.io/otel/trace v0.20.0 go.opentelemetry.io/proto/otlp v0.7.0 + google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013 google.golang.org/grpc v1.37.0 google.golang.org/protobuf v1.26.0 ) diff --git a/exporters/otlp/go.sum b/exporters/otlp/go.sum index 65606007ace..cf9ba6783d4 100644 --- a/exporters/otlp/go.sum +++ b/exporters/otlp/go.sum @@ -4,6 +4,8 @@ github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03 github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY= github.com/benbjohnson/clock v1.0.3 h1:vkLuvpK4fmtSCuo60+yC63p7y0BmQ8gm5ZXGuBCJyXg= github.com/benbjohnson/clock v1.0.3/go.mod h1:bGMdMPoPVvcYyt1gHDf4J2KE153Yf9BuiUKYMaxlTDM= +github.com/cenkalti/backoff/v4 v4.1.0 h1:c8LkOFQTzuO0WBM/ae5HdGQuZPfPxp7lqBRwQRm4fSc= +github.com/cenkalti/backoff/v4 v4.1.0/go.mod h1:scbssz8iZGpm3xbr14ovlUdkxfGXNInqkPWOWmG2CLw= github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU= github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= github.com/cncf/udpa/go v0.0.0-20191209042840-269d4d468f6f/go.mod h1:M8M6+tZqaGXZJjfX53e64911xZQV5JYwmTeXPW+k8Sc= diff --git a/exporters/otlp/internal/otlpconfig/options.go b/exporters/otlp/internal/otlpconfig/options.go index 3fd4a30dd38..53fe06e0618 100644 --- a/exporters/otlp/internal/otlpconfig/options.go +++ b/exporters/otlp/internal/otlpconfig/options.go @@ -42,31 +42,16 @@ const ( // DefaultTimeout is a default max waiting time for the backend to process // each span or metrics batch. DefaultTimeout time.Duration = 10 * time.Second - // DefaultServiceConfig is the gRPC service config used if none is - // provided by the user. - DefaultServiceConfig = `{ - "methodConfig":[{ - "name":[ - { "service":"opentelemetry.proto.collector.metrics.v1.MetricsService" }, - { "service":"opentelemetry.proto.collector.trace.v1.TraceService" } - ], - "retryPolicy":{ - "MaxAttempts":5, - "InitialBackoff":"0.3s", - "MaxBackoff":"5s", - "BackoffMultiplier":2, - "RetryableStatusCodes":[ - "CANCELLED", - "DEADLINE_EXCEEDED", - "RESOURCE_EXHAUSTED", - "ABORTED", - "OUT_OF_RANGE", - "UNAVAILABLE", - "DATA_LOSS" - ] - } - }] -}` +) + +var ( + // defaultRetrySettings is a default settings for the retry policy. + defaultRetrySettings = otlp.RetrySettings{ + Enabled: true, + InitialInterval: 5 * time.Second, + MaxInterval: 30 * time.Second, + MaxElapsedTime: time.Minute, + } ) type ( @@ -88,17 +73,16 @@ type ( Metrics SignalConfig Traces SignalConfig - // General configurations + // HTTP configurations + Marshaler otlp.Marshaler MaxAttempts int Backoff time.Duration - // HTTP configuration - Marshaler otlp.Marshaler - // gRPC configurations ReconnectionPeriod time.Duration ServiceConfig string DialOptions []grpc.DialOption + RetrySettings otlp.RetrySettings } ) @@ -118,7 +102,7 @@ func NewDefaultConfig() Config { }, MaxAttempts: DefaultMaxAttempts, Backoff: DefaultBackoff, - ServiceConfig: DefaultServiceConfig, + RetrySettings: defaultRetrySettings, } return c @@ -280,15 +264,9 @@ func WithMetricsURLPath(urlPath string) GenericOption { }) } -func WithMaxAttempts(maxAttempts int) GenericOption { - return newGenericOption(func(cfg *Config) { - cfg.MaxAttempts = maxAttempts - }) -} - -func WithBackoff(duration time.Duration) GenericOption { +func WithRetry(settings otlp.RetrySettings) GenericOption { return newGenericOption(func(cfg *Config) { - cfg.Backoff = duration + cfg.RetrySettings = settings }) } @@ -374,3 +352,15 @@ func WithMetricsTimeout(duration time.Duration) GenericOption { cfg.Metrics.Timeout = duration }) } + +func WithMaxAttempts(maxAttempts int) GenericOption { + return newGenericOption(func(cfg *Config) { + cfg.MaxAttempts = maxAttempts + }) +} + +func WithBackoff(duration time.Duration) GenericOption { + return newGenericOption(func(cfg *Config) { + cfg.Backoff = duration + }) +} diff --git a/exporters/otlp/internal/otlptest/data.go b/exporters/otlp/internal/otlptest/data.go index 84e00ccb42e..0a61f101cf7 100644 --- a/exporters/otlp/internal/otlptest/data.go +++ b/exporters/otlp/internal/otlptest/data.go @@ -97,7 +97,7 @@ func SingleSpanSnapshot() []*tracesdk.SpanSnapshot { StartTime: time.Date(2020, time.December, 8, 20, 23, 0, 0, time.UTC), EndTime: time.Date(2020, time.December, 0, 20, 24, 0, 0, time.UTC), Attributes: []attribute.KeyValue{}, - MessageEvents: []trace.Event{}, + MessageEvents: []tracesdk.Event{}, Links: []trace.Link{}, StatusCode: codes.Ok, StatusMessage: "", diff --git a/exporters/otlp/internal/transform/span.go b/exporters/otlp/internal/transform/span.go index d24cb1e6e59..342f349a24f 100644 --- a/exporters/otlp/internal/transform/span.go +++ b/exporters/otlp/internal/transform/span.go @@ -168,7 +168,7 @@ func links(links []trace.Link) []*tracepb.Span_Link { } // spanEvents transforms span Events to an OTLP span events. -func spanEvents(es []trace.Event) []*tracepb.Span_Event { +func spanEvents(es []tracesdk.Event) []*tracepb.Span_Event { if len(es) == 0 { return nil } diff --git a/exporters/otlp/internal/transform/span_test.go b/exporters/otlp/internal/transform/span_test.go index 4d9d1710f4c..9d89a80b7bd 100644 --- a/exporters/otlp/internal/transform/span_test.go +++ b/exporters/otlp/internal/transform/span_test.go @@ -73,13 +73,13 @@ func TestNilSpanEvent(t *testing.T) { } func TestEmptySpanEvent(t *testing.T) { - assert.Nil(t, spanEvents([]trace.Event{})) + assert.Nil(t, spanEvents([]tracesdk.Event{})) } func TestSpanEvent(t *testing.T) { attrs := []attribute.KeyValue{attribute.Int("one", 1), attribute.Int("two", 2)} eventTime := time.Date(2020, 5, 20, 0, 0, 0, 0, time.UTC) - got := spanEvents([]trace.Event{ + got := spanEvents([]tracesdk.Event{ { Name: "test 1", Attributes: []attribute.KeyValue{}, @@ -101,9 +101,9 @@ func TestSpanEvent(t *testing.T) { } func TestExcessiveSpanEvents(t *testing.T) { - e := make([]trace.Event, maxMessageEventsPerSpan+1) + e := make([]tracesdk.Event, maxMessageEventsPerSpan+1) for i := 0; i < maxMessageEventsPerSpan+1; i++ { - e[i] = trace.Event{Name: strconv.Itoa(i)} + e[i] = tracesdk.Event{Name: strconv.Itoa(i)} } assert.Len(t, e, maxMessageEventsPerSpan+1) got := spanEvents(e) @@ -215,7 +215,7 @@ func TestSpanData(t *testing.T) { Name: "span data to span data", StartTime: startTime, EndTime: endTime, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ {Time: startTime, Attributes: []attribute.KeyValue{ attribute.Int64("CompressedByteSize", 512), diff --git a/exporters/otlp/otlpgrpc/connection.go b/exporters/otlp/otlpgrpc/connection.go index d904b56e058..097388aabf8 100644 --- a/exporters/otlp/otlpgrpc/connection.go +++ b/exporters/otlp/otlpgrpc/connection.go @@ -16,12 +16,18 @@ package otlpgrpc import ( "context" + "fmt" "math/rand" "sync" "sync/atomic" "time" "unsafe" + "github.com/cenkalti/backoff/v4" + "google.golang.org/genproto/googleapis/rpc/errdetails" + "google.golang.org/grpc/codes" + "google.golang.org/grpc/status" + "google.golang.org/grpc/encoding/gzip" "go.opentelemetry.io/otel/exporters/otlp" @@ -276,3 +282,145 @@ func (c *connection) contextWithStop(ctx context.Context) (context.Context, cont }(ctx, cancel) return ctx, cancel } + +func (c *connection) doRequest(ctx context.Context, fn func(context.Context) error) error { + expBackoff := newExponentialBackoff(c.cfg.RetrySettings) + + for { + err := fn(ctx) + if err == nil { + // request succeeded. + return nil + } + + if !c.cfg.RetrySettings.Enabled { + return err + } + + // We have an error, check gRPC status code. + st := status.Convert(err) + if st.Code() == codes.OK { + // Not really an error, still success. + return nil + } + + // Now, this is this a real error. + + if !shouldRetry(st.Code()) { + // It is not a retryable error, we should not retry. + return err + } + + // Need to retry. + + throttle := getThrottleDuration(st) + + backoffDelay := expBackoff.NextBackOff() + if backoffDelay == backoff.Stop { + // throw away the batch + err = fmt.Errorf("max elapsed time expired: %w", err) + return err + } + + var delay time.Duration + + if backoffDelay > throttle { + delay = backoffDelay + } else { + if expBackoff.GetElapsedTime()+throttle > expBackoff.MaxElapsedTime { + err = fmt.Errorf("max elapsed time expired when respecting server throttle: %w", err) + return err + } + + // Respect server throttling. + delay = throttle + } + + // back-off, but get interrupted when shutting down or request is cancelled or timed out. + err = func() error { + dt := time.NewTimer(delay) + defer dt.Stop() + + select { + case <-ctx.Done(): + return ctx.Err() + case <-c.stopCh: + return fmt.Errorf("interrupted due to shutdown: %w", err) + case <-dt.C: + } + + return nil + }() + + if err != nil { + return err + } + + } +} + +func shouldRetry(code codes.Code) bool { + switch code { + case codes.OK: + // Success. This function should not be called for this code, the best we + // can do is tell the caller not to retry. + return false + + case codes.Canceled, + codes.DeadlineExceeded, + codes.ResourceExhausted, + codes.Aborted, + codes.OutOfRange, + codes.Unavailable, + codes.DataLoss: + // These are retryable errors. + return true + + case codes.Unknown, + codes.InvalidArgument, + codes.Unauthenticated, + codes.PermissionDenied, + codes.NotFound, + codes.AlreadyExists, + codes.FailedPrecondition, + codes.Unimplemented, + codes.Internal: + // These are fatal errors, don't retry. + return false + + default: + // Don't retry on unknown codes. + return false + } +} + +func getThrottleDuration(status *status.Status) time.Duration { + // See if throttling information is available. + for _, detail := range status.Details() { + if t, ok := detail.(*errdetails.RetryInfo); ok { + if t.RetryDelay.Seconds > 0 || t.RetryDelay.Nanos > 0 { + // We are throttled. Wait before retrying as requested by the server. + return time.Duration(t.RetryDelay.Seconds)*time.Second + time.Duration(t.RetryDelay.Nanos)*time.Nanosecond + } + return 0 + } + } + return 0 +} + +func newExponentialBackoff(rs otlp.RetrySettings) *backoff.ExponentialBackOff { + // Do not use NewExponentialBackOff since it calls Reset and the code here must + // call Reset after changing the InitialInterval (this saves an unnecessary call to Now). + expBackoff := &backoff.ExponentialBackOff{ + InitialInterval: rs.InitialInterval, + RandomizationFactor: backoff.DefaultRandomizationFactor, + Multiplier: backoff.DefaultMultiplier, + MaxInterval: rs.MaxInterval, + MaxElapsedTime: rs.MaxElapsedTime, + Stop: backoff.Stop, + Clock: backoff.SystemClock, + } + expBackoff.Reset() + + return expBackoff +} diff --git a/exporters/otlp/otlpgrpc/connection_test.go b/exporters/otlp/otlpgrpc/connection_test.go new file mode 100644 index 00000000000..ea0e5d70b01 --- /dev/null +++ b/exporters/otlp/otlpgrpc/connection_test.go @@ -0,0 +1,90 @@ +// Copyright The OpenTelemetry Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package otlpgrpc + +import ( + "testing" + "time" + + "github.com/stretchr/testify/require" + + "google.golang.org/genproto/googleapis/rpc/errdetails" + "google.golang.org/grpc/codes" + "google.golang.org/grpc/status" + "google.golang.org/protobuf/types/known/durationpb" +) + +func TestGetThrottleDuration(t *testing.T) { + tts := []struct { + stsFn func() (*status.Status, error) + throttle time.Duration + }{ + { + stsFn: func() (*status.Status, error) { + return status.New( + codes.OK, + "status with no retry info", + ), nil + }, + throttle: 0, + }, + { + stsFn: func() (*status.Status, error) { + st := status.New(codes.ResourceExhausted, "status with retry info") + return st.WithDetails( + &errdetails.RetryInfo{RetryDelay: durationpb.New(15 * time.Millisecond)}, + ) + }, + throttle: 15 * time.Millisecond, + }, + { + stsFn: func() (*status.Status, error) { + st := status.New(codes.ResourceExhausted, "status with error info detail") + return st.WithDetails( + &errdetails.ErrorInfo{Reason: "no throttle detail"}, + ) + }, + throttle: 0, + }, + { + stsFn: func() (*status.Status, error) { + st := status.New(codes.ResourceExhausted, "status with error info and retry info") + return st.WithDetails( + &errdetails.ErrorInfo{Reason: "no throttle detail"}, + &errdetails.RetryInfo{RetryDelay: durationpb.New(13 * time.Minute)}, + ) + }, + throttle: 13 * time.Minute, + }, + { + stsFn: func() (*status.Status, error) { + st := status.New(codes.ResourceExhausted, "status with two retry info should take the first") + return st.WithDetails( + &errdetails.RetryInfo{RetryDelay: durationpb.New(13 * time.Minute)}, + &errdetails.RetryInfo{RetryDelay: durationpb.New(18 * time.Minute)}, + ) + }, + throttle: 13 * time.Minute, + }, + } + + for _, tt := range tts { + sts, _ := tt.stsFn() + t.Run(sts.Message(), func(t *testing.T) { + th := getThrottleDuration(sts) + require.Equal(t, tt.throttle, th) + }) + } +} diff --git a/exporters/otlp/otlpgrpc/driver.go b/exporters/otlp/otlpgrpc/driver.go index c5df20566c7..39544564ca9 100644 --- a/exporters/otlp/otlpgrpc/driver.go +++ b/exporters/otlp/otlpgrpc/driver.go @@ -145,10 +145,13 @@ func (md *metricsDriver) uploadMetrics(ctx context.Context, protoMetrics []*metr if md.metricsClient == nil { return errNoClient } - _, err := md.metricsClient.Export(ctx, &colmetricpb.ExportMetricsServiceRequest{ - ResourceMetrics: protoMetrics, + + return md.connection.doRequest(ctx, func(ctx context.Context) error { + _, err := md.metricsClient.Export(ctx, &colmetricpb.ExportMetricsServiceRequest{ + ResourceMetrics: protoMetrics, + }) + return err }) - return err }() if err != nil { md.connection.setStateDisconnected(err) @@ -183,10 +186,12 @@ func (td *tracesDriver) uploadTraces(ctx context.Context, protoSpans []*tracepb. if td.tracesClient == nil { return errNoClient } - _, err := td.tracesClient.Export(ctx, &coltracepb.ExportTraceServiceRequest{ - ResourceSpans: protoSpans, + return td.connection.doRequest(ctx, func(ctx context.Context) error { + _, err := td.tracesClient.Export(ctx, &coltracepb.ExportTraceServiceRequest{ + ResourceSpans: protoSpans, + }) + return err }) - return err }() if err != nil { td.connection.setStateDisconnected(err) diff --git a/exporters/otlp/otlpgrpc/mock_collector_test.go b/exporters/otlp/otlpgrpc/mock_collector_test.go index 7183b9511fc..13ce21039ab 100644 --- a/exporters/otlp/otlpgrpc/mock_collector_test.go +++ b/exporters/otlp/otlpgrpc/mock_collector_test.go @@ -34,11 +34,12 @@ import ( tracepb "go.opentelemetry.io/proto/otlp/trace/v1" ) -func makeMockCollector(t *testing.T) *mockCollector { +func makeMockCollector(t *testing.T, mockConfig *mockConfig) *mockCollector { return &mockCollector{ t: t, traceSvc: &mockTraceService{ storage: otlptest.NewSpansStorage(), + errors: mockConfig.errors, }, metricSvc: &mockMetricService{ storage: otlptest.NewMetricsStorage(), @@ -49,10 +50,12 @@ func makeMockCollector(t *testing.T) *mockCollector { type mockTraceService struct { collectortracepb.UnimplementedTraceServiceServer - mu sync.RWMutex - storage otlptest.SpansStorage - headers metadata.MD - delay time.Duration + errors []error + requests int + mu sync.RWMutex + storage otlptest.SpansStorage + headers metadata.MD + delay time.Duration } func (mts *mockTraceService) getHeaders() metadata.MD { @@ -77,9 +80,19 @@ func (mts *mockTraceService) Export(ctx context.Context, exp *collectortracepb.E if mts.delay > 0 { time.Sleep(mts.delay) } - reply := &collectortracepb.ExportTraceServiceResponse{} + mts.mu.Lock() - defer mts.mu.Unlock() + defer func() { + mts.requests++ + mts.mu.Unlock() + }() + + reply := &collectortracepb.ExportTraceServiceResponse{} + if mts.requests < len(mts.errors) { + idx := mts.requests + return reply, mts.errors[idx] + } + mts.headers, _ = metadata.FromIncomingContext(ctx) mts.storage.AddSpans(exp) return reply, nil @@ -122,6 +135,11 @@ type mockCollector struct { stopOnce sync.Once } +type mockConfig struct { + errors []error + endpoint string +} + var _ collectortracepb.TraceServiceServer = (*mockTraceService)(nil) var _ collectormetricpb.MetricsServiceServer = (*mockMetricService)(nil) @@ -192,13 +210,17 @@ func runMockCollector(t *testing.T) *mockCollector { } func runMockCollectorAtEndpoint(t *testing.T, endpoint string) *mockCollector { - ln, err := net.Listen("tcp", endpoint) + return runMockCollectorWithConfig(t, &mockConfig{endpoint: endpoint}) +} + +func runMockCollectorWithConfig(t *testing.T, mockConfig *mockConfig) *mockCollector { + ln, err := net.Listen("tcp", mockConfig.endpoint) if err != nil { t.Fatalf("Failed to get an endpoint: %v", err) } srv := grpc.NewServer() - mc := makeMockCollector(t) + mc := makeMockCollector(t, mockConfig) collectortracepb.RegisterTraceServiceServer(srv, mc.traceSvc) collectormetricpb.RegisterMetricsServiceServer(srv, mc.metricSvc) mc.ln = newListener(ln) diff --git a/exporters/otlp/otlpgrpc/options.go b/exporters/otlp/otlpgrpc/options.go index dd7201f94a7..390bb276094 100644 --- a/exporters/otlp/otlpgrpc/options.go +++ b/exporters/otlp/otlpgrpc/options.go @@ -200,3 +200,12 @@ func WithTracesTimeout(duration time.Duration) Option { func WithMetricsTimeout(duration time.Duration) Option { return otlpconfig.WithMetricsTimeout(duration) } + +// WithRetry configures the retry policy for transient errors that may occurs when +// exporting metrics or traces. An exponential back-off algorithm is used to +// ensure endpoints are not overwhelmed with retries. If unset, the default +// retry policy will retry after 5 seconds and increase exponentially after each +// error for a total of 1 minute. +func WithRetry(settings otlp.RetrySettings) Option { + return otlpconfig.WithRetry(settings) +} diff --git a/exporters/otlp/otlpgrpc/otlp_integration_test.go b/exporters/otlp/otlpgrpc/otlp_integration_test.go index b83900c5062..4eb7306faa0 100644 --- a/exporters/otlp/otlpgrpc/otlp_integration_test.go +++ b/exporters/otlp/otlpgrpc/otlp_integration_test.go @@ -22,8 +22,10 @@ import ( "testing" "time" + "google.golang.org/genproto/googleapis/rpc/errdetails" "google.golang.org/grpc/codes" "google.golang.org/grpc/status" + "google.golang.org/protobuf/types/known/durationpb" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" @@ -150,6 +152,7 @@ func TestNewExporter_collectorConnectionDiesThenReconnectsWhenInRestMode(t *test reconnectionPeriod := 20 * time.Millisecond ctx := context.Background() exp := newGRPCExporter(t, ctx, mc.endpoint, + otlpgrpc.WithRetry(otlp.RetrySettings{Enabled: false}), otlpgrpc.WithReconnectionPeriod(reconnectionPeriod)) defer func() { require.NoError(t, exp.Shutdown(ctx)) }() @@ -200,12 +203,280 @@ func TestNewExporter_collectorConnectionDiesThenReconnectsWhenInRestMode(t *test require.NoError(t, nmc.Stop()) } +func TestExporterExportFailureAndRecoveryModes(t *testing.T) { + tts := []struct { + name string + errors []error + rs otlp.RetrySettings + fn func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) + opts []otlpgrpc.Option + }{ + { + name: "Do not retry if succeeded", + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + require.NoError(t, exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}})) + + span := mc.getSpans() + + require.Len(t, span, 1) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 success request.") + }, + }, + { + name: "Do not retry if 'error' is ok", + errors: []error{ + status.Error(codes.OK, ""), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + require.NoError(t, exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}})) + + span := mc.getSpans() + + require.Len(t, span, 0) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 error OK request.") + }, + }, + { + name: "Fail three times and succeed", + rs: otlp.RetrySettings{ + Enabled: true, + MaxElapsedTime: 300 * time.Millisecond, + InitialInterval: 2 * time.Millisecond, + MaxInterval: 10 * time.Millisecond, + }, + errors: []error{ + status.Error(codes.Unavailable, "backend under pressure"), + status.Error(codes.Unavailable, "backend under pressure"), + status.Error(codes.Unavailable, "backend under pressure"), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + require.NoError(t, exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}})) + + span := mc.getSpans() + + require.Len(t, span, 1) + require.Equal(t, 4, mc.traceSvc.requests, "trace service must receive 3 failure requests and 1 success request.") + }, + }, + { + name: "Permanent error should not be retried", + rs: otlp.RetrySettings{ + Enabled: true, + MaxElapsedTime: 300 * time.Millisecond, + InitialInterval: 2 * time.Millisecond, + MaxInterval: 10 * time.Millisecond, + }, + errors: []error{ + status.Error(codes.InvalidArgument, "invalid arguments"), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + require.Error(t, exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}})) + + span := mc.getSpans() + + require.Len(t, span, 0) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 error requests.") + }, + }, + { + name: "Test all transient errors and succeed", + rs: otlp.RetrySettings{ + Enabled: true, + MaxElapsedTime: 500 * time.Millisecond, + InitialInterval: 1 * time.Millisecond, + MaxInterval: 2 * time.Millisecond, + }, + errors: []error{ + status.Error(codes.Canceled, ""), + status.Error(codes.DeadlineExceeded, ""), + status.Error(codes.ResourceExhausted, ""), + status.Error(codes.Aborted, ""), + status.Error(codes.OutOfRange, ""), + status.Error(codes.Unavailable, ""), + status.Error(codes.DataLoss, ""), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + require.NoError(t, exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}})) + + span := mc.getSpans() + + require.Len(t, span, 1) + require.Equal(t, 8, mc.traceSvc.requests, "trace service must receive 9 failure requests and 1 success request.") + }, + }, + { + name: "Retry should honor server throttling", + rs: otlp.RetrySettings{ + Enabled: true, + MaxElapsedTime: time.Minute, + InitialInterval: time.Nanosecond, + MaxInterval: time.Nanosecond, + }, + opts: []otlpgrpc.Option{ + otlpgrpc.WithTimeout(time.Millisecond * 100), + }, + errors: []error{ + newThrottlingError(codes.ResourceExhausted, time.Second*30), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + err := exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}}) + require.Error(t, err) + require.Equal(t, "context deadline exceeded", err.Error()) + + span := mc.getSpans() + + require.Len(t, span, 0) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 failure requests and 1 success request.") + }, + }, + { + name: "Retry should fail if server throttling is higher than the MaxElapsedTime", + rs: otlp.RetrySettings{ + Enabled: true, + MaxElapsedTime: time.Millisecond * 100, + InitialInterval: time.Nanosecond, + MaxInterval: time.Nanosecond, + }, + errors: []error{ + newThrottlingError(codes.ResourceExhausted, time.Minute), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + err := exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}}) + require.Error(t, err) + require.Equal(t, "max elapsed time expired when respecting server throttle: rpc error: code = ResourceExhausted desc = ", err.Error()) + + span := mc.getSpans() + + require.Len(t, span, 0) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 failure requests and 1 success request.") + }, + }, + { + name: "Retry stops if takes too long", + rs: otlp.RetrySettings{ + Enabled: true, + MaxElapsedTime: time.Millisecond * 100, + InitialInterval: time.Millisecond * 50, + MaxInterval: time.Millisecond * 50, + }, + errors: []error{ + status.Error(codes.Unavailable, "unavailable"), + status.Error(codes.Unavailable, "unavailable"), + status.Error(codes.Unavailable, "unavailable"), + status.Error(codes.Unavailable, "unavailable"), + status.Error(codes.Unavailable, "unavailable"), + status.Error(codes.Unavailable, "unavailable"), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + err := exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}}) + require.Error(t, err) + + require.Equal(t, "max elapsed time expired: rpc error: code = Unavailable desc = unavailable", err.Error()) + + span := mc.getSpans() + + require.Len(t, span, 0) + require.LessOrEqual(t, 1, mc.traceSvc.requests, "trace service must receive at least 1 failure requests.") + }, + }, + { + name: "Disabled retry", + rs: otlp.RetrySettings{ + Enabled: false, + }, + errors: []error{ + status.Error(codes.Unavailable, "unavailable"), + }, + fn: func(t *testing.T, ctx context.Context, exp *otlp.Exporter, mc *mockCollector) { + err := exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}}) + require.Error(t, err) + + require.Equal(t, "rpc error: code = Unavailable desc = unavailable", err.Error()) + + span := mc.getSpans() + + require.Len(t, span, 0) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 failure requests.") + }, + }, + } + + for _, tt := range tts { + t.Run(tt.name, func(t *testing.T) { + ctx := context.Background() + + mc := runMockCollectorWithConfig(t, &mockConfig{ + errors: tt.errors, + }) + + opts := []otlpgrpc.Option{ + otlpgrpc.WithRetry(tt.rs), + } + + if len(tt.opts) != 0 { + opts = append(opts, tt.opts...) + } + + exp := newGRPCExporter(t, ctx, mc.endpoint, opts...) + + tt.fn(t, ctx, exp, mc) + + require.NoError(t, mc.Stop()) + require.NoError(t, exp.Shutdown(ctx)) + }) + } + +} + +func TestPermanentErrorsShouldNotBeRetried(t *testing.T) { + permanentErrors := []*status.Status{ + status.New(codes.Unknown, "Unknown"), + status.New(codes.InvalidArgument, "InvalidArgument"), + status.New(codes.NotFound, "NotFound"), + status.New(codes.AlreadyExists, "AlreadyExists"), + status.New(codes.FailedPrecondition, "FailedPrecondition"), + status.New(codes.Unimplemented, "Unimplemented"), + status.New(codes.Internal, "Internal"), + status.New(codes.PermissionDenied, ""), + status.New(codes.Unauthenticated, ""), + } + + for _, sts := range permanentErrors { + t.Run(sts.Code().String(), func(t *testing.T) { + ctx := context.Background() + + mc := runMockCollectorWithConfig(t, &mockConfig{ + errors: []error{sts.Err()}, + }) + + exp := newGRPCExporter(t, ctx, mc.endpoint) + + err := exp.ExportSpans(ctx, []*sdktrace.SpanSnapshot{{Name: "Spans"}}) + require.Error(t, err) + require.Len(t, mc.getSpans(), 0) + require.Equal(t, 1, mc.traceSvc.requests, "trace service must receive 1 permanent error requests.") + + require.NoError(t, mc.Stop()) + require.NoError(t, exp.Shutdown(ctx)) + }) + } +} + +func newThrottlingError(code codes.Code, duration time.Duration) error { + s := status.New(code, "") + + s, _ = s.WithDetails(&errdetails.RetryInfo{RetryDelay: durationpb.New(duration)}) + + return s.Err() +} + func TestNewExporter_collectorConnectionDiesThenReconnects(t *testing.T) { mc := runMockCollector(t) reconnectionPeriod := 50 * time.Millisecond ctx := context.Background() exp := newGRPCExporter(t, ctx, mc.endpoint, + otlpgrpc.WithRetry(otlp.RetrySettings{Enabled: false}), otlpgrpc.WithReconnectionPeriod(reconnectionPeriod)) defer func() { require.NoError(t, exp.Shutdown(ctx)) }() @@ -367,7 +638,7 @@ func TestNewExporter_WithTimeout(t *testing.T) { }() ctx := context.Background() - exp := newGRPCExporter(t, ctx, mc.endpoint, otlpgrpc.WithTimeout(tt.timeout)) + exp := newGRPCExporter(t, ctx, mc.endpoint, otlpgrpc.WithTimeout(tt.timeout), otlpgrpc.WithRetry(otlp.RetrySettings{Enabled: false})) defer func() { _ = exp.Shutdown(ctx) }() @@ -406,6 +677,7 @@ func TestNewExporter_withInvalidSecurityConfiguration(t *testing.T) { expectedErr := fmt.Sprintf("traces exporter is disconnected from the server %s: grpc: no transport security set (use grpc.WithInsecure() explicitly or set credentials)", mc.endpoint) + require.Error(t, err) require.Equal(t, expectedErr, err.Error()) defer func() { diff --git a/exporters/otlp/retry.go b/exporters/otlp/retry.go new file mode 100644 index 00000000000..65c918ec29e --- /dev/null +++ b/exporters/otlp/retry.go @@ -0,0 +1,34 @@ +// Copyright The OpenTelemetry Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package otlp + +import ( + "time" +) + +// RetrySettings defines configuration for retrying batches in case of export failure +// using an exponential backoff. +type RetrySettings struct { + // Enabled indicates whether to not retry sending batches in case of export failure. + Enabled bool + // InitialInterval the time to wait after the first failure before retrying. + InitialInterval time.Duration + // MaxInterval is the upper bound on backoff interval. Once this value is reached the delay between + // consecutive retries will always be `MaxInterval`. + MaxInterval time.Duration + // MaxElapsedTime is the maximum amount of time (including retries) spent trying to send a request/batch. + // Once this value is reached, the data is discarded. + MaxElapsedTime time.Duration +} diff --git a/exporters/stdout/trace_test.go b/exporters/stdout/trace_test.go index 1e5378aee2c..8fa64feff58 100644 --- a/exporters/stdout/trace_test.go +++ b/exporters/stdout/trace_test.go @@ -60,7 +60,7 @@ func TestExporter_ExportSpan(t *testing.T) { attribute.String("key", keyValue), attribute.Float64("double", doubleValue), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ {Name: "foo", Attributes: []attribute.KeyValue{attribute.String("key", keyValue)}, Time: now}, {Name: "bar", Attributes: []attribute.KeyValue{attribute.Float64("double", doubleValue)}, Time: now}, }, diff --git a/exporters/trace/jaeger/jaeger_test.go b/exporters/trace/jaeger/jaeger_test.go index 1889b78ae0b..cb9a4549593 100644 --- a/exporters/trace/jaeger/jaeger_test.go +++ b/exporters/trace/jaeger/jaeger_test.go @@ -279,7 +279,7 @@ func Test_spanSnapshotToThrift(t *testing.T) { attribute.Float64("double", doubleValue), attribute.Int64("int", intValue), }, - MessageEvents: []trace.Event{ + MessageEvents: []sdktrace.Event{ { Name: eventNameValue, Attributes: []attribute.KeyValue{attribute.String("k1", keyValue)}, diff --git a/exporters/trace/zipkin/model.go b/exporters/trace/zipkin/model.go index bcf7558b17b..74a3ffc71a6 100644 --- a/exporters/trace/zipkin/model.go +++ b/exporters/trace/zipkin/model.go @@ -136,7 +136,7 @@ func toZipkinKind(kind trace.SpanKind) zkmodel.Kind { return zkmodel.Undetermined } -func toZipkinAnnotations(events []trace.Event) []zkmodel.Annotation { +func toZipkinAnnotations(events []tracesdk.Event) []zkmodel.Annotation { if len(events) == 0 { return nil } diff --git a/exporters/trace/zipkin/model_test.go b/exporters/trace/zipkin/model_test.go index 6e1e7789ffb..ab0979a0554 100644 --- a/exporters/trace/zipkin/model_test.go +++ b/exporters/trace/zipkin/model_test.go @@ -60,7 +60,7 @@ func TestModelConversion(t *testing.T) { attribute.String("attr2", "bar"), attribute.Array("attr3", []int{0, 1, 2}), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -93,7 +93,7 @@ func TestModelConversion(t *testing.T) { attribute.Int64("attr1", 42), attribute.String("attr2", "bar"), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -129,7 +129,7 @@ func TestModelConversion(t *testing.T) { attribute.Int64("attr1", 42), attribute.String("attr2", "bar"), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -165,7 +165,7 @@ func TestModelConversion(t *testing.T) { attribute.Int64("attr1", 42), attribute.String("attr2", "bar"), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -204,7 +204,7 @@ func TestModelConversion(t *testing.T) { attribute.String("net.peer.ip", "1.2.3.4"), attribute.Int64("net.peer.port", 9876), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -240,7 +240,7 @@ func TestModelConversion(t *testing.T) { attribute.Int64("attr1", 42), attribute.String("attr2", "bar"), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -276,7 +276,7 @@ func TestModelConversion(t *testing.T) { attribute.Int64("attr1", 42), attribute.String("attr2", "bar"), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", @@ -334,7 +334,7 @@ func TestModelConversion(t *testing.T) { Attributes: []attribute.KeyValue{ attribute.String("error", "false"), }, - MessageEvents: []trace.Event{ + MessageEvents: []tracesdk.Event{ { Time: time.Date(2020, time.March, 11, 19, 24, 30, 0, time.UTC), Name: "ev1", diff --git a/oteltest/provider.go b/oteltest/provider.go index babba528e12..cf6aea17880 100644 --- a/oteltest/provider.go +++ b/oteltest/provider.go @@ -67,7 +67,7 @@ func (p *TracerProvider) Tracer(instName string, opts ...trace.TracerOption) tra return t } -// DefaulTracer returns a default tracer for testing purposes. +// DefaultTracer returns a default tracer for testing purposes. func DefaultTracer() trace.Tracer { return NewTracerProvider().Tracer("") } diff --git a/sdk/resource/resource.go b/sdk/resource/resource.go index c64baaa6e9f..baf9f2c977e 100644 --- a/sdk/resource/resource.go +++ b/sdk/resource/resource.go @@ -159,7 +159,7 @@ func Default() *Resource { func Environment() *Resource { detector := &fromEnv{} resource, err := detector.Detect(context.Background()) - if err == nil { + if err != nil { otel.Handle(err) } return resource diff --git a/sdk/trace/batch_span_processor.go b/sdk/trace/batch_span_processor.go index f63aa7a940f..6687839ee43 100644 --- a/sdk/trace/batch_span_processor.go +++ b/sdk/trace/batch_span_processor.go @@ -156,16 +156,14 @@ func (bsp *batchSpanProcessor) Shutdown(ctx context.Context) error { func (bsp *batchSpanProcessor) ForceFlush(ctx context.Context) error { var err error if bsp.e != nil { - wait := make(chan struct{}) + wait := make(chan error) go func() { - if err := bsp.exportSpans(ctx); err != nil { - otel.Handle(err) - } + wait <- bsp.exportSpans(ctx) close(wait) }() // Wait until the export is finished or the context is cancelled/timed out select { - case <-wait: + case err = <-wait: case <-ctx.Done(): err = ctx.Err() } @@ -216,11 +214,18 @@ func (bsp *batchSpanProcessor) exportSpans(ctx context.Context) error { defer cancel() } - if len(bsp.batch) > 0 { - if err := bsp.e.ExportSpans(ctx, bsp.batch); err != nil { + if l := len(bsp.batch); l > 0 { + err := bsp.e.ExportSpans(ctx, bsp.batch) + + // A new batch is always created after exporting, even if the batch failed to be exported. + // + // It is up to the exporter to implement any type of retry logic if a batch is failing + // to be exported, since it is specific to the protocol and backend being sent to. + bsp.batch = bsp.batch[:0] + + if err != nil { return err } - bsp.batch = bsp.batch[:0] } return nil } @@ -244,7 +249,7 @@ func (bsp *batchSpanProcessor) processQueue() { case sd := <-bsp.queue: bsp.batchMutex.Lock() bsp.batch = append(bsp.batch, sd) - shouldExport := len(bsp.batch) == bsp.o.MaxExportBatchSize + shouldExport := len(bsp.batch) >= bsp.o.MaxExportBatchSize bsp.batchMutex.Unlock() if shouldExport { if !bsp.timer.Stop() { diff --git a/sdk/trace/batch_span_processor_test.go b/sdk/trace/batch_span_processor_test.go index db7ead846a2..0cea6e29468 100644 --- a/sdk/trace/batch_span_processor_test.go +++ b/sdk/trace/batch_span_processor_test.go @@ -37,6 +37,9 @@ type testBatchExporter struct { batchCount int shutdownCount int delay time.Duration + errors []error + droppedCount int + idx int err error } @@ -44,6 +47,13 @@ func (t *testBatchExporter) ExportSpans(ctx context.Context, ss []*sdktrace.Span t.mu.Lock() defer t.mu.Unlock() + if t.idx < len(t.errors) { + t.droppedCount += len(ss) + err := t.errors[t.idx] + t.idx++ + return err + } + time.Sleep(t.delay) select { @@ -338,12 +348,8 @@ func TestBatchSpanProcessorForceFlushSucceeds(t *testing.T) { // Force flush any held span batches err := ssp.ForceFlush(context.Background()) - gotNumOfSpans := te.len() - spanDifference := option.wantNumSpans - gotNumOfSpans - if spanDifference > 10 || spanDifference < 0 { - t.Errorf("number of exported span not equal to or within 10 less than: got %+v, want %+v\n", - gotNumOfSpans, option.wantNumSpans) - } + assertMaxSpanDiff(t, te.len(), option.wantNumSpans, 10) + gotBatchCount := te.getBatchCount() if gotBatchCount < option.wantBatchCount { t.Errorf("number batches: got %+v, want >= %+v\n", @@ -353,31 +359,92 @@ func TestBatchSpanProcessorForceFlushSucceeds(t *testing.T) { assert.NoError(t, err) } +func TestBatchSpanProcessorDropBatchIfFailed(t *testing.T) { + te := testBatchExporter{ + errors: []error{errors.New("fail to export")}, + } + tp := basicTracerProvider(t) + option := testOption{ + o: []sdktrace.BatchSpanProcessorOption{ + sdktrace.WithMaxQueueSize(0), + sdktrace.WithMaxExportBatchSize(2000), + }, + wantNumSpans: 1000, + wantBatchCount: 1, + genNumSpans: 1000, + } + ssp := createAndRegisterBatchSP(option, &te) + if ssp == nil { + t.Fatalf("%s: Error creating new instance of BatchSpanProcessor\n", option.name) + } + tp.RegisterSpanProcessor(ssp) + tr := tp.Tracer("BatchSpanProcessorWithOption") + generateSpan(t, option.parallel, tr, option) + + // Force flush any held span batches + err := ssp.ForceFlush(context.Background()) + assert.Error(t, err) + assert.EqualError(t, err, "fail to export") + + // First flush will fail, nothing should be exported. + assertMaxSpanDiff(t, te.droppedCount, option.wantNumSpans, 10) + assert.Equal(t, 0, te.len()) + assert.Equal(t, 0, te.getBatchCount()) + + // Generate a new batch, this will succeed + generateSpan(t, option.parallel, tr, option) + + // Force flush any held span batches + err = ssp.ForceFlush(context.Background()) + assert.NoError(t, err) + + assertMaxSpanDiff(t, te.len(), option.wantNumSpans, 10) + gotBatchCount := te.getBatchCount() + if gotBatchCount < option.wantBatchCount { + t.Errorf("number batches: got %+v, want >= %+v\n", + gotBatchCount, option.wantBatchCount) + t.Errorf("Batches %v\n", te.sizes) + } +} + +func assertMaxSpanDiff(t *testing.T, want, got, maxDif int) { + spanDifference := want - got + if spanDifference < 0 { + spanDifference = spanDifference * -1 + } + if spanDifference > maxDif { + t.Errorf("number of exported span not equal to or within %d less than: got %+v, want %+v\n", + maxDif, got, want) + } +} + +type indefiniteExporter struct{} + +func (indefiniteExporter) Shutdown(context.Context) error { return nil } +func (indefiniteExporter) ExportSpans(ctx context.Context, _ []*sdktrace.SpanSnapshot) error { + <-ctx.Done() + return ctx.Err() +} + func TestBatchSpanProcessorForceFlushTimeout(t *testing.T) { - var bp testBatchExporter - bsp := sdktrace.NewBatchSpanProcessor(&bp) // Add timeout to context to test deadline ctx, cancel := context.WithTimeout(context.Background(), time.Nanosecond) defer cancel() <-ctx.Done() - if err := bsp.ForceFlush(ctx); err == nil { - t.Error("expected context DeadlineExceeded error, got nil") - } else if !errors.Is(err, context.DeadlineExceeded) { - t.Errorf("expected context DeadlineExceeded error, got %v", err) + bsp := sdktrace.NewBatchSpanProcessor(indefiniteExporter{}) + if got, want := bsp.ForceFlush(ctx), context.DeadlineExceeded; !errors.Is(got, want) { + t.Errorf("expected %q error, got %v", want, got) } } func TestBatchSpanProcessorForceFlushCancellation(t *testing.T) { - var bp testBatchExporter - bsp := sdktrace.NewBatchSpanProcessor(&bp) ctx, cancel := context.WithCancel(context.Background()) // Cancel the context cancel() - if err := bsp.ForceFlush(ctx); err == nil { - t.Error("expected context canceled error, got nil") - } else if !errors.Is(err, context.Canceled) { - t.Errorf("expected context canceled error, got %v", err) + bsp := sdktrace.NewBatchSpanProcessor(indefiniteExporter{}) + if got, want := bsp.ForceFlush(ctx), context.Canceled; !errors.Is(got, want) { + t.Errorf("expected %q error, got %v", want, got) } } diff --git a/sdk/trace/event.go b/sdk/trace/event.go new file mode 100644 index 00000000000..4eb556c53be --- /dev/null +++ b/sdk/trace/event.go @@ -0,0 +1,37 @@ +// Copyright The OpenTelemetry Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package trace + +import ( + "time" + + "go.opentelemetry.io/otel/attribute" +) + +// Event is a thing that happened during a Span's lifetime. +type Event struct { + // Name is the name of this event + Name string + + // Attributes describe the aspects of the event. + Attributes []attribute.KeyValue + + // DroppedAttributeCount is the number of attributes that were not + // recorded due to configured limits being reached. + DroppedAttributeCount int + + // Time at which this event was recorded. + Time time.Time +} diff --git a/sdk/trace/provider.go b/sdk/trace/provider.go index 601c239c0e0..83e1fe5e8d3 100644 --- a/sdk/trace/provider.go +++ b/sdk/trace/provider.go @@ -236,6 +236,12 @@ func (p *TracerProvider) Shutdown(ctx context.Context) error { // WithSyncer registers the exporter with the TracerProvider using a // SimpleSpanProcessor. +// +// This is not recommended for production use. The synchronous nature of the +// SimpleSpanProcessor that will wrap the exporter make it good for testing, +// debugging, or showing examples of other feature, but it will be slow and +// have a high computation resource usage overhead. The WithBatcher option is +// recommended for production use instead. func WithSyncer(e SpanExporter) TracerProviderOption { return WithSpanProcessor(NewSimpleSpanProcessor(e)) } diff --git a/sdk/trace/simple_span_processor.go b/sdk/trace/simple_span_processor.go index b66a87a2a3f..5b935e6a6af 100644 --- a/sdk/trace/simple_span_processor.go +++ b/sdk/trace/simple_span_processor.go @@ -33,6 +33,12 @@ var _ SpanProcessor = (*simpleSpanProcessor)(nil) // NewSimpleSpanProcessor returns a new SpanProcessor that will synchronously // send completed spans to the exporter immediately. +// +// This SpanProcessor is not recommended for production use. The synchronous +// nature of this SpanProcessor make it good for testing, debugging, or +// showing examples of other feature, but it will be slow and have a high +// computation resource usage overhead. The BatchSpanProcessor is recommended +// for production use instead. func NewSimpleSpanProcessor(exporter SpanExporter) SpanProcessor { ssp := &simpleSpanProcessor{ exporter: exporter, @@ -60,16 +66,34 @@ func (ssp *simpleSpanProcessor) OnEnd(s ReadOnlySpan) { func (ssp *simpleSpanProcessor) Shutdown(ctx context.Context) error { var err error ssp.stopOnce.Do(func() { + stopFunc := func(exp SpanExporter) (<-chan error, func()) { + done := make(chan error) + return done, func() { done <- exp.Shutdown(ctx) } + } + + // The exporter field of the simpleSpanProcessor needs to be zeroed to + // signal it is shut down, meaning all subsequent calls to OnEnd will + // be gracefully ignored. This needs to be done synchronously to avoid + // any race condition. + // + // A closure is used to keep reference to the exporter and then the + // field is zeroed. This ensures the simpleSpanProcessor is shut down + // before the exporter. This order is important as it avoids a + // potential deadlock. If the exporter shut down operation generates a + // span, that span would need to be exported. Meaning, OnEnd would be + // called and try acquiring the lock that is held here. ssp.exporterMu.Lock() - exporter := ssp.exporter - // Set exporter to nil so subsequent calls to OnEnd are ignored - // gracefully. + done, shutdown := stopFunc(ssp.exporter) ssp.exporter = nil ssp.exporterMu.Unlock() - // Clear the ssp.exporter prior to shutting it down so if that creates - // a span that needs to be exported there is no deadlock. - err = exporter.Shutdown(ctx) + go shutdown() + + select { + case err = <-done: + case <-ctx.Done(): + err = ctx.Err() + } }) return err } diff --git a/sdk/trace/simple_span_processor_test.go b/sdk/trace/simple_span_processor_test.go index 34ad9356d55..bfdb1eecb9d 100644 --- a/sdk/trace/simple_span_processor_test.go +++ b/sdk/trace/simple_span_processor_test.go @@ -16,7 +16,9 @@ package trace_test import ( "context" + "errors" "testing" + "time" "go.opentelemetry.io/otel/trace" @@ -142,3 +144,24 @@ func TestSimpleSpanProcessorShutdownOnEndConcurrency(t *testing.T) { stop <- struct{}{} <-done } + +func TestSimpleSpanProcessorShutdownHonorsContextDeadline(t *testing.T) { + ctx, cancel := context.WithTimeout(context.Background(), time.Nanosecond) + defer cancel() + <-ctx.Done() + + ssp := sdktrace.NewSimpleSpanProcessor(&testExporter{}) + if got, want := ssp.Shutdown(ctx), context.DeadlineExceeded; !errors.Is(got, want) { + t.Errorf("SimpleSpanProcessor.Shutdown did not return %v, got %v", want, got) + } +} + +func TestSimpleSpanProcessorShutdownHonorsContextCancel(t *testing.T) { + ctx, cancel := context.WithCancel(context.Background()) + cancel() + + ssp := sdktrace.NewSimpleSpanProcessor(&testExporter{}) + if got, want := ssp.Shutdown(ctx), context.Canceled; !errors.Is(got, want) { + t.Errorf("SimpleSpanProcessor.Shutdown did not return %v, got %v", want, got) + } +} diff --git a/sdk/trace/span.go b/sdk/trace/span.go index f13967d2b59..b7f5296cddf 100644 --- a/sdk/trace/span.go +++ b/sdk/trace/span.go @@ -45,7 +45,7 @@ type ReadOnlySpan interface { EndTime() time.Time Attributes() []attribute.KeyValue Links() []trace.Link - Events() []trace.Event + Events() []Event StatusCode() codes.Code StatusMessage() string Tracer() trace.Tracer @@ -295,7 +295,7 @@ func (s *span) addEvent(name string, o ...trace.EventOption) { s.mu.Lock() defer s.mu.Unlock() - s.messageEvents.add(trace.Event{ + s.messageEvents.add(Event{ Name: name, Attributes: c.Attributes, DroppedAttributeCount: discarded, @@ -372,11 +372,11 @@ func (s *span) Links() []trace.Link { } // Events returns the events of this span. -func (s *span) Events() []trace.Event { +func (s *span) Events() []Event { s.mu.Lock() defer s.mu.Unlock() if len(s.messageEvents.queue) == 0 { - return []trace.Event{} + return []Event{} } return s.interfaceArrayToMessageEventArray() } @@ -469,10 +469,10 @@ func (s *span) interfaceArrayToLinksArray() []trace.Link { return linkArr } -func (s *span) interfaceArrayToMessageEventArray() []trace.Event { - messageEventArr := make([]trace.Event, 0) +func (s *span) interfaceArrayToMessageEventArray() []Event { + messageEventArr := make([]Event, 0) for _, value := range s.messageEvents.queue { - messageEventArr = append(messageEventArr, value.(trace.Event)) + messageEventArr = append(messageEventArr, value.(Event)) } return messageEventArr } @@ -595,7 +595,7 @@ type SpanSnapshot struct { // from StartTime by the duration of the span. EndTime time.Time Attributes []attribute.KeyValue - MessageEvents []trace.Event + MessageEvents []Event Links []trace.Link StatusCode codes.Code StatusMessage string diff --git a/sdk/trace/trace_test.go b/sdk/trace/trace_test.go index efffde19366..6495215f116 100644 --- a/sdk/trace/trace_test.go +++ b/sdk/trace/trace_test.go @@ -559,7 +559,7 @@ func TestEvents(t *testing.T) { }), Parent: sc.WithRemote(true), Name: "span0", - MessageEvents: []trace.Event{ + MessageEvents: []Event{ {Name: "foo", Attributes: []attribute.KeyValue{k1v1}}, {Name: "bar", Attributes: []attribute.KeyValue{k2v2, k3v3}}, }, @@ -608,7 +608,7 @@ func TestEventsOverLimit(t *testing.T) { }), Parent: sc.WithRemote(true), Name: "span0", - MessageEvents: []trace.Event{ + MessageEvents: []Event{ {Name: "foo", Attributes: []attribute.KeyValue{k1v1}}, {Name: "bar", Attributes: []attribute.KeyValue{k2v2, k3v3}}, }, @@ -781,7 +781,7 @@ func TestSetSpanStatusWithoutMessageWhenStatusIsNotError(t *testing.T) { func cmpDiff(x, y interface{}) string { return cmp.Diff(x, y, cmp.AllowUnexported(attribute.Value{}), - cmp.AllowUnexported(trace.Event{}), + cmp.AllowUnexported(Event{}), cmp.AllowUnexported(trace.TraceState{})) } @@ -1119,7 +1119,7 @@ func TestRecordError(t *testing.T) { Name: "span0", StatusCode: codes.Unset, SpanKind: trace.SpanKindInternal, - MessageEvents: []trace.Event{ + MessageEvents: []Event{ { Name: semconv.ExceptionEventName, Time: errTime, @@ -1489,7 +1489,7 @@ func TestAddEventsWithMoreAttributesThanLimit(t *testing.T) { Parent: sc.WithRemote(true), Name: "span0", Attributes: nil, - MessageEvents: []trace.Event{ + MessageEvents: []Event{ { Name: "test1", Attributes: []attribute.KeyValue{ diff --git a/trace/trace.go b/trace/trace.go index d372e7d9d72..fe8d7458922 100644 --- a/trace/trace.go +++ b/trace/trace.go @@ -21,7 +21,6 @@ import ( "encoding/json" "regexp" "strings" - "time" "go.opentelemetry.io/otel/attribute" "go.opentelemetry.io/otel/codes" @@ -539,22 +538,6 @@ type Span interface { SetAttributes(kv ...attribute.KeyValue) } -// Event is a thing that happened during a Span's lifetime. -type Event struct { - // Name is the name of this event - Name string - - // Attributes describe the aspects of the event. - Attributes []attribute.KeyValue - - // DroppedAttributeCount is the number of attributes that were not - // recorded due to configured limits being reached. - DroppedAttributeCount int - - // Time at which this event was recorded. - Time time.Time -} - // Link is the relationship between two Spans. The relationship can be within // the same Trace or across different Traces. // diff --git a/website_docs/exporting_data.md b/website_docs/exporting_data.md index 8b468eed8c1..cf64e161162 100644 --- a/website_docs/exporting_data.md +++ b/website_docs/exporting_data.md @@ -13,7 +13,7 @@ A sampler needs to be set on the tracer provider when its configured, as follows ```go provider := sdktrace.NewTracerProvider( - sdktrace.WithSampler(sdktrace.AlwaysSample()), + sdktrace.WithSampler(sdktrace.AlwaysSample()), ) ``` @@ -34,14 +34,14 @@ Resources should be assigned to a tracer provider at its initialization, and are ```go resources := resource.New( - attribute.String("service.name", "myService"), - attribute.String("service.version", "1.0.0"), - attribute.String("instance.id", "abcdef12345"), + attribute.String("service.name", "myService"), + attribute.String("service.version", "1.0.0"), + attribute.String("instance.id", "abcdef12345"), ) provider := sdktrace.NewTracerProvider( - ... - sdktrace.WithResources(resources), + ... + sdktrace.WithResources(resources), ) ``` diff --git a/website_docs/instrumentation.md b/website_docs/instrumentation.md index 756779f338a..93532535c08 100644 --- a/website_docs/instrumentation.md +++ b/website_docs/instrumentation.md @@ -21,20 +21,20 @@ In Go, the `context` package is used to store the active span. When you start a ```go func parentFunction() { - ctx := context.Background() - var parentSpan trace.Span - ctx, parentSpan = tracer.Start(ctx, "parent") - defer parentSpan.End() - // call our child function - childFunction(ctx) - // do more work, when this function ends, parentSpan will complete. + ctx := context.Background() + var parentSpan trace.Span + ctx, parentSpan = tracer.Start(ctx, "parent") + defer parentSpan.End() + // call our child function + childFunction(ctx) + // do more work, when this function ends, parentSpan will complete. } func childFunction(ctx context.Context) { - var childSpan trace.Span - ctx, childSpan = tracer.Start(ctx, "child") - defer childSpan.End() - // do work here, when this function returns, childSpan will complete. + var childSpan trace.Span + ctx, childSpan = tracer.Start(ctx, "child") + defer childSpan.End() + // do work here, when this function returns, childSpan will complete. } ```