Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/awss3] add marshaller for Sumo Logic Installed Collector format #4

Closed
wants to merge 474 commits into from

Conversation

kasia-kujawa
Copy link
Owner

Description:

Link to tracking Issue:

Testing:

Documentation:

@kasia-kujawa kasia-kujawa force-pushed the awss3exporter-add-sumo-logs-marshaller branch from c77d701 to 042f4fb Compare June 14, 2023 11:39
atoulme and others added 28 commits July 12, 2023 13:32
Deprecated the component. 
Closes open-telemetry#19737
---------

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
…pen-telemetry#24016)

**Description:** This PR changes the format used for the
`k8s.pod.start_time` value. Previously the
[.String()](https://pkg.go.dev/time#Time.String) method was used but
documentation for that says it should only be used for debugging.
Instead use [.MarshalText()](https://pkg.go.dev/time#Time.MarshalText)
which formats in RFC3339.

I have listed this as a breaking change because it is possible that end
users are making assertions on the format of this timestamp value.

Timestamp output:
before: `2023-07-10 12:34:39.740638 -0700 PDT m=+0.020184946`
after `2023-07-10T12:39:53.112485-07:00`
**Link to tracking Issue:** n/a

**Testing:** Updated unit tests.

**Documentation:** Add blurb at bottom of readme
…nces (open-telemetry#23279)

When ECS task is in state Provisioning/Pending, it can contain
container(s) which don't have EC2 instance yet. Such containers have
`nil` instance arn.

This change fixes service discovery error:
```error [email protected]/error.go:77 attachContainerInstance failed:
describe container instanced failed offset=0: ecs.DescribeContainerInstance
failed: InvalidParameterException: Container instance can not be blank.
{"kind": "extension", "name": "ecs_observer", "ErrScope": "Unknown"}
```


**Testing:** Related unit test is added.
…f the client side aggregation code (open-telemetry#23881)

This pull request updates trace exporter code and gets rid of client
side aggregation / populating of `services`,``span_count`` and
``error_count`` attributes.

This client side aggregation is not needed anymore since it now happens
on the server side.

It's worth noting that this client side aggregation was meant as a quick
short term workaround / hack early in the development cycle (special
thanks to @martin-majlis-s1 for implementing this workaround so quickly
on the client side).

It was never meant as a long term solution since it had multiple edge
cases and problems associated with it - it would only really work
correctly / as expected in case all the spans which belong to the same
trace are part of the same batch received by the plugin. And of course
that won't be true in many scenarios - different spans are routed to
different collectors, long running spans which are not part of the same
batch, no guarantees how batches are assembled, etc.

As part of this change, `traces.aggregate` and `traces.max_wait` config
options have also been made obsolete / redundant.

Those config options have also been removed - since the plugin is still
an alpha (some breaking changes are expected) we still have the luxury
of being able to do that.

Removing this code also makes the plugin code much simpler now - and who
doesn't like removing code and simplifying things (less thing to
maintain and worry about edge cases, etc.) :)

---------

Co-authored-by: Martin Majlis <[email protected]>
…24048)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

Breaking change! Allows enabling / disabling Exemplars.

**Link to tracking Issue:** <Issue number if applicable>

open-telemetry#23872
**Testing:** <Describe what testing was performed and which tests were
added.>

- Added unit test
**Documentation:** <Describe the documentation added.>
- Added docs

---------

Co-authored-by: Albert <[email protected]>
Co-authored-by: Pablo Baeyens <[email protected]>
…lemetry#24182)

Change k8s.job metrics to use mdatagen.

**Link to tracking Issue:**
open-telemetry#4367
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Hi all,

This enables the processor to perform context based routing for payloads
that are received on the http server of the otlp receiver. It defaults
to the original grpc metadata extraction but if it is not able to
extract the grpc metadata, it will then attempt to extract it from
client.Info. Currently the routing processor will always use the default
route if the payload was received through the http server.

**Link to tracking Issue:** <Issue number if applicable>
resolves open-telemetry#20913 
**Testing:** <Describe what testing was performed and which tests were
added.>
Added test cases for traces, metrics and logs to includes testing
context based routing when the metadata is in client.Info
**Documentation:** <Describe the documentation added.>
…-dependent processors (open-telemetry#23870) (open-telemetry#24246)

**Description:** Makes clear that `tail_sampling` must be placed after
`k8sattributes` and other processors that clobber context. Please see
issue below for why this is an easy trap to fall into.

**Link to tracking Issue:** open-telemetry#23870
…pen-telemetry#24239)

**Description:**

Do not return empty host ID

**Link to tracking Issue:** open-telemetry#24230

**Testing:** Unit tests

**Documentation:** N/A

---------

Signed-off-by: Dominik Rosiek <[email protected]>
Co-authored-by: Pablo Baeyens <[email protected]>
This change improves the collector.yaml example to contain information
about container tags and metrics.
…ad definitions for host metadata and gohai (open-telemetry#24267)

**Description:** 

Host metadata and gohai payloads were moved to a new `pkg/inframetadata`
Go module. This PR is just an internal refactor using the new payloads.
Description: Allows time comparison by enabling boolean behavior for
time objects.

Link to tracking Issue: Closes open-telemetry#22008

Testing: Unit tests

Documentation:

---------

Co-authored-by: Tyler Helmuth <[email protected]>
**Description:**

Follow up from
open-telemetry#23840
to report all errors that occur during parsing. The error isn't pretty,
but this way users know all syntax errors in their OTTL statements up
front.

I played around with trying to format this to make it more readable, but
wasn't able to come up with anything that was satisfactory.

Example config and resulting error:

```yaml
processors:
  transform:
    trace_statements:
      - context: span
        statements: 
        - ParseJSON("...")
        - false
        - set[[]]
    metric_statements:
      - context: datapoint
        statements: 
        - set(attributes, attributes)
        - true
        - set(
```

```
Error: invalid configuration: processors::transform: unable to parse OTTL statement "ParseJSON(\"...\")": editor names must start with a lowercase letter but got 'ParseJSON'; unable to parse OTTL statement "0": statement has invalid syntax: 1:1: unexpected token "0"; unable to parse OTTL statement "set[[]]": statement has invalid syntax: 1:4: unexpected token "[" (expected "(" (Value ("," Value)*)? ")" Key*); unable to parse OTTL statement "1": statement has invalid syntax: 1:1: unexpected token "1"; unable to parse OTTL statement "set(": statement has invalid syntax: 1:5: unexpected token "<EOF>" (expected ")" Key*)
```

I aggregated the errors both in the `Config` struct and on processor
startup, but realistically are highly unlikely to get an error from the
parser on startup because we've already validated the statements. I
updated it just so we're consistent between validating the config and
reporting errors from the OTTL parser.

---------

Co-authored-by: Evan Bradley <[email protected]>
…entries get committed (open-telemetry#24273)

See open-telemetry/opentelemetry-collector#8089
for origin.

When creating a chore PR, we don't check chloggen entries at all. If a
chore introduces a changelog entry, we should validate it.
…efault (open-telemetry#24010)

**Description:** 

Disable setting `host.id` by default on the `system` detector.
Users can restore the previous behavior by setting
`system::resource_attributes::host.id::enabled` to `true`.

**Link to tracking Issue:** Fixes open-telemetry#21233

**Testing:** Amended existing tests
…en-telemetry#24255)

**Description:** Call the correct function on shutdown

After we have changed the underlying library we haven't updated the
exporter code to call the correct function. This PR is fixing this.

**Link to tracking Issue:** open-telemetry#24253

**Testing:** 

1. Run open telemetry collector
2. Terminate it - `docker kill --signal SIGTERM ...`
3. Observe logs: 
```

2023-07-13T11:30:54.191Z	info	otelcol/collector.go:236	Received signal from OS	{"signal": "terminated"}
2023-07-13T11:30:54.191Z	info	service/service.go:157	Starting shutdown...
2023-07-13T11:30:54.191Z	info	adapter/receiver.go:139	Stopping stanza receiver	{"kind": "receiver", "name": "filelog/log", "data_type": "logs"}
...
...
2023-07-13T11:31:00.917Z	info	client/client.go:387	Buffers' Queue Stats:	{"kind": "exporter", "data_type": "logs", "name": "dataset/logs", "processed": 32, "enqueued": 32, "dropped": 0, "waiting": 0, "processingS": 24.727153498}
2023-07-13T11:31:00.919Z	info	client/client.go:399	Events' Queue Stats:	{"kind": "exporter", "data_type": "logs", "name": "dataset/logs", "processed": 200005, "enqueued": 200005, "waiting": 0, "processingS": 24.727153498}
2023-07-13T11:31:00.919Z	info	client/client.go:413	Transfer Stats:	{"kind": "exporter", "data_type": "logs", "name": "dataset/logs", "bytesSentMB": 150.06610584259033, "bytesAcceptedMB": 150.06610584259033, "throughputMBpS": 6.068879131385992, "perBufferMB": 4.689565807580948, "successRate": 1, "processingS": 24.727153498, "processing": 24.727153498}
2023-07-13T11:31:00.919Z	info	client/add_events.go:294	Finishing with success	{"kind": "exporter", "data_type": "logs", "name": "dataset/logs"}
2023-07-13T11:31:00.919Z	info	extensions/extensions.go:44	Stopping extensions...
2023-07-13T11:31:00.919Z	info	service/service.go:171	Shutdown complete.
```


**Documentation:** There is no change in the documentation needed.
After migrating the detection processor to the new config interface, the
docker detector stopped setting any attributes

Fixes
open-telemetry#24280
…prometheus (open-telemetry#24026)

**Description:** 

Implement down-scaling of exponential histograms to Prometheus native
histograms in Prometheus remote writer.
Configuration of down-scaling is TBD.

**Link to tracking Issue:**

Fixes: open-telemetry#17565 

**Testing:**

Unit tests.

**Documentation:**

TBD

---------

Signed-off-by: György Krajcsovits <[email protected]>
Co-authored-by: Ruslan Kovalov <[email protected]>
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
Co-authored-by: Anthony Mirabella <[email protected]>
Adds `capabilities` to supervisor configuration.
…etry#24235)

When finding files, we would check for files with duplicate fingerprints
and deduplicate them if `a.StartsWith(b)` or `b.StartsWith(a)`. This
`StartsWith` logic is useful for recognizing files that have new content
since the previous poll interval. However, when deduplicating files
observed at the same moment, the only case that we need to consider is
copy/truncate rotation. In this case, a file may have been copied but
the original has not yet been truncated. At that moment we should expect
to find two files with exactly the same fingerprint. Therefore, we do
not need to check StartsWith cases.
atoulme and others added 26 commits August 6, 2023 17:16
@aboguszewski-sumo aboguszewski-sumo force-pushed the awss3exporter-add-sumo-logs-marshaller branch from 9277a8b to d084d23 Compare August 7, 2023 08:30
kasia-kujawa pushed a commit that referenced this pull request Oct 13, 2024
… Histo --> Histogram (open-telemetry#33824)

## Description

This PR adds a custom metric function to the transformprocessor to
convert exponential histograms to explicit histograms.

Link to tracking issue: Resolves open-telemetry#33827

**Function Name**
```
convert_exponential_histogram_to_explicit_histogram
```

**Arguments:**

- `distribution` (_upper, midpoint, uniform, random_)
- `ExplicitBoundaries: []float64`

**Usage example:**

```yaml
processors:
  transform:
    error_mode: propagate
    metric_statements:
    - context: metric
      statements:
        - convert_exponential_histogram_to_explicit_histogram("random", [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]) 
```

**Converts:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-31 09:35:25.212037 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
Bucket (32.000000, 64.000000], Count: 10
Bucket (64.000000, 128.000000], Count: 22
Bucket (128.000000, 256.000000], Count: 12
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

**To:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: Histogram
     -> AggregationTemporality: Delta
HistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-30 21:37:07.830902 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
ExplicitBounds #0: 10.000000
ExplicitBounds #1: 20.000000
ExplicitBounds #2: 30.000000
ExplicitBounds #3: 40.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 60.000000
ExplicitBounds open-telemetry#6: 70.000000
ExplicitBounds open-telemetry#7: 80.000000
ExplicitBounds open-telemetry#8: 90.000000
ExplicitBounds open-telemetry#9: 100.000000
Buckets #0, Count: 0
Buckets #1, Count: 0
Buckets #2, Count: 0
Buckets #3, Count: 2
Buckets #4, Count: 5
Buckets #5, Count: 0
Buckets open-telemetry#6, Count: 3
Buckets open-telemetry#7, Count: 7
Buckets open-telemetry#8, Count: 2
Buckets open-telemetry#9, Count: 4
Buckets open-telemetry#10, Count: 21
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

### Testing

- Several unit tests have been created. We have also tested by ingesting
and converting exponential histograms from the `statsdreceiver` as well
as directly via the `otlpreceiver` over grpc over several hours with a
large amount of data.

- We have clients that have been running this solution in production for
a number of weeks.

### Readme description:

### convert_exponential_hist_to_explicit_hist

`convert_exponential_hist_to_explicit_hist([ExplicitBounds])`

the `convert_exponential_hist_to_explicit_hist` function converts an
ExponentialHistogram to an Explicit (_normal_) Histogram.

`ExplicitBounds` is represents the list of bucket boundaries for the new
histogram. This argument is __required__ and __cannot be empty__.

__WARNING:__

The process of converting an ExponentialHistogram to an Explicit
Histogram is not perfect and may result in a loss of precision. It is
important to define an appropriate set of bucket boundaries to minimize
this loss. For example, selecting Boundaries that are too high or too
low may result histogram buckets that are too wide or too narrow,
respectively.

---------

Co-authored-by: Kent Quirk <[email protected]>
Co-authored-by: Tyler Helmuth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.