-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CONTINT-4412] Upgrade k8s
dependencies
#30061
Conversation
go.mod
Outdated
github.com/DataDog/datadog-go/v5 v5.5.0 | ||
github.com/DataDog/datadog-operator v1.8.0-rc.1 | ||
github.com/DataDog/datadog-operator v0.7.1-0.20241010110733-dbbe6d120655 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the commit of DataDog/datadog-operator#1445.
To be updated as soon as this PR is merged.
go.mod
Outdated
@@ -165,7 +167,7 @@ require ( | |||
github.com/DataDog/opentelemetry-mapping-go/pkg/quantile v0.20.0 | |||
github.com/DataDog/sketches-go v1.4.6 | |||
github.com/DataDog/viper v1.13.5 | |||
github.com/DataDog/watermarkpodautoscaler v0.6.1 | |||
github.com/DataDog/watermarkpodautoscaler v0.5.3-0.20241011111846-034635582ee1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the commit of DataDog/watermarkpodautoscaler#221.
To be updated as soon as this PR is merged.
Go Package Import DifferencesThis comment was omitted because it was over 65,536 characters. Please check the Gitlab Job logs to see its output. |
1e31c20
to
58f9908
Compare
Regression DetectorRegression Detector ResultsMetrics dashboard Baseline: e830ee5 Optimization Goals: ✅ No significant changes detected
|
perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
---|---|---|---|---|---|---|
➖ | pycheck_lots_of_tags | % cpu utilization | +4.91 | [+1.29, +8.53] | 1 | Logs |
➖ | idle_all_features | memory utilization | +2.60 | [+2.47, +2.73] | 1 | Logs bounds checks dashboard |
➖ | idle | memory utilization | +1.41 | [+1.36, +1.47] | 1 | Logs bounds checks dashboard |
➖ | quality_gate_idle_all_features | memory utilization | +1.34 | [+1.22, +1.47] | 1 | Logs bounds checks dashboard |
➖ | uds_dogstatsd_to_api_cpu | % cpu utilization | +1.22 | [+0.50, +1.95] | 1 | Logs |
➖ | quality_gate_idle | memory utilization | +1.21 | [+1.16, +1.27] | 1 | Logs bounds checks dashboard |
➖ | file_tree | memory utilization | +1.18 | [+1.05, +1.31] | 1 | Logs |
➖ | tcp_syslog_to_blackhole | ingress throughput | +0.58 | [+0.52, +0.63] | 1 | Logs |
➖ | file_to_blackhole_1000ms_latency | egress throughput | +0.21 | [-0.27, +0.70] | 1 | Logs |
➖ | file_to_blackhole_300ms_latency | egress throughput | +0.07 | [-0.12, +0.26] | 1 | Logs |
➖ | tcp_dd_logs_filter_exclude | ingress throughput | +0.00 | [-0.01, +0.01] | 1 | Logs |
➖ | uds_dogstatsd_to_api | ingress throughput | -0.00 | [-0.09, +0.08] | 1 | Logs |
➖ | file_to_blackhole_100ms_latency | egress throughput | -0.01 | [-0.27, +0.26] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency | egress throughput | -0.01 | [-0.42, +0.41] | 1 | Logs |
➖ | file_to_blackhole_500ms_latency | egress throughput | -0.03 | [-0.28, +0.22] | 1 | Logs |
➖ | basic_py_check | % cpu utilization | -0.62 | [-4.49, +3.26] | 1 | Logs |
Bounds Checks: ❌ Failed
perf | experiment | bounds_check_name | replicates_passed | links |
---|---|---|---|---|
❌ | quality_gate_idle | memory_usage | 5/10 | bounds checks dashboard |
❌ | idle | memory_usage | 7/10 | bounds checks dashboard |
❌ | idle_all_features | memory_usage | 9/10 | bounds checks dashboard |
✅ | file_to_blackhole_0ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_1000ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_100ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_100ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_300ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_500ms_latency | memory_usage | 10/10 | |
✅ | quality_gate_idle_all_features | memory_usage | 10/10 | bounds checks dashboard |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
go.mod
Outdated
|
||
// Prevent a false-positive detection by the Google and Ikarus security vendors on VirusTotal | ||
exclude go.opentelemetry.io/proto/otlp v1.1.0 | ||
|
||
replace github.com/google/gopacket v1.1.19 => github.com/DataDog/gopacket v0.0.0-20240626205202-4ac4cee31f14 | ||
|
||
// Remove once https://github.com/kubernetes-sigs/custom-metrics-apiserver/pull/184 is merged | ||
replace sigs.k8s.io/custom-metrics-apiserver v1.30.0 => github.com/L3n41c/custom-metrics-apiserver v0.0.0-20241014100211-ccded82d0da2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the commit of kubernetes-sigs/custom-metrics-apiserver#184.
To be updated as soon as this PR is merged.
38710a7
to
7c22c88
Compare
Test changes on VMUse this command from test-infra-definitions to manually test this PR changes on a VM: inv create-vm --pipeline-id=48445688 --os-family=ubuntu Note: This applies to commit 4479926 |
37696fe
to
b686d5f
Compare
b686d5f
to
58f6136
Compare
Gitlab CI Configuration ChangesModified Jobsdocker_image_build_otel docker_image_build_otel:
before_script:
- mkdir -p $GOPATH/pkg/mod/cache && tar xJf modcache.tar.xz -C $GOPATH/pkg/mod/cache
- rm -f modcache.tar.xz
- mkdir -p /tmp/otel-ci
- cp comp/otelcol/collector-contrib/impl/manifest.yaml /tmp/otel-ci/
- cp Dockerfiles/agent-ot/Dockerfile.agent-otel /tmp/otel-ci/
- cp test/integration/docker/otel_agent_build_tests.py /tmp/otel-ci/
- wget https://github.com/mikefarah/yq/releases/download/3.4.1/yq_linux_amd64 -O
/usr/bin/yq && chmod +x /usr/bin/yq
- export OTELCOL_VERSION=v$(/usr/bin/yq r /tmp/otel-ci/manifest.yaml dist.otelcol_version)
- yq w -i /tmp/otel-ci/manifest.yaml "receivers[+] gomod" "github.com/open-telemetry/opentelemetry-collector-contrib/receiver/k8sobjectsreceiver
${OTELCOL_VERSION}"
- yq w -i /tmp/otel-ci/manifest.yaml "processors[+] gomod" "github.com/open-telemetry/opentelemetry-collector-contrib/processor/metricstransformprocessor
${OTELCOL_VERSION}"
image: registry.ddbuild.io/ci/datadog-agent-buildimages/docker_x64$DATADOG_AGENT_BUILDIMAGES_SUFFIX:$DATADOG_AGENT_BUILDIMAGES
needs:
- go_deps
- integration_tests_otel
rules:
- if: $CI_PIPELINE_SOURCE =~ /^schedule.*$/
when: never
- if: $CI_COMMIT_TAG
when: never
- if: $CI_COMMIT_MESSAGE =~ /.*\[skip cancel\].*/
when: never
- if: $CI_COMMIT_REF_NAME =~ /.*-skip-cancel$/
when: never
- when: always
script:
+ - docker build --build-arg AGENT_BRANCH=$CI_COMMIT_BRANCH --tag agent-byoc:latest
- - docker build -t agent-byoc:latest -f /tmp/otel-ci/Dockerfile.agent-otel /tmp/otel-ci
? - ---------------------------------
+ -f /tmp/otel-ci/Dockerfile.agent-otel /tmp/otel-ci
- OT_AGENT_IMAGE_NAME=agent-byoc OT_AGENT_TAG=latest python3 /tmp/otel-ci/otel_agent_build_tests.py
stage: integration_test
tags:
- runner:docker Changes Summary
ℹ️ Diff available in the job log. |
dd1e9c5
to
2c7fa17
Compare
f70036b
to
5b37b98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, thanks for fixing my integration test
edit @liustanley if you have time today I'd love to look over the e2e tests with you to get a better understanding why they are failing before this gets merged in
7393cf8
to
235b2e2
Compare
235b2e2
to
8f59c4a
Compare
Performance issue is now fixed: dashboard |
until they are fixed by OTEL-2192 and #30669
Debug infoIf you have questions, we are happy to help, come visit us in the #serverless slack channel and provide a link to this comment. These dependencies were added to the serverless extension by this pull request:
View dependency graphs for each added dependency in the artifacts section of the github action. We suggest you consider adding the |
Serverless Benchmark Results
tl;drUse these benchmarks as an insight tool during development.
What is this benchmarking?The The benchmark is run using a large variety of lambda request payloads. In the charts below, there is one row for each event payload type. How do I interpret these charts?The charts below comes from The benchstat docs explain how to interpret these charts.
I need more helpFirst off, do not worry if the benchmarks are failing. They are not tests. The intention is for them to be a tool for you to use during development. If you would like a hand interpreting the results come chat with us in Benchmark stats
|
# TODO: https://datadoghq.atlassian.net/browse/OTEL-2192 | ||
# To be removed once https://github.com/DataDog/datadog-agent/pull/30669 is merged. | ||
test/new-e2e/tests/otel/otlp-ingest: | ||
- TestOTLPIngest | ||
- TestOTLPIngestSampling/TestSampling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR significantly increases agent package size (~50MB), and in particular just k8s.io/client-go goes from 5MB to 12MB in binary size.
This is known, it was deployed and didn't show significant regression, and it will be worked on later, so approving !
/merge |
Devflow running:
|
What does this PR do?
Previously, in order to store/get the redacted config, we needed to unmarshal the
confmap.Conf
passed toNotifyConfig
into aotelcol.Config
in order to take advantage of the redaction ofconfigopaque.String
.Since the changes in the following PR: [otelcol] Obtain the Collector's effective configuration from
otelcol.Config
open-telemetry/opentelemetry-collector#10139, theconfmap.Conf
passed toNotifyConfig
is redacted already as it is built from aotelcol.Config
(see here) so we can remove all unmarshalling logic and store theconfmap.Conf
directly.Motivation
Describe how to test/QA your changes
Launch the converged agent without any extensions. Then trigger a flare.
Ensure that:
Possible Drawbacks / Trade-offs
Additional Notes
Supersedes #28585.