Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a tiered platform support model #8209

Closed
rrschulze opened this issue Aug 8, 2023 · 5 comments
Closed

Introduce a tiered platform support model #8209

rrschulze opened this issue Aug 8, 2023 · 5 comments

Comments

@rrschulze
Copy link
Contributor

rrschulze commented Aug 8, 2023

Description

This is a proposal to introduce a tiered platform support model for the OpenTelemetry Collector. Tiered platform support models exist today for various open-source projects (e.g., rustc, qiskit, mozilla, python, node.js) and are set up to meet a project’s objective to support as many platforms as possible, while facing limitations in test resources and platform access. Such platform support models describe different tiers of support based on the level of the verifications and support that the platforms receive by the open-source project. The models usually classify platforms, i.e., combinations operating systems and processor architectures, into the support tiers and guarantee core functionality and stability for primary platforms, while other platforms might receive less verifications or depend on dedicated community contributions. The support tiers have in most cases semantics like (1) fully supported, (2) partially supported and (3) community supported or experimental. The tier with the highest level of support is usually referred as Tier 1, while a higher number of the tier implies less support. Target policies may be in place that describe in detail the criteria and conditions for classifying platforms into the support tier. Common characteristics of support tiers are for instances:

  1. The level of operational guarantees, where in the fully supported tiers the binaries for the platforms are guaranteed to work, while in other tiers they are only guaranteed to build.
  2. Success of compile and build is mandatory for the platforms of the fully supported tiers, while in less supported tiers, failures may be accepted.
  3. The ownership of the platforms: it can either be overseen by the community at large or specifically by platform maintainers/owners.
  4. The extent of automated test coverage can vary, verification test are always fully run in fully supported tiers, while other tiers have only partial runs, and some no automated testing at all.
  5. The impact of test failures on a release can differ; in fully supported tiers, a test failure blocks the release for all platforms, whereas for others, it only delays the deliverables of the failing platform.
  6. Maintenance often aligns to the support tiers. Defects reported on platforms in fully supported tiers are usually treated as high-priority issues, while defects on less supported tiers might not receive the same prioritization and fixes depend on the contribution of designated platform maintainers.

Background

This proposal is a follow-up to on earlier comment by @mx-psi in context of the ask to support AIX where the need for a tiered platform support model was emphasized. This proposal is initiated as with Linux on s390x (#378 ) one more platform has been proposed to be supported by the OTel Collector.

Current Test Strategy

The current verification process of the OpenTelemetry Collector includes unit and performance tests for core and additional end-to-end and integration tests for contrib. In the end-to-end tests, receivers, processors, and exporters etc. are tested in a testbed, while the integration tests rely on actual instances and available container images. Additional stability tests are in preparation for the future as well. All verification tests are run on Ubuntu 22.04 Linux on amd64 as the primary platform today. In addition, unit tests are run for the contrib collector on Microsoft Windows Server 2022 (amd64). The cross compile supports two MacOS/Darwin targets (amd64 and arm64), five Linux platforms (amd64, arm64, i386, arm and ppc64le) and two Windows binaries (amd64, i386) today. None of those platforms is tested today, except of Linux on amd64 as the primary platform. The OpenTelemetry Collector can be installed using apk, deb or rpm files and is available as container images (core, contrib) for deployment on Kubernetes and Docker. The container images are built using qemu and published to Docker Hub and ghcr.io for Linux on amd64, arm64, i386, arm/v7 and ppc64le. The end-to-end test for the contrib container images is run on Ubuntu 22.04 Linux for the Kubernetes versions v1.23 to v1.26.

Tiered platform support model

The OpenTelemetry Collector will be supported following a tiered platform support model to balance between the aim to support as many platforms as possible and to guarantee stability for the most important platforms. The platform support for the OpenTelemetry Collector is broken into three tiers with different levels of support for each tier.

Tier 1 – Primary Support

The Tier 1 supported platforms are guaranteed to work. Precompiled binaries are built on the platform, fully supported for all collector add-ons (receivers, processor, exporters etc.), and continuously tested as part of the development processes to ensure any proposed change will function correctly. Build and test infrastructure is provided by the project. All tests are executed on the platform as part of automated continuous integration (CI) for each pull request and the biweekly release cycle. Any build or test failure block the release of the collector distribution for all platforms. Defects are addressed with priority and depending on severity fixed for the previous release in a bug fix release.

Tier 1 platforms are currently:

  • Linux amd64
  • Kubernetes amd64

Tier 2 – Secondary Support

Tier 2 platforms are guaranteed to work with specified limitations. Precompiled binaries are built and tested on the platform as part of the biweekly release cycle. Build and test infrastructure is provided by the platform maintainers. All tests are executed on the platform as far as they are applicable, and all prerequisites are fulfilled. Not executed tests and not tested collector add-ons (received, processors, exporters, etc.) are published on release of the collector distribution. Any build or test failure delays the release of the binaries for the respective platform but not the collector distribution for all other platforms. Defects are addressed but not with the priority as for Tier 1 and, if specific to the platform, require the support of the platform maintainers.

Tier 2 platforms are currently:

  • None

Tier 3 - Community Support

Tier 3 platforms are guaranteed to build. Precompiled binaries are made available as part of the release process and as result of a cross compile build on Linux amd64 but the binaries are not tested at all. Any build failure delays the release of the binaries for the respective platform but not the collector distribution for all other platforms. Defects are addressed based on community contributions. Core developers might provide guidance or code reviews, but direct fixes may be limited.

Tier 3 platforms are currently:

  • MacOS/Darwin amd64
  • MacOS/Darwin arm64
  • Linux arm64
  • Linux i386
  • Linux arm/7
  • Linux ppc64le
  • Windows amd64
  • Windows i386
  • Kubernetes arm64, i386, arm/v7 and ppc64le
  • Docker amd64, arm64, i386, arm/v7 and ppc64le

The proposed additional platforms Linux on s390x (#378) and AIX on ppc64 (#19195) will be included into Tier 3 once they're added to the OpenTelemetry Collector as platforms.

@atoulme
Copy link
Contributor

atoulme commented Aug 8, 2023

I think we need more guarantees around Windows support and Docker. What does Kubernetes amd64 stand for here? As it is typically built around our Docker image, wouldn't that qualify support for Docker amd64 to tier 1?

@rrschulze
Copy link
Contributor Author

@atoulme Updated Current Test Strategy above to reflect current practice: In addition, unit tests are run for the contrib collector on Microsoft Windows Server 2022 (amd64).

With only the unit test executed on Windows amd64, it would not yet qualify for Tier 1 and Tier 2 as per the current proposal. We need to discuss if and how the test strategy is to be adjusted while introducing the Tiered Platform Support Model.

Current tests for container images are done on Kubernetes on Ubtunu 22.04 (Linux amd64) during e2e_test for the contrib collector. Decision needs to be made that this will account for a Docker support in Tier 1 (strictly speaking no testing on Docker is done).

@mx-psi
Copy link
Member

mx-psi commented Aug 9, 2023

Thanks for the issue @rrschulze! I have a bunch of comments, but I think it would be easier to discuss them if this was a document on a pull request. Could you open a pull request over at https://github.com/open-telemetry/opentelemetry-collector for this? Prior work on defining support that affected all three repositories related to the Collector has ended up being in documents there, so I think it makes more sense to have it there.

I will also transfer the issue to opentelemetry-collector since this is where we have typically had these kind of dicussions that affect the whole Collector project and not just the artifacts.

@mx-psi mx-psi transferred this issue from open-telemetry/opentelemetry-collector-releases Aug 9, 2023
@rrschulze
Copy link
Contributor Author

@mx-psi yes, will open a PR for a document.

codeboten pushed a commit that referenced this issue Sep 25, 2023
This PR adds documentation to introduce a tiered
platform support model for the OpenTelemetry Collector. The tiered
platform support model provides clarity to the project and its users
about how existing platforms are supported today and how requests for
new platforms can be supported in future, while balancing between the
aim to support as many platforms as possible and to guarantee stability
for the most important platforms.

**Link to tracking Issue:** #8209

---------

Co-authored-by: Aunsh Chaudhari <[email protected]>
Co-authored-by: Pablo Baeyens <[email protected]>
Co-authored-by: Alex Boten <[email protected]>
@codeboten
Copy link
Contributor

Documented in #8224

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants