Add support for community supported plugins #937

SylvainJuge · 2019-11-21T10:53:40Z

Issue description

Community contributions are split in two broad categories

agent "core" and common infrastructure code
agent "plugins", which provide support for specific frameworks/libraries.

Plugins (2) are the easiest way to add features to the agent and are also the most common contribution type.
We (Elastic Java Agent team) currently take ownership of all those added features and merge them to the plugins, which means asking for changes in pull-requests to meet quality/style standards.

While doing so seems fair, it means that merging contributions takes a considerable effort and time, and that our team is often the bottleneck here, often with back and forth with contributors.
Also, keeping pull-requests open for a while makes them harder to merge.

Proposed solution

https://docs.google.com/document/d/1LFTBDLM3-Sj7cU1pHFxfgI3donyTMp8tvjFpyJM7dF8/edit

Create apm-agent-plugin-sdk: used both by internal and external/community plugins
- agnostic of the tracer API
External plugins can use the public API or the OpenTracing API bridge
External plugins are loadable from $AGENT_HOME/plugins
Testing: relies on internal API and internal test helpers such as AbstractInstrumentationTest and MockReporter
- Creating and maintaining abstractions that are backwards compatible doesn't seem feasible and will always lag behind
- The important objective of the plugin being compatible for the users of the plugin is still maintained
- When the plugin maintainer bumps the agent version, they have to resolve breaking changes in test code
- Makes it easier to eventually pull external plugins into the core agent as the same testing framework is used.
Plugin registry
- Would be just the supported technologies page linking to external GH repos
- The plugins can be downloaded from the releases page

Tasks

The text was updated successfully, but these errors were encountered:

SylvainJuge · 2019-11-25T10:51:00Z

Also, we currently have "incubating" features (currently JMS is one of them), those are disabled by default because disable_instrumentations=incubating in the default configuration.

There are two possible approaches:

reuse the same feature and set disable_instrumentations=incubating,community by default
create a dedicated opt-in option for enabled community instrumentations enabled_community_instrumentations that would be empty by default.

Pros/cons

will automatically enable community-supported plugins if default value is changed, even if only a few users do, that would be a breaking change
adds another configuration option

felixbarny · 2019-11-26T12:55:03Z

I would prefer the explicit opt-in via enabled_community_instrumentations

SylvainJuge · 2020-03-06T13:50:51Z

I've changed a bit my mind for the naming part, thus I feel now that community is a better choice for the following reasons:

it's independent from maturity of the plugin: alpha/beta/incubating, we currently use instrumentation group names for this purpose.
it focuses on ownership, and here our goal is to 1) make contributions easier and 2) set users expectations

Also, we should probably start to have a distinction between core plugins and community plugins for naming, but we will keep the existing structure for the core part.

+1 on using a single enabled_community_instrumentations to opt-in those, as adding new ones will not be made active when updating the agent.

felixbarny · 2020-03-11T16:41:16Z

allow building plugins outside of agent codebase & load them dynamically

Let's discuss whether that would be a feasible option.

It has the following advantages:

Users can independently develop and distribute plugins
Clearer ownership as the plugins would be in a separate GitHub repo
Plugins can incubate independently from the main repo and as they mature, we can pull them in

Downsides of the initial suggestion of creating a dedicated contrib folder within the main repo:

we might miss the main objective of having community-owned plugins: we would still be the "gatekeepers" and might be slow to respond to PRs or to merge them in.

The challenges

Our internal API can always change without considering that a breaking change
Only the public API is guaranteed to stay stable within a major version
- The public API is does not have the full functionality of the internal API
- The performance is worse, as transactions and spans have to be wrapped
- There's no support for auto-instrumentation

What we currently recommend to users that want to add plugins for their custom frameworks, which should not be included in the main repo, is to fork the repo and add an additional plugin in the apm-agents-plugin folder. They already have to deal with changes in the internal API and have to rebase with the upstream and build their own fork whenever they want to update the agent version.

I think it makes sense to fold the two use cases of community plugins and custom plugins and improve the experience of developing them in the following ways:

Create examples/tutorials/blogs/templates/maven archetypes for external plugins
Make plugins loadable from the $AGENT_HOME/plugins directory
List community plugins in the contrib repo or in the supported technologies docs.
Documenting the path how to get a community plugin into the main repo
As those plugins work in the same way as internal plugins, it is very easy, from a technical standpoint, to just move those plugins into the apm-agent-plugins folder.
On a best effort basis, list changes in the internal API in the release notes

The plugin development for Kibana and Elasticsearch is actually quite similar. Maybe the situation has changed since I had my last touchpoint with it but in Kibana, you had to explicitly declare for which version an external plugin works because the internal APIs can change without notice.

SylvainJuge · 2020-03-11T17:03:24Z

I really like this proposal as it's definitely a "best of both worlds" option, without much extra complexity on agent architecture.

We should probably add a way to check for agent/plugin compatibility, probably using jar manifest, that would allow us to still break things when needed, and then incompatible plugins would be disabled with a proper warning and an optional link to plugin github repository (if any).

On a side note, that probably means publishing our internal APIs as regular jars, which might be a bit confusing when people are searching maven central (plugins outside of our codebase will depend on those). This is quite minor and if we make a good walkthrough/doc we can make this user-proof.

felixbarny · 2020-03-11T17:12:05Z

On a side note, that probably means publishing our internal APIs as regular jars

I don’t think we’d be doing that. Users need to compile against the shaded version of Byte Buddy, for example. Do they’d just have a provided dependency on the elastic-apm-agent module, which we already publish on maven central.

eyalkoren · 2020-03-12T05:39:42Z

I like this discussion and I agree that moving outside the repo is important in order to achieve what we want. I am not feeling comfortable doing that and relying on our internal APIs though.

I just did two refactors lately that we wanted to do in our internal API. One would have practically break all community plugins and the other would have break those creating transactions (currently in progress are such for Ratpack, WebFlux, RocketMQ, Dubbo and probably Scala). We already know of a future refactor we have in mind that will practically break all in multiple places. I can easily see how having 30 plugins being affected by such changes affect our decisions, or the effort they take. Start breaking things for all the users of those plugins upon upgrade is unpleasant.

What if we did what @felixbarny suggested but allow usage of the public API only?
Regarding the related listed concerns:

The public API is does not have the full functionality of the internal API

True, but it is already reacher than the OpenTracing API that was proved useful for creation of custom plugins. Like we did so far, we can be attentive and adjust it to support future requirements.

The performance is worse, as transactions and spans have to be wrapped

I am pretty sure that wouldn't be the major performance concern for most community plugins that we are not involved in developing. The fact that these plugins are likely to be not-as-efficient is inherent and I believe acceptable. Going back to OpenTracing- it has the same (or worse) limitations and it got its adoption.

There's no support for auto-instrumentation

We can do the same as suggested above - loading plugins from the $AGENT_HOME/plugins directory.

felixbarny · 2020-03-12T07:13:25Z

Definitely valid concerns.

I can easily see how having 30 plugins being affected by such changes affect our decisions, or the effort they take. Start breaking things for all the users of those plugins upon upgrade is unpleasant.

Part of this would be educating users that things break at any time. It should not influence our decisions about refactoring the internal API, however.

Also, note that we're already in this situation for users that are adding plugins for their internal frameworks. But I acknowledge that promoting public API-based community plugins would expose many more users to this problem.

What if we did what @felixbarny suggested but allow usage of the public API only?

Why not both 🙂
We could let the user decide which API (internal or public API) to use.

However, there are some technical complications with allowing the public API in auto-instrumentation plugins.

Plugins using the public API would still have a provided dependency to elastic-apm-agent in order to leverage type matchers and advice.
At the moment, the core of the agent is loaded form the bootstrap class loader
- As a consequence, plugins can't ship with any libraries, including the public API, as it could introduce version conflicts.
- We'd have to load the plugins from an isolated class loader to avoid conflicts. However, the consequence is that advices, which are inlined in the methods they instrument, can't access static variables declared in that plugin for example.
- Public API-based auto instrumentation plugins could become viable again as we rework the agent/advice loading architecture to require using method handles to call the agent from within an advice.

In the meantime, I'd suggest starting with internal API plugins and consider public API plugins after we have refactored the agent to allow for that.

eyalkoren · 2020-03-12T08:18:04Z

Plugins using the public API would still have a provided dependency to elastic-apm-agent in order to leverage type matchers and advice.

Right, although this is not agent code. They can rely on ByteBuddy and the build will shade those. Or other solution.

As a consequence, plugins can't ship with any libraries, including the public API, as it could introduce version conflicts.

They don't need to ship with it, the API can be provided as well. There is no expectation that these plugins will work without an agent.

SylvainJuge · 2020-03-12T13:52:33Z

Update on the spec after today's discussion.

Summary

community plugins should not be part of our codebase
we will profide and maintain a list of available community plugins
those plugins will be separate binaries loaded by the agent at runtime
we defined a custom instrumentation ladder (see below) where community plugins are the 3rd level

Custom instrumentation ladder

1st level

Instrument application itself through explicit changes in application code
Relies on pubic agent API
User manual: mostly documentation

2cnd level

Instrument framework using interceptors (filters, event hooks, framework interceptors, ...)
Relies on pubic agent API
Shipped as regular application dependencies
Promoted through list of community plugins
User manual: documentation + blog posts + contrib repository samples

3rd level

Instrument framework using byte-buddy advices (auto-instrumentation)
Relies on internal agent API (thus more likely to break).
Shipped as external community agent plugins, outside of application dependencies
Major technical requirement: depends on changing current plugin classloading/isolation
Strong version agent/plugin version check (up to patch version).
Promoted through list of community plugins
User manual: Documentation + blog posts + contrib repository samples

4th level

Instrument framework using byte-buddy advices (auto-instrumentation)
Shipped as core agent plugins, embedded within the agent
Plugin is part of our official roadmap
Plugins promoted to this stage require a 3rd level plugin

Given there is a strong technical dependency on solving classloading/isolation, we will need to solve this first as it's a pre-requisite.

felixbarny · 2020-03-12T16:59:42Z

@roncohen @alvarolobato As you raised some concerns with the previous plan to have community plugins within the main repo, could we get your opinion on the new plan?

See #937 (comment) and #937 (comment)

alvarolobato · 2020-03-12T18:41:20Z

Thanks team for looking into this, this is a key initiative

I read through the proposal and it looks good to me with the exception of allowing plugins to rely on the internal API.
You argue that it is a decision of the user to use the internal API or not, but the reality is that it is a decision of the developer of the plugin, not the final user of the plugin.

My concern is getting into a situation where plugins are hard to maintain because we break the internal API, maintainers take a long time to update and our users start feeling the pain.

This also can happen with public API's, but that's the objective of public API's and that's why we are so careful when breaking them.

I've lived this situation and is very painful and in the end, it tends to hamper your progress because you are conditioned to not break your own internal API.

Having the plugins in an external repo is something we need, but it will make all this worse by making it very hard to do a breaking change that also fixes the plugin (we also don't want to do for community plugins we don't maintain).

I don't fully understand why we want to allow internal API, what are the advantages over just allowing public API?

felixbarny · 2020-03-13T11:29:40Z

The truth is that to develop an auto-instrumentation plugin, you not only rely on the tracer API (to create transactions/spans). You also need to define the instrumentation. Byte Buddy helps in that regard as it lets you define matchers (which methods to instrument) and advices (the code to be injected into methods). We have developed lots of glue code that helps to navigate the issues around class visibility. This glue code is subject to change as well. In fact, we have plans for a major overhaul of this in order to properly support runtime attachment.

If we'd only allow access to the public API for external plugins, developers would be left without support when it comes to the hardest part of agent development: classloading/visibility. If we give them access to those helpers, we're basically allowing them to access the internals of the agent: any changes we make would be breaking ones.

Maybe we could extract some of these helpers to a separate module which is guaranteed to be stable. Or even try to work towards getting some of those changes into Byte Buddy.

But developing plugins only with the public API will always be quite limiting. There are a number of things you wouldn't be able to do like

Adding configuration options
Add lifecycle listeners (callbacks on start/stop/pause/resume)
Add activation listeners (callback when a span is activated/deactivated)
object pooling
I have probably forgotten some

While not all plugins may need that, for some the need will eventually arise. In that case, there's no way around relying on internal APIs

felixbarny · 2020-03-23T12:19:58Z

Maybe we could extract some of these helpers to a separate module which is guaranteed to be stable. Or even try to work towards getting some of those changes into Byte Buddy.

I just had a chat with @roncohen about this.

He also shared concerns around the implications of more people relying on the internal API. We would either break lots of plugins by making changes in the internal API or be forced to maintain backwards compatibility for our own internal API. Both don't sound like good options.

We have discussed what we could have a plugin SDK, possibly in a separate repo, that allows people to write bytecode instrumentation plugins without relying on the internal API. The SDK would contain the public API (which could be made a bit more powerful in future versions) and an opinionated Byte Buddy setup, including things similar to ElasticApmInstrumentation and HelperClassManager.
However, there should be no shared code among the internal API and the plugin SDK so that we don't break the SDK API when changing something in the agent core. Instead, we may adapt some the agent core classes and duplicate them in the plugin SDK repo. This makes it easier to maintain compatibility.

Over time, we can make the SDK more powerful. For example, we could allow adding additional configuration options. But to start with it can be lightweight, even if it means it's not as powerful as the internal API.

This has a dependency on #1085, as we are revising the class loading and HelperClassManager mechanism in the core agent.

alvarolobato · 2020-03-23T12:22:42Z

SGTM, thanks

felixbarny · 2020-06-30T12:47:48Z

I've created a doc with the proposed changes in the module structure of the agent https://docs.google.com/document/d/1LFTBDLM3-Sj7cU1pHFxfgI3donyTMp8tvjFpyJM7dF8/edit

TL;DR:

Create apm-agent-plugin-sdk: used both by internal and external/community plugins
- agnostic of the tracer API
External plugins can use the public API or the OpenTracing API bridge
External plugins are loadable from $AGENT_HOME/plugins
Testing: relies on internal API and internal test helpers such as AbstractInstrumentationTest and MockReporter
- Creating and maintaining abstractions that are backwards compatible doesn't seem feasible and will always lag behind
- The important objective of the plugin being compatible for the users of the plugin is still maintained
- When the plugin maintainer bumps the agent version, they have to resolve breaking changes in test code
- Makes it easier to eventually pull external plugins into the core agent as the same testing framework is used.
Plugin registry
- Would be just the supported technologies page linking to external GH repos
- The plugins can be downloaded from the releases page

AlexanderWert · 2022-05-31T08:20:37Z

first iteration on external plugins support is completed

SylvainJuge added the type: enhancement label Nov 25, 2019

SylvainJuge added this to the 7.7 milestone Mar 10, 2020

SylvainJuge removed this from the 7.7 milestone Mar 12, 2020

SylvainJuge added the [zube]: Backlog label May 18, 2020

This was referenced Jul 2, 2020

Add GlobalTracer #1270

Closed

First step towards enabling external plugins #1271

Merged

This was referenced Aug 13, 2020

Stop shading libraries #1339

Closed

Publish test module for apm-agent-core #1340

Closed

SylvainJuge added enhancement Enhancement of an existing feature and removed [zube]: Backlog labels Feb 1, 2021

AlexanderWert added this to the 7.16 milestone Aug 26, 2021

AlexanderWert added the agent-java label Aug 26, 2021

SylvainJuge added the 8.0-candidate label Oct 18, 2021

AlexanderWert modified the milestones: 7.16, 8.0 Oct 18, 2021

AlexanderWert removed the 8.0-candidate label Oct 18, 2021

AlexanderWert added the 8.1-candidate label Dec 6, 2021

AlexanderWert modified the milestones: 8.0, 8.1 Dec 6, 2021

AlexanderWert removed the 8.1-candidate label Dec 6, 2021

AlexanderWert assigned jackshirazi Dec 6, 2021

AlexanderWert added the 8.2-candidate label Jan 10, 2022

AlexanderWert removed this from the 8.1 milestone Jan 10, 2022

AlexanderWert added 8.3-candidate and removed 8.2-candidate labels Jan 24, 2022

AlexanderWert added this to the 8.3 milestone Mar 28, 2022

AlexanderWert removed the 8.3-candidate label Mar 28, 2022

AlexanderWert mentioned this issue May 31, 2022

[META] External plugins - Enhancements #2654

Open

3 tasks

AlexanderWert closed this as completed May 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for community supported plugins #937

Add support for community supported plugins #937

SylvainJuge commented Nov 21, 2019 •

edited by zube bot

Loading

SylvainJuge commented Nov 25, 2019

felixbarny commented Nov 26, 2019

SylvainJuge commented Mar 6, 2020

felixbarny commented Mar 11, 2020

SylvainJuge commented Mar 11, 2020

felixbarny commented Mar 11, 2020

eyalkoren commented Mar 12, 2020

felixbarny commented Mar 12, 2020

eyalkoren commented Mar 12, 2020

SylvainJuge commented Mar 12, 2020

felixbarny commented Mar 12, 2020

alvarolobato commented Mar 12, 2020

felixbarny commented Mar 13, 2020

felixbarny commented Mar 23, 2020

alvarolobato commented Mar 23, 2020

felixbarny commented Jun 30, 2020

AlexanderWert commented May 31, 2022

Add support for community supported plugins #937

Add support for community supported plugins #937

Comments

SylvainJuge commented Nov 21, 2019 • edited by zube bot Loading

SylvainJuge commented Nov 25, 2019

felixbarny commented Nov 26, 2019

SylvainJuge commented Mar 6, 2020

felixbarny commented Mar 11, 2020

SylvainJuge commented Mar 11, 2020

felixbarny commented Mar 11, 2020

eyalkoren commented Mar 12, 2020

felixbarny commented Mar 12, 2020

eyalkoren commented Mar 12, 2020

SylvainJuge commented Mar 12, 2020

Summary

Custom instrumentation ladder

1st level

2cnd level

3rd level

4th level

felixbarny commented Mar 12, 2020

alvarolobato commented Mar 12, 2020

felixbarny commented Mar 13, 2020

felixbarny commented Mar 23, 2020

alvarolobato commented Mar 23, 2020

felixbarny commented Jun 30, 2020

AlexanderWert commented May 31, 2022

SylvainJuge commented Nov 21, 2019 •

edited by zube bot

Loading