Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experimental: config builder + nette schema #498

Closed
wants to merge 7 commits into from
Closed

experimental: config builder + nette schema #498

wants to merge 7 commits into from

Conversation

brettmc
Copy link
Collaborator

@brettmc brettmc commented Dec 2, 2021

using nette/schema to generate a config from environment + user-supplied variables

  • user vars replace env vars
  • some defaults provided in config (more could be?)
  • enforces config structure and types (where possible)
  • switching some factories to fromConfig (more to do)

@codecov
Copy link

codecov bot commented Dec 2, 2021

Codecov Report

Merging #498 (737d9e8) into main (8649206) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##               main     #498   +/-   ##
=========================================
  Coverage     94.73%   94.73%           
  Complexity      961      961           
=========================================
  Files            94       94           
  Lines          2375     2375           
=========================================
  Hits           2250     2250           
  Misses          125      125           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8649206...737d9e8. Read the comment docs.

]),
'propagators' => Expect::string('tracecontext,baggage'),
'service' => Expect::structure([
'name' => Expect::string(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add all defaults here, and remove from the various classes that look up env vars

@tidal
Copy link
Member

tidal commented Dec 2, 2021

Looks good on first sight. However I will add some feeback.

One thing thing I found in the repos of other SIGs is, that they often have a experimental directory besides the src one. Maybe we can adport this, so it would be easier to collaborate on certain things, and once everybody is happy, we move it to source?
One reason is, that I find it way easier to review code locally, than in github's PR GUI (There are tools to help with this, but they don't work with the restrictions of the OTEL repos afaik)

Another things is, while creating the sdk bundle, I found it quite difficult to reason about all the config options and possibilities in theory.
So the config "examples" of the integration tests happened to be the first thing I created, and then I made them work (while still making some adjustments). Basically I created a Symfony test application to see if everything works at the end, and the integration tests have just been turning the test app into executable tests. The configuration also does not have Unit tests, and you can find the reasons in the PR

class ConfigBuilder
{
//single-value env vars
private array $single = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This array could go into a constant, and maybe even in a Interface.
I'm definetly guilty of over-egnineering constant sometimes (cough), but the retionale is that there is only one place, where a static value is defined which can be referenced. So when a value needs to be changed, it only needs to be changed in one single place. (It's also a bit more memory efficient than an array property, but that would be micro-optimization as the ony reason.)

I'd also stumble upon the name of the property ($single). I prefer "speaking variable names" over comments. (Id did not come up with this myself, I read it in a book a while ago)

//@phan-ignore PhanUndeclaredClassMethod
$schema = Expect::structure([
'log' => Expect::structure([
'stream' => Expect::string('php://stdout'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should expect a PSR logger instance here, instead of a stream.

'attribute_count' => Expect::string()->castTo('int'),
'attribute_value_length' => Expect::string()->castTo('int'),
]),
'attributes' => Expect::array(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found, one can even define the structure of arreays. So, in this case it could be:

'attributes' => Expect::arrayOf(
    Expect::string(),
    Expect::string()
),

'attributes' => Expect::array(),
]),
'trace' => Expect::structure([
'sampler' => Expect::string('parentbased_always_on'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your reasoning for having those sampler + samplers etc. entries. or how do you think this will work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this method easier to reason about, because the config options for different types of sampler (or exporter/processor/etc) are easily found within their respective key in "samplers", where "sampler" is the list of samplers to be used.
So, iterate over "sampler" to work out which to create, and then look up sampler-specific config in samplers[type] as required.


/** @phan-file-suppress PhanUndeclaredClassMethod */

class ConfigBuilder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this ConfigBuilder is doing to much. It's a) defining a configuration schema, and b) acting upon user definded values via env vars. I think the providers you outlined here are a cleaner approach to deal with user provided values and allow for single responsibility.

]),
'exporter' => Expect::string('grpc'),
'exporters' => Expect::structure([
'zipkin' => Expect::structure([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not have the schemas for the contrib exporters here. This is of course not a direct dependency on the Contrib "package", but a indirect one. It's expecting something which may not be there.

Also this looks like it's always expecting configuration for all possible exporters, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. It does raise the question though - I'm starting to think that the two otlp exporters should not be in contrib but rather in SDK - this is the opentelementry project, after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the opentelementry project, after all.

That's true, however from the perspective of the library the OTLP collectors are 3rd party applications, which just happen to live under the same umbrella organization. Moving the grpc exporter into the SDK would also make the corresponding extension a must have dpendency for the SDK again. And since there must be a solution to handle Zipkin & Jaeger and Co. in the contrin directory, it's not much of a hassle to do the same for the otlp exporters.

]),
]);
$env = $this->env();
$resourceAttributes = $this->resourceAttributes();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think temporary variables help readability.

return strpos($key, 'OTEL_') !== false;
}, ARRAY_FILTER_USE_KEY);
$output = [];
foreach ($vars as $key => $val) {
Copy link
Member

@tidal tidal Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A long time ago I read a book on code readebilty, so you can blame the authors for me complaininga about this stuff. :)

However while a lot of things there seam to be nit-picky at first, they make sense, when one thinks about and get used to them. One point is to not use "short variable names". While you and maybe me as well know what's meant with "$val", it might not be obviouss to everyone.

'stream' => Expect::string('php://stdout'),
'level' => Expect::string('info'),
]),
'propagators' => Expect::string('tracecontext,baggage'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you expect a CSV here? I Think it's cleaner to expect an array and normalize values from the env var accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had two different techniques going during development, and originally CSV values were extracted into an array as you've suggested. I went back to the CSV method because arrays did not work with default values (ie, if you define default propagator(s) as array values, they are not removable).
So it looks like with nette/schema, we really need to choose between having it handle default values, or having it just enforce the schema (and responsibility for default values stays in the various factories/constructors). I'm happy for it to just enforce the structure.

Copy link
Member

@tidal tidal Dec 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Why should there be default propagators and why should there be defaults for array items?
However I think it's better anyway to have classes or factories handle defaults than maintaining them in two places, including possible bugs.

(There's no plugin mechanism for propagators in the SDK yet anyway)

@brettmc brettmc changed the title config builder + nette config experimental: config builder + nette schema Dec 4, 2021
@brettmc
Copy link
Collaborator Author

brettmc commented Dec 4, 2021

@tidal - a change of direction. I've created an experimental dir as you suggested, and have two different approaches:

  • Experimental\NetteConfig - the first approach (unchanged), using nette schema to create one big schema of all possible config
  • Experimental\Config - instead of nette schema, use classes and interfaces and a ConfigBuilder. I like this approach more:
    • exporters and span processors must be made available to the builder (but the main ones are there by default), which I think will help to remove the sdk->contrib dependency
    • config classes are responsible for only configuring themselves
    • experimental\examples\ConfigExample.php demonstrates creating a config from a mixture of env and user-provided input. The output is a mixture of arrays and objects - lazy but it's experimental :)
      I have not tried very hard to make it pretty, since it might all end up on the scrap heap.


echo 'Creating Config From Environment and user config' . PHP_EOL;
$config = (new ConfigBuilder())
->withExporterConfig(ZipkinConfig::class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of expecting a class, when this has to be added programmatically anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking here is to allow making contrib exporters available without directly referencing contrib from sdk, and potentially to make them settable from an env var (OTLP_PHP_EXPORTER_CLASS ?).
Without some sort of discovery mechanism I don't see a way to have something in sdk that is capable of configuring classes from contrib. Perhaps a class_exists for something like Contrib\ExporterBuilder, and the configurations for these guys actually does live in contrib, and if that class does not exist then only the SDK SpanExporters are available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, got you, I just find the constructor a bit clunky in that case. But I think you commented something regarding this below.

->withExporterConfig(ZipkinConfig::class)
->withExporterConfig(NewRelicConfig::class)
->withUserConfig([
'span.processor.batch.max_queue_size' => 333,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of this dot-notation? This has to be resolved on every request?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a "todo" - it should require a proper nested config as might be created from yaml/json/whatever.

->withUserConfig([
'span.processor.batch.max_queue_size' => 333,
'resource.limits.attribute_value_length' => 444,
'exporter.new_relic.license_key' => 'secret',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I can only have one newrelic exporter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It never occurred to me that somebody would want to have multiple instances of the same exporter. In fact, the Batch and Simple span processors only allow for 0..1 exporter. MultiSpanProcessor does look like it allows any number of span processors, each of which could have their own exporter though - so I guess you can!

Copy link
Member

@tidal tidal Dec 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It never occurred to me that somebody would want to have multiple instances of the same exporter.

I guess you never had to develop packages/libraries to be used by customers/clients. They sometimes can come up with let's say kind of "exotic" requests or requierements. That's why I'm used to at least consider "exotic" requirements, to be prepared. ;)

putenv('OTEL_ATTRIBUTE_COUNT_LIMIT=111');
putenv('OTEL_PROPAGATORS=tracecontext');

echo 'Creating Config From Environment and user config' . PHP_EOL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "user config"? are env vars not provided by users?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, they are. I could merge them all together, but haven't because it requires sdk->contrib references - at least, the way I'm thinking of it does. As mentioned in an earlier comment, this might be resolved by either having that logic sit in contrib, or even removing env vars from feature and keeping that knowledge where it is now, in the various constructors.

public array $service;
public ResourceConfig $resource;

public function __construct(array $userConfig, array $environmentConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of having two arguments, istead of an array of configurations?

public function build(): object
{
$config = new Config($this->userConfig, $this->environmentConfig);
foreach ($this->buildExporters() as $name => $exporterConfig) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What#s the point of passing the whole config tree into dedicated configs? What's the point of having them then in the first place, as this is hard coupling them with the global config schema.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see an improvement here. I think that the config builder should creates all of the different sub-configs and assemble them, passing to each only the config that they need.


class Config implements ConfigInterface, ExporterConfigInterface
{
public ?string $endpoint;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create a runtime error without defaulting to null.

$this->attributeValueCount = (int) ($userConfig['resource.limits.attribute_value_length'] ?? $environmentConfig['OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT'] ?? 128);
}

private function intOrNull($value): ?int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a question?

{
public function __construct(array $userConfig, array $environmentConfig)
{
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the constructor is defined in an interface...I will change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brettmc
I know, this was just a lazy comment from me. :)

'stream' => Expect::string('php://stdout'),
'level' => Expect::string('info'),
]),
'propagators' => Expect::string('tracecontext,baggage'),
Copy link
Member

@tidal tidal Dec 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Why should there be default propagators and why should there be defaults for array items?
However I think it's better anyway to have classes or factories handle defaults than maintaining them in two places, including possible bugs.

(There's no plugin mechanism for propagators in the SDK yet anyway)

@tidal
Copy link
Member

tidal commented Dec 4, 2021

"Nos têtes sont rondes pour que nos pensées puissent changer de direction." (Francis Picabia)

However your new approach looks a bit rough ,so I can only provide some "rough" comments, I'm afraid. And it looks like it's only tracing configuration in a way that metrics and logger can't be added.

In general here is no need for the for league/config or nette/schema to be one big ball of mud, they can be
merged as well.

  • which I think will help to remove the sdk->contrib dependency

Not like it's implemented at the moment. This just adds an additional layer of indirection and possibly other dependency problems.

config classes are responsible for only configuring themselves

Maybe, but they seems to have dependencies on the complete config tree.

@brettmc
Copy link
Collaborator Author

brettmc commented Dec 15, 2021

Discussed in SIG, I will close this and put the related issue on hold. open-telemetry/opentelemetry-specification#2207 may help a lot here, by defining an SDK configuration schema.

@brettmc brettmc closed this Dec 15, 2021
@brettmc brettmc deleted the nette-config branch October 25, 2022 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants