-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output refactoring #1869
Output refactoring #1869
Conversation
This also should allow k6 to support xk6 extensions for outputs.
b9e01e5
to
b7abb73
Compare
Codecov Report
@@ Coverage Diff @@
## master #1869 +/- ##
==========================================
- Coverage 71.50% 70.76% -0.74%
==========================================
Files 180 181 +1
Lines 13969 14163 +194
==========================================
+ Hits 9988 10022 +34
- Misses 3344 3518 +174
+ Partials 637 623 -14
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I will need to try it out some as trying to figure out if we've broken a config without trying to use it ... is too complicated :P
Hmm actually with the changes from e5afaa3 that @mstoykov suggested, I think even #1423 and #1606 are fixed... 🎉 I'll add some tests for the JSON output and remove the |
Did some runs with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. I left a few comments we can discuss, and the only blocker for me would be the way built-in outputs are "registered" and that whole built-in/external distinction.
Some general notes:
-
I like the
output.Params
approach and every output plucking what it needs from it. -
The config consolidation in each output feels like a future wart, but maybe we can't improve that before Configuration issues #883.
-
👍 for abandoning the "collector" naming and standardizing everything on "output".
-
The
PeriodicFlusher
is a neat abstraction to avoid every output re-implementing it. 👍
e.logger.Debugf("Stopping %d outputs...", upToID) | ||
for i := 0; i < upToID; i++ { | ||
if err := e.outputs[i].Stop(); err != nil { | ||
e.logger.WithError(err).Errorf("Stopping output %d failed", i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add Name()
to the Output
interface so that this message can be friendlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that I don't have a Name()
, I have a Description()
, which wouldn't be suitable here 😞 ... To improve the error message, I either have to change the output.Output
interface and force every output to expose a Name()
, or I'd have to add a wrapper struct for outputs in the Engine
that contains its type (or whole Params
). And I can't just use a map, since users can have multiple outputs with the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I either have to change the
output.Output
interface and force every output to expose aName()
👍 That sounds OK to me. Description()
is too "loose" of an interface and outputs can return whatever they want there. The description should rather be assembled from different output information so that we can control how it's rendered. Maybe even Version()
should be added, and make it no-op for built-in outputs.
Anyway, it's a minor thing, not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm 👎 for Version()
... It would be something that only extensions would use, and it would add unnecessary complexity for writing them. Only xk6 cares about the versions for now, and if we ever start caring, I'd rather figure out some way to extract their versions from runtime/debug.BuildInfo
than forcing extension authors to deal with built-time linker flags or bumping an in-code Version
constant on every release of their extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about Name()
, cc @mstoykov what do you think? I'd personally rather have a wrapper struct around output.Output
than forcing every output to have one extra helper method for information we already have elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I also think it's better if we can just use the name we already have ... otherwise I would argue that we should get the name from Name()
everywhere.
I just realized is that this makes it ... harder to reason about if you decide to have multiple outputs of the same type .... is this even supported now?
I am not certain that was supported even before that, and I am not certain is all that needed, but seems like something that can be useful especially if outputs can filter metrics on their own. So you basically can (for example) shard based on something between multiple influxdb instances ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise I would argue that we should get the name from Name() everywhere.
We can't, since Name()
would be a method of an output instance, and we can't instantiate the output without giving it some valid parameters first. That's why we have the output type as a string in https://github.com/loadimpact/k6/blob/7e1bc5dcab910f0d197a63e843e5591de4c60423/output/extensions/registry.go#L56
The only way around that is to have 2 interfaces, the current Output
and a new OutputConstructor
which has a Name() string
property and a New(Params) Output
method. I don't mind that refactoring, but it would require every output to have these two interfaces, and likely an empty struct with the second one that is passed to the Register()
function.
I just realized is that this makes it ... harder to reason about if you decide to have multiple outputs of the same type .... is this even supported now?
Yes, it was supported even before this PR... The main problem with it is that JSON and environment variable configuration would be shared between all outputs, because they are based on the type names. Only the CLI flag config could be distinct. I've explored this issue and potential solution a bit before (#587 (comment) and in #883), but I don't think it's a very urgent problem to solve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
cloudapi/config.go
Outdated
|
||
// GetConsolidatedConfig combines the default config values with the JSON config | ||
// values and environment variables and returns the final result. | ||
func GetConsolidatedConfig(jsonRawConf json.RawMessage, env map[string]string) (Config, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like some tech debt we'll have to address when working on #883. :-/
Ideally outputs shouldn't have to deal with raw configuration sources and this consolidation should happen before they're initialized. It is a bit cleaner and more consistent than before, but there's still a lot of duplication and slight differences between each implementation which is probably what made it difficult to abstract away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this consolidation should happen before they're initialized
It does 😕 This is precisely what this bit of code does: https://github.com/loadimpact/k6/blob/7e1bc5dcab910f0d197a63e843e5591de4c60423/cmd/outputs.go#L68-L72
And this isn't new technical debt, it was already part of k6 and #883, it was just squirreled away in cmd/
in a much, much worse way, see these snippets from master
:
https://github.com/loadimpact/k6/blob/51790fc8df37113ac20b586b28051e21672dab16/cmd/config.go#L68
https://github.com/loadimpact/k6/blob/51790fc8df37113ac20b586b28051e21672dab16/cmd/config.go#L96
https://github.com/loadimpact/k6/blob/51790fc8df37113ac20b586b28051e21672dab16/cmd/config.go#L174
https://github.com/loadimpact/k6/blob/51790fc8df37113ac20b586b28051e21672dab16/cmd/config.go#L196
https://github.com/loadimpact/k6/blob/51790fc8df37113ac20b586b28051e21672dab16/cmd/collectors.go#L90-L97
Though I just saw that I missed config.Name = null.StringFrom(arg)
from that last code snippet, so I'll fix that... 😅
But yeah, we have to fix this configuration mess - the current state is definitely not the end goal, it is just slightly more sane than before, #883 is still very much required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does 😕 This is precisely what this bit of code does
I know. My point was that each output package shouldn't handle raw configuration at all, and should just receive the consolidated values, which should be done much earlier and only once / in a single location.
But yeah, maybe something for #883 then. At least it's consistently bad now 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each output package shouldn't handle raw configuration at all,
We can debate this when we start dealing with #883, but if I correctly understand what you want, I am not sure I agree. I don't think having a central configuration clearing house is wise - why should we have one huge module that deals with every type of configuration? Why not have each component expose a unified interface for dealing with raw configuration sources and returning errors, but leave the particular configuration logic internal for every module?
Besides, this will never work with extensions, there is no way for k6 to know how they are configured, so the only way is to to pass the "raw" values to the extension and return any resulting validation values, since it's the only thing that knows its own config.
Btw I think that the new outputs would be ideal testing grounds for potential #883 solutions. They are now self-contained, and each one receives a raw JSON chunk, a CLI argument, and a map of environment variables in their constructor. So we can start converting their configuration parsing and validation to something saner (that avoids Apply()
and envconfig
and all of the other issues) one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the cloud regression in 626694d
func getAllOutputConstructors() (map[string]func(output.Params) (output.Output, error), error) { | ||
// Start with the built-in outputs | ||
result := map[string]func(output.Params) (output.Output, error){ | ||
"json": json.New, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhmm why aren't all built-in outputs registered the same way as "extension" outputs are? I think this should be consistent with the way JS modules are registered, and is much more elegant than a static map IMO. Plus there would be no implementation differences between built-in and external outputs, making potential external->core merges easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started that way, but then I refactored it the way it currently is, for several different reasons, and I'd advocate maybe refactoring the JS extensions as well in the future. Here are the reasons:
- This allows us to print much better error messages instead of panicking when there are name conflicts, see below.
- It allows us to very easily distinguish between built-in outputs and extension, which might be very useful if we want to send the latter in the usage reports.
- Package
init()
functions are a bit of an antipattern (see this, for example), and in this case it's completely unnecessary to have them for built-in modules. - This doesn't look more elegant than a static map, to me it looks ugly as hell, I'd normally consider such imports code smells in a Go program 😅 : https://github.com/loadimpact/k6/blob/51790fc8df37113ac20b586b28051e21672dab16/js/modules.go#L23-L34
You have a valid argument for easier extension->core merges, but I think this is a minor concern. Cleaning up the extension would take 1 minute of just removing the init()
from it and adding the same constructor that was referenced in it to this map instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but those are arguments for where Register()
is being called, which could just as well happen in a way that avoids these anti-patterns. Where do you envision Register()
being called from for external outputs now?
My main issue is there being a distinction between internal and external outputs, but if you think there are benefits to that, I'm OK with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you envision Register() being called from for external outputs now?
For extensions it would have to be in init()
again, that unfortunately can't be easily changed due to the xk6 architecture. It's the simplest way to go about it, and is good enough for extensions. We just don't need to do the same in k6, we have perfect knowledge of the built-in outputs, we can even write a simple //go:generate
helper to generate the list for us, if the number of built-in outputs ever grows substantially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am mostly for everything to be done the same way (in this case in init()
) so we don't by accident break something and then for it to work for our "special" internal ones but not for the extensions.
Introducing externally developed extensions in the base k6 by removing init
is not as interesting IMO but it also ... kind of prevents us from introducing module/outputs by still letting them be outside the k6
repo(and be able to be imported as external) and by just importing them and registering as "internal". Or more accurately while we can import them and register them as internal, the fact that they have init
basically means that they will be registered as non 'internal' as well, which I don't like ;).
I think I brought this up back when xk6 was designed but there is nothing stopping us from requiring that parh.to/extension/module/name
to have a function NewModule() interface{}
and for xk6 to generate additional_extensions.go
file in the root of the k6
project that just calls the correct Register
function based on something ... for example the name or the returned type. This even means that a single "extension" can both register a js module and an output if we go the "name" variant.
The above suggestion can still be made and combined with #1802 and/or #1858 both of which will likely require some changes to the already written extensions.
With the above we can completely remove the init
from anywhere but additional_extensions.go
- or just have it instead of adding additional files to add to change
main.go
for example - or for that matter just generate static maps in the corresponding files as in nedomi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't by accident break something and then for it to work for our "special" internal ones but not for the extensions.
I doubt we'll ever touch the extension registration after we have it set up once, and it already uses the same interfaces as internal modules, so this is mostly a theoretical concern. In any case, the proper way to ensure extension compatibility would be to to have a CI task that actually uses xk6. I created #1873
This even means that a single "extension" can both register a js module and an output if we go the "name" variant.
I am confused, this can easily be done now 😕 A single extension repo can register however many extension types and modules it wants to. Right now xk6 essentially does a import _ whatever
and that's all it does, I don't think it cares one bit for what every extension does. So I don't see a benefit for the extra complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have some minor reservations about some of the design decisions here, but you've explained the reasons behind them, so no need to keep discussing them.
This also should allow k6 to support xk6 extensions for outputs. So far the JSON output is the only one that has been fully transitioned to the new output.Output interface, the rest will be done in future PRs.
This completely fixes #917, fixes #1423, and fixes #1606. It also mostly fixes #1075. I think we can close it now and do the transition of the remaining Collectors to Outputs in separate PRs, since they should be perfectly usable with the simple
Output
adapter that wraps them for now.This PR already includes the new
json
output and an adapter for every otherCollector
-type one. The current plan is to also move thecloud
andcsv
outputs to the newOutput
interface for k6 v0.31.0, whileinfluxdb
,statsd
,datadog
, andkafka
are left with the oldCollector
interface and moved in k6 v0.32.0. The major bugs (#1075 and #1606) should be fixed for all outputs in v0.31.0 regardless, we're just leaving the semi-mechanical translation for later. We can use the additional time to also revisit some other issues (e.g. #1801, #1052, #737 and many more).