-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor processors and move remaining models into "models". #1137
Conversation
3369068
to
5b1f4aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the separate PR for this refactoring!
Since this is the first bit of the code restructure, have you thought about moving eg. the decoding part to the decoder package, the transformation part to a transformer package (probably not the best name), etc. I am not a huge fan of having logic within models
, but rather use models as pure domain types that get handled by services
or processors
or what not.
processor/error/error.go
Outdated
@@ -15,30 +15,22 @@ | |||
// specific language governing permissions and limitations | |||
// under the License. | |||
|
|||
package healthcheck | |||
package processor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be package error
I guess
model/error/event.go
Outdated
errorCounter = monitoring.NewInt(Metrics, "errors") | ||
stacktraceCounter = monitoring.NewInt(Metrics, "stacktraces") | ||
frameCounter = monitoring.NewInt(Metrics, "frames") | ||
processorEntry = common.MapStr{"name": processorName, "event": errorDocType} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you move these counters to the event
when they are used in the payload
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rifght. They'll be used in the event instead in the future, but I'll moved them back for now.
model/metric/payload_test.go
Outdated
|
||
// its ok to ignore the outcome of this cast here because | ||
// we assert everything below | ||
payload, _ := payloadInterface.(*Payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'd say that this should even be a require.NoError
, as if this conversion is not ok, the other checks don't make any sense and would just result in weird error messages.
model/metric/processor.go
Outdated
assert.True(t, ok) | ||
assert.IsType(t, &processor{}, p) | ||
} | ||
package metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this file good for?
@@ -15,20 +15,13 @@ | |||
// specific language governing permissions and limitations | |||
// under the License. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the files get moved in a weird way, e.g. this one moves from processor/error/processor_test.go
to model/payload.go
. Could you maybe solve this by applying explicit git rm
and git mv
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to fix this easily unfortunately. It's possible to adjust the similarity index, but we also have some renames that are correctly detected as renames and changing the similarity index could jeopardize those. If you think it's important i can another stab at it
processor/processor.go
Outdated
|
||
"github.com/elastic/apm-server/model" | ||
"github.com/elastic/apm-server/validation" | ||
"github.com/elastic/beats/libbeat/monitoring" | ||
) | ||
|
||
type NewProcessor func() Processor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed anymore
|
||
type PayloadDecoder func(map[string]interface{}) (model.Payload, error) | ||
|
||
type PayloadProcessor struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this naming very confusing.
Now you have a Processor
interface, that is implemented by one type PayloadProcessor
. That actually returns a payload
for the decoding part, instead of acting on a payload
. Then the actual model.payload
types again do some decoding and validation.
I suggest to avoid having a name Payload
in the processor
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why this would be confusing. It's partly due to the fact that it's been split out from the mega PR initially had.
I'll share my mental model to see if that helps:
"Processing" to me includes taking the raw data and validating it, decoding it into a go structure and eventually transformining it into an event (transformation happens by way of a method on the go struct, so it's not technically defined by the processor, but a separate refactor can change this if we want). In intake v1, the APM Server deals only with Payloads. For intake v2, we'll also have to deal with individual objects/models.
There's a general interface Processor
today is only implemented by the PayloadProcessor
s. In the future we would have a ModelProcessor
that would know how to take individual data pieces and validate, decode and transform them.
Let me know if it makes more sense in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining. I think I am clear on the context. I still think that this is very confusing to call this PayloadProcessor
as it doesn't operate on what is called a payload further on.
I'd rather call it something like ProcessorImpl
than adding Payload
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed offline and decided to keep as is for now given that subsequent PRs will address this.
processor/sourcemap/sourcemap.go
Outdated
|
||
var ( | ||
validateCount = monitoring.NewInt(sm.Metrics, "validation.count") | ||
validateError = monitoring.NewInt(sm.Metrics, "validation.errors") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why instantiating the validationCount
and validationError
outside of the processor.PayloadProcessor
?
@roncohen could you please also provide benchmark results? I don't expect any big changes, but we should rely on numbers for refactoring PRs. |
beater/handlers.go
Outdated
@@ -36,11 +36,11 @@ import ( | |||
conf "github.com/elastic/apm-server/config" | |||
"github.com/elastic/apm-server/decoder" | |||
"github.com/elastic/apm-server/processor" | |||
perr "github.com/elastic/apm-server/processor/error" | |||
"github.com/elastic/apm-server/processor/healthcheck" | |||
err "github.com/elastic/apm-server/processor/error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will end up shadowed a bunch, some other alias would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point
6963569
to
13f1317
Compare
benchmark results show there that are no difference in the number of allocations at all. The performance number for this PR looks better than Benchmark details
|
processor/error/error.go
Outdated
|
||
import ( | ||
pr "github.com/elastic/apm-server/processor" | ||
err "github.com/elastic/apm-server/model/error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also change err
here.
er "github.com/elastic/apm-server/processor/error" | ||
"github.com/elastic/apm-server/processor/error/generated/schema" | ||
"github.com/elastic/apm-server/model/error/generated/schema" | ||
err "github.com/elastic/apm-server/processor/error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for err
@@ -27,7 +27,7 @@ import ( | |||
s "github.com/go-sourcemap/sourcemap" | |||
|
|||
"github.com/elastic/apm-server/config" | |||
er "github.com/elastic/apm-server/processor/error" | |||
err "github.com/elastic/apm-server/processor/error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the missing rebase and the view minor comments left, LGTM
This moves error, transaction, sourcemap and metric models into directories in the "model" directory. It changes processors to be simple structs which can potentially override Decode/Validate (only sourcemaps do this). This removes a bunch of boilerplate from the processors. This will help move towards decoding and validating models individually, which is needed in the future.
aa1ebb0
to
3236b6e
Compare
backport waiting for elastic/beats#7629 |
…#1137) This moves error, transaction, sourcemap and metric models into directories in the "model" directory. It changes processors to be simple structs which can potentially override Decode/Validate (only sourcemaps do this). This removes a bunch of boilerplate from the processors. This will help move towards decoding and validating models individually, which is needed in the future. And healthcheck processor is now a simple http.Handler.
…1176) This moves error, transaction, sourcemap and metric models into directories in the "model" directory. It changes processors to be simple structs which can potentially override Decode/Validate (only sourcemaps do this). This removes a bunch of boilerplate from the processors. This will help move towards decoding and validating models individually, which is needed in the future. And healthcheck processor is now a simple http.Handler.
This moves error, transaction, sourcemap and metric models into directories in the "model" directory. It changes processors to be simple structs which can potentially override Decode/Validate (only sourcemaps overrides these). This removes a bunch of boilerplate from the processors. It will help move towards decoding and validating models individually, which is needed in the future for intake-v2.
I also got rid of the healthcheck processor, which is now a simple http.Handler.
There are still more we can do to separate models from processing, for example, counting (validation, transformations etc.) is spread between models and processor now, but I was unable to completely fix it in this PR. Subsequent PRs will address this.