Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move source mapping to model processing #5631

Merged
merged 6 commits into from
Jul 12, 2021
Merged

Conversation

axw
Copy link
Member

@axw axw commented Jul 7, 2021

Motivation/summary

Source mapping transformation is rewritten as a model.BatchProcessor, moved out of the model package and into the sourcemap package. Moving such logic out of the model package takes us another step towards generating the model types and beat.Event transformation.

One side-effect of the move to a process, worth debating, is that source mapping now occurs in the HTTP handler goroutine rather than in the publisher, which could increase request latency slightly. On the other hand this means that source mapping for more clients than there are CPUs can now happen concurrently (the publisher is limited to the number of CPUs in this regard); and the handlers will block for up to one second anyway if the publisher is busy/queue is full.

If a stack frame cannot be mapped, we no longer set sourcemap.error or log anything. Just because RUM and source mapping is enabled, does not mean that all stacks must be source mapped; therefore these "errors" are mostly just noise. Likewise we now only set sourcemap.updated when it is true.

I removed the TestSourcemapCacheExpiration system test as I believe it is redundant. In the test immediately above that one we prove that caching is effective, and cache expiration is unit tested.

Checklist

How to test these changes

  1. Enable RUM, upload a source map, check that the source map is applied to indexed events
  2. Enable RUM, don't upload a source map, check that events are indexed without sourcemap.error or sourcemap.updated set

Related issues

Prerequisite for #4120 and #3565
Closes #4958

@axw axw added the v7.15.0 label Jul 7, 2021
@axw axw force-pushed the sourcemap-processor branch from 69ecda7 to 277cb56 Compare July 7, 2021 09:56
@apmmachine
Copy link
Contributor

apmmachine commented Jul 7, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-07-12T07:30:41.588+0000

  • Duration: 49 min 17 sec

  • Commit: 16987cf

Test stats 🧪

Test Results
Failed 0
Passed 5881
Skipped 14
Total 5895

Trends 🧪

Image of Build Times

Image of Tests

@axw axw force-pushed the sourcemap-processor branch 2 times, most recently from 4c88168 to a73185e Compare July 7, 2021 10:00
Source mapping transformation is rewritten as
a model.BatchProcessor, moved out of the model
package and into the sourcemap package.

One side-effect of the move to a process, worth
debating, is that source mapping now occurs in
the HTTP handler goroutine rather than in the
publisher, which could increase request latency
slightly. On the other hand this means that source
mapping for more clients than there are CPUs can
now happen concurrently (the publisher is limited
to the number of CPUs in this regard); and the
handlers will block for up to one second anyway
if the publisher is busy/queue is full.

If a stack frame cannot be mapped, we no longer
set `sourcemap.error` or log anything. Just because
RUM and source mapping is enabled, does not mean
that all stacks _must_ be source mapped; therefore
these "errors" are mostly just noise. Likewise we
now only set `sourcemap.updated` when it is true.
@axw axw force-pushed the sourcemap-processor branch from a73185e to 7b67f55 Compare July 7, 2021 10:02
@axw axw marked this pull request as ready for review July 7, 2021 12:48
@axw
Copy link
Member Author

axw commented Jul 7, 2021

/test

@axw axw marked this pull request as draft July 7, 2021 14:05
@axw
Copy link
Member Author

axw commented Jul 7, 2021

The failures highlight an obvious issue, which is that if Elasticsearch is unavailable it should not prevent ingestion. When we have ditched the old direct-to-Elasticsearch source map implementation this won't be an isssue, but for now we'll need to continue logging and recording these errors in sourcemap.error.

If there is an error fetching a source map,
we once again set `sourcemap.error` on the
frame and log the error at debug level. The
logging is now using zap's rate limited
logging, rather than storing in a map.
@axw axw force-pushed the sourcemap-processor branch from 1f61072 to 4fd5174 Compare July 8, 2021 05:34
@axw
Copy link
Member Author

axw commented Jul 8, 2021

I reinstated the logging/recording of source map fetching errors.

It is possible for the processor to take a significant amount of time due to Elasticsearch search retries, e.g. when the server is unavailable. In a realistic setup I can't see moving this from the publisher to the handler making a practical difference. If the delay occurs in the publisher, then its queue would back up and cause the handler to timeout. I have added a configurable timeout for the sourcemap processor, defaulting to 5s. This is currently ineffective due to elastic/go-elasticsearch#300.

In the future when we're fully on Fleet, the catalogue of source maps will be pushed to APM Server. I think it would then be reasonable for APM Server to eagerly fetch and cache these, rather than fetching them on demand. Then the fetching will not be in the critical path of the handler, and it can just use them if they're in memory.

@axw axw force-pushed the sourcemap-processor branch from 6db6c7f to 552f353 Compare July 8, 2021 14:00
@axw axw marked this pull request as ready for review July 9, 2021 06:55
@axw axw requested a review from a team July 9, 2021 09:10
Copy link
Contributor

@stuartnelson3 stuartnelson3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change and a question; but LGTM!

beater/beater.go Outdated Show resolved Hide resolved
@@ -46,8 +47,14 @@ var (
validateError = monitoring.NewInt(registry, "validation.errors")
)

// AddedNotifier is an interface for notifying of sourcemap additions.
// This is implemented by sourcemap.Store.
type AddedNotifier interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for exporting this interface? It doesn't cost anything to export it, just curious since the interface isn't being referenced as an argument or struct member anywhere else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in the Handler function below. Am I misunderstanding your question?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to merge as-is, happy to update if you think there's an improvement to be had.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's an improvement, just wondering your personal opinion about exporting or not exporting an interface. In this case, AddedNotifier is only referenced once in the project, as an argument in the below function in the same package. A non-exported addedNotifier would also have worked, so I was curious if there was a specific motivation behind the choice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. My preference is to always have the interface exported if it is referenced by some exported type/function. IMO you should be able to understand how to use the interface just by clicking through the godoc UI, without delving into source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks!

@axw axw enabled auto-merge (squash) July 12, 2021 06:34
Copy link
Contributor

@simitt simitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@axw
Copy link
Member Author

axw commented Jul 12, 2021

/test

@axw axw merged commit ac3dc27 into elastic:master Jul 12, 2021
mergify bot pushed a commit that referenced this pull request Jul 12, 2021
* Move source mapping to model processing

Source mapping transformation is rewritten as
a model.BatchProcessor, moved out of the model
package and into the sourcemap package.

One side-effect of the move to a process, worth
debating, is that source mapping now occurs in
the HTTP handler goroutine rather than in the
publisher, which could increase request latency
slightly. On the other hand this means that source
mapping for more clients than there are CPUs can
now happen concurrently (the publisher is limited
to the number of CPUs in this regard); and the
handlers will block for up to one second anyway
if the publisher is busy/queue is full.

If a stack frame cannot be mapped, we no longer
set `sourcemap.error` or log anything. Just because
RUM and source mapping is enabled, does not mean
that all stacks _must_ be source mapped; therefore
these "errors" are mostly just noise. Likewise we
now only set `sourcemap.updated` when it is true.

* Reintroduce `sourcemap.error`

If there is an error fetching a source map,
we once again set `sourcemap.error` on the
frame and log the error at debug level. The
logging is now using zap's rate limited
logging, rather than storing in a map.

* Introduce sourcemap timeout config

* Remove unnecessary fleetCfg param

(cherry picked from commit ac3dc27)

# Conflicts:
#	changelogs/head.asciidoc
@axw axw deleted the sourcemap-processor branch July 12, 2021 08:45
axw added a commit that referenced this pull request Jul 12, 2021
* Move source mapping to model processing (#5631)

* Move source mapping to model processing

Source mapping transformation is rewritten as
a model.BatchProcessor, moved out of the model
package and into the sourcemap package.

One side-effect of the move to a process, worth
debating, is that source mapping now occurs in
the HTTP handler goroutine rather than in the
publisher, which could increase request latency
slightly. On the other hand this means that source
mapping for more clients than there are CPUs can
now happen concurrently (the publisher is limited
to the number of CPUs in this regard); and the
handlers will block for up to one second anyway
if the publisher is busy/queue is full.

If a stack frame cannot be mapped, we no longer
set `sourcemap.error` or log anything. Just because
RUM and source mapping is enabled, does not mean
that all stacks _must_ be source mapped; therefore
these "errors" are mostly just noise. Likewise we
now only set `sourcemap.updated` when it is true.

* Reintroduce `sourcemap.error`

If there is an error fetching a source map,
we once again set `sourcemap.error` on the
frame and log the error at debug level. The
logging is now using zap's rate limited
logging, rather than storing in a map.

* Introduce sourcemap timeout config

* Remove unnecessary fleetCfg param

(cherry picked from commit ac3dc27)

# Conflicts:
#	changelogs/head.asciidoc

* Delete head.asciidoc

Co-authored-by: Andrew Wilkins <[email protected]>
@stuartnelson3 stuartnelson3 self-assigned this Aug 26, 2021
@stuartnelson3
Copy link
Contributor

Confirmed with BC2

@stuartnelson3 stuartnelson3 removed their assignment Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RUM: sourcemap.error on and sourcemap.updated are set overzealously
4 participants