Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transaction_ignore_urls spec #333

Merged
merged 8 commits into from
Oct 5, 2020

Conversation

felixbarny
Copy link
Member

@felixbarny felixbarny commented Aug 26, 2020

supersedes #144

Note that the implementation of this spec also includes adding the option to Kibana.

@apmmachine
Copy link

apmmachine commented Aug 26, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Started by timer]

  • Start Time: 2020-10-05T04:05:01.013+0000

  • Duration: 3 min 45 sec

Steps errors

Expand to view the steps failures

  • Name: Shell Script
    • Description: [2020-10-05T04:07:45.167Z] + git diff --name-only 0f5151f5a333730e0e78f789ac398b24375ed82d...3c0076f

    • Duration: 0 min 0 sec

    • Start Time: 2020-10-05T04:07:44.875+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-10-05T04:07:07.227Z] Running on apm-ci-immutable-ubuntu-1804-1601870722075821012 in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-333
[2020-10-05T04:07:07.355Z] �[39;49m[INFO] Override default checkout�[0m
[2020-10-05T04:07:07.398Z] Sleeping for 10 sec
[2020-10-05T04:07:20.362Z] using credential f6c7695a-671e-4f4f-a331-acdce44ff9ba
[2020-10-05T04:07:20.435Z] Wiping out workspace first.
[2020-10-05T04:07:20.471Z] Cloning the remote Git repository
[2020-10-05T04:07:20.471Z] Using shallow clone with depth 4
[2020-10-05T04:07:20.471Z] Avoid fetching tags
[2020-10-05T04:07:20.502Z] Cloning repository [email protected]:elastic/apm.git
[2020-10-05T04:07:20.561Z]  > git init /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-333 # timeout=10
[2020-10-05T04:07:20.625Z] Fetching upstream changes from [email protected]:elastic/apm.git
[2020-10-05T04:07:20.625Z]  > git --version # timeout=10
[2020-10-05T04:07:20.631Z]  > git --version # 'git version 2.17.1'
[2020-10-05T04:07:20.632Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-10-05T04:07:20.662Z]  > git fetch --no-tags --progress -- [email protected]:elastic/apm.git +refs/heads/*:refs/remotes/origin/* # timeout=15
[2020-10-05T04:07:21.422Z] Cleaning workspace
[2020-10-05T04:07:21.447Z] Using shallow fetch with depth 4
[2020-10-05T04:07:21.447Z] Pruning obsolete local branches
[2020-10-05T04:07:22.074Z] Merging remotes/origin/master commit adc3d3746c218bdfd7979de468ecfbaa50a9988d into PR head commit 3c0076f4f880a8e8d4526881ccaedd223f8649a3
[2020-10-05T04:07:21.388Z]  > git config remote.origin.url [email protected]:elastic/apm.git # timeout=10
[2020-10-05T04:07:21.394Z]  > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
[2020-10-05T04:07:21.407Z]  > git config remote.origin.url [email protected]:elastic/apm.git # timeout=10
[2020-10-05T04:07:21.426Z]  > git rev-parse --verify HEAD # timeout=10
[2020-10-05T04:07:21.432Z] No valid HEAD. Skipping the resetting
[2020-10-05T04:07:21.433Z]  > git clean -fdx # timeout=10
[2020-10-05T04:07:21.455Z] Fetching upstream changes from [email protected]:elastic/apm.git
[2020-10-05T04:07:21.455Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-10-05T04:07:21.459Z]  > git fetch --no-tags --progress --prune -- [email protected]:elastic/apm.git +refs/pull/333/head:refs/remotes/origin/PR-333 +refs/heads/master:refs/remotes/origin/master # timeout=15
[2020-10-05T04:07:22.085Z]  > git config core.sparsecheckout # timeout=10
[2020-10-05T04:07:22.097Z]  > git checkout -f 3c0076f4f880a8e8d4526881ccaedd223f8649a3 # timeout=15
[2020-10-05T04:07:22.138Z]  > git remote # timeout=10
[2020-10-05T04:07:22.205Z] Merge succeeded, producing 4d247c30ed0ba5aba5475a66be2e4bd0a36d218a
[2020-10-05T04:07:22.206Z] Checking out Revision 4d247c30ed0ba5aba5475a66be2e4bd0a36d218a (PR-333)
[2020-10-05T04:07:22.163Z]  > git config --get remote.origin.url # timeout=10
[2020-10-05T04:07:22.173Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-10-05T04:07:22.177Z]  > git merge adc3d3746c218bdfd7979de468ecfbaa50a9988d # timeout=10
[2020-10-05T04:07:22.198Z]  > git rev-parse HEAD^{commit} # timeout=10
[2020-10-05T04:07:22.210Z]  > git config core.sparsecheckout # timeout=10
[2020-10-05T04:07:22.216Z]  > git checkout -f 4d247c30ed0ba5aba5475a66be2e4bd0a36d218a # timeout=15
[2020-10-05T04:07:25.895Z] Commit message: "Merge commit 'adc3d3746c218bdfd7979de468ecfbaa50a9988d' into HEAD"
[2020-10-05T04:07:25.912Z] First time build. Skipping changelog.
[2020-10-05T04:07:25.912Z] Cleaning workspace
[2020-10-05T04:07:25.902Z]  > git rev-list --no-walk ef777a83a1c83b02ae3a240a478461432cff2a1b # timeout=10
[2020-10-05T04:07:25.917Z]  > git rev-parse --verify HEAD # timeout=10
[2020-10-05T04:07:25.934Z] Resetting working tree
[2020-10-05T04:07:25.934Z]  > git reset --hard # timeout=10
[2020-10-05T04:07:25.949Z]  > git clean -fdx # timeout=10
[2020-10-05T04:07:27.010Z] Masking supported pattern matches of $JOB_GCS_BUCKET or $NOTIFY_TO
[2020-10-05T04:07:27.046Z] Timeout set to expire in 3 hr 0 min
[2020-10-05T04:07:27.056Z] The timestamps step is unnecessary when timestamps are enabled for all Pipeline builds.
[2020-10-05T04:07:27.266Z] [INFO] 'shallow' is forced to be disabled when running on PullRequests
[2020-10-05T04:07:27.275Z] Running in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-333/src/github.com/elastic/apm
[2020-10-05T04:07:27.286Z] [INFO] gitCheckout: Checkout SCM PR-333 with some customisation.
[2020-10-05T04:07:27.300Z] [INFO] Override default checkout
[2020-10-05T04:07:27.321Z] Sleeping for 10 sec
[2020-10-05T04:07:37.468Z] using credential f6c7695a-671e-4f4f-a331-acdce44ff9ba
[2020-10-05T04:07:37.546Z] Cloning the remote Git repository
[2020-10-05T04:07:37.565Z] Cloning repository [email protected]:elastic/apm.git
[2020-10-05T04:07:37.596Z]  > git init /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-333/src/github.com/elastic/apm # timeout=10
[2020-10-05T04:07:37.610Z] Fetching upstream changes from [email protected]:elastic/apm.git
[2020-10-05T04:07:37.610Z]  > git --version # timeout=10
[2020-10-05T04:07:37.620Z]  > git --version # 'git version 2.17.1'
[2020-10-05T04:07:37.620Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-10-05T04:07:37.627Z]  > git fetch --tags --progress -- [email protected]:elastic/apm.git +refs/heads/*:refs/remotes/origin/* # timeout=10
[2020-10-05T04:07:38.257Z]  > git config remote.origin.url [email protected]:elastic/apm.git # timeout=10
[2020-10-05T04:07:38.264Z]  > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
[2020-10-05T04:07:38.272Z]  > git config remote.origin.url [email protected]:elastic/apm.git # timeout=10
[2020-10-05T04:07:38.285Z] Fetching upstream changes from [email protected]:elastic/apm.git
[2020-10-05T04:07:38.286Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-10-05T04:07:38.291Z]  > git fetch --tags --progress -- [email protected]:elastic/apm.git +refs/pull/333/head:refs/remotes/origin/PR-333 +refs/heads/master:refs/remotes/origin/master # timeout=10
[2020-10-05T04:07:38.905Z] Checking out Revision 3c0076f4f880a8e8d4526881ccaedd223f8649a3 (origin/PR-333)
[2020-10-05T04:07:38.942Z] Commit message: "Update metrics.md (#339)"
[2020-10-05T04:07:38.942Z] First time build. Skipping changelog.
[2020-10-05T04:07:38.902Z]  > git rev-parse origin/PR-333^{commit} # timeout=10
[2020-10-05T04:07:38.909Z]  > git config core.sparsecheckout # timeout=10
[2020-10-05T04:07:38.919Z]  > git checkout -f 3c0076f4f880a8e8d4526881ccaedd223f8649a3 # timeout=10
[2020-10-05T04:07:39.655Z] Masking supported pattern matches of $GIT_USERNAME or $GIT_PASSWORD
[2020-10-05T04:07:40.276Z] + git fetch https://****:****@github.com/elastic/apm.git +refs/pull/*/head:refs/remotes/origin/pr/*
[2020-10-05T04:07:40.907Z] Archiving artifacts
[2020-10-05T04:07:41.559Z] + git rev-parse HEAD
[2020-10-05T04:07:41.926Z] + git rev-parse HEAD
[2020-10-05T04:07:42.239Z] + git rev-parse origin/pr/333
[2020-10-05T04:07:42.273Z] [INFO] githubEnv: Found Git Build Cause: pr
[2020-10-05T04:07:42.554Z] Masking supported pattern matches of $GITHUB_TOKEN
[2020-10-05T04:07:43.721Z] [INFO] githubPrCheckApproved: Title: Add transaction_ignore_urls spec - User: felixbarny - Author Association: MEMBER
[2020-10-05T04:07:44.386Z] Stashed 279 file(s)
[2020-10-05T04:07:44.812Z] Running in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-333/src/github.com/elastic/apm
[2020-10-05T04:07:45.167Z] + git diff --name-only 0f5151f5a333730e0e78f789ac398b24375ed82d...3c0076f4f880a8e8d4526881ccaedd223f8649a3
[2020-10-05T04:07:45.167Z] fatal: Invalid symmetric difference expression 0f5151f5a333730e0e78f789ac398b24375ed82d...3c0076f4f880a8e8d4526881ccaedd223f8649a3
[2020-10-05T04:07:45.220Z] Stage "Send Pull Request for BDD specs" skipped due to earlier failure(s)
[2020-10-05T04:07:45.238Z] Stage "Send Pull Request for JSON specs" skipped due to earlier failure(s)
[2020-10-05T04:07:45.426Z] Running on Jenkins in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-333
[2020-10-05T04:07:45.501Z] [INFO] getVaultSecret: Getting secrets
[2020-10-05T04:07:45.652Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-10-05T04:07:46.229Z] + chmod 755 generate-build-data.sh
[2020-10-05T04:07:46.229Z] + ./generate-build-data.sh https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-333/ https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-333/runs/17 FAILURE 164955
[2020-10-05T04:07:46.480Z] INFO: curl https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-333/runs/17/steps/?limit=10000 -o steps-info.json
[2020-10-05T04:07:47.391Z] INFO: curl https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-333/runs/17/tests/?status=FAILED -o tests-errors.json
[2020-10-05T04:07:47.391Z] Retry 1/3 exited 22, retrying in 1 seconds...
[2020-10-05T04:07:48.302Z] Retry 2/3 exited 22, retrying in 2 seconds...

@felixbarny felixbarny linked an issue Aug 26, 2020 that may be closed by this pull request
@felixbarny felixbarny requested review from beniwohli and removed request for beniwohli August 26, 2020 14:23
beniwohli and others added 2 commits September 3, 2020 10:53
…e_urls

* upstream/master:
  [CI] compare with the calculated SHA commit (elastic#336)
  add link to PHP documentation
  Link to create-agent-issues.sh in spec process
@beniwohli beniwohli requested a review from mikker September 3, 2020 12:42
Copy link
Contributor

@mikker mikker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

green-light

@beniwohli beniwohli marked this pull request as ready for review September 3, 2020 14:29
@beniwohli beniwohli requested review from a team as code owners September 3, 2020 14:29
http://whatever.com/home/index?value1=123
```

NOTE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for this exception?

Copy link
Contributor

@beniwohli beniwohli Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has always been the case, transaction sampling/ignoring has no effect on exception tracking

```

NOTE:
All errors that are captured during a request to an ignored URL are still sent to the APM Server regardless of this setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean errors for ignored transactions won’t have parent_id, transaction_id, transaction, context and trace_id?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transaction_id will obviously have to be set to null or equivalent, I'm not so sure about the others. trace_id could theoretically be around from distributed tracing, but it might involve overhead to parse it and make it available for error collection, which is exactly what we want to avoid with this setting.

In the Python agent, we generate the context for exceptions separately, but that might be considered an implementation detail.

@@ -35,6 +35,31 @@ Request and response headers, cookies, and form bodies should be sanitised (i.e.

Agents may may include additional patterns if there are common conventions specific to language frameworks.

##### `transaction_ignore_urls` configuration

Used to restrict requests to certain URLs from being instrumented.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should agent propagate data related to distributed tracing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the simplest thing to do is to treat the ignored URLs as if there's no agent at all. I expect most of the use cases to exclude health checks and static resources. For those cases, typically no downstream services are involved. Obviously, there can be exceptions to that rule.
Does any of our agents propagate the context for ignored URLs?

I'm also fine to not explicitly specify that bit. If it becomes an important question, we can follow up on it.

Copy link
Contributor

@SergeyKleyman SergeyKleyman Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the use case for this feature is to completely exclude parts of the site/application from being monitored (which is what I assumed when I first heard about this feature) than why do we need the exception for errors, discussed a few comments above? I think it would be much cleaner mental model if agent doesn't do anything for ignored URLs - agent doesn't report anything, doesn't propagate distributed tracing data, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably onto something here. This is how we implemented error handling in the context of processing a request that corresponds to an ignored URL:

When the user manually captures an error we still send that error.

However, we won't catch exceptions that happen during the request processing and automatically capture an error, as we'd do for non-ignored URLs.

This might be different from how other agents handle it. If that's the case, I think it does make sense to align eventually but I'd like to handle that in a follow-up and not block the progress on this spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So wouldn’t it be better to leave both error reporting and distributed tracing data propagation aspects unspecified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, unless all agents currently behave the same anyways. But seems like they don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKleyman @felixbarny so if I understand the discussion correctly, the spec is OK as is? Or would you like a note that error handling and trace propagation remains unspecified for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that existing implementations by the agents differ in error handling and trace propagation parts so the simplest approach at this point would be to have those aspects unspecified. I'm not sure a note is necessary - we can just not include those parts in the spec.

@beniwohli
Copy link
Contributor

@elastic/apm-agent-rum can you confirm that you're OK with this spec?

Copy link
Contributor

@hmdhk hmdhk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beniwohli , We don't have any plans to add this config option to kibana at the moment, but we will align the name with other agents once that happens.

Copy link

@alex-fedotyev alex-fedotyev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

@felixbarny felixbarny merged commit d5b2c87 into elastic:master Oct 5, 2020
@felixbarny felixbarny deleted the transaction_ignore_urls branch October 5, 2020 17:37
@felixbarny felixbarny added this to the 7.11 milestone Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet