Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit: logging and improvements #28056

Merged
merged 4 commits into from
Aug 12, 2024
Merged

Conversation

peteski22
Copy link

@peteski22 peteski22 commented Aug 12, 2024

Description

This PR improves the audit subsystem:

  • Updates the go-eventlogger library to v0.2.10 which includes context errors when needed in logging requests/responses
  • Adds TRACE logging to indicate when a context must be derived as the existing one is not viable for logging requests/responses
  • Improves the sensitivity of 'sink' nodes (file, socket) to context cancellation

HashiCorp Checklist

  • Labels: If this PR is the CE portion of an ENT change, and that ENT change is
    getting backported to N-2, use the new style backport/ent/x.x.x+ent labels
    instead of the old style backport/x.x.x labels.
  • Labels: If this PR is a CE only change, it can only be backported to N, so use
    the normal backport/x.x.x label (there should be only 1).
  • ENT Breakage: If this PR either 1) removes a public function OR 2) changes the signature
    of a public function, even if that change is in a CE file, double check that
    applying the patch for this PR to the ENT repo and running tests doesn't
    break any tests. Sometimes ENT only tests rely on public functions in CE
    files.
  • Jira: If this change has an associated Jira, it's referenced either
    in the PR description, commit message, or branch name.
  • RFC: If this change has an associated RFC, please link it in the description.
  • ENT PR: If this change has an associated ENT PR, please link it in the
    description. Also, make sure the changelog is in this PR, not in your ENT PR.

…ntext sensitivity of sink nodes (file, socket), update eventlogger to include context info in error
@peteski22 peteski22 added core/audit hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed backport/ent/1.16.x+ent Changes are backported to 1.16.x+ent backport/1.17.x labels Aug 12, 2024
@peteski22 peteski22 added this to the 1.16.8 milestone Aug 12, 2024
Copy link

github-actions bot commented Aug 12, 2024

CI Results:
All Go tests succeeded! ✅

Copy link

github-actions bot commented Aug 12, 2024

Build Results:
All builds succeeded! ✅

Copy link
Contributor

@kubawi kubawi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

case <-ctx.Done():
return ctx.Err()
default:
if s.fileLock.TryLock() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a small sleep in an else case here? I worry that something could get spinning in here up to 100% CPU. If the lock's not ready then this thread is super busy spinning through this loop non-stop. We had a similar bug recently in Agent/Proxy leading to high CPU usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same question is true for the similar bit of code in internal/observability/event/sink_socket.go

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a good point. I've pushed up some changes to just let us queue for the lock (letting Go figure it out) then check the context straight away when we get the lock (potentially just releasing the lock if we're 'done').

Copy link
Contributor

@VioletHynes VioletHynes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for thinking about the CPU thing :)

sink.fileLock.Unlock()

// Just a little bit of time to make sure that 'log' returned and err was set.
corehelpers.RetryUntil(t, 3*time.Second, func() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@peteski22 peteski22 merged commit b061606 into main Aug 12, 2024
83 checks passed
@peteski22 peteski22 deleted the peteski22/audit/context-improvements branch August 12, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/ent/1.16.x+ent Changes are backported to 1.16.x+ent core/audit do-not-merge hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants