Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault 1.15 audit log (SIGHUP) reload doesn't release and reopen file #23596

Closed
Langleu opened this issue Oct 11, 2023 · 2 comments
Closed

Vault 1.15 audit log (SIGHUP) reload doesn't release and reopen file #23596

Langleu opened this issue Oct 11, 2023 · 2 comments
Assignees
Labels
bug Used to indicate a potential bug core/audit reproduced This issue has been reproduced by a Vault engineer
Milestone

Comments

@Langleu
Copy link

Langleu commented Oct 11, 2023

Describe the bug
We have Vault 1.15 running for 8+ days and got an alert today that the disk of the audit log is close to reaching 10 GB. After inspecting what's been going on, I noticed that the audit log file was 0 bytes but the disk itself (only mounted for the audit log) is close to 10 GB but no file present.

Example where the /var/log is 4M but via df -h it's > 200M.

/var/log # ls -lh
total 4M
-rw-r--r--    1 root     root        8.0K Oct 11 08:14 audit.db
-rw-r--r--    1 root     root       32.0K Oct 11 08:15 audit.db-shm
-rw-r--r--    1 root     root        3.9M Oct 11 08:15 audit.db-wal
drwx------    2 root     root       16.0K Jul 13  2021 lost+found
-rw-------    1 root     root           0 Oct 11 08:15 vault_audit.log
---
df -h
/dev/sdb                  9.7G    211.9M      9.5G   2% /var/log

Our setup is using fluent-bit to read the logfile to push the data to BigQuery and we rotate the log with logrotate every 200 MB and send a SIGHUP to Vault. It has always worked so far and I tested it with 1.14 earlier and is still working.

We have 3 Vault instances for different envs (dev / stage / prod) and all of them are experiencing the issue.
My theory is that the SIGHUP is not releasing the existing file and reloading the new file. On the filesystem it's gone but somehow Vault continues logging to the old file and not the new one.
Nothing changed on the logrotate side and I can also see the reload (via SIGHUP) in the log.

To Reproduce
Steps to reproduce the behavior:

  1. Vault 1.15 instance with file audit log to /var/log
  2. Logrotate as side container with shared process space to rotate the log and send SIGHUP via POSTROTATE to Vault.
  3. See that the file is empty but the disk keeps growing, even though it's only mounted there + fits Vault audit disk usage.

Expected behavior
SIGHUP should release the file and reload the new file, just like it did in 1.14 and earlier.

Environment:

  • Vault Server Version (retrieve with vault status): 1.15.0, built 2023-09-22T16:53:10Z
  • Vault CLI Version (retrieve with vault version): v1.15.0 (b4d0727), built 2023-09-22T16:53:10Z
  • Server Operating System/Architecture: Kubernetes GKE cluster - 1.27.3

Vault server configuration file(s):

Vault is setup via Terraform.
Relevant config:

resource "vault_audit" "file_logs" {
  type  = "file"
  local = true

  options = {
    file_path = "/var/log/vault_audit.log"
  }
}

Additional context
I checked the docs and changelog and nothing obvious changed around the file audit log.

The docs state

Vault will continue writing to the same audit log file even if it was moved or renamed as part of log rotation; issuing a SIGHUP to the Vault process is necessary for the file to be released. 

and I feel like the first part is exactly what's happening even though SIGHUP was triggered.

@peteski22
Copy link

peteski22 commented Oct 11, 2023

Hi @Langleu thanks so much for reporting this issue, unfortunately it does seem like a bug 🐛.

As a result of this issue, we've created and merged a fix due for release in Vault 1.15.1 which isn't too far away.

We appreciate between now and 1.15.1's release we need to do something to unblock Vault users, so we've put together some documentation for this as a known issue, along with a temporary workaround which will revert Vault's audit system to pre-1.15.

We hope this helps, and thanks again for taking the time to file a comprehensive report.

https://developer.hashicorp.com/vault/docs/release-notes/1.15.0

https://developer.hashicorp.com/vault/docs/upgrading/upgrade-to-1.15.x#file-audit-devices-do-not-honor-sighup-signal-to-reload

@Langleu
Copy link
Author

Langleu commented Oct 12, 2023

thanks for the quick fix @peteski22 🚀 .
The workaround works great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to indicate a potential bug core/audit reproduced This issue has been reproduced by a Vault engineer
Projects
None yet
Development

No branches or pull requests

2 participants