-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add log rotation based on log size #13641
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we consider writing a system test? We could add a pre-populated events-file and configure it's size as the limit in a containers.conf we'd use.
@baude PTAL
We shouldn't be rotating, IMO. Rotation is good for container logs. The events logs should straight truncate. This matches Docker, which only retains a maximum of... I want to say 4096?... events at one time, purely in-memory. |
And, honestly, this won't solve our memory exhaustion issues - the files are still stored in tmpfs, so having two files versus one doesn't reduce the amount of RAM we consume. |
@rhatdan and @baude requested rotation/deletion. The motivation was to keep things simple, solve the disk issue and ultimately get users to use the journald driver.
The underlying issue was disk consumption, not RAM. |
As long as the logfile-1 is deleted when it is rotated, we should only have as 2x max log size, correct? |
We are free to decide the semantics. Currently, the rotation is really just a deletion. |
There are many parts in the code which rely on reading events. If we just truncate we run into problem elsewhere. IMO we want rotation. |
We'll always face these risks: at some point a file gets deleted with rotation. We'd increase the time window for sure if rotating more than 0 files. |
@vrothberg It's definitely a memory issue. These may be stored on disk, but they live in our temporary files directory, which is always on a tmpfs. Disk consumption is also a concern, but unbounded memory growth is much more of one. I continue to not really see the point of rotation. The events log is definitely referred to, but only for events that have occurred recently - a container that exited seconds ago, for example. Rotation doesn't help with that at all, because |
I would say the event read code must still read the rotated file. If you just truncate the file a second podman events process called in parallel will show no events. This looks wrong to me. |
My expectation would ideally be that, on exceeding maximum size, we cut off the first N events (lines) of the file and continue writing to the end. May be that it's simpler to do this via rotation, given that syscalls to shrink a file seem rather limited; but the behavior must be such that we act like a single logfile in all respects from the outside. Removal of excess files must also be performed automatically to remove potential for unbounded growth. |
It's unfortunate that we assigned work to an intern on which we obviously have no consensus on the design. The original task was to delete the file once the size limit is exceeded. This will address the RAM/disk issue since we control the size.
That means that we can only use 50 percent of the specified file limit as we have to create a copy, drop the first N lines and then write it back. Truncating means double the size for a short moment. (similar problem if we keep more than one file) That's why @baude and @rhatdan opted for rotation. May I suggest to make this a PS on Monday? Please have a look at the implementation and if/how the ideas are doable. |
Can we just change the reader to read both files. Current and current-1, then when current == 50% of max, we rotate it to current-1. Since old current-1 is overridden, we solve the problem. Correct? |
I think that the suggestion from @rhatdan is probably a reasonable solution. The critical aspects, IMO, are:
|
@nicrowe00, I'll prepare a fix for cross-build error on Windows and paste the diff here. |
The diff below diff the trick for me. You can apply it by copying the contents to a file (e.g, index b7f00a4d390e..1745095fb190 100644
--- a/libpod/events/events.go
+++ b/libpod/events/events.go
@@ -3,11 +3,9 @@ package events
import (
"encoding/json"
"fmt"
- "os"
"time"
"github.com/containers/storage/pkg/stringid"
- "github.com/nxadm/tail"
"github.com/pkg/errors"
)
@@ -221,14 +219,3 @@ func StringToStatus(name string) (Status, error) {
}
return "", errors.Errorf("unknown event status %q", name)
}
-
-func (e EventLogFile) getTail(options ReadOptions) (*tail.Tail, error) {
- reopen := true
- seek := tail.SeekInfo{Offset: 0, Whence: os.SEEK_END}
- if options.FromStart || !options.Stream {
- seek.Whence = 0
- reopen = false
- }
- stream := options.Stream
- return tail.TailFile(e.options.LogFilePath, tail.Config{ReOpen: reopen, Follow: stream, Location: &seek, Logger: tail.DiscardingLogger, Poll: true})
-}
diff --git a/libpod/events/logfile.go b/libpod/events/logfile.go
index 3f6d736c9ab5..6c2cfce77f70 100644
--- a/libpod/events/logfile.go
+++ b/libpod/events/logfile.go
@@ -1,3 +1,6 @@
+//go:build linux
+// +build linux
+
package events
import (
@@ -11,6 +14,7 @@ import (
"github.com/containers/podman/v4/pkg/util"
"github.com/containers/storage/pkg/lockfile"
+ "github.com/nxadm/tail"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
"golang.org/x/sys/unix"
@@ -69,6 +73,17 @@ func (e EventLogFile) writeString(s string) error {
return nil
}
+func (e EventLogFile) getTail(options ReadOptions) (*tail.Tail, error) {
+ reopen := true
+ seek := tail.SeekInfo{Offset: 0, Whence: os.SEEK_END}
+ if options.FromStart || !options.Stream {
+ seek.Whence = 0
+ reopen = false
+ }
+ stream := options.Stream
+ return tail.TailFile(e.options.LogFilePath, tail.Config{ReOpen: reopen, Follow: stream, Location: &seek, Logger: tail.DiscardingLogger, Poll: true})
+}
+
// Reads from the log file
func (e EventLogFile) Read(ctx context.Context, options ReadOptions) error {
defer close(options.EventChannel) |
Remote system tests can be fixed by adding the following line at the beginning of the new test:
|
@vrothberg Thanks, changes have been applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove libpod/events/.events.go.swp
and libpod/events/.logfile.go.swp
from the commit?
You can use git rm
for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor nit.
@containers/podman-maintainers PTAL
I think we can merge as is. One question I have is whether containers.conf should set a non-zero default to enable log-rotation by default.
We can still change certain details later but before the 4.1 release.
@nicrowe00 feel free to mark the PR as ready for review and drop the |
LGTM. We need to get LogSize into containers.conf and default it to something reasonable, so everyone will have a limited eventlog if they use file driver. |
We already have it in the containers.conf but it defaults to zero. @containers/podman-maintainers PTanotherL |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nicrowe00, rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@edsantiago PTAL at the system tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! A few nits, one typo, and one bug.
Add new functions to logfile.go for rotating and truncating the events log file once the log file and its contents exceed the maximum size limit while keeping 50% of the log file's content Also add tests to verify log rotation and truncation Signed-off-by: Niall Crowe <[email protected]> Signed-off-by: Valentin Rothberg <[email protected]>
@edsantiago PTanotherL |
I restarted the failing job - was a flake. |
LGTM |
/lgtm |
Add a new function to logfile.go for rotating and deleting
the events log file once the log file and its contents
exceed the maximum size limit.
Also add tests to verify log rotation for different
scenarios.
Signed-off-by: Niall Crowe [email protected]