-
Notifications
You must be signed in to change notification settings - Fork 36.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rotate debug.log file #22350
Rotate debug.log file #22350
Conversation
This is an alternative to #21603. That PR's motivation also applies to this PR:
An advantage I see with this approach is that it addresses the concern with log messages being dropped. With this approach, log messages are never dropped; they just don't go back an infinite amount of time :). Another advantage that could prove very useful is that debug log categories can be enabled for an indefinite amount of time (#21603 sensibly does not limit debug logging). It's quite possible that a bug occurs rarely and at an unpredictable time. By the time the bug occurs, it's too late to enable the relevant debug log category. Also, it might be nice to not have to worry so much about adding more logging in general (without even having to condition it on a category). There may be useful stuff to log that we don't currently, because of this disk space concern. Some background information: Currently, The rotated log files are kept in a separate directory, to avoid cluttering up the data directory (see discussion below). This directory has the fixed name An easy way to manually test this without having to add artificial logging is to enable log rotation by adding these lines to
Then start
There's no harm in manually removing these rotated log files, even if Please comment or review: @jnewbery @practicalswift @dergoegge |
src/logging.cpp
Outdated
rotate /= "debug-" + now + ".log"; | ||
try { | ||
fs::rename(m_file_path, rotate); | ||
} catch (const fs::filesystem_error&) {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're ignoring any error from rename
, not sure that's right.
src/logging.cpp
Outdated
// reopen the log file, if requested | ||
if (m_reopen_file) { | ||
m_reopen_file = false; | ||
reopen_file(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If logrotate
is being used, sending a SIGHUP
will continue to cause debug.log
to be reopened (this PR doesn't change that behavior), but if the user wants to continue using that mechanism (logrotate(8)
), they should disable this PR's log rotation (by setting -debuglogrotatekeep=0
in the configuration). But if they forget (definitely need release notes on this), I think it will be okay, because either this PR's log rotation will always happen (since it waits only until debug.log
reaches 1 MB by default), or the logrotate(8)
will happen (if the file size threshold is less than 1 MB). It would be confusing if both were happening! Maybe more thought is needed here.
An alternative might be to just remove this behavior (SIGHUP
causing debug.log
to be reopened), but then logrotate(8)
will no longer work, and there may be existing users that are all set up to use logrotate(8)
.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
91d796a
to
da13502
Compare
128c0ca
to
b526922
Compare
I think this is an interesting approach and i like the concept of log rotation but this is a bigger behavior change than #21603 and i am not sure if always dropping (old) logs should be the default behavior.
Continuously dropping logs seems like an aggressive change to me because why drop the logs when there is no attack? I haven't had problems with logfile sizes in normal node operation. (maybe thats just me)
I think that the current behavior is actually quite useful because in case of an issue (e.g.: a crash) we have the entire log of the nodes lifespan. This also seems to opens a similar attack vector as the global rate limiting did, an attacker (that knows of two bugs one of which is a disk filling bug) can execute an attack and then use the disk filling bug to rotate the logs out that had info about the attack. |
src/logging.cpp
Outdated
m_file_size += str_prefixed.size(); | ||
|
||
// If debug.log is large, rotate it (rename and recreate). | ||
if (m_rotate_keep > 0 && m_file_size > m_file_limit * 1e6) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I want to rotate, but never delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I want to rotate, but never delete?
-debuglogrotatekeep=9999999
or is that ugly? Should there be a separate -debuglogenable
boolean (default true; if set to false, behave as today), and then -debuglogrotatekeep=0
could mean "never delete". Would that be preferable? Or something else? Thanks for the comments!
I come from a storage system background, and dropping the oldest logs was standard practice. (This is why there's a
That's true, but for reasons I already stated, that's not practical. The most useful information is likely to be in the most recent logs. The user can always increase the amount of logging (in effect, to infinity if needed). Or disable this feature (work as today). Consider another option the user has: Run some kind of small agent (even just a small shell script) that watches for new rotated debug log files, and copies them off to some other place with tons of storage. Or compresses them. Or filters out interesting stuff from them and deletes them.
Well, this is unlikely-squared. This isn't something I'm worried about, but maybe I'm wrong. Thanks for your comments! |
Added commit d885f0b to address @luke-jr's review comments (aside from "What if I want to rotate, but never delete" because I'm not sure which is the best approach). There are no tests or release notes, because still looking for approach or concept ACK. |
d885f0b
to
a29d978
Compare
Force-pushed to fix CI failure (missing #include directive) |
a29d978
to
dce051a
Compare
Force-pushed to fix CI failures, also added a functional test. |
d0e3d1f
to
3c16201
Compare
src/logging.h
Outdated
@@ -24,6 +25,8 @@ static const bool DEFAULT_LOGTIMESTAMPS = true; | |||
static const bool DEFAULT_LOGTHREADNAMES = false; | |||
static const bool DEFAULT_LOGSOURCELOCATIONS = false; | |||
extern const char * const DEFAULT_DEBUGLOGFILE; | |||
static const int DEFAULT_DEBUGLOGROTATEKEEP = 10; | |||
static const int DEFAULT_DEBUGLOGMB = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To match current behavior, wouldn't you want DEFAULT_DEBUGLOGMB
to be 10 and DEFAULT_LOGROTATEKEEP
to be 1?
Nevermind, that isn't current behavior either. Just curious about the constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @narula, that's still a good question. The idea of those defaults is that between 10 MB and 11 MB of log messages would be retained (just after a log rotation, 10 MB because there'd be 10 rotated files at 1 MB each plus an empty debug.log
; just before log rotation, 10 MB for the same reason plus a 1 MB debug.log
. This matches the current disk usage if bitcoind
is restarted often (if not, then the debug.log
size can increase without limit).
If the defaults were DEFAULT_DEBUGLOGMB=10
and DEFAULT_LOGROTATEKEEP=1
then the logging storage would vary from 10 MB (just after rotation) to 20 MB (just before rotation).
One could argue that the defaults could be quite a bit larger because the current default (10 MB minimum) was chosen when disks were smaller than they are typically today. But there could be that one user whose disk is almost full, and upgrading to a release with this PR would soon run their disk out to space if the defaults were larger. (You might say that a node's disk space requirement grows indefinitely anyway as more blocks are added to the blockchain, but that's not necessarily so because nodes can be configured with block pruning enabled.)
ea728a3
to
7db585f
Compare
There doesn't seem to be much support for this PR, and at least part of the problem may be that it changes default behavior, and that could surprise some users. So I force-pushed a small change so that by default, there is no functional change (preserve existing behavior). If you want to enable log rotation, add lines like these to your
(This will retain 10 rotated files, each around 1 MB.) Starting with I think this version is much safer; if log rotation proves to be popular, we can consider making it default behavior later. |
Concept ACK I strongly agree with the principle above:
I also agree with the decision to make this optional for users (i.e it will not change default behavior). Having this as a config option would make me feel more comfortable running a node with Also, not really important, but a personal preference: I like having logs split into timestamped files as it makes it much easier for me to search logs during the period where I believe the issue occurred as opposed to opening and searching a very large file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while testing, I noticed there were silent failures when I specified a name other than debug.log
with log rotation requested and silent failures if I specified a debugloglimit
of < 1 (but greater than zero). I see there is code to handle the filename error, but it doesn't appear to be working as intended.
I agree with the concept of log rotation, but I'm not crazy about the approach of restricting the user to a hard coded name for the log file, especially since before this they could specify an arbitrary filename. Seems worth the effort to enable log rotation even with an arbitrary filename from the user, imo.
doc/release-notes-22350.md
Outdated
become (in MB) before it is rotated (default is 10). If you enable | ||
log file rotation and specify an | ||
alternate debug log file name using `-debuglogfile=path`, the last | ||
component of the path must be `debug.log`. (#22350) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously when using -debuglogfile=path
, i could specify any name for the file. now, if anything but debug.log
is specified for the name, bitcoind
fails to start without error (confirmed this while testing).
imo, rotation should work regardless of the filename, but at the very least there should be an error if i don't specify debug.log
with log rotation enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for your comments; these are very useful.
... rotation should work regardless of the filename ...
I like your idea. After giving this more thought, perhaps the best approach is to let -debuglogfile=path
to work as usual, but create a directory with a fixed name like rotated-logs
(or rotated-debug-logs
) and put all the rotated log files within that directory. (The active debug log file, debug.log
or as specified by -debuglogfile
, would be as it is today, no change.) Because the rotated log files are in a directory with a specific purpose, they can have simple names like 2021-06-27T15:10:55Z.log
. They wouldn't have to be named according to the -debuglogfile
argument.
This would have an added benefit: As I've been running this code to test it, my data directory now looks "messy" or polluted, with all these rotated log files that usually aren't very important -- why should they be so visible and prominent?
$ ls ~/.bitcoin
banlist.dat debug-2021-06-27T21:19:42Z.log fee_estimates.dat signet
banlist.json debug-2021-07-09T00:38:43Z.log indexes testnet3
bitcoin.conf debug-2021-07-14T16:32:30Z.log mempool.dat wallets
bitcoind.pid debug-2021-07-17T14:01:27Z.log peers.dat wallets-empty
blocks debug-2021-08-10T20:03:04Z.log regtest
chainstate debug.log settings.json
$
(And I've got only 5 rotated log files; there could be many more.) A small disadvantage is that it would be slightly more difficult to concatenate (or grep) all of the log files in time-order:
$ cat rotated-logs/* debug.log
(instead of just cat debug*.log
) but that's not bad.
So unless there are objections, I'll make this change within the next few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding many log files creating clutter in the root directory, i agree having a logs folder definitely makes sense. personally, i would prefer a logs/
folder containing all the logs as this makes it clear to the user there is nothing special about the debug.log
file vs what's in the logs directory. it also keeps things simple for concatenating and grepping.
since the debuglogfile custom name is known, wouldn't it be possible to attempt to split the filename into a name and extension (<name>-<timestamp>.<ext>
), and then handle the special case where the user doesn't supply an extension (i.e logs
, which would become logs
, logs-<timestamp>
)?
also, feel free to wait for more feedback from others before making changes :)
src/logging.cpp
Outdated
return strprintf( | ||
"The debug log file %s is not named \"debug.log\"; " | ||
"this prevents log rotation. Please disable log rotation by " | ||
"specifying -debuglogrotatekeep=0 or specify a debug log file " | ||
"path that ends in \"/debug.log\" using -debuglogfile.", | ||
m_file_path.string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this output isn't showing (not sure why). i ran:
./src/bitcoind -daemon -debuglogrotatekeep=3 -debugloglimit=1 -debuglogfile=/home/josibake/bitcoin/j.log
and got "Bitcoin Core Starting," but it fails to start without any errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's because you've started it with -daemon
. It hasn't initialized the logging yet, so it has nowhere to write the error message. If you start without -daemon
you should see the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. i do think its important to throw an error when -daemon
is used, else the user is likely to assume there node has started fine after seeing "Bitcoin Core Starting." for example, if i try to start my node with:
./src/bitcoind -datadir=foo -daemon
i will get:
Error: Specified data directory "foo" does not exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do think its important to throw an error when -daemon is used ...
I looked into this more, and it's not trivial to fix. The problem is that bitcoind has already "daemonized" itself by the time it gets around to initializing the logging, which is later than when it tries to open the data directory. This problem happens without the current PR:
$ src/bitcoind -daemon -debuglogfile=some/non-existent/path
Bitcoin Core starting
$
and there's no indication that anything went wrong (the exit status is even 0, success). So I think this can be fixed in a separate PR.
src/logging.cpp
Outdated
"The debug log file %s size limit in MB must be greater than zero.", | ||
m_file_path.string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also got a silent failure trying to specify a logsize value < 1. i ran with -debugloglimit=0.5
, got no errors, but the node failed to start. i also ran with -debugloglimit=1.5
without issue so i dont think its the decimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because it's parsing the argument as an integer, and when it gets to the decimal point, it stops parsing, interpreting the argument as zero. Specifying fractions of a megabyte is probably unnecessary, but maybe it should print a good error message. I'll see what I can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i went back and confirmed, even with 1.5 it is still rotating in 1mb chunks. might be worth adding something to the option help that debugloglimit is in MB and must be specified as an integer
Strong Concept ACK I think one of the main reasons why restricting unlimited growth of the log file was never seen as particularly problematic in Bitcoin Core is the fact that running bitcoind with default parameters (that is, w/o pruning) already consumes (potentially) infinite disk space over time. Having a growth rate of > 1 GB per week of the block chain on mainnet, one would hardly be worried about the additional space consumption of relatively small log messages. That said, I still fully agree that we should support log-rotating for the following reasons:
I also think it's better to keep the default behaviour by now as it is. Maybe it's worth it mentioning in the docs that using this new feature is recommended for pruned node users? Feel free to tag me if this PR is out of draft, will be happy to review. |
See https://en.wikipedia.org/wiki/ISO_8601#Times This is needed because log file rotation filenames are constructed using this "basic" format, because it has no ":" (colon) characters, which Windows does not allow.
No functional change.
-debuglogrotatekeep: number of rotated log files to keep -debugloglimit: debug.log size (in MB) before it is rotated The default -debuglogrotatekeep value is 0, which disables rotation.
7db585f
to
2f5e593
Compare
Thanks for the concept review, @theStack, I just force-pushed a new version that's a lot better, I think, and implements the suggestions here: #22350 (comment) |
b2d2f47
to
f5de5b0
Compare
No functional change by default. Add support for log file rotation. If enabled, then when the debug.log file reaches a certain size (default, 10MB), rename it to a file within a separate directory, debug-rotate, named according to the current date and time, and restart (truncate) debug.log. This keeps debug.log from growing unbounded. Retain a limited number of rotated files.
f5de5b0
to
13511ce
Compare
Force-pushed a small change to ensure that rotated debug log files don't have partial lines. Each line is guaranteed to be a full line beginning with a timestamp. This is an improvement over master, because with master (or with this PR without log rotation enabled), if the debug log file is larger than 11 MB when |
Concept ~0 on this. We always used to be against integrating log rotation because it can be handled by external tools (sure, on some operating systems this is more difficult, but I'm sure there are solutions for WIndows too). We should avoid the case where |
It seems to me bitcoind should manage its own log file appropriately. In contrast,
are already being used elsewhere; we're not introducing any new dependencies on filesystem calls. So I don't know that maintenance would be more difficult. |
i think it's also worth mentioning this isn't only about log management and is (more importantly, imo) a solution to mitigate a disk fill attack
as mentioned in the comments, a log rotation solution (as opposed to the rate limiting solution proposed in #21603 ) doesn't result in partial logs, which i think is important. @LarryRuane perhaps it would help to update the description of the PR to be more clear about how this is a mitigation for disk fill attacks (without the reviewer needing context from #21603 ) |
The point is less about what filesystem calls might be used, but the higher level observation that Adding and having to maintain large amounts, of sometimes complex code, to do non-bitcoin related things, i.e log management, settings parsing, cli's, ELF binary parsing, etc, adds a lot of (development) noise, maintenance, and back-compat overhead. Ideally this repository is moving in the opposite direction, where instead of adding more and more code, to do more and more things, we're actually removing more and more code, to the point where we end up with just the code that is bitcoin, and more (useful) libconsensus type things. That effort is somewhat underway, i.e with the separation of the GUI development, de-globalization of the chainstate etc. |
Thanks, @fanquake and @laanwj! Very good points.
You've given me an idea, tell me what you think of this (also @jnewbery and @theStack) as an alternative to this PR: I just did a proof-of-concept experiment on Linux (I don't know if this would work on Windows):
In another window
You'll see the Proposal: What do you think about adding a Python script to This would reproduce the current functionality (except for shrinking But this python script could be enhanced to do log rotation -- and in the future, maybe other stuff like compression, filtering, sending some kind of alert or notification if certain log messages appear, who knows -- all without any changes to |
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
Closing due to insufficient support; I still think it would be valuable to implement what I described above #22350 (comment) and that would be much less intrusive (only adds to |
EDIT: Change the default behavior to be that log rotation is not enabled.
This PR will mitigate a disk-fill DoS attack (although such an attack is not known to exist today).
If log rotation is enabled, then when the
debug.log
file reaches a certain size (default, 10 MB), rename it to a similar name that includes the current time, and reset (truncate)debug.log
. Keep a limited number of rotated files. This provides a sliding window of debug messages so that disk space isn't consumed without limit.This is similar to Unix
logrotate(8)
but requires no external tools and complex configuration, and works on Windows (which doesn't havelogrotate
).