-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repl: History file locking and other improvements #7005
Conversation
There are several open issues regarding the repl's history file, and it turns out that these issues are all somewhat related. The code changes here started as basic file locking to prevent simultaneous writes on the history file. This led to the realization that the history file was rewritten in its entirety for each expression evaluated in the repl. This, of course, meant that interleaving writes between multiple processes wasn't really possible. To make that happen the file needs to be open for writing with `'a'` instead of `'w'`. As it turns out, changing this call to `fs.open()` to use `'a'` also should fix a problem with `.node_repl_history` being hidden on Windows systems (see #5261). Although, I do not have a Windows box to test against. Benefits: * Don't rewrite the entire history file for every evaluated expression. * Don't lose all but the last NODE_HISTORY_FILE_SIZE number of lines on repl startup. * No history file corruption from simultaneous writes by multiple processes. * History is interleaved for multiple processes in a single history. * In theory fixes a known issue on Windows. Ref: #5261 Ref: #1634 Ref: #3928 Ref: #4313 repl: Add file locking for repl history file. Adds a .node_repl_history.LCK file to prevent concurrent writes to the NODE_REPL_HISTORY file from different processes. repl: Adjust tests for reverse order of log file. Since the repl now stores history as most recent last, the tests have been adjusted to account for that.
lib/internal/repl.js
Outdated
@@ -127,7 +127,8 @@ function setupHistory(repl, historyPath, oldHistoryPath, ready) { | |||
} | |||
|
|||
if (data) { | |||
repl.history = data.split(/[\n\r]+/, repl.historySize); | |||
repl.history = data.trim().split(/[\n\r]+/) | |||
.reverse().slice(0, repl.historySize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this PR change the ordering in the history file? if so, it is probably semver-major
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm… I think the intent of repl.historySize
is to limit the file size, not the length of the history that’s being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's provided by readline to limit the history in memory. But we use it for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fishrock123 Thanks for the clarification… It still seems to me (both from looking at the code and from experimenting) the file grows without bounds with this PR in its current state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax @Fishrock123 historically, historySize
has limited both the file size, and the maximum number of lines loaded into memory at start up. But this almost seems like a side effect of readline
usage, and is not documented anywhere that I could find in the context of the REPL.
This PR changes it so that so that the number of lines stored in memory from the history file at start up is still limited, but the history file size itself is unbounded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR changes it so that so that the number of lines stored in memory from the history file at start up is still limited, but the history file size itself is unbounded.
Is there any particular advantage to limiting the number of lines stored in memory if they are all going to be loaded (and saved?) anyway?
cc also @chrisdickinson who was originally working on this |
@Fishrock123 yes it does. I mentioned it in the description, but yeah I probably should have made that information more prominent. |
@Fishrock123 wrt reversing the the order of the history file, I've read the comments on this PR, and the related (still-open) issue. I would say that given the other issues this PR addresses, there are now compelling reasons to change the order.
@chrisdickinson I'm aware that there have been a few stop/start efforts to make changes to the history file. I don't mean to step on anyone's toes here. Just saw that there were multiple issues, all somewhat related, that could be addressed with a single PR. |
Hmm, I'd like to hear comments from others, but it sounds like a good approach. I haven't really looked at the code yet. |
I think there’s something weird going on with not all lines being saved to the history file (maybe something timeout-related?) And this seems to break |
lib/internal/repl.js
Outdated
repl.emit('flushHistory'); | ||
} | ||
} | ||
unlockHistoryFile(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured out where my .exit
problem comes from – You need to call unlockHistoryFile
before flushHistory
is emitted, because the latter one is calling process.exit()
in lib/internal/bootstrap_node.js.
(EDIT: Which may also be a good reason to do everything inside of unlockHistoryFile
synchronously.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I will add a fix and a test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - I can only reproduce this when there is already a stray .node_repl_history.LCK
file at start up, and the history file is never written to. When the REPLServer
is closing it adds a listener for flushHistory
here which never gets emitted, and so the final Interface#close
is never called.
I can add a check for the existence of the .LCK
file at startup and then remove it if necessary. There are maybe a couple of ways this could happen.
- Notify the user. "Hey! There's a lock file, do you want me to delete it?". Probably the safest, but pretty intrusive and kind of looks like its underwear is showing.
- Set one or more brief timeouts to check/recheck for the existence before deleting. This doesn't eliminate the possibility of a race condition, but it definitely minimizes it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - I can only reproduce this when there is already a stray
.node_repl_history.LCK
file at start up, and the history file is never written to.
Well, yeah, this is how I can reproduce it:
$ ./node
> .exit
$ ./node
> .exit
[hangs]
The first one here leaves the stray lock file lying around.
This is just an idea, but how about this approach when saving the history:
- Read the current history file into memory
- Remove it
- Create the history file again, with flags set to
wx
. If that fails, go back to 1. - Write the desired number of lines to the opened history file
That seems to be free of race conditions and does not require a lock file that could keep lying around if the process crashes (for whatever reason, it doesn’t have to be REPL-related)… what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And btw, this is how bash/readline does it:
- No locking whatsoever
- On exit, append the current session’s history to .bash_history
- After that, read the entire history file, truncate in memory to the desired number of lines, and then save again
This approach obviously racy but apparently that’s sufficiently irrelevant to just not care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax I've been playing with an implementation of your algorithm. It works, but I'm concerned about removing/recreating the history file. We would want to recreate it with the same file modes as the original. If I use fs.stat()
, can I use the resulting stats.mode
in fs.open()
when recreating the file? I don't think that would work, would it?
Edit: It does work on OSX. Not sure about Windows - specifically the hidden bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lance Valid point, and I don’t know if it works either (but I’d actually guess that it does)… if you have the code for that lying around somewhere, I can try and test it manually.
Hi! I wrote the original version of the persistent REPL history feature. Efficiency wasn't a pressing concern compared to (perceived) consistency, since the write is likely to complete under the threshold of being noticeable by a human being for most cases. My intent was to mimic bash history (to some degree) in the form of "last opened wins" — in this case, the last active REPL wins the history race. Interleaving history between multiple open REPLs is not ideal, and my goal was to avoid that by blowing away the history file on input (so at any given time, the history file would reflect the consistent history from the last repl to have input data.) Notably, #1634 was opened when the accidental interleaving would break history loading due to JSON loading concerns. Given the switch to line-delimited history, I'm unsure the benefit is worth the extra complexity. I worry a bit about processes accidentally leaving lockfiles around after unexpected crashes — it seems like it would break REPL history persistence in subtle ways. Opening the |
After some discussion, it appears that a .LCK file just introduces too many opportunities for error, and that the possibility of a simultaneous write between two processes is small enough to not matter. In addition, the log file ordering has been reverted back to most recent first. We can still open in append mode, but write from the 0th position in the file, which should help Windows users, encountering #5261 The change introduced here still eliminates the file corruption as was seen in #1634 by opening and closing the file on each write instead of holding on to a file descriptor throughout the session.
I've given some thought to this over the holiday weekend (here in the US). Append mode is definitely a win for Windows users. @addaleax your suggested algorithm should work, but does nothing for #5261 on Windows. I've just pushed a commit that eliminates the lock file, still opens in append mode, but writes from the 0th position. @chrisdickinson I see your point with regard to efficiency not being a pressing concern. My most recent push embraces this philosophy. :) |
@addaleax if you can test the delete/recreate behavior on windows that would be great. Please be sure to test with |
@lance So, apparently the |
@addaleax thanks for checking that for me. So, I would like some feedback on where to go next with this. Originally, this PR was meant to address 3 things.
Regarding file corruption, without a lock file of some kind (or other means of interprocess communication), we can't prevent simultaneous writes between multiple processes. The drawback with a lock file is that there are circumstances where a stale one could exist. There are likely algorithmic ways of dealing with this, but it's not clear to me if the problem is significant enough to bother with. However, this race condition should not be confused with #1634. I think this is occurring because each repl holds an open With the solution described here, it seems like it might be unwise to change the history file order. This was originally done so that the repl could simply append each new statement to the history file. Since that's no longer the case, I don't see a compelling reason to make a change like this. @addaleax @Fishrock123 @chrisdickinson your feedback on this analysis/approach is appreciated. |
I'm not sure I have a firm grasp of the issue anymore, are you saying we should change it (back to) append mode? I thought it already did append? |
@Fishrock123 the way it currently works, the REPL history is rewritten with each statement - there is no appending. This PR originally was intended to address multiple issues, in part by changing to append mode, and only writing the most recent statement from the REPL. However, this introduced a couple of issues.
So, given all of that, I've removed the line-by-line append approach. This means that #3928 will not be addressed with this PR. But there are still two other issues which I think could be resolved with this PR.
I think that #1634 can be partially resolved by opening in append mode, writing from 0th position in the file, and the closing the file. There is still the possibility that 2 repls could simultaneously perform this action and still result in corruption, but input from multiple repls doesn't get lost this way. And I believe that #5261 will be solved because the file is opened in append mode (but I cannot test this on Windows currently). If it makes sense to continue with this PR, I will clean it up/finalize changes. If not, no hard feelings. I'm learning more and more with each PR, so all is not lost. 👍 |
Is there anything wrong for going for a not-line-by-line approach that appends the current history session on shutdown? That shouldn't interleave? |
@Fishrock123 do you mean the file is only written once in append mode by a given repl instance, and that's on the What about history file length? In this scenario, it's potentially unbounded. To me, that doesn't seem like a serious issue, since it would take a lot of activity in a repl to actually make the file size problematic. And the simple solution for this is to I suppose this would still be semver-major given the behavior change. |
@lance What I'm suggesting would be similar to how bash handles it... I think. i'm not really sure how to do size limiting.... |
This commit modifies the history file persistence algorithm for the repl as so: * At repl startup, the history file is read into memory * Statements on the repl are maintained in memory until 'close' * On the repl's 'close' event, the history file is re-read, synchronously. This history is then merged with all new statements, and rewritten. This prevents statements between multiple repls from interleaving, but maintains history between them by grouping statements together from any given session. When the file is finally written, it is opened in append mode to (hopefully) address the Windows issue with hidden files. All history file actions on close are synchronous, and the `exit` event is now emitted on the next tick.
@Fishrock123 I think I've got something for you here with this latest commit. |
@Fishrock123 @addaleax @chrisdickinson any further thoughts on this? |
lib/internal/repl.js
Outdated
} | ||
const fd = fs.openSync(historyPath, 'a'); | ||
fs.writeSync(fd, history.join('\n'), 0, 'utf8'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’ll want to append an extra newline, otherwise the first entry of the current session is added to the last entry of the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax that's not actually the case. Since the history file is read in its entirety and rewritten each time, a trailing newline is not necessary. The pattern is:
- On startup read the history file into memory and split on newline to create the in-memory history array.
- Keep track of the number of lines found in the original history file.
- On shutdown, extract the new lines added to the in-memory history.
- Re-read the history file and concatenate the file data with the new lines from our in-memory history.
- Write the entire history file with the data that was read from the file, plus the new lines that have been concatenated to it.
- Close the history file.
No newline required! Also, note that the final fs.openSync
used to write the history file is opened with the a
flag, to get around the Windows issues with hidden files. When I do the subsequent fs.write
, the write just begins at position 0
in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, I ran 3 simultaneous REPLs and observed correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m going to try this again, but wouldn’t that mean that the a
flag should not be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> rm ~/.node_repl_history
$ ./node
> 1
1
> 2
2
> 3
3
> .exit
$ ./node
> 4
4
> 5
5
> 6
6
> .exit
$ cat ~/.node_repl_history
.exit
3
2
1.exit
6
5
4
.exit
3
2
1
Eek. Also, I personally consider it good practice to have all text files terminate with newlines. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right on all points. I have been using ^D to exit the REPL in my tests
which must account for the difference. I will address this.
On Tue, Jul 26, 2016 at 6:14 PM Anna Henningsen [email protected]
wrote:
In lib/internal/repl.js
#7005 (comment):}
- const fd = fs.openSync(historyPath, 'a');
- fs.writeSync(fd, history.join('\n'), 0, 'utf8');
rm ~/.node_repl_history
$ ./node
11
22
33
.exit
$ ./node
44
55
66
.exit
$ cat ~/.node_repl_history.exit321.exit654.exit321Eek. Also, I personally consider it good practice to have all text files
terminate with newlines. :)—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/nodejs/node/pull/7005/files/a74ca07852baafc5944c142d4c103c2067bb12cc#r72366519,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAA-UKgrWsySmgECsaBZY2tlwkI1jV5Rks5qZrDzgaJpZM4In8hN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax this is strange. I am not seeing this. I'll grant you that a newline terminating the file is good practice, and will address that. But for the record, I am not seeing the behavior you show above. I don't know why.....
~/s/node git:repl-history-file ❯❯❯ git status ✭
On branch repl-history-file
nothing to commit, working directory clean
~/s/node git:repl-history-file ❯❯❯ git log -n1 ✭
commit f5302b44eeb48039d1e140b988e06b930fdeac5e
Author: Lance Ball <[email protected]>
Date: Tue Jul 26 17:36:59 2016 -0700
repl: History file improvements
Address concerns in tests by using common.mustCall.
~/s/node git:repl-history-file ❯❯❯ rm ~/.node_repl_history ✭
remove /Users/lanceball/.node_repl_history? y
~/s/node git:repl-history-file ❯❯❯ ./node ✭
> 1
1
> 2
2
> 3
3
> .exit
~/s/node git:repl-history-file ❯❯❯ ./node ✭
> 4
4
> 5
5
> 6
6
> .exit
~/s/node git:repl-history-file ❯❯❯ cat ~/.node_repl_history ✭
.exit
6
5
4
.exit
3
2
1%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird – I still can’t reproduce the behaviour it has for you… without much deeper digging, I’d assume that the a
flag in this line is the reason for the doubling in my repl history file; is that wrong?
|
||
onClose(); | ||
}); | ||
repl.once('exit', onExit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto for common.mustCall
?
Looks pretty okay overall, sorry for the delay getting back to you. :) |
Address concerns in tests by using common.mustCall.
@addaleax thanks for following up. Nits and |
@addaleax newline appended. Strangely, I was not able to reproduce what you were seeing using those same steps (see outdated line note). |
Hrrrm - seeing failures on Windows on CI: https://ci.nodejs.org/job/node-test-pull-request/3442/ Specifically, this:
This occurrs consistently on the following builds:
Frustratingly, other Windows builds are fine. |
@lance If I’ve understood the |
Not sure what should be done here. I'm leaning towards closing this. It solves some of the issues noted, but just introduces others. Any opinion about that @addaleax or @Fishrock123? |
Sorry for not really following up here… I can probably find the time to take another look. But generally, it’s not going to land while CI is failing on Windows, and I don’t really know how to help with that. :( |
@addaleax no hard feelings! I just don't want the PR to linger in limbo. I'm OK to close it, if it seems that this is overall not the right approach. Also, given the major overhaul of the REPL that @princejwesley is proposing, this may become obsolete. However, if you think it's still worth pursuing this, I can keep at it. |
@lance I’m not sure. I’m fine with having this land if it fixes the actual issues that it touches, but as someone who hasn’t experienced them herself so far or even just has taken the time to get a really deep understanding of what they are, I just don’t feel qualified to make any decision here. ;) |
c133999
to
83c7a88
Compare
@lance ... still want to pursue this? |
@jasnell I think maybe this issue is dead at this point. Sorry for the delay in responding. I'm going to close this for now. If you think it should be reopened, I would be happy to continue looking at it. |
Checklist
Affected core subsystem(s)
repl
Description of change
There are several open issues regarding the repl's history file, and it
turns out that these issues are all somewhat related.
The code changes here started as basic file locking to prevent
simultaneous writes on the history file. This led to the realization
that the history file was rewritten in its entirety for each expression
evaluated in the repl. This, of course, meant that interleaving writes
between multiple processes wasn't really possible. To make that happen
the file needs to be open for writing with
'a'
instead of'w'
.As it turns out, changing this call to
fs.open()
to use'a'
alsoshould fix a problem with
.node_repl_history
being hidden on Windowssystems (see #5261). Although, I do
not have a Windows box to test against.
Benefits:
repl startup.
processes.
Ref: #5261
Ref: #1634
Ref: #3928
Ref: #4313
repl: Add file locking for repl history file.
Adds a .node_repl_history.LCK file to prevent concurrent writes to the
NODE_REPL_HISTORY file from different processes.
repl: Adjust tests for reverse order of log file.
Since the repl now stores history as most recent last, the tests have
been adjusted to account for that.
Remaining work
Existing tests have been adjusted to account for the fact that the repl now appends log file data. But no tests have been written yet which exercise the interleaving of repl history from two separate instances. I'm working on it, but it has been a little tricky. I wanted to get this PR submitted as a placeholder, and if the direction I'm taking is OK, then I'll finish up the test.
I don't think this requires a documentation change, because these are mostly bug fixes and improvements. But there is the fact that the history file is now reversed. It's not currently documented how the history file should look, and I think this is probably not necessary to document. But I wasn't sure, so I left the checkbox there and unchecked.