-
Notifications
You must be signed in to change notification settings - Fork 5
Wait for process to be killed before temporary directory gets removed #65
Conversation
08a742b
to
86bce41
Compare
86bce41
to
d3431ca
Compare
}); | ||
|
||
// remove the temporary directory | ||
return fs.remove(this.dir); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.remove()
returns a promise, which was previously ignored... the other issues were that:
- tree-kill didn't kill the processes very well
- the directory was removed before the process got killed (caused problems with Ruby hooks)
} | ||
pids.forEach((pid) => { | ||
try { | ||
process.kill(pid, 'SIGKILL'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIGKILL doesn't give the subprocess anytime for their own cleanup / time for gracefully shutting down.
Would it be better to do SIGTERM and then SIGKILL after some timeout (5-10 seconds?) That gives the processes some time to do their own cleanup before force killed.
Maybe in this case if a process doesn't gracefully exit from SIGTERM that should be a test failure too (instead of having to SIGKILL it).
If I understand correctly, these processes are child processes. If you use child_process.spawn
to create them, then we can observe when they exit and use the APIs on them to observe when exitted. For example (pseudo code and probably maybe python in areas :)) -- architecture can be changed a lot but this is to show the rough idea:
const assert = require('assert');
const processes = [];
let timeoutTimer;
let completion;
function spawn() {
const process = ...;
processes.push(process);
process.on('exit', () => {
process.remove(process)
if (processes.length === 0) {
clearTimeout(timeoutTimer);
if (completion) { completion() }
}
});
}
function shutdown(done) {
if (timeoutTimer) {
assert.fail('shutting down while already shutting down');
}
completion = done;
const timeoutInterval = 2000;
timeoutTimer = setTimeout(() => {
// bad processes didn't exit in time
processes.forEach(process => process.kill('SIGKILL'));
}, timeoutInterval);
processes.forEach(process => process.kill());
}
Can probably come up with something better if you can use promise.. Each process lifecycle can be wrapped in promise, kill can return collection of promises and then if timeout happens first they get force-killed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep this as minimal as possible - it is an after-test tear down cleanup. I know SIGKILL is brutal, but I'd say tests tear down is allowed to use brutal means to cleanup after tests on a few lines of code. There are three situations atm when killing can happen:
- Dredd malfunction. Dredd isn't able to kill it's children and they get stuck as running processes. I wonder whether this would even track the children down in case Dredd, the parent process, is already finished 🤔 But it's the best I have right now... in any way, that would be malfunction, and those can cause other malfunctions in chain reaction (in this case malfunction of tests cleanup), that's kind of expected.
- Dredd itself gets stuck in limbo because of some unexpected malfunction. This should definitely kill it and all its children with no mercy and I think that's fine.
- There are some tests running the hooks handler as a standalone process, sending TCP messages its way and inspecting what it returns. Then they rely on this cleanup to tear down the process (in user environment one would Ctrl+C to
SIGINT
it).
The test suite template could add to the contract that the hooks handler process should be able to handle SIGTERM, but I'd see that as a separate feature, an addition to what this test suite imposes on hooks handlers implementation. I filed #67 to discuss that. Also, now it could make some of the hooks handlers not passing the test suite.
However, I think we could use your comment (and perhaps the additional code you sent me over Slack?) as a suggestion to improve how Dredd's subprocess management works. Currently it lives here: https://github.com/apiaryio/dredd/blob/72220703a644234436c53de42849ab7fa045653a/lib/childProcess.js Sometimes it feels flaky in the CI and it still has bugs, even though it's much more reliable than it was previously (before I revamped it and put into childProcess.js
). Because there, in Dredd, that's IMHO the place where care should be taken and where exactly what you write happens - nicely trying SIGTERM first, waiting, then killing with SIGKILL, etc.
🙏 |
🎉 This PR is included in version 1.1.5 🎉 The release is available on: Your semantic-release bot 📦🚀 |
No description provided.