-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close the fd that we're removing from the pollfds list #96
Conversation
We currently dup the file descriptor in add() by calling consume_fd(). We never close this dup'd fd. We should close it when we remove it from the pollfds list.
r? @pcwalton |
cc @antrik |
@antrik Does this look right to you? |
@pcwalton yes, as the FD is owned by the receiver set once it's added to it, I'm pretty sure it's correct to close it when it's dropped from the set... Another case to convince me we need a proper FD type to handle these things properly :-) (Would be nice if the fix came with test cases, though...) |
@antrik How would you test that? |
@nox well, most important would be a baseline test to check that channel close in a receiver set is still handled correctly. So -- unless there is already an existing test case that covers this -- we'd do something like dropping the send end of one channel in a set, and check that the select correctly receives the notification, and that "ordinary" receives on the other channels in the set still work. Ideally, it would be nice also to check that the FD is actually not leaked any more. (Mostly to prevent regressions in the future.) Admittedly, I don't know how to do this correctly off-hand... Might be non-trivial when tests are executed in parallel. |
Is there a linux tool that will tell you what fds got closed in program termination? Similar to how valgrind can find leaked memory at the end of execution? |
It looks like there's another problem introduced by this patch: |
Indeed. Here's what happens:
we iterate over the results vec:
To avoid this we should probably not close the fds until after they are removed from the HashMap. |
@jrmuizel took me quite a while to figure out what you are talking about... So, I didn't realise that the FD is returned along with the And it would be really nice if you could put that into a test case :-) |
👍 but FYI this exposes an intermittent that I hit in earlier revisions of #94
|
So I looked into this a bit more: and the whole idea of casting the FD as a plain integer to use as ID seems fundamentally flawed to me. I don't see a way to fix the leak without introducing a race while keeping this approach. The The (The only way I can see to handle this in a both robust and efficient fashion, is to handle lookup entirely within the backends -- with an interface that doesn't expose IDs, but rather takes and returns some arbitrary payload supplied and used by the caller.) |
(For the record, I didn't invent the approach I'm suggesting here as a better alternative: it's just a safe variant of what is known as "protected payloads" in the IPC mechanisms of many modern micro-kernels.) |
@antrik I'm confused—is this patch OK to merge or is it broken in some way? |
This patch exposes a potential race that will cause panics. It's probably better to keep leaking now until the race is fixed. |
I guess this can be closed in favour of #105 ? |
☔ The latest upstream changes (presumably #102) made this pull request unmergeable. Please resolve the merge conflicts. |
We currently dup the file descriptor in add() by calling consume_fd(). We never close this dup'd fd. We should close it when we remove it from the pollfds list.