Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker fails to close used connections and they remain in CLOSE_WAIT #2166

Closed
kzangeli opened this issue May 20, 2016 · 9 comments
Closed

Broker fails to close used connections and they remain in CLOSE_WAIT #2166

kzangeli opened this issue May 20, 2016 · 9 comments
Assignees
Labels
Milestone

Comments

@kzangeli
Copy link
Member

It seems, as discovered by a client, that the broker sometimes leaves used connections without closing them, causing all fds to be used and consequently the broker stops working.

The broker uses libmicrohttpd (MHD) to respond to requests and libcurl for notifications and forwarding of messages and it seems libcurl might be responsible for the detected problem.

@kzangeli kzangeli self-assigned this May 20, 2016
@kzangeli
Copy link
Member Author

@fgalan fgalan added this to the 1.2.0 milestone May 20, 2016
@fgalan
Copy link
Member

fgalan commented May 23, 2016

I understand that the problem has been discoverd in Orion 0.26.1. Reviewing the changelog from Orion 0.26.1 to Orion 1.1 (the current one) note the following in Orion 0.28.0 release notes:

Fix: libmicrohttpd 0.9.48 included in contextBroker as static lib (previous Orion versions used 0.9.22 as dynamic library) (Issue #1675)

If the problem cause points to libmicrohttpd, it should be reviewed if 0.9.48 has solved it.

@crbrox
Copy link
Member

crbrox commented May 27, 2016

Finally, it seems an expected behavior if the limit of file descriptors for the process is not high enough. With notification mode set as threadpool , each thread can hold up to 5 connections (libcurl pool). So, for outgoing request, 5*(number of pool's threads) file descriptors are needed to avoid exhausting the available fds

The total limit of fds should include the number of simultaneous incoming request in MHD, plus an amount for logs, listening sockets and file descriptors used by libraries
maxCon + 5 * no of threads + X

Or the number of threads in the pool can be lowered consequently

As Fermin has noted, the number of connections to mongoDB held by the driver should be included in the count
maxCon + 5 * no of threads + size of mongo driver pool + X

@fgalan fgalan added doc and removed bug P8 labels May 27, 2016
@fgalan
Copy link
Member

fgalan commented May 27, 2016

After analyzing this, it seem it isn't a problem in the code, but something that need to be documented. Thus, removing "bug" label and adding "doc" label.

@fgalan
Copy link
Member

fgalan commented May 30, 2016

Doc modifications in PR #2214

@fgalan fgalan closed this as completed May 30, 2016
@mrutid
Copy link
Member

mrutid commented Nov 26, 2016

It's happening...

@mrutid mrutid reopened this Nov 26, 2016
@fgalan fgalan modified the milestones: 1.7.0, 1.2.0 Nov 28, 2016
@fgalan fgalan added bug P8 and removed doc labels Nov 28, 2016
@fgalan
Copy link
Member

fgalan commented Nov 28, 2016

Recovering this issue for the 1.7.0 milestone.

@fgalan
Copy link
Member

fgalan commented Dec 5, 2016

After researching on this, CLOSE_WAIT are not the root of the problem: the root of the problem is fd exahustion when select() was used in MHD (which is the scope of issue #2724).

However, documentation should be improved to explain this better so this issue will remain opened.

@fgalan
Copy link
Member

fgalan commented Dec 7, 2016

Documentation completed in PR #2755. Closing.

@fgalan fgalan closed this as completed Dec 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants