Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a dedicated thread for tcp listener #4523

Merged

Conversation

pwojcikdev
Copy link
Contributor

While I'm not aware of any problems with previous implementation, the async way of implementing tcp_listener was generally harder to understand and to evolve over time. It was especially problematic when making changes to other aspects of networking code. Given that our connections are long-lived and relatively infrequent, there is very little need for tcp acceptor code to be focused on performance and instead we should prioritize readability and maintainability. This modifies it to instead run on a single, dedicated thread and use blocking IO operations.

@pwojcikdev pwojcikdev changed the title Use a dedicated thread for tcp listener component Use a dedicated thread for tcp listener Mar 25, 2024
clemahieu
clemahieu previously approved these changes Mar 25, 2024
Copy link
Contributor

@clemahieu clemahieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are minor points to consider/discard/accept.

Overall I think moving to a single-threaded tcp_listener is a major benefit.

nano/test_common/testutil.hpp Show resolved Hide resolved
nano/node/transport/tcp_listener.hpp Show resolved Hide resolved
nano/node/transport/socket.cpp Outdated Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Outdated Show resolved Hide resolved
}

void nano::transport::tcp_listener::on_connection (std::function<bool (std::shared_ptr<nano::transport::socket> const &, boost::system::error_code const &)> callback_a)
void nano::transport::tcp_listener::wait_available_slots ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we close excess incoming connections or let the OS do it? If we don't service connections the OS will queue them up to max_listener https://beta.boost.org/doc/libs/develop/doc/html/boost_asio/reference/basic_socket_acceptor/listen.html
@dsiganos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test this behaviour with more sockets queued than the max capacity @gr0vity-dev?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a problem with this behaviour. This way effectively leaves the queueing to the operating system.
Another way would be to accept a socket and, if the limit is passed, close it.

nano/node/transport/tcp_listener.cpp Outdated Show resolved Hide resolved
nano/core_test/socket.cpp Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Show resolved Hide resolved
@pwojcikdev
Copy link
Contributor Author

For future reference: there is a problem with using blocking accept () on Linux. Closing acceptor doesn't guarantee that the blocking call to accept will return, as described in their documentation:

This function is used to close the acceptor. Any asynchronous accept operations will be cancelled immediately.

The blessed ASIO way of doing things seems to always be using async functions. Here I decided to go with use_future completion handler to make code as similar as possible to the threaded version and avoid callback spaghetti.

@pwojcikdev pwojcikdev force-pushed the networking-fixes/tcp-listener-2 branch from ef48944 to 75fce7e Compare April 1, 2024 10:24
@pwojcikdev pwojcikdev requested a review from clemahieu April 1, 2024 16:50
clemahieu
clemahieu previously approved these changes Apr 2, 2024
@dsiganos
Copy link
Contributor

dsiganos commented Apr 8, 2024

One thing that confused me is the name tcp_listerner although I don't know how to name it better (tcp_server_container maybe?).
I expected it to do just tcp listening but it does a lot more.
It also creates and manages the tcp_server objects and is responsible for keeping the tcp_server objects alive.

nano/core_test/node.cpp Outdated Show resolved Hide resolved
}

void nano::transport::tcp_listener::on_connection (std::function<bool (std::shared_ptr<nano::transport::socket> const &, boost::system::error_code const &)> callback_a)
void nano::transport::tcp_listener::wait_available_slots ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a problem with this behaviour. This way effectively leaves the queueing to the operating system.
Another way would be to accept a socket and, if the limit is passed, close it.

nano/node/transport/transport.cpp Outdated Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Outdated Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Outdated Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Outdated Show resolved Hide resolved
nano/node/transport/tcp_listener.cpp Show resolved Hide resolved
@dsiganos
Copy link
Contributor

dsiganos commented Apr 8, 2024

Also, the node does not stop when pressing Control-C because the the accept does not return probably because the io ctx is stopped by the signal handling code.

@dsiganos dsiganos closed this Apr 8, 2024
@dsiganos dsiganos reopened this Apr 8, 2024
# Conflicts:
#	nano/node/transport/socket.cpp
#	nano/node/transport/socket.hpp
# Conflicts:
#	nano/lib/thread_roles.cpp
#	nano/lib/thread_roles.hpp
#	nano/test_common/testutil.hpp
@pwojcikdev
Copy link
Contributor Author

The issue with interrupt signal should now be fixed.

dsiganos
dsiganos previously approved these changes Apr 17, 2024
@pwojcikdev
Copy link
Contributor Author

This seems to have triggered a new TSAN warning, I'm not sure whether it's a false positive or not. I'm going to move to coroutine-based implementation with a strand to avoid these problems altogether. The changes necessary will be minimal.

@pwojcikdev pwojcikdev force-pushed the networking-fixes/tcp-listener-2 branch 4 times, most recently from 719588e to 780a2d9 Compare April 21, 2024 15:07
@pwojcikdev pwojcikdev force-pushed the networking-fixes/tcp-listener-2 branch from 780a2d9 to e93f243 Compare April 22, 2024 08:32
@pwojcikdev
Copy link
Contributor Author

I'm splitting this refactor and coroutines into two separate PRs. The currently approved code provides a solid foundation to build on, but introduces a new TSAN warning that will be fixed by a subsequent coroutine PR.

@pwojcikdev pwojcikdev merged commit 95b62fe into nanocurrency:develop Apr 22, 2024
23 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants