Skip to content

Commit

Permalink
Bug#35594709: Memory leak on xcom_tcp_server_startup()
Browse files Browse the repository at this point in the history
Problem
================================

Group Replication ASAN run failing without any symptom of a
leak, but with shutdown issues:

worker[6] Shutdown report from
/dev/shm/mtr-3771884/var-gr-debug/6/log/mysqld.1.err after tests:
 group_replication.gr_flush_logs
group_replication.gr_delayed_initialization_thread_handler_error
group_replication.gr_sbr_verifications
group_replication.gr_server_uuid_matches_group_name_bootstrap
group_replication.gr_stop_async_on_stop_gr
group_replication.gr_certifier_message_same_member
group_replication.gr_ssl_mode_verify_identity_error_xcom

Analysis and Fix
================================

It ended up being a leak on gr_ssl_mode_verify_identity_error_xcom test:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7f1709fbe1c7 in operator new(unsigned long)
      ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
    #1 0x7f16ea0df799 in xcom_tcp_server_startup(Xcom_network_provider*)
      (/export/home/tmp/BUG35594709/mysql-trunk/BIN-ASAN/plugin_output_directory
        /group_replication.so+0x65d799)
    #2 0x7f170751e2b2  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc2b2)

This happens because we delegated incoming connections
cleanup to the external consumer in incoming_connection_task.
Since it calls incoming_connection() from
Network_provider_manager, in case of a concurrent stop,
a connection could be left orphan in the shared atomic
due to the lack of an Active Provider, thus creating a
memory leak.

The solution is to make this cleanup on
Network_provider_manager, on both stop_provider() and in
stop_all_providers() methods, thus ensuring that no
incoming connection leaks.

Change-Id: I2367c37608ad075dee63785e9f908af5e81374ca
  • Loading branch information
tiagoportelajorge committed Sep 26, 2023
1 parent f3f6b05 commit c357395
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -102,11 +102,14 @@ bool Network_provider_manager::start_network_provider(

bool Network_provider_manager::stop_all_network_providers() {
bool retval = false;

for (auto &&i : m_network_providers) {
// Logical Sum of all stop() operations. If any of the operations fail,
// it will report the whole operation as botched, but it will stop all
// providers
retval |= i.second->stop().first;

this->cleanup_incoming_connection(*(i.second));
}

set_incoming_connections_protocol(get_running_protocol());
Expand All @@ -118,7 +121,13 @@ bool Network_provider_manager::stop_network_provider(
enum_transport_protocol provider_key) {
auto net_provider = this->get_provider(provider_key);

return net_provider ? net_provider->stop().first : true;
auto cleanup_and_stop = [&]() {
this->cleanup_incoming_connection(*net_provider);

return net_provider->stop().first;
};

return net_provider ? cleanup_and_stop() : true;
}

const std::shared_ptr<Network_provider>
Expand Down Expand Up @@ -383,3 +392,13 @@ void Network_provider_manager::finalize_secure_connections_context() {
CLEANUP_NET_PARAMS_FIELD(tls_params.tls_version);
CLEANUP_NET_PARAMS_FIELD(tls_params.tls_ciphersuites);
}

void Network_provider_manager::cleanup_incoming_connection(
Network_provider &provider_ref) {
Network_connection *remaining_connection = provider_ref.get_new_connection();

if (remaining_connection != nullptr) {
provider_ref.close_connection(*remaining_connection);
delete remaining_connection;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,8 @@ class Network_provider_manager : public Network_provider_management_interface,
m_incoming_connections_protocol = value;
}

void cleanup_incoming_connection(Network_provider &provider_ref);

std::unordered_map<enum_transport_protocol, std::shared_ptr<Network_provider>,
std::hash<int>>
m_network_providers;
Expand Down

0 comments on commit c357395

Please sign in to comment.