-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple concurrent_unordered_set #1308
Simple concurrent_unordered_set #1308
Conversation
Perhaps try the Intel TBB
Or perhaps the Microsoft implementation ? |
Back in my university days I was in a class called "System programming concepts and techniques" or sth like that. One day, the professor was talking about mutexes, and gave us a pseudocode example for a simple producer/consumer queue. One of the students pointed out an error in the code. The professor admitted (with obvious frustration) that he was giving out slightly different versions of that example code every year, because every year someone would find a new bug in his logic. What I'm trying to say with this is - although I haven't even checked your code, please use a commonly available solution instead of rolling your own. :-) |
I agree (the roll-your-own part), but was surprised to find not too many baked versions of an unordered set. The Intel TBB was the only one I found. I haven't looked over the MS solution. Does anyone have a problem with the Apache license? I'm no lawyer. The begin() and end() stuff is the most troublesome. Things will possibly have to be locked until the end of the loop, unless we examine each loop individually, looking for optimizations. Often the implementation is "make a lambda as small as possible, acquire lock with lock_guard, call std::for_each" BTW: concurrency/parallel programming was a good subject at cppcon2017. Search "Is Parallel Programming Still Hard" on YouTube. |
Update: I have begun to include Intel TBB as part of the build process (different branch in case we decide against it). The default (and highly recommended by Intel) is to build shared libraries due to scheduler issues. There is an undocumented (on purpose) workaround to build and link statically. My question to all is should we use it? Yet another update: I added TBB as a submodule, using static linking, and made a PR: #1315 |
Let's imagine simple code snippets executed from different threads.
so, i see the problem here
|
Yes, I have a test program where I torture an unordered_set with 3 threads. Thread A adds an element, thread D deletes the first element if one exists (using If you put locks in place, the program will continue to run without problems. The "hard" part of fixing this is the iterators. Something intelligent must be done to either
All of those options require examining each iteration loop to determine its purpose. Then a decision can be made to either use A, B, or C. C is an interesting option. You would think it would be the best option ( performant, lack of contention ). But depending on the implementation and use, it could actually be a bad choice ( increase in complexity, decrease in performance ). |
https://github.com/bitshares/bitshares-core/wiki/Threading lists a number of tasks handled by the P2P thread. Can you document which collections are used by which tasks and in what way? Would be a good starting point for a high-level documentation of the P2P layer. |
Here is what I found. I am posting here, as it will probably need to be reformatted to fit the goals of the wiki. Please let me know if you are looking for something different. This has to do with _handshaking_connections, _active_connections, _closing_connections, and _terminating connections which are unordered sets in libraries/net/node.cpp. _active_connections contains peer connections that are completely set up. Methods that use the collections above, and how they use them: node_impl destructor: fetch_sync_items_loop: is_item_in_any_peers_inventory: fetch_items_loop: advertise_inventory_loop: terminate_inactive_connections_loop: fetch_updated_peer_lists_loop: schedule_peer_for_deletion: on_address_request_message: get_number_of_connections: get_peer_by_node_id: is_already_connected_to_id: display_current_connections: calculate_unsynced_block_count_from_all_peers: on_blockchain_item_ids_inventory_message: on_item_ids_inventory_message: on_connection_closed: send_sync_block_to_node_delegate: process_backlog_of_sync_blocks: process_block_during_normal_operation: forward_firewall_check_to_next_available_peer: on_get_current_connections_request_message: start_synchronizing: new_peer_just_added: close: accept_loop: connect_to_task: get_connection_to_endpoint: move_peer_to_active_list: move_peer_to_closing_list: move_peer_to_terminating_list: dump_node_status: get_connected_peers: get_connection_count: is_connected: set_advanced_node_parameters: set_allowed_peers: |
Thanks. |
I do not believe that is possible. I believe that would violate memory/process boundary rules. I will investigate. |
How would it yield? Through std::this_thread::yield()? This yields to other threads. It simply allows this_thread to be rescheduled. It does not allow this_thread to jump to another section of code. Or perhaps I am talking |
|
Thank you for straightening me out. I am digging through boost::context to see how this works. From what I can tell, this would take an explicit call to yield on our (core or fc) part. Some things are still unclear. I am still digging. |
Latest notes: I have played in my sandbox and verified that what @pmconrad has said is correct.
The boost primitives used by fc::thread have also been used to build out boost fibers. Boost fibers include much of the functionality of fc::thread, and include a recursive_mutex that is "fiber-aware". There are now several ways to go about fixing this issue:
None of these seem appealing. Cost and benefits need to be thought about. I will pause here to welcome other comments / suggestions / ideas. |
AFAICS |
Yes, it does. I have created a test that proves it. I may check it in to FC, but not sure if it is worth it. See https://gist.github.com/jmjatlanta/6624f524214b7c72184c36e43e346848 I will switch the concurrent_unordered_set to use fc::mutex and see if it works (update: it does). |
Closing this PR, but will keep the branch. This does not seem to solve the problem it was designed for. Will take another look later. |
I thought this PR did fix something, if it's harmless, we can merge it. |
It does make the internal sets thread-safe. It does not fix issue #1256. In short, it probably fixes an issue if these sets are accessed from multiple threads. I think we have yet to prove that is the case. To merge it would require more testing. Being that such testing is difficult to do thoroughly, it was thought best to close this ticket, leave the branch, and revisit when we have a clearer picture of what is causing #1256 (or another). |
Merged this to |
Need to remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been running this in production for a few months, it seems to be working fine, so I will merge it.
Thanks to @jmjatlanta and the reviewers/testers.
Potential fix for #1256
This code is for testing only. I would certainly imagine that optimizations can be found. Please do not do a full code review yet. Please take a look at the approach and comment here.