Replace fibers with threadpool #674

markaren · 2021-11-18T07:52:50Z

This PR replaces #671. It uses a threadpool rather than std::for_each so that compilers older than gcc9 will work.

Note that slave state is unimplemented in this PR. It will eventually be included when I have figured out how to best make it back in after removing async_slave.
Additionally the concurrency test has been commented out. The test is about file locking.

davidhjp01 · 2022-02-14T09:15:12Z

src/cosim/utility/thread_pool.hpp

+            if (!work_queue_.empty()) {
+
+                auto task = work_queue_.front();
+                work_queue_.pop();
+                task();
+
+                lck.unlock();
+                cv_.notify_one();
+
+            } else {
+                std::this_thread::yield();
+            }


Thanks for working on this PR. Would it be possible to have an option to run fmus without spawning a separate worker thread? (i.e. running everything in a main thread sequentially). For some use-cases, the communication overhead between the main and the worker threads seem quite significant.

This is possible today, and was in fact the motivation for using fibers in the first place. (Not that fibers are required for sequential execution of FMU functions, but they provide a nice framework for combining sequential execution with asynchronous I/O so one can run both local and remote slaves without the overhead of threads.)

The problem is just that the remote-slave implementation we're using today (proxyfmu) is not built around async I/O, so it requires a worker thread per slave anyway. That is why fibers simply are an extra overhead that comes in addition to threading and I/O costs, which is what I suppose motivated @markaren's PR.

As I see it, there are two ways to improve performance in this area: Drop fibers and go all-in for proxyfmu and one-thread-per-slave (as this PR proposes), or keep fibers and implement async communication with slaves (as was the original plan).

Thanks for the explanation @kyllingstad.

I was able to achieve required simulation performance by removing std::thread for slaves as well as recursive fiber creation in the libcosim master branch. I was not able to achieve this via pseudo_async_slave because it still creates fiber recursively in the fmu interface methods (calling get/setters for variables and do_step).

But now the problem is I cannot not simulate some fmus because they run directly in fibers with limited resources (created by slave_simulator::impl::do_step). And I cannot replace/extend slave_simulator as it is always added in execution::add_slave.

This PR seems fixed my issue but I still had to remove worker thread to avoid communication overhead (also worker thread consuming cpu time on checking messages).

Would it be possible to have an option to run fmus without spawning a separate worker thread?

Yes, and this is one of the motivations for this PR. IMO the master algorithm itself should decide how it handles execution. In the case of this implementation it could take numThreads as an argument where < 1 means no pool.

proxy-fmu was not a motivation for this PR. My motivation was to simplify the code base and make it run faster. I've found that the fiber solution is slower than a threaded solution with or without proxy-fmu.

markaren · 2022-02-14T11:50:49Z

It uses a threadpool rather than std::for_each so that compilers older than gcc9 will work

This is an important thing to address if this is something we want to move forward with. Do we have to support old compilers that are not C++17 feature complete?

markaren · 2022-02-16T07:43:37Z

If the target branch gets the ok, we should also make it possible to specify the count in the XML config.

Edit: This was meant as an comment in #680

markaren · 2022-02-20T12:48:49Z

Adding the functionality to set the number of worker threads for SSP was trivial, but not so much for the OSP alternative as it does not provide algorithm spesific configuration options. This is related to #404

ljamt · 2022-03-15T10:25:51Z

Adding the functionality to set the number of worker threads for SSP was trivial, but not so much for the OSP alternative as it does not provide algorithm spesific configuration options. This is related to #404

I see your point, but I don't think this can be solved separately. Should not be a blocker for this PR.

ljamt · 2022-03-15T10:27:28Z

What's the reason for this PR to remain in draft mode? Any remaining issues that must be resolved?

markaren · 2022-03-15T10:47:07Z

What's the reason for this PR to remain in draft mode? Any remaining issues that must be resolved?

Not other than significant vetting.

And yeah, the concurrent file locking test needs to be fixed (currently commented out).

include/cosim/slave.hpp

src/cosim/utility/thread_pool.hpp

davidhjp01

Good work! Just added some comments

ljamt · 2022-03-28T09:03:03Z

utility_concurrency_unittest is still commented out. Should be included and fixed before merging.

ljamt · 2022-03-28T10:56:59Z

utility_concurrency_unittest is still commented out. Should be included and fixed before merging.

Test included and passing now

ljamt · 2022-03-29T10:42:15Z

As fibers are removed, I think cosim::utility::shared_mutex can be replaced with std::shared_mutex. Don't want to add more to this PR, and can push that as a separate PR.

ljamt · 2022-03-30T07:02:21Z

@markaren, if you are ok with the latest changes to thread_pool.hpp are we then ready to merge?
@kyllingstad, @eidekrist, share your opinions if you disagree :)

markaren · 2022-03-30T07:15:16Z

How are the observed differences in usage/speed/accuracy on your side? All good?
We can probably set the the default number of threads to std::thread::hardware_concurrency()-1 in fixed_step_algorithm as suggested?

restenb · 2022-03-30T08:36:38Z

How are the observed differences in usage/speed/accuracy on your side? All good? We can probably set the the default number of threads to std::thread::hardware_concurrency()-1 in fixed_step_algorithm as suggested?

Yes. I've added a unsigned int max_threads_ = std::thread::hardware_concurrency() - 1 variable to fixed_step_algorithm. With what is now a blocking only strategy, it may no longer be necessary, but I'd like to extend this in the future to include a spinlock as well. Blocking & resuming threads has a non-negligible overhead cost when done at a higher rate, such as with a very small timestep simulation.

We seem to be seeing ~15-20% improvements in simulation speed with this PR over the fiber implementation, at least with the example projects like dp-ship.

kyllingstad

Great work! I don't have much to add except some stylistic nitpicks here and there. I didn't look at the thread pool implementation in any detail, since it seems to have been thoroughly reviewed by others. Everything else looks good to merge as far as I'm concerned.

include/cosim/orchestration.hpp

include/cosim/algorithm/fixed_step_algorithm.hpp

kyllingstad · 2022-03-30T11:45:54Z

src/cosim/utility/concurrency.hpp

-    boost::fibers::condition_variable condition_;
-};
-
-
 /**
 *  A shared mutex à la `std::shared_mutex`, but with support for fibers.


I don't think we need this class at all anymore, we can just use std::shared_mutex.

Nevermind, I see that @ljamt already suggested we do this as a separate PR.

Yes, we noticed this as well. In addition cosim::utility::shared_mutex was used in utility_concurrency_unittest.cpp with the intention to test that file locking functioned correctly with the custom mutex, which was in turn only necessary because of fibers.

With the removal of fibers, that test can be removed as well, given that there is no point for us in unit testing std::shared_mutex. We wanted to push this as a separate PR to avoid more noise in this one, since cosim::utility::shared_mutex is used throughout concurrency.hpp / cpp.

kyllingstad · 2022-03-30T12:11:23Z

One more thing: You may want to consider running clang-format on everything before merging. I see there are some includes that are out of alphabetic order after the async_slave --> slave change, and possibly other things. If it's not fixed now, it's going to show up in someone else's PR later.

markaren added 13 commits October 6, 2021 12:13

parallel

89911d0

update

613f312

update

d85fcf2

Merge branch 'master' into parallel

a8b1c8c

fix indent

0f678e6

add linux dependency

998e252

Using threadpool

cb507a5

Fix multi-step test

843b318

cleanup

3d37ed1

Disable utility_concurrency_unittest

d7c8ce4

include thread

05eb22e

missing iostream

d05273e

refactor

2cb7943

davidhjp01 reviewed Feb 14, 2022

View reviewed changes

Passing thread_count from the ctor. (#680)

ef36c3a

markaren and others added 7 commits February 18, 2022 13:05

Support numWorkerThreads in SSP

8e9c1de

Merge branch 'master' into parallel-pool

b9860e5

add license header

4dd5a0f

state

3e5c74f

simulator interface have no terminate

566995d

typo

78c2e3f

re-add indeterminate

11225f2

markaren marked this pull request as ready for review March 15, 2022 10:47

ljamt requested a review from eidekrist March 15, 2022 11:49

restenb added 5 commits March 23, 2022 13:11

unfix, wait(..) addresses this problem

a99fd55

another unlock()

87d7309

no need for the default, set via std::min under

0704d4a

protect done_ here as well

af315e2

gut feeling; we don't need to do this

84a4090

davidhjp01 reviewed Mar 24, 2022

View reviewed changes

include/cosim/slave.hpp Show resolved Hide resolved

davidhjp01 reviewed Mar 24, 2022

View reviewed changes

src/cosim/utility/thread_pool.hpp Show resolved Hide resolved

davidhjp01 reviewed Mar 24, 2022

View reviewed changes

levi added 2 commits March 28, 2022 12:41

Includes concurrency unit tests

5f7c204

sleeping a bit longer

29fa322

restenb approved these changes Mar 28, 2022

View reviewed changes

davidhjp01 approved these changes Mar 28, 2022

View reviewed changes

ljamt requested a review from kyllingstad March 28, 2022 14:01

slight refactoring to max_threads_ variable

226e0da

kyllingstad approved these changes Mar 30, 2022

View reviewed changes

markaren added 6 commits March 30, 2022 14:57

Merge branch 'master' into parallel-pool

261daa8

clang-format

3a1a697

undo const reference

46ca0c3

Small typo

61a1b7a

Undo const reference pt2

1d6a86c

merge artifact (?)

f6116bc

markaren merged commit a6f3fcc into master Mar 31, 2022

markaren deleted the parallel-pool branch March 31, 2022 07:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace fibers with threadpool #674

Replace fibers with threadpool #674

markaren commented Nov 18, 2021

davidhjp01 Feb 14, 2022

kyllingstad Feb 14, 2022

kyllingstad Feb 14, 2022 •

edited

Loading

davidhjp01 Feb 14, 2022 •

edited

Loading

markaren Feb 14, 2022 •

edited

Loading

markaren commented Feb 14, 2022

markaren commented Feb 16, 2022 •

edited

Loading

markaren commented Feb 20, 2022

ljamt commented Mar 15, 2022

ljamt commented Mar 15, 2022

markaren commented Mar 15, 2022 •

edited

Loading

davidhjp01 left a comment

ljamt commented Mar 28, 2022

ljamt commented Mar 28, 2022

ljamt commented Mar 29, 2022

ljamt commented Mar 30, 2022

markaren commented Mar 30, 2022

restenb commented Mar 30, 2022 •

edited

Loading

kyllingstad left a comment

kyllingstad Mar 30, 2022

kyllingstad Mar 30, 2022 •

edited

Loading

restenb Mar 30, 2022 •

edited

Loading

kyllingstad commented Mar 30, 2022

Replace fibers with threadpool #674

Replace fibers with threadpool #674

Conversation

markaren commented Nov 18, 2021

davidhjp01 Feb 14, 2022

Choose a reason for hiding this comment

kyllingstad Feb 14, 2022

Choose a reason for hiding this comment

kyllingstad Feb 14, 2022 • edited Loading

Choose a reason for hiding this comment

davidhjp01 Feb 14, 2022 • edited Loading

Choose a reason for hiding this comment

markaren Feb 14, 2022 • edited Loading

Choose a reason for hiding this comment

markaren commented Feb 14, 2022

markaren commented Feb 16, 2022 • edited Loading

markaren commented Feb 20, 2022

ljamt commented Mar 15, 2022

ljamt commented Mar 15, 2022

markaren commented Mar 15, 2022 • edited Loading

davidhjp01 left a comment

Choose a reason for hiding this comment

ljamt commented Mar 28, 2022

ljamt commented Mar 28, 2022

ljamt commented Mar 29, 2022

ljamt commented Mar 30, 2022

markaren commented Mar 30, 2022

restenb commented Mar 30, 2022 • edited Loading

kyllingstad left a comment

Choose a reason for hiding this comment

kyllingstad Mar 30, 2022

Choose a reason for hiding this comment

kyllingstad Mar 30, 2022 • edited Loading

Choose a reason for hiding this comment

restenb Mar 30, 2022 • edited Loading

Choose a reason for hiding this comment

kyllingstad commented Mar 30, 2022

kyllingstad Feb 14, 2022 •

edited

Loading

davidhjp01 Feb 14, 2022 •

edited

Loading

markaren Feb 14, 2022 •

edited

Loading

markaren commented Feb 16, 2022 •

edited

Loading

markaren commented Mar 15, 2022 •

edited

Loading

restenb commented Mar 30, 2022 •

edited

Loading

kyllingstad Mar 30, 2022 •

edited

Loading

restenb Mar 30, 2022 •

edited

Loading