multithreading - C++ Fork Join Parallelism Blocking -
suppose wish run section in parallel, merge main thread section in parallel, , on. similar childhood game red light green light.
i've given example of i'm trying do, i'm using conditional variable block threads @ start wish start them in parallel block them @ end can printed out serially. *= operation larger operation spanning many seconds. reusing threads important. using task queue might heavy.
i need use kind of blocking construct isn't plain busy loop, because know how solve problem busy loops.
in english:
- thread 1 creates 10 threads blocked
- thread 1 signals threads start (without blocking eachother)
- thread 2-11 process exclusive memory
- thread 1 waiting until 2-11 complete (can use atomic count here)
- thread 2-11 complete, each can notify 1 check condition if necessary
- thread 1 checks condition , prints array
- thread 1 resignals 2-11 process again, continuing 2
example code (naive adapted example on cplusplus.com):
// condition_variable example #include <iostream> // std::cout #include <thread> // std::thread #include <mutex> // std::mutex, std::unique_lock #include <condition_variable> // std::condition_variable #include <atomic> std::mutex mtx; std::condition_variable cv; bool ready = false; std::atomic<int> count(0); bool end = false; int a[10]; void doublea (int id) { while(!end) { std::unique_lock<std::mutex> lck(mtx); while (!ready) cv.wait(lck); a[id] *= 2; count.fetch_add(1); } } void go() { std::unique_lock<std::mutex> lck(mtx); ready = true; cv.notify_all(); ready = false; // naive while (count.load() < 10) sleep(1); for(int = 0; < 10; i++) { std::cout << a[i] << std::endl; } ready = true; cv.notify_all(); ready = false; while (count.load() < 10) sleep(1); for(int = 0; < 10; i++) { std::cout << a[i] << std::endl; } end = true; cv.notify_all(); } int main () { std::thread threads[10]; // spawn 10 threads: (int i=0; i<10; ++i) { a[i] = 0; threads[i] = std::thread(doublea,i); } std::cout << "10 threads ready race...\n"; go(); // go! return 0; }
this not trivial implement efficiently. moreover, not make sense unless learning subject. conditional variable not choice here because not scale well.
i suggest how mature run-time libraries implement fork-join parallelism , learn them or use them in app. see http://www.openmprtl.org/, http://opentbb.org/, https://www.cilkplus.org/ - these open-source.
openmp closest model looking , has efficient implementation of fork-join barriers. though, has disadvantages because designed hpc , lacks dynamic composability. tbb , cilk work best nested parallelism , usage in modules , libraries can used in context of external parallel regions.
Comments
Post a Comment