COM-464 Deep Sleep with Rouser by SeverTopan · Pull Request #3 · SynaptiveMedical/thread-pool-cpp

SeverTopan · 2018-02-23T20:45:55Z

A change set performed in response to the feedback on #2. This pull request basically ups the robustness of the pull request.

Problems addressed:

Compiler operation reordering: no more usage of std::memory_order_relaxed in response to Hayk's comments.
Function signature change for SlottedBag::tryEmptyAny.
There is an inherent problem with having a worker going to sleep. How does it ensure there is no task present in its queue? Any attempt to use a task counter neccessarily introduces spin-locking. This is solved by allowing threads to go into the idle state while there are still tasks in their queue, and introducing a 'Rouser' that periodically (currently every 10ms) wakes threads up, ensuring that no tasks are dropped. The Rouser thread essentially upper bounds the delay between when a task is posted, and when it is processed.

On the performance side, I don't have exact numbers, but based off preliminary results this updated PR is slightly faster than #2 due to the removal of spin-locking. Overall still a major performance improvement over the old pplx scheduler.

Busy wait behavious can now be configured in the task options. Constructors now use rvalues.

mmthomas · 2018-03-05T17:32:20Z

-     */
+    * @brief push Push data to queue.
+    * @param data Data to be pushed.
+    * @param is_strong false if we wish to allow spurious failures


is_strong is no longer a param

mmthomas · 2018-03-05T17:32:35Z

-     */
+    * @brief pop Pop data from queue.
+    * @param data Place to store popped data.
+    * @param is_strong false if we wish to allow spurious failures


no longer a param

mmthomas · 2018-03-05T17:40:22Z

+};
+
+inline Rouser::Rouser(std::chrono::microseconds rouse_period)
+    : m_running_flag(true)


should this be true here? or false, and set to true in start().

Fixed, Rouser and Workers are now robust to repeated start/stop cycles.

mmthomas · 2018-03-05T17:43:04Z

+template <typename Task, template<typename> class Queue>
+inline void Rouser::start(std::vector<std::unique_ptr<Worker<Task, Queue>>>* workers, SlottedBag<Queue>* idle_workers, std::atomic<size_t>* num_busy_waiters)
+{
+    m_thread = std::thread(&Rouser::threadFunc<Task, Queue>, this, workers, idle_workers, num_busy_waiters);


on repeated calls to start/stop/start/stop, the m_running_flag state will never be reset back to true.

If we only allow one transition from true to false, then consider throwing here if the value of m_running_flag is not true.

Otherwise, throw if it not false and set to true before starting the thread.

👍 will add the throw. I think it's safe to assume we will not need to be able to restart thread pools.

The ability to restart also begs the question of what to do with the tasks left in the worker queues. Currently they're all just dropped.

mmthomas · 2018-03-05T17:47:34Z

+
+    /**
+    * @brief tryEmptyAny Try to empty any slot in the bag.
+    * @return a pair containing true upon success along with the id of the emptied slot, false otherwise with an id of UINT_MAX.


UINT_MAX? or ((size_t) -1)? is there a constant for that? numeric_limits<size_t>::max()?

hadamyan · 2018-03-02T19:42:05Z

-     */
+    * @brief push Push data to queue.
+    * @param data Data to be pushed.
+    * @param is_strong false if we wish to allow spurious failures


Comment is outdated, remove the is_strong parameter definition

hadamyan · 2018-03-02T19:42:25Z

-     */
+    * @brief pop Pop data from queue.
+    * @param data Place to store popped data.
+    * @param is_strong false if we wish to allow spurious failures


the same here

hadamyan · 2018-03-02T19:43:04Z

    }
 }

-template <typename T>


Why did you remove move constructor and move assignment operator? If you got confused with my comment in the previous review then it was for copy constructor/assignment only.

hadamyan · 2018-03-02T19:49:03Z

 {
    Cell* cell;
-    size_t pos = m_enqueue_pos.load(std::memory_order_relaxed);
+    size_t pos = m_enqueue_pos.load(std::memory_order_acquire);


Thanks for changing the memory orders

hadamyan · 2018-03-02T19:49:45Z

 {
    Cell* cell;
-    size_t pos = m_dequeue_pos.load(std::memory_order_relaxed);
+    size_t pos = m_dequeue_pos.load(std::memory_order_acquire);


The same here, thanks for changing

hadamyan · 2018-03-02T20:45:56Z

+    /**
+    * @brief Move ctor implementation.
+    */
+    ThreadPoolImpl(ThreadPoolImpl&& rhs) noexcept = delete;


Move semantics is useful with this type of objects. Would be better if you restore the move constructor and assignment operator.

hadamyan · 2018-03-02T20:58:14Z

 {

 template <typename Task, template<typename> class Queue>
 class ThreadPoolImpl;


Just a stylistic comment. It's a common practice to use the Impl suffix when you implement "pimpl" idiom. For template implementations of classes you can use GenericThreadPool, BasicThreadPool, ThreadPoolTemplate, etc.

This was the original naming scheme in the upstream. Will alter the naming.

Updated to use ThreadPool and GenericThreadPool

hadamyan · 2018-03-02T21:22:58Z

-        m_next_worker = rhs.m_next_worker.load();
-    }
-    return *this;
+    m_rouser.stop();


There might be an issue with order here. Needs to be double checked. Seems that stops should be in the inverse order of starts.

Good catch, fixed!

hadamyan · 2018-03-02T21:57:31Z

-    return getWorker().tryPost(std::forward<Handler>(handler));
+    // If there aren't busy waiters, let's see if we have any idling threads. 
+    // These incur higher overhead to wake up than the busy waiters.
+    if (m_num_busy_waiters.load(std::memory_order_acquire) == 0)


This type of checks are not giving you any guarantee that when the execution reaches next line the condition still holds. This is the same type of issue that you had in the queue class in the previous version. If you want to have this type of decision points you need to look at them only as if the purpose is to decrease statistical likelihood of hitting a slow path.

You need to model what happens in two scenarios and make sure the logic is correct

variable m_num_busy_waiters loads 0 but when the code enters the body of if statement state of the variable is modified by other thread

variable m_num_busy_waiters loads other value than 0, execution jumps over if statement and then other thread changes it to 0

If in both cases behavior is what you expect then you are good.

This part of the posting algorithm should be ok given the two points you mentioned. The busy waiter count check does not need to be strict. It just lowers the likelihood of two situations:

Posting the task to the queue of an active worker when non-active workers are available in the pool.

Hitting the slow-ish path of checking for idle threads (which requires a queue pop operation as opposed to just an atomic read).

hadamyan · 2018-03-05T17:09:13Z

+
+    // We have to ensure that at least one thread is active after our submission.
+    // Threads could have transitioned into idling under our feet. We need to account for this.
+    if (m_num_busy_waiters.load(std::memory_order_acquire) == 0)


The same issue is present here. Please carefully model cases when other thread changes the state of the variable.

Same as above; this is a soft check that lower the likelihood of task dropping and improves the response time of posted tasks. The Rouser thread enforces the strict constraint on dropped tasks.

I realize the comments above this check are misleading, will update.

mmthomas · 2018-03-14T15:55:18Z

-     */
+    * @brief MPMCBoundedQueue destructor.
+    */
+    virtual ~MPMCBoundedQueue() = default;


Is this class ever derived from? Are there any other virtual methods? It may be worth marking the class as final and removing the virtual dtor.

mmthomas · 2018-03-14T15:57:40Z

+    /**
+    * @brief SlottedBag destructor.
+    */
+    virtual ~SlottedBag() = default;


make class final and remove virtual methods?

mmthomas · 2018-03-14T16:00:47Z

+        if (result.first)
+        {
+            auto success = m_workers[result.second]->tryPost(std::forward<Handler>(handler));
+            m_workers[result.second]->wake();


if success is false, is it even worth waking the worker?

tryEmptyAny removes the worker from the idle worker bag, so once we fail to post we need to ensure that the worker will be re-added to the bag. The wake function spins the worker back up so that it does this.

Realistically this condition would be very low probability, the only time posting fails is when the queue is full, and if the worker is idle, its queue will end up empty (barring certain races).

I think we can optimize this a little though, re-adding the worker to the queue could safely occur here as opposed to on the worker thread (via wake()), since the caller retains ownership of the worker's idling synchronization at this point in the execution.

Updated to reflect #3 (comment).

mmthomas · 2018-03-14T16:03:52Z

+
+    // No idle threads. Our threads are either active or busy waiting
+    // Either way, submit the work item in a round robin fashion.
+    if (!getWorker().tryPost(std::forward<Handler>(handler)))


should we retry the next worker is this fails? Should we retry as many times as there are threads in the pool? If so, will we need another getWorker() function variant that guarantees retrieving the next thread in the round robin?

I'd like to include post retrying in another Pull Request. I have a ticket for it > COM-466. It would (hopefully 😉) be the last feature that we'd need to add to this thread pool for it to be consumable.

hadamyan · 2018-03-15T18:23:24Z

+
+inline void Rouser::stop()
+{
+    if (m_running_flag.exchange(false, std::memory_order_acq_rel))


You probably need a symmetrical exception here. in start function you throw exception when it is called on already started object. The opposite should be done here.

Implemented!

mmthomas · 2018-03-19T12:56:16Z

    {
-        static thread_local size_t tss_id = std::numeric_limits<size_t>::max();
-        return &tss_id;
+        static thread_local std::atomic<size_t> tss_id(std::numeric_limits<size_t>::max());


why does a thread local need to be atomic?

There is a race between when this gets assigned (at the start of threadFunc) and the first post that occurs (which can query for tss_id).

My above comment is wrong, will update :)

mmthomas · 2018-03-19T12:58:52Z

+            }
+
+            // If post is unsuccessful, we need to re-add the worker to the idle worker bag.
+            m_idle_workers.fill(result.second);


is there a race here? should we not let the thread add itself back once it is awake?

This should be safe, the previous version had the thread add itself back in, this just short circuits the condition variable overhead (see #3 (comment)).

mmthomas · 2018-03-19T13:01:07Z


 inline void Rouser::stop()
 {
+    if (!m_started_flag.load(std::memory_order_acquire))


Actually, I think calling stop() shouldn't throw... in general there is an intractable race in this scenario, so would prefer to always allow stop() to succeed if already stopped (since there is no cost or allocation of resources)

hadamyan · 2018-03-19T15:21:09Z

+* in worker queues. The second is that it increases the likelihood of at least one worker busy-waiting at any
+* point in time, which speeds up task processing response time.
+*/
+class Rouser final


You need to add a destructor to this class and stop the thread it inside if it's not stopped yet.

Implemented!

This reverts commit 56590c1.

mmthomas · 2018-03-20T13:42:03Z

Looks good!

hadamyan · 2018-03-20T15:11:27Z

+    template <typename Task, template<typename> class Queue>
+    void threadFunc(std::vector<std::unique_ptr<Worker<Task, Queue>>>& workers, SlottedBag<Queue>& idle_workers, std::atomic<size_t>& num_busy_waiters);
+
+    std::atomic<bool> m_running_flag;


Just a general recommendation with designing this type of logic controlled with flags. You could alternatively have a single atomic variable which represents the states of the object. When you have one variable representing the state it's easier to control transitions between the states with exchange and compare exchange operations and easier to reason about the state.

For this class you can model the following states:
not started
start requested
running
stop requested
stopped

This is not a necessary change now, but if you have time later you can experiment with it.

My reasoning for keeping the states separate was that the m_running_flag check in the threadFunc loop would be faster with boolean comparison versus an enum.

Not sure if the performance gain is worth the hit to readability?

I concur with Hayk. Modeling this with an enum rather than a bunch of boolean flags would make your life much easier. The difference in performance should be so small as to be negligible, AFAIK.

👍 will change.

Sever Topan added 16 commits January 19, 2018 13:22

Partial implementation of deep sleeping threads.

c505f23

Integration of the Bounded Random Access Bag, readability improvements.

49fc93e

Race tweaks.

28890fa

Namespacing.

90dcd37

Exception cleanup and MPMCBoundedQueue bugfix.

56dac93

MPMCBoundedQueue selectable strength.

9fb102a

Move busy wait iterations to thread pool options.

e81af46

Ordering edits and code clarification.

f8b7ac9

Resolve ABA condition in worker wakeup.

7017c70

Rename to SlottedBag.

dc351e6

Thread Pool Options Cleanup and Functional Extension.

1e48afd

Busy wait behavious can now be configured in the task options. Constructors now use rvalues.

Template SlottedBag's Internal Queue

d50c57e

Style cleanup.

6666a22

Rouser-based implementation.

05877d9

Stricter ordering.

99bfe6c

TryEmptyAny function signiature rework.

d33cc97

SeverTopan self-assigned this Feb 23, 2018

SeverTopan requested review from PoyangLiu, doxxx, mmthomas, stephure and stewartbright February 23, 2018 20:45

SeverTopan mentioned this pull request Feb 23, 2018

COM-464 Deep Sleep #2

Closed

mmthomas reviewed Mar 5, 2018

View reviewed changes

hadamyan reviewed Mar 5, 2018

View reviewed changes

Review Fixes.

acd28d3

Atomic thread id.

f84cce8

mmthomas reviewed Mar 14, 2018

View reviewed changes

Sever Topan added 2 commits March 15, 2018 13:51

Final specifiers.

395b7ba

Waking optimization on post.

c51a63f

hadamyan reviewed Mar 15, 2018

View reviewed changes

Exceptions on invalid stopping.

56590c1

mmthomas reviewed Mar 19, 2018

View reviewed changes

hadamyan reviewed Mar 19, 2018

View reviewed changes

PoyangLiu removed their request for review March 19, 2018 15:35

Sever Topan added 2 commits March 19, 2018 11:47

Revert "Exceptions on invalid stopping."

208cd4c

This reverts commit 56590c1.

Rouser & Worker Destructors, Non-atomic thread id.

3411673

SeverTopan merged commit c48dd19 into synaptive/master Mar 20, 2018

SeverTopan deleted the synaptive/dev/sever/COM-464_DeepSleepWithRouser branch March 20, 2018 14:54

hadamyan reviewed Mar 20, 2018

View reviewed changes

Conversation

SeverTopan commented Feb 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmthomas Mar 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SeverTopan Mar 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

mmthomas Mar 5, 2018 •

edited

Loading

SeverTopan Mar 12, 2018 •

edited

Loading