Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

[core] Use an actor model for tile worker concurrency #6337

Merged
merged 2 commits into from
Sep 16, 2016
Merged

Conversation

jfirebaugh
Copy link
Contributor

@jfirebaugh jfirebaugh commented Sep 15, 2016

GeometryTile and RasterTile now communicate with their asynchronous worker according to the actor model.

This has the following benefits:

Code docs

An Actor<O> is an owning reference to an asynchronous object of type O: an "actor".
Communication with an actor happens via message passing: you send a message to the object
(using invoke), passing a pointer to the member function to call and arguments which
are then forwarded to the actor.

The actor receives messages sent to it asynchronously, in a manner defined its Scheduler.
To store incoming messages before their receipt, each actor has a Mailbox, which acts as
a FIFO queue. Messages sent from actor S to actor R are guaranteed to be processed in the
order sent. However, relative order of messages sent by two different actors S1 and S2
to R is not guaranteed (and can't be: S1 and S2 may be acting asynchronously with respect
to each other).

Construction and destruction of an actor is currently synchronous: the corresponding O
object is constructed synchronously by the Actor constructor, and destructed synchronously
by the ~Actor destructor, after ensuring that the O is not currently receiving an
asynchronous message. (Construction and destruction may change to be asynchronous in the
future.)

An Actor<O> can be converted to an ActorRef<O>, a non-owning value object representing
a (weak) reference to the actor. Messages can be sent via the Ref as well.

It's safe -- and encouraged -- to pass Refs between actors via messages. This is how two-way
communication and other forms of collaboration between multiple actors is accomplished.

It's safe for a Ref to outlive its Actor -- the reference is "weak", and does not extend
the lifetime of the owning Actor, and sending a message to a Ref whose Actor has died is
a no-op. (In the future, a dead-letters queue or log may be implemented.)

Please don't send messages that contain shared pointers or references. That subverts the
purpose of the actor model: prohibiting direct concurrent access to shared state.

A Scheduler is responsible for coordinating the processing of messages by
one or more actors via their mailboxes. It's an abstract interface. Currently,
the following concrete implementations exist:

  • ThreadPool can coordinate an unlimited number of actors over any number of
    threads via a pool, preserving the following behaviors:

    • Messages from each individual mailbox are processed in order
    • Only a single message from a mailbox is processed at a time; there is no
      concurrency within a mailbox

    Subject to these constraints, processing can happen on whatever thread in the
    pool is available.

  • RunLoop is a Scheduler that is typically used to create a mailbox and
    ActorRef for an object that lives on the main thread and is not itself wrapped
    as an Actor:

      auto mailbox = std::make_shared<Mailbox>(*util::RunLoop::Get());
      Actor<Worker> worker(threadPool, ActorRef<Foo>(*this, mailbox));
    

Notes on implementation patterns

The actor model is a constrained environment. All you can do is send messages! Implementing behaviors that would be easy in a single threaded or shared memory environment takes a degree of cleverness, or at least knowledge of certain patterns.

Here are the patterns that GeometryTile / GeometryTileWorker use to get their job done:

  • Correlation Identifier allows GeometryTile to know when the worker has finished processing the most recent data it's been sent, such that the tile can be considered "fully loaded".
  • GeometryTileWorker is a State Machine: it transitions between states depending on which messages it receives, and its behavior for messages of a certain type depends on what state it's in.
  • GeometryTileWorker creates a Self-Sent Message: a message it sends to itself, asynchronously. Its state transitions are defined in such a way that this allows it to coalesce the other messages it receives, to avoid doing obsolete work.

If we use actors more, which I think we should, we'll probably reuse these patterns and more.

TODO

Now:

  • Write unit tests for the new types. They're integration tested via existing tests, but should be independently tested.

Future:

  • Convert Thread users to actors, and remove Thread. This is a strictly better model.
  • Think about adding a "supervisor" concept/type and more clearly defined exception handling behavior.

@mention-bot
Copy link

@jfirebaugh, thanks for your PR! By analyzing this pull request, we identified @ansis, @tmpsantos and @mikemorris to be potential reviewers.

@jfirebaugh jfirebaugh force-pushed the worker-actor branch 3 times, most recently from 54e81ca to 13f7484 Compare September 15, 2016 01:23
@kkaefer
Copy link
Member

kkaefer commented Sep 15, 2016

\o/

@jfirebaugh jfirebaugh force-pushed the worker-actor branch 3 times, most recently from 96b49ca to 5468f86 Compare September 15, 2016 18:20
@jfirebaugh
Copy link
Contributor Author

Added unit tests and did some high-intensity panning, zooming, rotating, and style switching on my iPhone. Everything is looking good.

@tmpsantos, @kkaefer, want to review?

Copy link
Contributor

@tmpsantos tmpsantos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job with this refactoring.

void ThreadPool::schedule(std::weak_ptr<Mailbox> mailbox) {
std::lock_guard<std::mutex> lock(mutex);
queue.push(mailbox);
cv.notify_one();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe? If all threads are busy will this cause the message on the mailbox to starve because no thread will be waiting on the synchronization point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safe because we use the version of condition_variable::wait which accepts a predicate. If all threads are busy, on the subsequent loop whichever thread obtains the mutex first will check the predicate !queue.empty() || terminate.load(), see that it's true, and not wait on the condition variable.

queue.pop();
lock.unlock();

if (auto locked = mailbox.lock()) {
Copy link
Contributor

@tmpsantos tmpsantos Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I particularly find using a = b inside an if confusing as I tend to think it was a typo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the idiomatic use of weak_ptr; see e.g. here. In general, declaring on a separate line would require an extra set of braces and indentation to preserve the same region of locking:

{
    auto locked = mailbox.lock();
    if (locked) {
        ...
    }
}

@@ -99,6 +101,7 @@ Map::Impl::Impl(View& view_,
contextMode(contextMode_),
pixelRatio(view.getPixelRatio()),
asyncUpdate([this] { update(); }),
workerThreadPool(4),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move the number of works to constants.hpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think we can target that for a followup that also makes the thread pool global, instead of per-Map.

@@ -78,6 +81,14 @@ class RunLoop : private util::noncopyable {

void push(std::shared_ptr<WorkTask>);

void schedule(std::weak_ptr<Mailbox> mailbox) override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, so when we get rid of util::Thread we can get rid of all the locking and complexity of the RunLoop, I suppose.

class Mailbox;

/*
A `Scheduler` is responsible for coordinating the processing of messages by
Copy link
Contributor

@tmpsantos tmpsantos Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this abstraction. It could be used for creating a out-of-process worker if ever needed.


namespace mbgl {

RasterTileWorker::RasterTileWorker(ActorRef<RasterTileWorker>, ActorRef<RasterTile> parent_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me what is the first parameter on this constructor used for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll document this.

Copy link
Contributor

@tmpsantos tmpsantos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkaefer can you also 👀 before we merge?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants