Implementing Actors: Part 1

implementing-actors.png
 

Introduction

CAF attaches many labels to actors: event-based, blocking, statically typed, dynamically typed, and so on. When delving into the depths of the API, users may get lost in technical details and subtleties without grasping the big picture first.

In this guide, we first take a high-level look at the basic ideas and concepts behind the API before we dissect the class hierarchies and outline how the pieces fit together in the big picture.

The Actor Model

Before we discuss API details of CAF, here is a quick refresher on actors in general. The original formulation of the Actor Model of computation by Hewitt, Bishop, and Steiger, ties three things to an actor:

  1. Processing: CPU cycles and internal control flow.
  2. Communications: Sending and receiving messages.
  3. Storage: Member variables and other state.

Conceptually, we can imagine an actor as depicted in this diagram:

Conceptual view of an actor.

Conceptual view of an actor.

 

As the name suggests, an actor is an active software entity. It shields its inside from the outside world. Only messages may cross the boundary between inside and outside. No internal variable may gets accessed from the outside, only the mailbox.

When holding an handle to an actor, you may do one of two things: send a message to the actor by enqueueing to its mailbox or observe the lifetime of the actor by monitoring or linking to it.

Handle Types : The Outside View

The concept view now allows us to categorize some of the building blocks in the CAF API. Remember, there is an inside and an outside. When standing outside of an actor, it appears as a black box with only a mailbox standing out for interacting with it. What kind of mailbox, though? That depends on the handle type:

strong weak
dynamically typed    actor   
statically typed    typed_actor<...>   
untyped    strong_actor_ptr    actor_addr

There are four different handle types in CAF in total. We ignore the strong/weak categorization for now. The most important difference between the handle types is: what kind of mailbox do you see on the outside?

The handle type actor gives you a dynamically typed mailbox on the outside. This means you can put any message into the mailbox. Whether the actor actually understands this input or not is decided once the actor processes the message. This means errors only happen at run-time.

The handle type typed_actor<...> gives you a statically typed mailbox on the outside. This means there is a list of allowed message types and you may not enqueue anything else. The compiler catches type errors for you, but typed actors generally required more boilerplate code.

Lastly, there are two untyped handle types. Here, you see no mailbox at all! The best way to think of these two is as a type-erased pointer to an actor. Usually, you need to restore type information via actor_cast before you can do anything useful with these handles.

Typing aside, CAF also distinguishes between strong and weak references to an actor. Actors in CAF are reference counted in order to enable the run-time system to detect and dispose unreachable actors.

If they do not allow sending messages, what use case do the untyped handles have? Let us start with actor_addr, because it has a very specific use case: down and exit messages. When monitoring an actor, CAF sends a down_msg to the observer once the monitored actor terminates. CAF sends an exit_msg instead when linking to it. In both cases, CAF has no type information about the terminated actor. Further, down and exit messages cannot include a strong reference to the terminated actor, because this actor may have been disposed as unreachable, i.e., it may have a (strong) reference count of 0! The most useful thing actor_addr has to offer is its operator==. Once an actor receives a down or exit message, it may compare the received actor_addr to some list or map of related actors. For example, a supervisor may monitor its workers for re-spawning them on error. After receiving a down message, the supervisor iterates its list of workers to replace the matching entry with a newly spawned replacement. The handle type actor_addr has little use outside of monitoring and linking.

Our second untyped handle is strong_actor_ptr. Unlike its weak counterpart, this handle type keeps actors from becoming unreachable. The most notable use case for strong_actor_ptr is for storing the sender of a message. CAF has no knowledge of whether a message came from a dynamically or statically typed actor. A receiver can reasonably assume that the sender is going to provide a handler for the response message (if any), but the receiver has no further knowledge regarding the type of the sender. Actors can access the source of a message when inside a message handler by calling self->current_sender(). If CAF would use a weak handle for storing sender information, then an actor could become unreachable immediately after sending a message and before receiving the response. Aside from sender information, there remain only a few places in CAF where users may encounter a strong_actor_ptr. Notable examples include the actor registry and when communicating with the middleman actor.

Implementation Types: The Inside View

At a first glance, there seem to be many ways to implement actors with CAF. However, we can prune some choices right away by following this simple rule: do not implement blocking actors! If you ever find yourself in need of a blocking way to interact with other actors, then use scoped_actor. If you need an actor to have its own thread of execution, then spawn it using the detached flag. Consider blocking_actor an implementation detail.

When reading the Actors Section in the official Manual without having a solid grasp on terminology and concepts, the densely packed Section on various ways to implement and spawn actors may confuse.

As a starting point, we will only look at event_based_actor. The event-based part of the name tells us about the control flow. Remember the three things that an actor encapsulates? This is the Processing part. An event-based follows a simple state machine:

  1. Wait for a Message (= Event).
  2. Process the message.
  3. Terminate when done, otherwise goto 1.

We postpone discussing the Communications part in more detail for now and instead look at the Storage bit.

Class-based Actors

The most obvious way to store a value in an actor is by using a member variable. As an example, consider this simple cell actor:

class cell_impl : public caf::event_based_actor {
public:
  using super = caf::event_based_actor;

  cell_impl(caf::actor_config& cfg, int32_t value)
  : super(cfg), value_(value) {
    // nop
  }

  caf::behavior make_behavior() override {
    return {
      [this](caf::get_atom) {
        return value_;
      }
    };
  }

  void on_exit() override {
    // nop
  }

  const char* name() const override {
    return "cell";
  }

private:
  int32_t value_;
};

Our cell is derived from event_based_actor. The only constructor available to us in the base class takes an actor_config. We can safely treat this argument as an opaque value that we only need to pass along. The actor_config is always going to the be the first constructor argument and additional arguments for the derived type follow after that.

When implementing an actor, we need to override make_behavior. Otherwise, our actor immediately terminates after spawning it because actors without an active behavior are considered done. A behavior is simply a list of message handlers that CAF tries to invoke in order to process a received message. We also override on_exit, but we come back to this member function later.

Our cell actor only stores a single 32-bit value and provides a single message handler that returns the 32-bit value when receiving a get_atom. We have implemented our cell as a class, so we refer to it as a class-based actor.

Source Code

void caf_main(caf::actor_system& sys) {
  auto cell = sys.spawn<cell_impl>(42);
  caf::scoped_actor self{sys};
  self->send(cell, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell responded with: ", value);
    });
}

Output

cell responded with: 42

Note that our class takes an actor_config as first argument but we spawn the actor by providing only a value for the second argument. The actor system creates the actor_config for us and then appends whatever arguments we pass to spawn.

Now that you know how to implement class-based actors... don't do it. :)

There are better ways to implement actors, as we are going to discuss later, that also require less boilerplate code. However, everything we discuss later builds on top of what we have learned here. Hence, we are not just going to discuss more convenient ways to implement actors but we are also explaining how the abstractions are implemented.

For now, though, we make a quick detour in order to understand the inner workings of CAF before we explore more ways to implement our cell actor.

Object Lifetimes

One cannot program in C++ without thinking about object lifetimes. So this time, we implement the same cell actor slightly differently and add some print statements that allow us to reason about lifetimes of objects:

class cell_impl : public caf::event_based_actor {
public:
  using super = caf::event_based_actor;

  cell_impl(caf::actor_config& cfg, int32_t value) : super(cfg)  {
    println("create actor and state objects");
    value_ = std::make_unique<int32_t>(value);
  }

  ~cell_impl() override {
    println("destroy actor object");
  }

  caf::behavior make_behavior() override {
    return {
      [this](caf::get_atom) {
        return *value_;
      },
    };
  }

  void on_exit() override {
    println("destroy state object");
    value_.reset();
  }

  const char* name() const override {
    return "cell";
  }

private:
  std::unique_ptr<int32_t> value_;
};

The question "is this safe?" may come mind, because we destroy value_ in on_exit but access it without null pointer check in our behavior. To answer the question: yes, CAF guarantees that this code is safe. We allocate the state in the constructor for the actor. This means value_ points to a heap-allocated integer when first calling make_behavior. When an actor terminates, it cleans up its behavior and calls on_exit. Finally, the actor object itself gets destroyed when the strong reference count drops to zero.

Our slightly different cell actor prints on construction and when destroying either value_ or the object itself. To understand how CAF manages the lifetime of actors, we also add print statements to our caf_main, add an caf::actor_addr that points to our cell in addition the regular caf::actor handle, and terminate the actor before our handle to it goes out of scope:

Source Code

void caf_main(caf::actor_system& sys) {
  println("enter main");
  caf::actor_addr cell_addr;
  {
    println("enter inner scope");
    caf::actor cell = sys.spawn<cell_impl>(42);
    cell_addr = caf::actor_cast<caf::actor_addr>(cell);
    caf::scoped_actor self{sys};
    self->send(cell, caf::get_atom_v);
    self->receive(
      [](int32_t value) {
        println("cell responded with: ", value);
      });
    self->send_exit(cell, caf::exit_reason::user_shutdown);
    self->wait_for(cell);
    println("leave inner scope (cell goes out of scope)");
  }
  assert(cell_addr != nullptr);
  auto hdl = caf::actor_cast<caf::actor>(cell_addr);
  if (hdl)
    println("hdl != null");
  else
    println("hdl == null");
  println("leave main (cell_addr goes out of scope)");
}

Output

enter main
enter inner scope
create actor and state objects
cell responded with: 42
destroy state object
leave inner scope (cell goes out of scope)
destroy actor object
hdl == null
leave main (cell_addr goes out of scope)

The program prints a deterministic output, although the cell actor runs in another thread. After receiving the result from the cell, we force it to terminate by sending it an exit message. As part of its termination, the actor calls on_exit and prints "destroy state object". Then, we wait in the inner scope until self receives the down message for cell. We could also explicitly call self->monitor(cell) and then wait for a down_msg, the function wait_for is simply a convenience API that does that for us. Afterwards, we leave the inner scope and the actor object gets destroyed. We can see this in the output from the text we print in the destructor. At this point, we still hold onto a weak reference through cell_addr. What about that?

To understand the inner workings of the handle types, think of the strong actor handles as a smart pointer like std::shared_ptr and of actor_addr as a smart pointer like std::weak_ptr. The function actor_cast converts between the various handle types. Converting a strong reference to a weak one always succeeds. However, trying to convert an actor_addr to a strong handle type fails (returns null) if the actor is already destroyed! In the standard library, we would call weak_ptr::lock for this conversion, which behaves in a similar way.

We already mentioned strong and weak references a couple of times. The C++ standard library calls these owning and non-owning. However, what does that actually mean for our program? To answer this, we need to look more closely at the reference counting implementation. The following figure depicts a simplified view of how our example looks like in terms of object relations at runtime.

Conceptual view of actor references.

Conceptual view of actor references.

 

At the very bottom sits the 32-bit integer that we manage with our member variable value_. One the layer above our cell, we see that handles point to a control block rather than to the cell_impl object directly. A control block contains two reference counts: strong and weak. Once the strong reference count drops to zero, the control block destroys the cell_impl object. Once both reference counts drop to zero, CAF destroys the control block itself and releases all memory.

So now, we can fully explain the observed lifetimes. When terminating, the actor calls on_exit. The lifetime of the int32_t ends here. However, the inner loop still holds a strong reference to the cell. When leaving the scope, the strong reference count drops to zero and the control block destroys the cell_impl object. Because the strong reference count is zero, converting the cell_addr to an actor (a strong handle) fails. Finally, we leave caf_main and the weak reference count drops to zero as well. At that point, CAF destroys the control block.

Stateful Actors

In our previous example, we managed a single int32_t with a unique_ptr. Of course, we are not going to recommend this contrived way of managing a few bits.

However, the example allowed us to introduce the notion of actors managing state explicitly and releasing that state before the actor object itself gets destroyed eventually. So now, we take our previous example one step further.

struct cell_state {
  explicit cell_state(int32_t init_value) : value(init_value)  {
    // nop
  }

  caf::behavior make_behavior() {
    return {
      [this](caf::get_atom) {
        return value;
      },
    };
  }

  int32_t value;

  static inline const char* name = "cell";
};

If we would ask you to change the implementation of the previous cell_impl using the state class, you probably came up with something like this pretty quickly:

class cell_impl_1 : public caf::event_based_actor {
public:
  using super = caf::event_based_actor;

  cell_impl_1(caf::actor_config& cfg, int32_t value) : super(cfg)  {
    state_ = std::make_unique<cell_state>(value);
  }

  caf::behavior make_behavior() override {
    return state_->make_behavior();
  }

  void on_exit() override {
    state_.reset();
  }

  const char* name() const override {
    return cell_state::name;
  }

private:
  std::unique_ptr<cell_state> state_;
};

You can safe yourself some typing! This transformation is so basic, and at the same time so useful, that CAF includes a template class to make working with explicit state simple and easy:

using cell_impl_2 = caf::stateful_actor<cell_state>;

There are slight differences in the two implementations. Before going into more detail, though, let us convince ourselves that the two implementations behave in the same way by spawning one cell of each kind and then ask them for their respective value:

Source Code

void caf_main(caf::actor_system& sys) {
  auto cell1 = sys.spawn<cell_impl_1>(11);
  auto cell2 = sys.spawn<cell_impl_2>(22);
  caf::scoped_actor self{sys};
  self->send(cell1, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell 1 responded with: ", value);
    });
  self->send(cell2, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell 2 responded with: ", value);
    });
}

Output

cell 1 responded with: 11
cell 2 responded with: 22

No surprise should be hiding in there. Conceptually, both implementation types do the same thing. We recommend caf::stateful_actor over the hand-written version not only because it is more compact. There is a functional difference between the two implementations: our handwritten version puts the state separately on the heap by allocating it separately with make_unique, while caf::stateful_actor embeds the state into the actor object to avoid extra heap allocations.

To have a functionally equivalent version to cell_impl_2, we need to replace the unique_ptr with an embedded state object that we can construct and destroy manually. Feel free to skip this final cell implementation if you are not familiar with placement new syntax and C++11 unrestricted unions. We include this final implementation here mostly for the sake of completeness.

class cell_impl_3 : public caf::event_based_actor {
public:
  using super = caf::event_based_actor;

  cell_impl_3(caf::actor_config& cfg, int32_t value) : super(cfg)  {
    new (&state) cell_state(value);
  }

  caf::behavior make_behavior() override {
    return state.make_behavior();
  }

  void on_exit() override {
    state.~cell_state();
  }

  const char* name() const override {
    return cell_state::name;
  }

  union {  cell_state state; };
};

By destroying the state of an actor as early as possible, we not only release resources earlier. We also avoid circular references. Consider a supervisor with a set of workers. Supervisors dispatch incoming tasks to workers, so they need to store a handle to each active worker. Each worker in turns holds a handle to the supervisor in order to report back when becoming idle.

In this design, we would have a subtle memory leak if supervisor and workers store the handles as member variables. We cannot destroy the supervisor object because the workers have a strong reference to it and we cannot destroy the worker objects because the supervisor holds a strong reference to each worker.

However, just using a stateful_actor-based design instead breaks the cycle. Once the supervisor terminates, it automatically releases all reference to its workers.

Having the state class provide a make_behavior function or static name member variable is optional. When would you implement a state without a behavior, though? Well, you could write a function to initialize the behavior instead.

Function-based Actors

After understanding how to implement actors by deriving from event_based_actor or by providing a state class, there is only one topic left: so-called function-based actors.

We closed the last Section with a remark that we can implement state classes without providing a make_behavior function. So let us revisit our cell_state class again, but this time we omit make_behavior as well as a constructor:

struct cell_state {
  int32_t value = 0;

  static inline const char* name = "cell";
};

Of course we need to define a behavior for our cell actor eventually and we also want to initialize the state properly. However, we use a free function this time to do both and also add a new handler that allows us to set a new value.

caf::behavior cell_fun(caf::stateful_actor<cell_state>* self, int32_t value) {
  self->state.value = value;
  return {
    [self](caf::get_atom) {
      return self->state.value;
    },
    [self](caf::put_atom, int32_t new_value) {
      self->state.value = new_value;
    },
  };
}

The upside to our new approach is that we fully decouple the definition of our state from the definition of the behavior. We could define multiple functions that operate on the same state.

The function needs to access the state of the actor. Hence, we pass the this pointer for the stateful actor as an argument. The state member variable is public, so we can access self->state to get our cell_state and then we can access the member variable value.

Now, how can we put all the pieces together to make this work? We need to construct a stateful_actor with a cell_state at some point. We also would not want to hard-wire cell_fun into our actor implementation, because that would defeat the point of separating state and initialization/behavior.

But on a closer look, all we actually need to achieve is to capture a function pointer plus arguments and then delay calling the function until the actor calls make_behavior. For our caf::stateful_actor<cell_state>, the following class does precisely this:

class cell_impl : public caf::stateful_actor<cell_state> {
public:
  using super = caf::stateful_actor<cell_state>;

  using init_fun = std::function<caf::behavior()>;

  template <class Fun, class... Ts>
  cell_impl(caf::actor_config& cfg, Fun f, Ts... args) : super(cfg) {
    init_ = [init{std::move(f)},
             pack{std::make_tuple(this, std::move(args)...)}]() mutable {
      return std::apply(init, std::move(pack));
    };
  }

  caf::behavior make_behavior() override {
    return init_();
  }

private:
  init_fun init_;
};

Our constructor now takes a function f and any number of arguments args... to it. Per convention, the first argument to the function is always the this pointer. Hence, we call f(this, args...) but delay the actual function invocation until the actor calls make_behavior.

With this utility class, we may now spawn cell actors, e.g., by calling sys.spawn<cell_impl>(cell_fun, 11). But wait! This transformation once more is so basic... can CAF do that for us? But of course! All we need to do is dropping the type parameter altogether: sys.spawn(cell_fun, 11).

Source Code

void caf_main(caf::actor_system& sys) {
  auto cell1 = sys.spawn<cell_impl>(cell_fun, 11);
  auto cell2 = sys.spawn(cell_fun, 22);
  caf::scoped_actor self{sys};
  // Read the default values.
  self->send(cell1, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell 1 responded with: ", value);
    });
  self->send(cell2, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell 2 responded with: ", value);
    });
  // Set and retrieve new values.
  self->send(cell1, caf::put_atom_v, int32_t{123});
  self->send(cell1, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell 1 responded with: ", value);
    });
  self->send(cell2, caf::put_atom_v, int32_t{321});
  self->send(cell2, caf::get_atom_v);
  self->receive(
    [](int32_t value) {
      println("cell 2 responded with: ", value);
    });
}

Output

cell 1 responded with: 11
cell 2 responded with: 22
cell 1 responded with: 123
cell 2 responded with: 321

As we can see, function-based actors are a pure convenience feature. Covering the actual implementation in CAF would require its own article, though, because it relies heavily on template metaprogramming. Conceptually, though, CAF only generalizes our cell_impl actor to allow:

  • Functions that return void.
  • Functions that return behavior.
  • Functions that return typed_behavior<...>.

Additionally, CAF makes it optional whether to pass in a self pointer as first argument or not and, if present, derives the proper actor type from the self pointer.

Next Up

We hope part 1 leaves you with a solid grasp on class-based, stateful and function-based actors as well insights about internal vs. external view and object lifetimes in CAF.

In our next part on implementing actors, we revisit what we have learned here and add one additional layer: static type checking via typed_actor<...>. We also extend our discussion on stateful actors and showcase how we can leverage state classes to compose actor implementations.

Previous
Previous

Implementing Actors: Part 2

Next
Next

Configuration: Part 1