Welcome to our archive of CAF Gems: bite-sized tips & tricks!

We update this site regularly. Did we miss an important topic? Do you have ideas how we could improve this site or individual articles? Questions on any of the topics covered here? Please share your thoughts with us by sending an email to feedback@cafcademy.com!

Dominik Charousset 5/28/21 Dominik Charousset 5/28/21

Typed Message Views

A message in CAF really is just a type-erased tuple for passing values around. At some point, however, we need to restore the type information to access the stored values. Typed message views are a convenient way to do just that.

A message in CAF really is just a type-erased tuple for passing values around. At some point, however, we need to restore the type information to access the stored values. Typed message views are a convenient way to do just that.

For the most part, users can write CAF applications without knowing about message. When sending messages between actors, CAF automatically wraps the content for the message into this type-erased container and then matches each incoming message against the message handlers.

However, there are use cases for message outside of the dispatching logic in CAF. For example, actors can store messages that they currently don't want to (or cannot) process in some cache for processing them later. Or an application could read messages from custom data sources and then deploy on their run-time type.

For such use cases, CAF includes typed message views. On construction, a view performs the necessary type checking. On a match, users can then query individual elements from the view via get<Index>(x), much like the interface offered by std::tuple.

Internally, messages are copy-on-write tuples. Hence, there are two flavors of typed message views: const views that represent read-only access and views with mutable access. The latter forces the message to perform a deep copy of its data if there is more to one reference to the data. To cut a long story short, we recommend sticking to the const version whenever possible.

For both view types, the simplest way of using them is to construct a view object and then check whether it is valid. The const version is called const_typed_message_view:

Source Code

auto msg1 = make_message("hello", "world");
auto msg2 = msg1;
if (auto v = make_const_typed_message_view<std::string, std::string>(msg1))
  println("v: ", get<0>(v), ", ", get<1>(v));
// Both messages still point to the same data.
assert(msg1.cptr() == msg2.cptr());

Output

v: hello, world

The mutable version is simply called typed_message_view, but otherwise has a similar interface:

Source Code

auto msg1 = make_message("hello", "world");
auto msg2 = msg1;
if (auto v = make_typed_message_view<std::string, std::string>(msg1)) {
  get<0>(v) = "bye";
  println("v: ", get<0>(v), ", ", get<1>(v));
}
// The messages no longer point to the same data.
assert(msg1.cptr() != msg2.cptr());

Output

v: bye, world

Dominik Charousset 4/30/21 Dominik Charousset 4/30/21

Message Builders

Sometimes, actors need to assemble messages incrementally and cannot provide all values at once for make_message. With message builders, CAF offers a convenient tool to collect some values for converting them to a message later.

Sometimes, actors need to assemble messages incrementally and cannot provide all values at once for make_message. With message builders, CAF offers a convenient tool to collect some values for converting them to a message later.

The general work flow for message builders is calling append until all values were added and then calling move_to_message for turning the collection of values to an actual message as shown below.

Source Code

message_builder mb;
for (int32_t i = 1; i <= 8; i *= 2)
  mb.append(i);
println("result: ", mb.move_to_message());

Output

result: message(1, 2, 4, 8)

Calling move_to_message leaves the builder object in a moved-from state and we may only destroy it after this point. To leave the builder in a state where we can still add more values to it, we can call to_message instead as shown below.

Source Code

std::string hw = "hello world";
message_builder mb;
mb.append(hw.substr(0, 5));
println("result 1: ", mb.to_message());
mb.append(hw.substr(6, 5));
println("result 2: ", mb.move_to_message());

Output

result 1: message("hello")
result 2: message("hello", "world")

Using to_message means that CAF needs to copy each value into the new message instead of moving them. The builder can also copy values from another CAF message:

Source Code

auto msg = make_message("hello", "world", "goodbye");
auto vec = std::vector<int32_t>{1, 2, 3};
message_builder mb;
mb.append_from(msg, 0, 2) // starting at index 0, copying two elements
  .append(vec.begin(), vec.end());
println("result: ", mb.move_to_message());

Output

result: message("hello", "world", 1, 2, 3)

While message builders offer great flexibility, they do come with some performance overhead. Mostly due to the extra heap allocations. So we still recommend sticking to make_message whenever feasible.

Dominik Charousset 4/23/21 Dominik Charousset 4/23/21

Copy-on-write Tuples

Copy-on-write (COW) tuples make passing values around cheap while also making sure that the system copies the actual data only when necessary.

CAF uses copy-on-write for its message type, which basically is a type-erased tuple. This allows actors to send the same message to multiple recipients (usually by sending it to publish/subscribe group) without storing the content in memory multiple times. CAF also includes cow_tuple which is a std::tuple-like type to wrap potentially expensive data (like strings and lists) into a single unit that actors can pass around cheaply.

Here is a small example to illustrate the API:

Source Code

auto xs = caf::make_cow_tuple(1, 2, 3);
auto ys = xs;
println("After initializing:");
println("xs: ", xs, " (", std::addressof(xs.data()), ")");
println("ys: ", ys, " (", std::addressof(ys.data()), ")");
ys.unshared() = std::tuple{10, 20, 30};
println("After assigning to ys:");
println("xs: ", xs, " (", std::addressof(xs.data()), ")");
println("ys: ", ys, " (", std::addressof(ys.data()), ")");

Output

After initializing:
xs: [1, 2, 3] (0x7fb091405ae0)
ys: [1, 2, 3] (0x7fb091405ae0)
After assigning to ys:
xs: [1, 2, 3] (0x7fb091405ae0)
ys: [10, 20, 30] (0x7fb091405b00)

By default, cow_tuple only grants const access to its elements. With get<I>(xs), users can get access to a single value of xs at the index I. With xs.data(), users get a const reference to the internally stored std::tuple.

In order to gain mutable access, users may call unshared() to get a mutable reference to the std::tuple. This function makes a deep copy of the data if there is more than one reference to the data at the moment. In our example above, ys initially points to the same data as xs. After calling unshared() on ys, however, the two tuples point to different data.

Dominik Charousset 4/2/21 Dominik Charousset 4/2/21

JSON Serialization

Did you know that CAF can generate (and parse) JSON for any inspectable type?

Consider this simple data type for representing a user with a numerical ID, a user name and an optional email address:

#include "caf/json_reader.hpp"
#include "caf/json_writer.hpp"

struct user {
  uint32_t id;
  std::string name;
  std::optional<std::string> email;
};

template <class Inspector>
bool inspect(Inspector& f, user& x) {
  return f.object(x).fields(f.field("id", x.id),
                            f.field("name", x.name),
                            f.field("email", x.email));
}

CAF_BEGIN_TYPE_ID_BLOCK(example_app, caf::first_custom_type_id)

  CAF_ADD_TYPE_ID(example_app, (user))

CAF_END_TYPE_ID_BLOCK(example_app)

The setup above really just follows the basic template for enabling CAF type inspection: provide an inspect overload and assign a type ID. For generating us some JSON, all we need to do is passing an user object to a json_writer!

The JSON writer comes with a couple of configuration options. However, passing an object to the inspector always uses apply. For our example, we factor out this step to a utility function that throws an exception on errors and otherwise prints the generated JSON to the terminal:

template <class T>
void json_println(caf::json_writer& writer, const T& obj) {
  if (!writer.apply(obj)) {
    std::cerr << "failed to generate JSON output: "
              << to_string(writer.get_error()) << '\n';
    throw std::logic_error("failed to generate JSON");
  }
  println(writer.str());
  writer.reset();
}

Two notes on this functions: str returns a string_view to an internal buffer and we must rewind the buffer (and the state) by calling reset before applying another object.

With this utility in place, we can print users in JSON format. The writer API has two configuration options: indentation and skip_empty_fields. The former is a numerical value that tells CAF whether it should break after each value and how much indentation it should add on each level of nesting. Setting this to 0 (the default) disables indentation and results in a compact single-line output. Setting the skip_empty_fields to true tells CAF to omit missing fields completely (the default). Otherwise, CAF includes the field in the output and assigns null to it as shown in example (c) below.

Source Code

auto john = user{1234, "John Doe", std::nullopt};
caf::json_writer writer;

println("(a): compact output");
json_println(writer, john);

println("\n(b): indentation = 2");
writer.indentation(2);
json_println(writer, john);

println("\n(c): indentation = 2 && skip_empty_fields = false");
writer.skip_empty_fields(false);
json_println(writer, john);

Output

(a): compact output
{"@type": "user", "id": 1234, "name": "John Doe"}

(b): indentation = 2
{
  "@type": "user",
  "id": 1234,
  "name": "John Doe"
}

(c): indentation = 2 && skip_empty_fields = false
{
  "@type": "user",
  "id": 1234,
  "name": "John Doe",
  "email": null
}

CAF can convert any inspectable type to JSON. This also applies to caf::message, which really is just a tuple. Since JSON has no notion of tuples, the writer outputs CAF messages as lists:

Source Code

auto msg = caf::make_message(user{1234, "John Doe", std::nullopt},
                             user{2345, "Jane Doe", "jane@doe.public"});
caf::json_writer writer;
writer.indentation(2);
json_println(writer, msg);

Output

[
  {
    "@type": "user",
    "id": 1234,
    "name": "John Doe"
  },
  {
    "@type": "user",
    "id": 2345,
    "name": "Jane Doe",
    "email": "jane@doe.public"
  }
]

One last thing. As you can see, CAF also adds an @type annotation with the C++ class name. The main reason for this inclusion is to enable CAF to deserialize its own JSON output again! For this, CAF includes the class json_reader:

Source Code

caf::message msg;
caf::json_reader reader;
// Step 1: parse JSON to an internal buffer.
if (!reader.load(R"([{"@type": "user", "id": 1234, "name": "John Doe"}])")) {
  std::cerr << "failed to parse JSON input: "
            << to_string(reader.get_error()) << '\n';
  throw std::logic_error("failed to parse JSON");
}
// Step 2: try to deserialize a message from the parse JSON input.
if (reader.apply(msg)) {
  println("parsed JSON: ", msg);
} else {
  println("failed to parse JSON: ", reader.get_error());
}

Output

parsed JSON: message(user(1234, "John Doe", null))

Dominik Charousset 3/18/21 Dominik Charousset 3/18/21

Telemetry Timers

When instrumenting code, timers offer a convenient way for measuring the duration of individual operations.

The metrics API in CAF includes histograms for sampling observations over time. For example, how long it takes to handle incoming requests or to perform some expensive operations.

Sampling time manually is quite tedious, though, as illustrated by this snippet:

caf::telemetry::dbl_histogram* my_histogram = nullptr;
// ... some place later ...
auto t0 = std::chrono::steady_clock::now();
// ... expensive operation ...
auto delta = std::chrono::steady_clock::now() - t0;
// ... convert delta to fractional seconds and pass to my_histogram ...

To automate this process, CAF includes timers. They simply store the current timestamp when created and pass the elapsed time since construction to a histogram when destroyed. Hence, we can replace the verbose version from before simply by putting a timer into the scope of the expensive option and take advantage of RAII:

caf::telemetry::dbl_histogram* my_histogram = nullptr;
// ... some place later ...
{
  auto t = caf::telemetry::timer{my_histogram};
  // ... expensive operation ...
}

By the way, passing a null pointer the constructor of timer is perfectly fine. This accounts for the fact that some metrics may be disabled by default.

In a more complete example, a worker that samples expensive operation may follow this template:

struct worker_state {
  explicit worker_state(caf::event_based_actor* self) {
    std::array<double, 6> default_time_buckets{{
      0.00001, //  10us
      0.0001,  // 100us
      0.001,   //   1ms
      0.01,    //  10ms
      0.1,     // 100ms
      1.,      //   1s
    }};
    processing_time = self->system().metrics().histogram_singleton<double>(
      "my-app", "processing-time", default_time_buckets,
      "Time the app needs to perform the expensive computation..", "seconds");
  }

  int32_t expensive_computation() {
    auto t = caf::telemetry::timer{processing_time};
    auto result = int32_t{42};
    // ... expensive number crunching on result ...
    return result;
  }

  caf::behavior make_behavior() {
    return {
      [this](caf::get_atom) {
        return expensive_computation();
      },
    };
  }

  caf::telemetry::dbl_histogram* processing_time;

  static inline const char* name = "worker";
};

After spinning up a worker and exporting your metrics to Prometheus, you can query the custom metric as my_app_processing_time_seconds.

Dominik Charousset 3/3/21 Dominik Charousset 3/3/21

Indenting Trace Logs

Have you ever looked at CAF trace logs before? Sometimes, reproducing a bug with logging enabled is the best way to get to the bottom of unexpected system behavior.

Trace logging of course generates much too much data to run in production. Even for short runs, you probably look at mega- or even gigabytes of text output.

During development, however, skimming through logs can often times safe a lot of time. To make it easier to focus on a single actor, CAF includes a small python script called indent_trace_log.py in the repository.

But before we look at the script itself, we briefly discuss a few basics and implement a small example to generate some log output to work with.

Trace logging is disabled by default. Not just at run-time, but users must explicitly enable trace logging when building CAF. When building CAF using the configure script, you can pass --log-level=trace. When building with CMake directly, set CAF_LOG_LEVEL:STRING=TRACE during build.

To have some actor we can analyze, we implement a simple cell actor that holds an integer value that we can retrieve, set or add to:

struct cell_state {
  explicit cell_state(int32_t init_value) : value(init_value)  {
    // nop
  }

  caf::behavior make_behavior() {
    return {
      [this](caf::get_atom) {
        return value;
      },
      [this](caf::put_atom, int32_t new_value) {
        value = new_value;
      },
      [this](caf::add_atom, int32_t amount) {
        value += amount;
        return value;
      },
    };
  }

  int32_t value;

  static inline const char* name = "cell";
};

using cell_impl = caf::stateful_actor<cell_state>;

In our main, we simply spawn a cell actor with some initial value, then add to it and print the result.

void caf_main(caf::actor_system& sys) {
  using namespace std::literals;
  auto cell = sys.spawn<cell_impl>(11);
  caf::scoped_actor self{sys};
  self->request(cell, 1s, caf::add_atom_v, 9).receive(
    [](int32_t new_value) {
      println("cell responded with: ", new_value);
    },
    [](const caf::error& err) {
      println("cell failed to respond: ", err);
    });
}

Sure enough, the program prints cell responded with: 20. But how can we find our cell actor in the trace log output? And how do we get a trace log in the first place?

To get CAF to create a trace log file, the quickest way is to pass a verbosity level to the logger on the command line:

./example --caf.logger.file.verbosity=trace --caf.logger.file.path=out.log

This should put an out.log file into the current directory with a couple hundred lines of text. Setting a path is not necessary, we use it here to make sure we know the file name in advance.

Now, how to get to our cell actor? You may use more advanced tools, but for now we rely on good ol' grep:

$ grep 'NAME = cell' out.log
... SPAWN ; ID = 6 ; NAME = cell ; TYPE = ...

As we can see in the truncated output, the quickest way to find actors of a particular kind is to look for the name we have assigned to it. Our cell state has the static member variable name and whatever we assign to it is the string we can start looking for.

CAF logs each spawn event with a line that contains the ID, the name, the C++ type, constructor arguments, node ID and initial group memberships. The interesting bit to us now is the ID, because this is where our gem indent_trace_log.py comes into play. We can use the script to print out only log entries for a particular actor and to indent the output based on entry/exit events by passing -i <ID> before the file name:

$ indent_trace_log.py -i 6 out.log

If you are like us and find yourself working on the command line more often than not, give the script a try. Combined with a bit of syntax highlighting in your favorite command line editor, it can go a long way.

Dominik Charousset 2/17/21 Dominik Charousset 2/17/21

Hashing

Did you know that CAF can generate hash values for any inspectable type out of the box?

Consider this simple POD type with an inspect overload:

struct point_3d {
  int32_t x;
  int32_t y;
  int32_t z;
};

template<class Inspector>
bool inspect(Inspector& f, point_3d& point) {
  return f.object(point).fields(f.field("x", point.x),
                                f.field("y", point.y),
                                f.field("z", point.z));
}

The common algorithm of choice for generating hash values is the FNV algorithm, which is designed for hash tables and fast. Because this algorithm is so common, CAF ships an inspector that implements it: caf::hash::fnv (include caf/hash/fnv.hpp).

The FNV algorithm chooses a different prime number as the seed value based on whether you are generating a 32-bit hash value or a 64-bit hash value. In CAF, you choose the seed implicitly by instantiating the inspector with uint32_t, uint64_t, or size_t.

Because applying any value to the inspector always results in an integer value, caf::hash::fnv has a static member function called compute that takes any number of inspectable values. This means generating a hash boils down to a one-liner!

It also makes it very convenient to specialize std::hash. For our point_3d, the implementation boils down to this:

namespace std {

template <>
struct hash<point_3d> {
  size_t operator()(const point_3d& point) const noexcept {
    return caf::hash::fnv<size_t>::compute(point);
  }
};

} // namespace std

Under the hood, the inspector uses the inspect overload to traverse the object. Hash inspectors ignore type names, field names, etc. So passing a point_3d to the inspector results in the same result as passing x, y and z individually:

Source Code

using hasher = caf::hash::fnv<uint32_t>;
println("hash of (1, 2, 3): ", hasher::compute(1, 2, 3));
println("hash of point_3d(1, 2, 3): ", hasher::compute(point_3d{1, 2, 3}));

Output

hash of (1, 2, 3): 2034659765
hash of point_3d(1, 2, 3): 2034659765