Welcome to our archive of CAF Gems: bite-sized tips & tricks!
We update this site regularly. Did we miss an important topic? Do you have ideas how we could improve this site or individual articles? Questions on any of the topics covered here? Please share your thoughts with us by sending an email to feedback@cafcademy.com!
Typed Message Views
A message in CAF really is just a type-erased tuple for passing values around. At some point, however, we need to restore the type information to access the stored values. Typed message views are a convenient way to do just that.
A message
in CAF really is just a type-erased tuple for passing values around.
At some point, however, we need to restore the type information to access the
stored values. Typed message views are a convenient way to do just that.
For the most part, users can write CAF applications without knowing about
message
. When sending messages between actors, CAF automatically wraps the
content for the message into this type-erased container and then matches each
incoming message against the message handlers.
However, there are use cases for message
outside of the dispatching logic in
CAF. For example, actors can store messages that they currently don't want to
(or cannot) process in some cache for processing them later. Or an application
could read messages from custom data sources and then deploy on their run-time
type.
For such use cases, CAF includes typed message views. On construction, a view
performs the necessary type checking. On a match, users can then query
individual elements from the view via get<Index>(x)
, much like the interface
offered by std::tuple
.
Internally, messages are
copy-on-write tuples.
Hence, there are two flavors of typed message views: const
views that
represent read-only access and views with mutable access. The latter forces the
message to perform a deep copy of its data if there is more to one reference to
the data. To cut a long story short, we recommend sticking to the const
version whenever possible.
For both view types, the simplest way of using them is to construct a view
object and then check whether it is valid. The const
version is called
const_typed_message_view
:
Source Code
auto msg1 = make_message("hello", "world");
auto msg2 = msg1;
if (auto v = make_const_typed_message_view<std::string, std::string>(msg1))
println("v: ", get<0>(v), ", ", get<1>(v));
// Both messages still point to the same data.
assert(msg1.cptr() == msg2.cptr());
Output
v: hello, world
The mutable version is simply called typed_message_view
, but otherwise has a
similar interface:
Source Code
auto msg1 = make_message("hello", "world");
auto msg2 = msg1;
if (auto v = make_typed_message_view<std::string, std::string>(msg1)) {
get<0>(v) = "bye";
println("v: ", get<0>(v), ", ", get<1>(v));
}
// The messages no longer point to the same data.
assert(msg1.cptr() != msg2.cptr());
Output
v: bye, world
Message Builders
Sometimes, actors need to assemble messages incrementally and cannot provide all values at once for make_message. With message builders, CAF offers a convenient tool to collect some values for converting them to a message later.
Sometimes, actors need to assemble messages incrementally and cannot provide all
values at once for make_message
. With message builders, CAF offers a
convenient tool to collect some values for converting them to a message later.
The general work flow for message builders is calling append
until all values
were added and then calling move_to_message
for turning the collection of
values to an actual message as shown below.
Source Code
message_builder mb;
for (int32_t i = 1; i <= 8; i *= 2)
mb.append(i);
println("result: ", mb.move_to_message());
Output
result: message(1, 2, 4, 8)
Calling move_to_message
leaves the builder object in a moved-from state and
we may only destroy it after this point. To leave the builder in a state where
we can still add more values to it, we can call to_message
instead as shown
below.
Source Code
std::string hw = "hello world";
message_builder mb;
mb.append(hw.substr(0, 5));
println("result 1: ", mb.to_message());
mb.append(hw.substr(6, 5));
println("result 2: ", mb.move_to_message());
Output
result 1: message("hello")
result 2: message("hello", "world")
Using to_message
means that CAF needs to copy each value into the new message
instead of moving them. The builder can also copy values from another CAF
message:
Source Code
auto msg = make_message("hello", "world", "goodbye");
auto vec = std::vector<int32_t>{1, 2, 3};
message_builder mb;
mb.append_from(msg, 0, 2) // starting at index 0, copying two elements
.append(vec.begin(), vec.end());
println("result: ", mb.move_to_message());
Output
result: message("hello", "world", 1, 2, 3)
While message builders offer great flexibility, they do come with some
performance overhead. Mostly due to the extra heap allocations. So we still
recommend sticking to make_message
whenever feasible.
Copy-on-write Tuples
Copy-on-write (COW) tuples make passing values around cheap while also making sure that the system copies the actual data only when necessary.
Copy-on-write (COW) tuples make passing values around cheap while also making sure that the system copies the actual data only when necessary.
CAF uses copy-on-write for its message
type, which basically is a type-erased
tuple. This allows actors to send the same message to multiple recipients
(usually by sending it to publish/subscribe group) without storing the content
in memory multiple times. CAF also includes cow_tuple
which is a
std::tuple
-like type to wrap potentially expensive data (like strings and
lists) into a single unit that actors can pass around cheaply.
Here is a small example to illustrate the API:
Source Code
auto xs = caf::make_cow_tuple(1, 2, 3);
auto ys = xs;
println("After initializing:");
println("xs: ", xs, " (", std::addressof(xs.data()), ")");
println("ys: ", ys, " (", std::addressof(ys.data()), ")");
ys.unshared() = std::tuple{10, 20, 30};
println("After assigning to ys:");
println("xs: ", xs, " (", std::addressof(xs.data()), ")");
println("ys: ", ys, " (", std::addressof(ys.data()), ")");
Output
After initializing:
xs: [1, 2, 3] (0x7fb091405ae0)
ys: [1, 2, 3] (0x7fb091405ae0)
After assigning to ys:
xs: [1, 2, 3] (0x7fb091405ae0)
ys: [10, 20, 30] (0x7fb091405b00)
By default, cow_tuple
only grants const
access to its elements. With
get<I>(xs)
, users can get access to a single value of xs
at the index I
.
With xs.data()
, users get a const
reference to the internally stored
std::tuple
.
In order to gain mutable access, users may call unshared()
to get a mutable
reference to the std::tuple
. This function makes a deep copy of the data if
there is more than one reference to the data at the moment. In our example
above, ys
initially points to the same data as xs
. After calling
unshared()
on ys
, however, the two tuples point to different data.
JSON Serialization
Did you know that CAF can generate (and parse) JSON for any inspectable type?
Did you know that CAF can generate (and parse) JSON for any inspectable type?
Consider this simple data type for representing a user with a numerical ID, a user name and an optional email address:
#include "caf/json_reader.hpp"
#include "caf/json_writer.hpp"
struct user {
uint32_t id;
std::string name;
std::optional<std::string> email;
};
template <class Inspector>
bool inspect(Inspector& f, user& x) {
return f.object(x).fields(f.field("id", x.id),
f.field("name", x.name),
f.field("email", x.email));
}
CAF_BEGIN_TYPE_ID_BLOCK(example_app, caf::first_custom_type_id)
CAF_ADD_TYPE_ID(example_app, (user))
CAF_END_TYPE_ID_BLOCK(example_app)
The setup above really just follows the basic template for enabling CAF type
inspection: provide an inspect
overload and assign a type ID. For generating
us some JSON, all we need to do is passing an user
object to a json_writer
!
The JSON writer comes with a couple of configuration options. However, passing
an object to the inspector always uses apply
. For our example, we factor out
this step to a utility function that throws an exception on errors and otherwise
prints the generated JSON to the terminal:
template <class T>
void json_println(caf::json_writer& writer, const T& obj) {
if (!writer.apply(obj)) {
std::cerr << "failed to generate JSON output: "
<< to_string(writer.get_error()) << '\n';
throw std::logic_error("failed to generate JSON");
}
println(writer.str());
writer.reset();
}
Two notes on this functions: str
returns a string_view
to an internal buffer
and we must rewind the buffer (and the state) by calling reset
before applying
another object.
With this utility in place, we can print users in JSON format. The writer API
has two configuration options: indentation
and skip_empty_fields
. The former
is a numerical value that tells CAF whether it should break after each value and
how much indentation it should add on each level of nesting. Setting this to 0
(the default) disables indentation and results in a compact single-line output.
Setting the skip_empty_fields
to true
tells CAF to omit missing fields
completely (the default). Otherwise, CAF includes the field in the output and
assigns null
to it as shown in example (c) below.
Source Code
auto john = user{1234, "John Doe", std::nullopt};
caf::json_writer writer;
println("(a): compact output");
json_println(writer, john);
println("\n(b): indentation = 2");
writer.indentation(2);
json_println(writer, john);
println("\n(c): indentation = 2 && skip_empty_fields = false");
writer.skip_empty_fields(false);
json_println(writer, john);
Output
(a): compact output
{"@type": "user", "id": 1234, "name": "John Doe"}
(b): indentation = 2
{
"@type": "user",
"id": 1234,
"name": "John Doe"
}
(c): indentation = 2 && skip_empty_fields = false
{
"@type": "user",
"id": 1234,
"name": "John Doe",
"email": null
}
CAF can convert any inspectable type to JSON. This also applies to
caf::message
, which really is just a tuple. Since JSON has no notion of
tuples, the writer outputs CAF messages as lists:
Source Code
auto msg = caf::make_message(user{1234, "John Doe", std::nullopt},
user{2345, "Jane Doe", "jane@doe.public"});
caf::json_writer writer;
writer.indentation(2);
json_println(writer, msg);
Output
[
{
"@type": "user",
"id": 1234,
"name": "John Doe"
},
{
"@type": "user",
"id": 2345,
"name": "Jane Doe",
"email": "jane@doe.public"
}
]
One last thing. As you can see, CAF also adds an @type
annotation with the C++
class name. The main reason for this inclusion is to enable CAF to deserialize
its own JSON output again! For this, CAF includes the class json_reader
:
Source Code
caf::message msg;
caf::json_reader reader;
// Step 1: parse JSON to an internal buffer.
if (!reader.load(R"([{"@type": "user", "id": 1234, "name": "John Doe"}])")) {
std::cerr << "failed to parse JSON input: "
<< to_string(reader.get_error()) << '\n';
throw std::logic_error("failed to parse JSON");
}
// Step 2: try to deserialize a message from the parse JSON input.
if (reader.apply(msg)) {
println("parsed JSON: ", msg);
} else {
println("failed to parse JSON: ", reader.get_error());
}
Output
parsed JSON: message(user(1234, "John Doe", null))
Telemetry Timers
When instrumenting code, timers offer a convenient way for measuring the duration of individual operations.
When instrumenting code, timers offer a convenient way for measuring the duration of individual operations.
The metrics API in CAF includes histograms for sampling observations over time. For example, how long it takes to handle incoming requests or to perform some expensive operations.
Sampling time manually is quite tedious, though, as illustrated by this snippet:
caf::telemetry::dbl_histogram* my_histogram = nullptr;
// ... some place later ...
auto t0 = std::chrono::steady_clock::now();
// ... expensive operation ...
auto delta = std::chrono::steady_clock::now() - t0;
// ... convert delta to fractional seconds and pass to my_histogram ...
To automate this process, CAF includes timers. They simply store the current timestamp when created and pass the elapsed time since construction to a histogram when destroyed. Hence, we can replace the verbose version from before simply by putting a timer into the scope of the expensive option and take advantage of RAII:
caf::telemetry::dbl_histogram* my_histogram = nullptr;
// ... some place later ...
{
auto t = caf::telemetry::timer{my_histogram};
// ... expensive operation ...
}
By the way, passing a null pointer the constructor of timer
is perfectly fine.
This accounts for the fact that some metrics may be disabled by default.
In a more complete example, a worker that samples expensive operation may follow this template:
struct worker_state {
explicit worker_state(caf::event_based_actor* self) {
std::array<double, 6> default_time_buckets{{
0.00001, // 10us
0.0001, // 100us
0.001, // 1ms
0.01, // 10ms
0.1, // 100ms
1., // 1s
}};
processing_time = self->system().metrics().histogram_singleton<double>(
"my-app", "processing-time", default_time_buckets,
"Time the app needs to perform the expensive computation..", "seconds");
}
int32_t expensive_computation() {
auto t = caf::telemetry::timer{processing_time};
auto result = int32_t{42};
// ... expensive number crunching on result ...
return result;
}
caf::behavior make_behavior() {
return {
[this](caf::get_atom) {
return expensive_computation();
},
};
}
caf::telemetry::dbl_histogram* processing_time;
static inline const char* name = "worker";
};
After spinning up a worker and exporting your metrics to
Prometheus, you can query the custom metric as
my_app_processing_time_seconds
.
Indenting Trace Logs
Have you ever looked at CAF trace logs before? Sometimes, reproducing a bug with logging enabled is the best way to get to the bottom of unexpected system behavior.
Have you ever looked at CAF trace logs before? Sometimes, reproducing a bug with logging enabled is the best way to get to the bottom of unexpected system behavior.
Trace logging of course generates much too much data to run in production. Even for short runs, you probably look at mega- or even gigabytes of text output.
During development, however, skimming through logs can often times safe a lot of
time. To make it easier to focus on a single actor, CAF includes a small python
script called indent_trace_log.py
in the
repository.
But before we look at the script itself, we briefly discuss a few basics and implement a small example to generate some log output to work with.
Trace logging is disabled by default. Not just at run-time, but users must
explicitly enable trace logging when building CAF. When building CAF using the
configure script, you can pass --log-level=trace
. When building with CMake
directly, set CAF_LOG_LEVEL:STRING=TRACE
during build.
To have some actor we can analyze, we implement a simple cell actor that holds an integer value that we can retrieve, set or add to:
struct cell_state {
explicit cell_state(int32_t init_value) : value(init_value) {
// nop
}
caf::behavior make_behavior() {
return {
[this](caf::get_atom) {
return value;
},
[this](caf::put_atom, int32_t new_value) {
value = new_value;
},
[this](caf::add_atom, int32_t amount) {
value += amount;
return value;
},
};
}
int32_t value;
static inline const char* name = "cell";
};
using cell_impl = caf::stateful_actor<cell_state>;
In our main, we simply spawn a cell actor with some initial value, then add to it and print the result.
void caf_main(caf::actor_system& sys) {
using namespace std::literals;
auto cell = sys.spawn<cell_impl>(11);
caf::scoped_actor self{sys};
self->request(cell, 1s, caf::add_atom_v, 9).receive(
[](int32_t new_value) {
println("cell responded with: ", new_value);
},
[](const caf::error& err) {
println("cell failed to respond: ", err);
});
}
Sure enough, the program prints cell responded with: 20
. But how can we find
our cell actor in the trace log output? And how do we get a trace log in the
first place?
To get CAF to create a trace log file, the quickest way is to pass a verbosity level to the logger on the command line:
./example --caf.logger.file.verbosity=trace --caf.logger.file.path=out.log
This should put an out.log
file into the current directory with a couple
hundred lines of text. Setting a path is not necessary, we use it here to make
sure we know the file name in advance.
Now, how to get to our cell actor? You may use more advanced tools, but for now
we rely on good ol' grep
:
$ grep 'NAME = cell' out.log
... SPAWN ; ID = 6 ; NAME = cell ; TYPE = ...
As we can see in the truncated output, the quickest way to find actors of a
particular kind is to look for the name we have assigned to it. Our cell state
has the static member variable name
and whatever we assign to it is the string
we can start looking for.
CAF logs each spawn event with a line that contains the ID, the name, the C++
type, constructor arguments, node ID and initial group memberships. The
interesting bit to us now is the ID, because this is where our gem
indent_trace_log.py
comes into play. We can use the script to print out only
log entries for a particular actor and to indent the output based on entry/exit
events by passing -i <ID>
before the file name:
$ indent_trace_log.py -i 6 out.log
If you are like us and find yourself working on the command line more often than not, give the script a try. Combined with a bit of syntax highlighting in your favorite command line editor, it can go a long way.
Hashing
Did you know that CAF can generate hash values for any inspectable type out of the box?
Did you know that CAF can generate hash values for any inspectable type out of the box?
Consider this simple POD type with an inspect
overload:
struct point_3d {
int32_t x;
int32_t y;
int32_t z;
};
template<class Inspector>
bool inspect(Inspector& f, point_3d& point) {
return f.object(point).fields(f.field("x", point.x),
f.field("y", point.y),
f.field("z", point.z));
}
The common algorithm of choice for generating hash values is the
FNV algorithm, which is
designed for hash tables and fast. Because this algorithm is so common, CAF
ships an inspector that implements it: caf::hash::fnv
(include
caf/hash/fnv.hpp
).
The FNV algorithm chooses a different prime number as the seed value based on
whether you are generating a 32-bit hash value or a 64-bit hash value. In CAF,
you choose the seed implicitly by instantiating the inspector with uint32_t
,
uint64_t
, or size_t
.
Because applying any value to the inspector always results in an integer value,
caf::hash::fnv
has a static member function called compute
that takes any
number of inspectable values. This means generating a hash boils down to a
one-liner!
It also makes it very convenient to specialize std::hash
. For our point_3d
,
the implementation boils down to this:
namespace std {
template <>
struct hash<point_3d> {
size_t operator()(const point_3d& point) const noexcept {
return caf::hash::fnv<size_t>::compute(point);
}
};
} // namespace std
Under the hood, the inspector uses the inspect
overload to traverse the
object. Hash inspectors ignore type names, field names, etc. So passing a
point_3d
to the inspector results in the same result as passing x
, y
and
z
individually:
Source Code
using hasher = caf::hash::fnv<uint32_t>;
println("hash of (1, 2, 3): ", hasher::compute(1, 2, 3));
println("hash of point_3d(1, 2, 3): ", hasher::compute(point_3d{1, 2, 3}));
Output
hash of (1, 2, 3): 2034659765
hash of point_3d(1, 2, 3): 2034659765