How to make your maps, try_emplace and smart pointers play nicely with each others in C++17.

Posted on Sun 18 November 2018 in C++ • Tagged with C++17, std::map, std::unordered_map.Leave a comment

Trivia:

Lately, I have been working on the reincarnation of a class at work: a hash map. While this class had interesting internals (a sort of dense hash map) and performed really well, its interface was not up to standard both literally and metaphorically. After much of lipstick applied to it, the class now fully mimic the interface of the beloved std::unordered_map from the standard library. A close look on std::unordered_map and its sister std::map reveals few interesting design choices. Combining this interface with some smart pointer types can present some challenges to squeeze performance out of your maps. We will explore these challenges in this blog post, and try to figure out some solutions.

Disclaimer: C++ being C++, I would not be suprise if 1) I wrote some unacurracies here 2) Some guru could reduce this entire article in a phantasmagoric one liner.

Some peculiar modifier member functions:

Note: This part of the post will serve as a reminder for some of the folks that are not well versed in the associative containers of the standard. If you are confident, you can always jump straight to the dilemma part.

A container

insert:

If you observe the interface of the associative containers (like std::map or std::unordered_map) in the current standard you will notice that there are 6 member functions to map a value to a given key: insert, insert_or_assign, emplace, emplace_hint, try_emplace and the subscript operator (operator[]). That number does not include all the overloads for each of these member functions. It is not a wonder that a lot of C++ users will tend to do suboptimal calls to insert values in their associative containers, the choice is not always obvious when you have 6 different functions with slightly different behaviour.

Typically, you will often this pattern within a code-base:

std::unordered_map<std::string, std::string> m;

// Check if the key is already in m.
if (m.find("johannes") == m.end()) { // Often written as m.count("johannes") == 0
    m["johannes"] = "lucio"; // If not insert they key
}

Little did your colleague, boss, or tired ego know that such a code will do twice a relatively costly job: checking the existence of the key in the map. Indeed, in the case of a std::unordered_map, the key "johannes" will be hashed twice: in find and in the operator[]. In both member functions, the std::unoredered_map has to know in which bucket the key will fall into. Worst! If you are having collisions between your keys, checking the existence of a key may induce up to N comparisons (even your hash function can be drunk sometimes) where N is the amount of stored key-value pairs. Potentially mutiplying these comparisons by two is not something you should desire. Such a situation in std::map is even worst, this will always bring roughly O(log(N)) comparisons. Comparing two keys may not always be as cheap as it seems and if you add on top of that the cost of jumping through a linked list of nodes, this should be considered harmful.

Obviously the answer to this problem is to use insert. insert as its name implies, will only work if it can insert the key in the associative container, meaning that the insertion will not happen if the same key is already in the map. If you really care to know whether the insertion happend, insert will return a pair of an iterator and a boolean that you can query. The iterator points to the newly inserted key-value pair or the already existing one, the boolean indicates whether the insertion happened or not.

// Use C++17 structured bindings and class template argument deduction (CTAD)
// See more on my previous post on that topics
auto [it, result] = m.insert(std::pair("johannes", "lucio")); // Construct a pair and insert it if needed.
// Do whatever you want with it and result.

Here only one check for the existence will be done, that is much better, isn't it? Well, while this snippet is shorter and performs better, there is still room for improvement. Here we are constructing a pair out of the map, the same pair that needs to be created in a node of the map. Internally a call to the move-constructor of the pair will be done, or way worst a call to the copy-constructor if one of the two types in the pair cannot be moved. Relying on the move-constructor of a pair to exist AND to be performant is too much of a wishful thinking.

emplace:

Thanksfully, C++11 added on many containers a new member function called emplace. Given a container of type T, emplace will accept the arguments necessary for an in-situ construction of a new instance of T. Meaning that we can easily improve our insertion in this way:

auto [it, result] = m.emplace("johannes", "lucio"); // Construct the pair of "johannes", "lucio" straight into m.

I will go slightly against abseil's recommendation and say that emplace should be prefered over insert in C++11 for at least all the associative containers. It will perform better (as explained previously) and it also feels more natural (most users think of a key and a value, not a std::pair)!

Now emplace on associative containers has a vicious specification which cppreference gently warns you about. For some obscure reasons, even if the emplace operation does not succeed since the key already exists, your arguments passed as a r-value may have been moved anyway. More vaguely, the standard mandates effects only on whether the insertion will happen or not and the return value:

Effects: Inserts a value_­type object t constructed with std::forward<Args>(args)... if and only if there is no element in the container with key equivalent to the key of t.
The bool component of the returned pair is true if and only if the insertion takes place, and the iterator component of the pair points to the element with key equivalent to the key of t.

This cryptic language-lawyer text does not explain in any way what happens to the arguments in the case of failure. For what we know, they could be sent over smoke signal up to Hong-Kong and inserted into some fortune cookies. Why would we care about that? Well, because that will restrain you to write such code:

std::unordered_map<std::string, std::unique_ptr<int>> m;
auto my_precious_int = std::make_unique<int>(42);

auto [it, result] = m.emplace("ricky", std::move(my_precious_int)); // We need to move unique pointers.

if (!result) { // Alright the insertion failed, let's do something else with my_precious_int.
    do_something_else(*my_precious_int); // Or can we?
}

Here my_precious_int is unusable right after the call to emplace, it may have been moved-from forever and ever, EVEN if the insertion result is false. Some confused souls will tell you that this is evident, we called std::move, so it MUST be moved-from. Like the famous cake, std::move is a lie! It does not move anything, it simply casts objects to a x-value which makes them POTENTIALLY moveable-from (this world would have been better if std::move was named std::to_xvalue, std::moveable, std::to_be_moved...).

try_emplace:

This unleashed emplace is a real pain when you are trying to store move-only types in any associative container. The standard committee was aware of the issue and fixed it in C++17 with another member function called try_emplace. Here are the expected effects:

Effects: If the map already contains an element whose key is equivalent to k, there is no effect.
Otherwise inserts an object of type value_type constructed with piecewise_construct, forward_as_tuple(std::move(k)), forward_as_tuple(std::forward<Args>(args)...).

It effectively prevents your arguments to be moved-from (as well as being packaged into fortune cookies) in the case of an insertion failure. Why wasn't emplace simply patched? If you take a look at the definition of emplace, you will understand that it accepts as a key argument any object with a type compatible with your key type. Unlike the value argument, the key argument ALWAYS need to be somehow converted to the key type to check for its existence in the map. The potential conversion would defeat the "there is no effect" policy. try_emplace is stricter and only takes a key of type key type, which guarantees that no conversion sequence will be triggered. At least try_emplace can help us to safely rewrite the previous example:

std::unordered_map<std::string, std::unique_ptr<int>> m;
auto my_precious_int = std::make_unique<int>(42);

auto [it, result] = m.try_emplace("ricky", std::move(my_precious_int)); // We need to move unique pointers.

if (!result) { // Alright the insertion failed, let's do something else with my_precious_int.
    do_something_else(*my_precious_int); // It is safe! 
}

Hurray! After three corrections in the standard, we can effectively mix associative containers with unique_ptrs for a maximum of fun and profit! Well... no. C++ being C++, it cannot be that easy. There is one last boss to slain.

The last dilemma of a map of unique_ptrs:

A container

Before I start on this topic, I would like to remind everyone a basic rule: you should try to avoid heap allocations. You should always strive to get a std::map<T1, T2> over a std::map<T1, std::unique_ptr<T2>>. Now that being said, you may have situations where you cannot do otherwise:

  • If you need runtime polymorphism. For instance, you may need to store services into a std::map<std::string, std::unique_ptr<service>> where service is an interface with multiples concrete implemetations. Although, there are always ways to hide the inheritance as explained by Sean Parent...
  • If your weapon of choice is a map following closely the interface of std::unordered_map or std::map minus the stable addressing part of it. This is often the case for all the hash map with excellent performance, like skarupe's one. Not having stable addressing means that querying the address of a value in the map &map["jeremy"] might give you different results if you do any modifying operations (insert, erase...) on the map. In such case, having an extra indirection (heap allocation) will bring back stable addressing.

Not only dealing with pointers (even smart ones) is often tedious, but it can also ruin the try_emplace member function of your class. You will have to choose between a costly object creation or the dreaded double key lookup I mentioned right at the beggining of this post. Pick your poison!

Uncessary object creation or double key lookup:

Let's keep the idea of a map of services: std::map<std::string, std::unique_ptr<service>>, and you would like to register a service "file_locator" only if there was none registered earlier. Hastily, you may write such code:

std::map<std::string, std::unique_ptr<service>> m;
// ...

// Create a file locator that explore a file system on a remote server.
// remote_file_locator implements the service interface.
auto [it, result] = m.try_emplace("file_locator", std::make_unique<remote_file_locator>("8.8.8.8", "/a_folder/"));

if (!result) {
    // Print which file_locator is already in there. 
    log("Could not register a remote_file_locator, it has been overridden by: ", it->second.name());
}

If the remote_file_locator is successfully registered, everything is fine! But in the other scenario where a file_locator is already in the map, this code has a huge pessimisation. Here your compiler will emit code that allocate enough memory for a remote_file_allocator, then it will construct it, under any circumstances. If allocating can be seen as slow in the C++ world, starting a connection to a server is pure hell when it comes to speed. If you are not planning to use the instance of this really costly object, why would you create it in the first place?

So shall we revert to the double lookup?

auto it = m.find("file_locator");
if (it == m.end()) {
    m["file_locator"] = std::make_unique<remote_file_locator>("8.8.8.8", "/a_folder/");
} else {
    log("Could not register a remote_file_locator, it has been overridden by: ", it->second.name());
}

Hell no! I already explained why I discourage you to use such a pattern in the first part of this post. You could argue that here we will only pay this double lookup overhead only once, when we try to create the "remote_file_locator". Given more time and coffee, I should be able to come up with an architecture where you could do such insertions of unique_ptrs in a loop. In any case, C++ is all about not paying in performance for uncessary things.

But don't worry, C++ being C++, there surely are ways to get around this impediment.

Two clumsy solutions:

I, personally, could come up with two solutions. If you have a better one, you are welcome to express it in the comments.

The first one is actually not so hacky. You can start by trying to emplace an empty unique_ptr, if the insertion works you can always fix it afterwards with a real allocation:

// Try to emplace an empty `unique_ptr` first.
auto [it, result] = m.try_emplace("file_locator", nullptr);

if (result) {
    // The insertion happened, now we can safely create our remote_file_locator without wasting any performance. 
    it->second = std::make_unique<remote_file_locator>("8.8.8.8", "/a_folder/");
}

I somehow dislike this solution. It is not consistent with the usage of try_emplace on more classic types, which do not require any extra step. It really smells like some kind of two-phases initialisation pattern which are usually frowned upon. We are temporarily putting our map into a state where "file_locator" cannot be trusted. What if the actual creation of the remote_file_locator throws an exception? That would leave the map with a empty "file_locator", that's not great.

My second solution consists in trying to delay the construction of the remote_file_locator. To do so, I wrote a very simple helper struct that I called lazy_convert_construct. This struct wraps any kind of lambda that acts like factory: the lambda factory returns an instance of a given type, "result_type", when called. If at any point the struct needs to be converted to result_type, it will call the internal lambda to generate an instance of result_type. Any code should speaks for itself, so here is the lazy_convert_construct beast in all its beauty:

template<class Factory>
struct lazy_convert_construct
{
    using result_type = std::invoke_result_t<const Factory&>; // Use some traits to check what would be the return type of the lambda if called.

    constexpr lazy_convert_construct(Factory&& factory) 
        : factory_(std::move(factory)) // Let's store the factory for a latter usage.
    {
    }

    //                                     ↓ Respect the same nowthrow properties as the lambda factory.
    constexpr operator result_type() const noexcept(std::is_nothrow_invocable_v<const Factory&>) 
    //        ^ enable       ^ the type this struct can be converted to 
    //          conversion
    {
        return factory_();  // Delegate the conversion job to the lambda factory.
    }

    Factory factory_;
};

// Example of usage:
auto l = lazy_convert_construct([]{ return 42; });
//        ^ CTAD again                    ^ Factory lambda that returns an int.
int x = l;
//      ^ Here l is forced to be converted to an int and will therefore call the lambda to do so.
std::cout << x; // Prints 42. 

Note that the lambda will not be called if there is no conversion needed, this makes it having a lazy evaluation. Note also: after turning all optimisations on, the lazy_convert_construct entirely disappears and x will be simply initialised by 42 when needed.

The next step is to combine this lazy_convert_construct with try_emplace, which works like a charm:

auto [it, result] = m.try_emplace("file_locator", lazy_convert_construct([]{ return std::make_unique<remote_file_locator>("8.8.8.8", "/a_folder/"); }));

lazy_convert_construct is now able to create a unique_ptr<remote_file_locator> on-demand. Even with lazy_convert_construct, try_emplace will respect its contract: it will not have any side effect on the lazy_convert_construct object if the key "file_locator" is already present. Meaning that no conversion will happen if the key already exists. This rather elegant solution fixes one of the main drawback of the previous one: it never leaves the map in a state with a file_locator being null. It is also a one liner!

Benchmark results:

Some of you may still be a bit skeptical on the importance of optimising your queries in your associative containers. So I wrote a very simple benchmark which explores multiple insertion scenarios on a std::map<std::string, std::unique_ptr<T>>. You can fetch it here.

Using clang 6.0 on my Linux laptop, I obtain the following results:

2018-11-18 14:16:17
Running ./fast_try_emplace
Run on (8 X 3600 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 256K (x4)
  L3 Unified 8192K (x1)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------------
Benchmark                                      Time           CPU Iterations
-----------------------------------------------------------------------------
insertion_double_lookup                      636 ns        638 ns    1085075
insertion_construct_before_try_emplace       503 ns        506 ns    1387818
insertion_lazy_convert_try_emplace           503 ns        506 ns    1380488

no_insertion_double_lookup                   107 ns        107 ns    6300180
no_insertion_construct_before_try_emplace   8642 ns       8641 ns      80478
no_insertion_lazy_convert_try_emplace         32 ns         32 ns   22393508

Clearly, a sucessful insertion using the double lookup is more expensive than it should. The cost will change depending on the amount of key-value pairs already in the map. For a failed insertion scenario, my lazy_convert_construct is also faster than the double lookup. I cannot explain why! Internally, find and try_emplace should have the same lookup mechanims. And of course, creating a costly object and destroying right after is really bad choice. That explains why no_insertion_construct_before_try_emplace's record is so damn huge compared to the two others cases (I purposely made the type very costly to create for the no insertion cases).

GCC offers similar results, without the mysterious advantage of the try_emplace + lazy_convert_construct over the double lookup in a no insertion scenario.

2018-11-18 14:17:53
Running ./fast_try_emplace
Run on (8 X 3600 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 256K (x4)
  L3 Unified 8192K (x1)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------------
Benchmark                                      Time           CPU Iterations
-----------------------------------------------------------------------------
insertion_double_lookup                      547 ns        543 ns    1310711
insertion_construct_before_try_emplace       493 ns        496 ns    1394343
insertion_lazy_convert_try_emplace           495 ns        500 ns    1000000

no_insertion_double_lookup                    44 ns         44 ns   15787599
no_insertion_construct_before_try_emplace   8659 ns       8658 ns      78869
no_insertion_lazy_convert_try_emplace         44 ns         44 ns   15895884

Now we can proudly claim that we solved all the insertion issues (at least those that I am aware of). But somehow it still feels like something is off with the smart pointer types of the standard library. Are we missing something here?

An in_place constructor:

Warning: here you will enter the somehow controversial and imaginary part of my post. I am not claiming that this is the direction C++ should take on these issues, but merely raising my questions on that topic.

Indeed, unique_ptr and its other smart pointers counsins (shared_ptr...) are rather special types. On one hand you could see them as simple RAII wrappers that take care of a basic resource: a pointer. On the other hand, you could, very arguably (<== note this bold statement), see them as some funky value wrappers with a very special storage (one that implies pointer semantics).

Value wrappers are types that enhance the property of another type(s). The most recent ones in the standard are: std::optional, std::variant and std::any. As expected, all of these new value wrappers have constructors that accepts an instance of the type they are wrapping:

struct A {
    A(int args1, std::vector<int> args2) { /* do some with the args */}
};

// Move construct the newly created A into o.
std::optional o(A{42, std::vector<int>{42}});

While such constructors might be sufficient for most of the usages of your value wrappers, sometimes you really want to avoid any move or copy constructors. The standard committee was proactive and provided another set of constructors to build the wrapped value in-place. In order to disambiguate with the usual constructors, these new constructors take as a first argument a tag type: std::in_place or std::in_place_type. Here is how std::in_place works with std::optional:

//            ↓ No CTAD            ↓ The arguments needed for constructing a new instance of A.
std::optional<A> o(std::in_place, 42, std::vector<int>{42});
//                   ^ dispatch to in place constructor.

With such a constructor, we can safely assume that the wrapped instance of A was built directly in its storage place. Of course, you can also use this constructor if you are dealing with map::try_emplace:

std::map<std::string, std::optional<A>> m;
m.try_emplace("bruno", std::in_place, 42, std::vector<int>{42});
//                           ^ Will construct the wrapped A deep down in the map.

At this point, if you are following my anology between std::unique_ptr and value wrappers, you could start to question yourself on why we could not get a similar set of constructors for our smart pointers. Maybe something similar to this:

std::unique_ptr<service> x{std::allocate_in_place<remote_file_locator>, 42, std::vector<int>{42}};
//                          ^ tag                   ^ concret instance   ^ following args.

I intentionally chose a different tag type than std::in_place for two reasons:

  • This constructor is doing more than constructing in place, it does allocate. The user should be informed.
  • In a similar fashion to std::in_place_type , we somehow need to encode the concret type we want to instantiate.

With such a constructor in the standard for unique_ptr, our issue with try_emplace would become trivial to solve. Just call it!

auto [it, result] = m.try_emplace("file_locator", std::allocate_in_place<remote_file_locator>, "8.8.8.8", "/a_folder/");

The constructor of unique_ptr that accept the tag allocate_in_place would be called only if only the key "file_locator" is not in there. No overhead, simple syntax, you could not ask for more!

As a side effect, my guess is that we could also fully deprecate the usage of make_unique and make_shared:

// Before:
auto x = std::make_shared<something>("michel", "christian");
// After
auto x = std::shared_ptr(std::allocate_in_place<something>, "michel", "christian"); 

Obviously the syntax is far from being terse. The invented feature also does not take in consideration the allocator arguments you could receive in a smart pointer. Me and a colleague promised ourselves to look a bit more into this topic. Whether we will formulate a proposal or just have some afterthoughts, be sure that I will let you informed on that in a further post!

Conclusion:

  • If you are inserting a new pair into an associative container consider using try_emplace first.
  • If you cannot use C++17, prefer to use emplace over insert.
  • If you are cannot use C++11, I feel sorry for you!
  • You can borrow my lazy_convert_construct if you are dealing with smart pointers and try_emplace, to get a blazzing fast insertion.

A special thanks to my colleague Yo Anes with whom I had a lot of fun discussing this specific topic.


Trip report - CppCon 2018

Posted on Sun 07 October 2018 in C++ • Tagged with C++, event, cppconLeave a comment

New year, new conference! This time, my employer, King, helped me to organize a first pilgrimage to CppCon for me and another colleague. You cannot fathom how enthusiastic I was to finally making it there! Although I might be a bit late on the "trip-report-race", I think that it is still worth to relate my overall experience of the event and then move onto a list of recommended talks you should watch-out on Youtube.

About CppCon (2018):

The event:

CppCon is the most renowned conference for all the C++ afficionados. So far held anually in the cozy city center of Bellevue, Washington (or less precisely somewhere close-by to Seattle for those like me that are not into north-american geography), CppCon let you explore the C++ world with various talks, keynotes and other activities provided by some accomplished members of the community. The event is usually happening at the end of September and lasts 5 days, or even more for those attending the training sessions. The program is really plentiful and you would need to watch presentations from 8.00am to 10.00pm, have an ubiquity capacity to be simultaneously in 6 rooms at the same time, and have infinite memory to be able to absorbe all the C++ knowledge flowing during these days.

Do not get me wrong, while the conference is impressive when it comes to the amount of content, that does not imply that this content is out of the reach of the commoners. On one hand you have hairy topics like "Compile-time programming" being discussed, and on the other hand you have acces to gentle introductions to some features of the language. Dedicated or novice C++ users will appreciate as much the conference but for different reasons. A novice will bring back home a lot of new keywords/concepts to plunge into. Someone who follow C++ news, may not make as many discoveries at CppCon than at CppNow, but she/he will gain a lot of inspiration/motivation from other C++ fellows and will be exposed to other point of views on the language.

Does this sound exciting to you, but you missed the opportunity to come this year? No worries, CppCon has one of the best Youtube channel where they upload all the talks. I have always been impressed by the quality of their videos and the frequency at which they can produce such material. Now, I can say that behind the scene (or actually in front...), they have an excellent video-shooting crew in each of the conference rooms. On a side note, all keynotes were introduced by a live rock-band, which you can see at the beggining of the videos. Whether you appreciated the music or not, it is hard to deny that CppCon organizers really put a lot of efforts in the entire event!

My experience over there:

Right before leaving Stockholm, I had the great idea to listen to my favorite C++ podcast CppCast. The guest for that week was Bryce Adelstein Lelbach who is one of the organiser of CppCon. Bryce had four advices for any new attendee of the conference, which matched my experience at Meeting C++ last year and turn out to be valid for CppCon too:

  • Have no regrets! Indeed, as an attendee you will quickly discover that two or three appealing talks will be on the same time slot. Just randomly pick one of the talk or none if you need to rest a bit. If luck is not on your side that day and you missed a terrific talk, you will naturally hear about it from others and you will have plenty of time to binge-watch it on Youtube later on. Having that in mind, I was much more relax at this conference than I was at my very intense days at Meeting C++. If you really dislike to choose you will anyway ends-up following someone, you just chit-chated with within the corridors, to the talk of his choice. Which brings us to the second advice you should follow...
  • Engage with the community! If you are planning to go to CppCon to appreciate a better version of the talks than on YouTube, you are doing it wrong! By going over there, you actually loose the possibility to watch the talk at your own pace... What you gain instead, as an attendee, is the opportunity to mingle with people with very different backgrounds, exchange tips and tricks and feel part of community bigger than the few C++ colleagues you are working with. Programmers seldom have the reputation of being extroverts, but you will always find someone that can introduce to his connections and slowly build relationships. I met quite a few people from SwedenCpp over there and it was really fun to see them in another context!
  • Be confused! You should not be afraid to be out of your confort zone when it comes to the content of the talks. You may be mortified at the idea that a guru will suddenly drop a very complicated C++ concept on stage, and you will be there lonely, not knowing what on earth he/she/it is talking about. Truth is, very few people (if not none) can claim knowing everything about such a vaste language that is C++. Usually during a "productive" C++ conference, you will dress-up a list of keywords / ideas that you will explore later-on. This year, I promised myself to look further on the edge-cases of Class Template Argument Deduction (CTAD), prepare myself for C++20's contracts and play with clang's tooling internals.
  • Be bold! The concentration of "legendary C++ devs" per square meter is even higher than in Meeting C++. While I did not shoot any selfie (not that I wanted to either) with any of these legends, I discussed briefly with few of them, and you can too! People at CppCon are here to celebrate their favorite language, not to act as elitists, which make everyone very approachable. One of the talk I attended was on a proposal (Woes of Scope Guards and Unique Resource - 5+ years in the making - By Peter Sommerland) I implemented at work. During the talk, Peter mentioned a relatively serious bug introduced in one of the revision. Right after that talk, I had the unique chance to have a face-to-face discussion with him. It turns out, my implementation did not suffer from that bug, but was instead hiding another one. I was so glad that I could access to that person so easily!

Following these precepts, I experienced a very wonderful week made of C++ and human interactions, and I would highly recommend CppCon to anyone having a slight interest in the language.

I still have one and only one complain about the event: the time slot for the lightning talks. For those that are not aware, the lightning talks are short presentations of roughly 5min, easier to submit, often light-minded but also damn interesting. Due to the short format, people often go straight to the point which is really pleasant to watch. For instance, this year there was an epic lightning talk battle on East Const vs West Const or a touching "Thank You" speech from Dr. Walter E. Brown. If that sounds interesting to you, you will have to stay awake from 8.30pm to 10.00pm which is where my grudge comes from. After absorbing some C++ since roughly 9.00am, and with a pretty strong jetlag (=~9h for central Europeans) you really need to channel all your inner motivation to attend any of these late activities. The lightning talks being such joyfull part of CppCon, I would argue that some of them could be moved to an earlier slot in the day...

Enough of my pseudo-rant on an almost perfect event and let's continue with some more concrete reporting!

The chief's suggestions of the year:

Once again, here is a menu of most of the talks I particulary enjoyed. The legend follow the same rules:

  • 💀 : The difficulty of the talk (💀: Begginer friendly, 💀💀: Intermediate, 💀💀💀: High exposure to C++'s dark corners)
  • ★ : My interest for the talk (★: Good talk, ★★: Tasty talk, ★★★: Legendary talk)

I promise you not to spoil everything in the talks, but simply try to give an overview of what you can expect within them or some conclusions. Although most of the talks are most likely worth to be seen, I forced my-very-subjective-self to pick very few of them. I have seen people with very different "favorite talk" and you should not feel sad if your own talk is not part of this not-so-prestigious list.

[Keynote] Concepts: The Future of Generic Programming (the future is here) - Bjarne Stroustrup - 💀★★★:

The Bjarne himself, creator of C++ (for those really out of the loop), kick-started the conference with a keynote on a long-overdue C++ feature: Concepts. Indeed, as early as 2003, Bjarne and some members of the committee have been pushing for Concepts as an extension to C++ templates to simplify the life of us poor template consumers. Sadly, a first version, dimmed too complicated, of the feature was rejected for C++11. Nine years later, Bjarne and his crew is back with a version "Concepts Lite" that will be integrated into C++20.

Let's face it, C++ templates have always been a pain to use due to the absence of constraints on the template parameter types. Take a look at this very simple piece of code:

template <class T>
void foo(T&& t) {
    t.print();
}

template <class T>
void bar(T&& t) {
    foo(t);
}

struct A {};

int main() {
    A a;
    bar(a);
}

Your never-satisfied compiler will refuse to compile it with the following error message:

<source>: In instantiation of 'void foo(T&&) [with T = A&]':
<source>:10:8:   required from 'void bar(T&&) [with T = A&]'
<source>:17:10:   required from here
<source>:5:7: error: 'struct A' has no member named 'print'
     t.print();
     ~~^~~~~

This is an horrifying warning message as we often see nowadays. Your compiler do warn you about the absence of a print method in A far down in foo! In this specific case, we only have two layers in the call stack which is very reasonable. But most often when using functions provided by the Standard Library, your type will violate some constraints reaaallly deep down. As a newcommer you will often struggle with your compiler vomiting a bizarre message of 100 lines when using templates, which is not the best first experience (to say the least). Moreover, by looking at the signature of bar, nothing tells you that the template parameter T needs to have a print member function. That's frustrating!

With concepts you can define in elegant way the constraints that your parameters must respect. Have a look at we could do with Concepts in this case:

template <class T> concept bool IsPrintable = requires (T a) { // Define a reusable concept "IsPrintable".
    a.print(); // Any template parameter of T respecting "IsPrintable" must have a print function. 
};

template <IsPrintable T> // Explicitely tells that T must respect the concept.
void foo(T&& t) {
    t.print();
}

template <IsPrintable T> // Same here!
void bar(T&& t) {
    foo(t);
}

Which would give us the following much more accurate error message:

<source>:19:10: error: cannot call function 'void bar(T&&) [with T = A&]'
     bar(a);
          ^
<source>:11:6: note:   constraints not satisfied
 void bar(T&& t) {
      ^~~
<source>:1:33: note: within 'template<class T> concept const bool IsPrintable<T> [with T = A&]'
 template <class T> concept bool IsPrintable = requires (T a) {
                                 ^~~~~~~~~~~
<source>:1:33: note:     with 'A& a'
<source>:1:33: note: the required expression 'a.print()' would be ill-formed

Some sceptical gurus will tell you that this can be emulated with some witch-crafted SFINAEs. Concepts were not on my christmas wish-list for C++ this year, but I have to admit that Bjarne's talk hipped me a lot!

[Talk] How to Write Well-Behaved Value Wrappers - Simon Brand - 💀💀💀 ★★:

Good speakers are like good wine, you are seldom disappointed. Having attended Simon's talk on debugger internals last year and enjoyed it a lot, I took the chance to hear what he had to say on value wrappers. You may not know what is a value wrapper, but I surely bet that you already manipulated some provided by the Standard Library: std::pair, std::tuple, std::optional...

It might be a breeze to work with most of these wrappers (certainely not you verbose std::variant), but the poor souls writting implementations of these must go to great lengths to make them behave as they should. Value wrappers, as their name imply, try to mimic as closely as possible the type of a given value. All the types have some basic properties: can it be explicitely constructed? Are the special member functions noexcept? Can it be called in a constant expression? Can it be compared? And much more... A wrapper must react exactly in the same way when it comes to these properties.

In this talk, Simon compares the old fashioned way to tackle these issues and present how it will look like as soon as the concepts and a bunch of other proposals arrive in C++20:

  • Explicit operator akin to the noexcept operator
  • P0748R0 to use concepts-based require clauses on these member functions to enable or disable them. A.K.A goodbye a lot of uncessarry conditional inheritance.
  • P0847R0 which permits to deduce correctly the implicit this parameter type of a member function. A.K.A goodbye all the methods overloading on const, volatile and ref qualifiers.
  • P0515R2 that unify all the comparison operators into one. A.K.A goodbye all the operator overloads.
  • ... Some more that I surely forgot.

Regardless if you are planning to write such wrappers or not, I would suggest to watch the talk to refresh yourself on some tricky C++ mechanisms.

[Talk] Fancy Pointers for Fun and Profit - Bob Steagall - 💀💀 ★★:

Bob Steagall promoted his own talk on fancy pointers in an early September CppCast episode. So here I was, ready to learn more about these mystical beasts!

Allocators in C++ are rather infamous for their over-engineered interface, which is not useful 99.42% of the time. This even forced the committee to come-up, in C++17, with a lighter interface called PMR. But this time, the good old full-blown interface found a very clever usage in Bob's hands. Indeed, std::allocator_traits has a nice type property pointer. Which means that the Standard offers a nice customization point to switch normal pointers T* for a given allocator to a type that acts like a pointer would. These wanna-be pointers are what you call fancy pointers. Somehow, you can think of fancy pointers as more generalized concept of smart pointers.

Now let's say that you would like to serialise to / deserialise from binary a container-like object (vector, list...) with a bunch of trivial objects inside and send it through a network. Providing that you are targeting one and only one architecture, which is often the case for servers, you may be able to use std::memcpy to transfer the representation of this container into a char* buffer. Then, the whole buffer can be wired to another machine. At the endpoint, to deserialise the container from that buffer you can re-use std::memcpy to copy back the binary representation into a container (note that you cannot rely reinterpret_cast in C++ for that purpose...). This will work smoothly as long as none of the stored PODs have pointers as members referencing each others! Pointers are usually not valid as soon as you cross the boundary of a process or a machine. This huge drawback can be avoided by introducing fancy pointers to your code-base.

As an example on how to fix that issue, Bob brings offset_ptr to the table, which permits to express some reference between two elements using their distance from each others:

struct obj
{
    value_type v;
    offset_ptr<value_type> p; // optional
}; 

Example of the layout of a container of objs:

                               -2
                +-----------------------------+
                |        1                    |
                |     +-----+                 |
 +--------------v-----------v--------------------------------+
 |  v  |  p  |  v  |  p  |  v  |  p  |  V  |  p  |  v  |  p  |
 +-----------------------------------------------------------+
          |                             ^
          |                             |
          +-----------------------------+
                       3

With a bit of boilerplate, this offset_ptr can be handled by a custom allocator that can be injected into a container, making different address mappings a non-issue. I find this solution pretty elegant and is a good showcase on how extensible the Standard Library is.

[Talk] RVO is Harder than it Looks: the story of -Wreturn-std-move - Arthur O'Dwyer - 💀💀 ★:

It is commonly admitted that returning by value seldom has a performance impact in C++. Two mechanisms (N)RVO and move semantics will most likely kick in to avoid unecessary copies:

struct A
{
// ... some members
};

A foo() {
    A a;
    // ...
    return a;
}

A a = foo(); // The return value is directly constructed in a's stack location (RVO), can fall back onto the move-constructor otherwise. 

As time goes by, the C++ standard has stronger and stronger guarantees that copy ellision (RVO) will happen in these situations. At the same time, forcefully moving the return value can a pretty huge pessimisation and is taught as an anti-pattern:

A foo() {
    A a;
    // ...
    return std::move(a); // Will hinder the compiler to choose RVO instead of move constructor.
}

In the worst case scenario, if the object has no move-constructor, the compiler might resort to use the copy constructor, which could have been avoided with RVO.

Now in the C++ land nothing really holds true if you look closer at some corner cases. And the "no-move-on-return-values" rule mentioned right above can be debated for that reason. Arthur was valiant enough to inquire into this topic and found few cases where a call to std::move will BE an optimization. Notably, if you return a value with a type convertible to the function's return type thanks to an explicit conversion operator, you should apply std::move. Arthur introduced a new warning in clang -Wreturn-std-move to avoid this pitfall. I will gadly turn that warning on as soon as I can.

I liked the talk for delving into such precise topics ; although, Arthur rushed on quite a few slides and even skipped a whole bunch of them, meaning that there was more to say on this theme.

[Talk] State Machines Battlefield - Naive vs STL vs Boost - Kris Jusiak - 💀💀 ★:

Kris Jusiak is the proud author of two library aspiring to be into Boost: Boost.DI and Boost.SML. This talk was partly based on the later one. More precisely Kris compared how different implementations of a state machine would fare in term of performance and ease to maintain.

Kris started with a good ol' implementation designed around a giant switch case roughly similar to this code:

class connection {
    // ...
    void update(event e) {
        switch(state_) {
            case connecting:
                if (e && e->type == event_type::established) {
                    state_ = state::connected;  
                    log("connected");   
                }
            break;
            case connected:
                // ...
                do_connected_things();
                // ...
            break;
            // ...
            default:
                throw runtime_exception("bad state");
            break;
        }
    }
    // ...

    int variable_for_connected_state;
    int another_variable_for_connected_state;
    int variable_for_disconnected_state;
    // ...
    state state_;
};

Surely, this implementation will perform rather decently, but at the cost of being extremely hard to maintain if the amount of states increase. Sadly, a lot of code-bases for games or networking have plenty of these ugly state machines sprinkled around. C++ is all about zero-cost abstractions, which means that if you want to avoid some serious posttraumatic stress disorders after working on such projects, you may want to look at other choices than switch.

Therefore, Kris jumped onto other implementations. One of the them is using std::variant which reminded me a lot a blog post from Kalle Huttunen. std::variant will permit you to isolate the variables necessary for your different states and will enforce a stricter handling of your state with std::visit. In my opinion this solution is huge improvement compared to using a switch and does not require the introduction of an external library into your project. As I will explain later, std::variant may or may not have a slight performance impact.

After dwelling with two oldish and rather slow Boost libraries that can help to design state machines, Kris presented us his work. I have to admit that the DSL provided by his library looks very pleasant to use:

// Coming straight from Kris's slides:
sml::sm connection = []{
    using namespace sml;
    return transition_table{
        * "Disconnected"_s + event<connect> / establish = "Connecting"_s,
        "Connecting"_s + event<established> = "Connected"_s,
        "Connected"_s + event<ping> [ is_valid ] / reset_timeout,
        "Connected"_s + event<timeout> / establish = "Connecting"_s,
        "Connected"_s + event<disconnect> / close = "Disconnected"_s
    };
};

Boost.DI is performing very well according to Kris and is on par with the switch solution according to his benchmark. Boost.DI offers different dispatch strategies to get the current state: recursive branching, jump table, fold expressions... It turns out that the recursive branching is amongst the fastest, yelding results as close as the giant switch strategy. I am not so surprised by these results, since we observed a similar pattern at work with our custom implementation of std::visit. As far as I know, clang and gcc visit their std::variant using a jump table, which may explain the slight performance drop compared to a giant switch. These are good news though, it means that there is room to improve the Quality of Implementation (QoI) of std::visit in our favorite libraries.

[Talk] Compile-time programming and reflection in C++20 and beyond - Louis Dionne - 💀💀💀 ★★★:

Three skulls, three stars, nothing unusual when it comes to my judgement on Louis Dionne's talks. I am very fond of (template) meta-programming, and I have always been in awe of Louis's work on Boost.Hana and more recently dyno. This year, he was on stage to give us an overview on what we could expect in the upcomming standards concerning constexpr, and how this would unlock a better interface for reflection.

We are slowly but surely reaching the point where we will be able to "allocate" at compile-time and convert most of our code-bases to constexpr within a blink. Louis explained what are the necessary changes we need to apply to constexpr to be able to use it in expressions where we do allocate:

  • Allowing constexpr non-trivial destructors, allowing heap allocation and placement new that you will find in P0784R0.
  • Having the new trait std::is_constant_evaluated from P0595R0 that queries whether the compiler is currently evaluating the function in a constexpr context or not. Surprisingly, you will NOT use that trait within a if constexpr statement ; this would always be evaluated as constexpr and return true, a simple if does the job. This trait is an absolute necessity if we want to share a single interface for both a constexpr and runtime implementation of a feature (a std::vector...). Behind the scene, constexpr code usually has very different demands to perform corretly than standard runtime code.
  • Support try-catch statements within a constexpr expression which we would get from P1002R0. Note that this does imply that the compiler
  • Some other minor changes that must appear in some other hairy white papers ...

Taking all these changes in consideration, we should be able to slap constexpr on many containers and algorithms from the STL (vector, string...). That would make the usage of constexpr very trivial to any decent C++ developer.

It will also be a great paradigm shift for the planned reflection within the language. The standard committee used to formulate a reflection proposal based on template meta-programming, which dreadfully reminds you some kind of Boost.MPL. While templates are powerfull, the syntax to manipulate types appears alienesque to most of the human coders. Constexpr-based metaprogramming looks a lot more natural and having proper containers was the last missing part of the puzzle to use that syntax for reflection. If you are in doubt, have a look at this very short example from Louis:

struct my_struct
{
    int x;
    std::string y;
    // ...
};

// Get the type of the first member of my_struct using the old template-based syntax:
using my_struct_meta = reflexpr(my_struct);
using members = std::reflect::get_data_members_t<my_struct>; // Some weird template list-like type.
using x_meta = std::reflect::get_element_t<0, members>; // Some ideaous index accessor.
using x_type = std::reflect::get_reflected_type_t<x_meta>;

// Get the type of the first member of my_struct with the new fancy constexpr-based syntax:
constexpr std::reflect::Record my_struct_meta = reflexp(my_struct);
constexpr std::vector members = my_struct_meta.get_data_members(); // Uses the good ol' vector and class template argument deduction guides from C++17.
constexpr std::reflect::RecordMember x_meta = members[0]; // Just use the operator[] as usual... 
using type = unreflexpr(x_meta.get_reflected_type()); // Get that actual type of x.

If you want to have a better understanding on the proposed syntax, have a look at P0962R0.

[Keynote] Thoughts on a More Powerful and Simpler C++ (5 of N) - Herb Sutter - 💀💀 ★★★:

The last two years, at CppCon, Herb brought us his vision on a future C++ full of promises. Both of these talks were accompanied with some concrete actions (white-papers, guidelines, proof-of-concepts..) that Herb was working on with the rest of the fellowship of the C++s. This year, Herb shared with us some more results on his goals. It might not sound like a thrilling talk... but that would be under-appreciating the two main ideas Herb was initially pushing for: lifetimes and meta-classes.

Lifetimes are some implicit or explicit rules that directly concern the ownership of an object. If such lifetime rules are adjusted correctly, your code should be bulletproof when it comes to a huge range of bugs related to memory: user after free, dandling pointers... Some languages like Rust even make it a core concept of the language. Arguably, Herb's lifetimes will be slightly more relaxed (no annotations on everything) and more natural to use, at the price of not covering some extreme cases. Let's have a look at what these so-called lifetimes may protect you from:

int& foo() {
    int a;
    return a; // Oups I am returning a reference to a local variable that will die right after that function execution.
    // Some compilers may warn you about it, some may not! 
}

std::reference_wrapper<int> foo() {
    int a;
    return std::reference_wrapper<int>(a); // Same issue. No compiler warns you about it! 
}

After applying the rules elaborated by Herb and his crew, the lifetime of a would be dimmed as ending at the end of foo and the compiler would yield a strong warning or a plain error. Here std::reference is considered as a pointer/reference type and will be highly scrutinised by the compiler. If you combine the lifetimes and the concepts, your compiler or linter may be able to discover the pointer types automagically!

Another trivial bug yet often spawning nastily in your code is the dreaded "use-after-move" situation. Here again, lifetimes would avoid an easy shoot in the feet situation:

my_class my_obj;
my_class another_obj = std::move(my_obj);

my_obj.x->bla = 42; // lifetime warning: using a moved-from obj is seldom a good idea.

All these smart lifetime rules are often based on recommendations that you may find in C++ Core Guidelines. Having them enforced within your projects is amazing. I am eager to try the clang implementation of it. Later in the day Matthias Gehre and Gabor Horvath did show us the internals of clang that will support this new feature.

After mesmering the crowd with the lifetimes, Herb gave us some updates on the meta-classes, which were mainly some changes in the syntax. While I really appreciate the efforts put into meta-classes, I still have doubts that I will enjoy such a feature before I am retiring (roughly in 50 years from now). The lifetimes were much more concrete and fathomable when it comes to my daily C++ life.

[Talk] Better C++ using Machine Learning on Large Projects - Nicolas Fleury and Mathieu Nayrolles - 💀 ★:

You can certainely rely on C++ to improve your AI projects, but can you use an AI or machine learning to improve your C++ project? The two "cousin-frenchies" Nicolas and Mathieu had the smart idea to detect bugs in pull-requests using some kind of machine learning that analyse previously reported issues in their code-base.

The presentation did not contain much of actual C++ code, but was more focused on the process they invented to automatically fetch, analyse and post feedback on any submitted code. I am not an expert on these topics and would not dare to emit any comment on what they presented us. It seems that after training, the classifying algorithm they put in place was able to predict with a success rate of 70% whether a piece of code would have a negative impact or not. Their next step would be to add some automatic code correction facilities by applying machine learning on the fixed cases. Triple A games tend to reuse a lot of variations of the same code across multiple titles, WITHOUT actually sharing it (new games are just cloned from old ones). With this process in place, the projects are spreading the awareness of some issues very easily. It seems like a huge time saver.

In any case, it was a breeze to attend a slightly less C++ oriented talk. There was a lot of questions regarding the human aspect of that technology. Is 70% of success rate high enough not to piss-off the users experimenting with the bot? My experience is that a lot of false positive in a linter, will invariably make people turn it off at the earliest opportunity... Would you be able to spot the bad programmers in your team with such a tool? Thanksfully, the labor rights in Canada (Montréal) should protect the employees on that topic... And many other interesting facts that you can discover in the video.

[Talk] Class Template Argument Deduction for Everyone - Stephan T. Lavavej - 💀💀 ★★:

Class Template Argument Deduction, also known as CTAD, is new tiny feature added into the C++17. While not being an amazing game changer, CTAD can been seen as some very tasty syntaxic sugar that avoid you to specific the template argument of a class template when instantiating it. In other simpler words: it can avoid you to call make_xxx function when there is enough information for the compiler to deduce the template paramters of a class template. Here is what the CTAD lipstick looks like on std::pair:

// Before C++17:
std::pair<int, const char*> p1 = {42, "test"};
auto p2 = std::make_pair(42, "test");

// After C++17, with CTAD:
std::pair p3 = {42, "test"}; // No needs to specify the template argument "int" and "const char*".
auto p4 = std::pair(42, "test"); // Another way to construct following the always auto rule.

In many instances, you do not have to update your class template to benefit from CTAD. But when you do, one must understand how to help the compiler using some deduction guides. STL's (known as Stephan T. Lavavej) dedicated and still dedicate his life to maintain the STL (also known as Standard Template Library) for MSVC. Stephan apparently had a first hand experience on the deduction guides when adding CTAD to the standard containers in the STL and wanted to explain the gist of it.

Deduction guides are "pseudo-constructors" declared out of the targeted class, that are evaluated right before going to the steps of template parameter deduction, substitution and all the subsequent mess. When instantiating a given class, all the deduction guides follow the overload resolution and template argument deduction rules that you would expect if applied on normal functions. The return type of the chosen deduction guide, will be the one used by the following steps. Now this sounds very wordy, but it is actually fairly trivial to write:

template <class T> 
struct foo          // A trivial class template.    
{
    foo(T&& t) {}   // Do something with t...
};

template <class T>
foo(T t) -> foo<T&>; // Declare a deduction guide: given a type T, I will help to deduce a foo with a first parameter being a reference to this type T. 

int a = 42;
auto bar = foo(a); // Bar will be foo<int&> thanks to the CTAD and this deduction guide.

I chose this example as it has two interesting tidbits. First you will notice that I apply a transformation on T in the return type: the template parameter becomes a T*. It turns out that you can do a lot more in this place: you can invoke satan some traits or inject a SFINAE expression (Oh my...! I really have to push that idea further). The second unexpected part is that my guide does not have the same signature as my constructor. Indeed, one takes T as an r-value reference, the other one by value. That's really fortunate, unlike the make_xxx functions which would take universal references and decay the arguments, the deductions guides can rely on the automatic decaying of template parameters taken by value. Stephan has a lot more of nitty-gritty details on how deduction guides behave and it would take a full a post to explain some of them, just watch his talk instead!

[Talk] The Bits Between the Bits: How We Get to main() - Matt Godbolt - 💀💀 ★★★:

Matt Godbolt, author of the amazing website godbolt.org, has a very pedagogic mindset and is a naturally gifted speaker (if you ever wanted to understand the Meltdown Attack, you should make a detour on his YouTube video and come back to this blog post afterwards). This time Matt wanted us to have a closer look at the hidden steps from the linking stage of your build until the beggining of the execution of your main entry point. More precisely, how your linker select the symbols that will appear into your application and which are the mechanisms that allow the creation of global variables right before entering main.

Being didactic as usual, Matt did some live debugging sessions, using GDB, objdump, readelf and such, to make us conceive how things work under the hood. It covered the sections you can find within an application (.data, .text, ...), the One Definition Rule (ODR), the linker's gathering of informations, the mysterious __static_initialization_and_destruction_0 and its associates... His approach of solving one problem at a time using simple tools make it very easy to comprehend fully what is going on in there.

I made two discoveries during that one hour debugging session:

  • LD, the GNU linker (and very likely clang's one too) uses a scripting language to define the rules for each section of your binary. I wish to never have to dabble in this language for work purpose.
  • ld.so, the GNU dynamic linker reacts on an environment variable called LD_DEBUG. By setting this variable to =all (or something more precise), the dynamic linker will output all operations and some extra info when loading a dynamic library. It is very convenient if you want to know which libraries get loaded by your process, which symbols it will use, etc... Here is what the output would look like if your process is somehow fetching getenv:
     15257: binding file wc [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `getenv' [GLIBC_2.2.5]
     15257: symbol=abort;  lookup in file=wc [0]
     15257: symbol=abort;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]

Here we know the getenv resides within the libc.so library which is the C runtime library.

[Other]:

Sadly, I would need to (std::)allocate myself a lot of more time to be able to cover all the goodies you can get from this CppCon. I have watched many different talks and had great echoes from some others. Here is what I would consider worth to google for:

  • Spectre: Secrets, Side-Channels, Sandboxes, and Security - by Chandler Carruth
  • The Nightmare of Initialization in C++ - by Nicolai Josuttis
  • The Salami Method for Cross Platform Development - by Adi Shavit
  • Compile Time Regular Expressions - by Hana Dusíková
  • And many more...

Conclusion:

The week I spent fully immersed in C++ was really special! A bunch of passionated and dedicated persons can build really great events. Hopefully, I will be able to join in denver next year or the year after, maybe as a speaker this time (I should not stay a simple lurker forever ;))! If not, be sure to find me at some other C++ conferences in Europe (Meeting C++, C++ On Sea...) or some local meetup groups, this community is trully amazing.


Meta Crush Saga: a C++17 compile-time game

Posted on Sat 19 May 2018 in C++ • Tagged with C++17, TMP, meta programming, constexprLeave a comment

Trivia:

As a quest to obtain the highly coveted title of Lead Senior C++ Over-Engineer, I decided last year to rewrite the game I work on during daytime (Candy Crush Saga) using the quintessence of modern C++ (C++17). And... thus was born Meta Crush Saga: a compile-time game. I was highly inspired by Matt Bernier's Nibbler game that used pure template meta-programming to recreate our the famous snake game we could play on our Nokia 3310 back in the days.

"What the hell heck is a compile-time game?", "How does it looks like?", "What C++17 features did you use in this project?", "What was your learnings?" might come to your mind. To answer these questions you can either read the rest of this post or accept your inner laziness and watch the video version of this post, which is a talk I made during a Meetup event in Stockholm:

Disclaimer: for the sake of your sanity and the fact that errare humanum est, this article might contain some alternative facts.

A compile-time game you said?

I believe that it's easier to understand what I mean by the "concept" of a compile-time game if you compare the life cycle of such a game with the life cycle of a normal game. So here it is!

Life cycle of a normal game:

As a normal game developer with a normal life working at a normal job with a normal sanity level you would usually start by writing your game logic using your favorite language (C++ of course!), then fire your compiler to transform this, far too often spaghetti-like, logic into an executable. As soon as you double-click on your executable (or use the console), a process will be spawned by your operating system. This process will execute your game logic which 99.42% of time consists of a game loop. A game loop will update the state of your game according to some rules and the user inputs, render the newly computed state of your game in some flashy pixels, again, again and again.

Life cycle of a normal game

Life cycle of a compile-time game:

As an over-engineer cooking the next big compile-time game, you will still have use of your favorite language (still C++ of course!) to write your game logic. You will still have a compilation phase but... then... here comes the plot twist: you will execute your game logic within this compilation step. One could call it a compilutation. This is where C++ truly comes in handy ; it has a some features like Template Meta Programming (TMP) or constexpr to actually have computations happening during the compilation phase. We will dive later on the features you can use to do so. As we are executing the logic of the game during this phase, we must also inject the user inputs at that point in time. Obviously, our compiler will still output an executable. What could it be used for? Well, the executable will not contain any game loop anymore, but it will have a very simple mission: output the newly computed state. Let's name this executable a renderer and its output the rendering. Our rendering won't contain any fancy particule effect nor ambiant occlusion shadows, it will be in ASCII. An ASCII rendering of your newly computed state has the nice property that you can easily show it to your player, but you also copy it into a text file. Why a text file? Obviously because you can combine it with your code in some way, redo all the previous steps and therefore have a loop.

As you may understand now, a compile-time game is made of a game-loop where each frame of your game is a compilation step. Each compilation step is computing a new state of your game, that you can present to your player and also inject to the following frame / compilation step.

I let you contemplate this magnificient diagram for as much time as it takes you to understand what I just wrote above: Life cycle of a compile-time game

Before we move on the implementation details of such a loop, I am sure that you have one question you would like to shoot at me...

"Why would you even do that?"

Do you really think I would let you break my C++ meta-programming idyll with such a fundamental question? Never!

  • First and foremost, a compile-time game will have amazing runtime performances since most of the computations are done during the compilation phase. Runtime performance is a key to the success of your AAA game in ASCII art!
  • You lessen the probability that a wild crustacean appears in your github repository and ask you to rewrite your game in Rust. His well-prepared speech on security will fall appart as soon as you explain that a dangling pointer cannot exist at compile-time. Smug Haskell programmers might even approve the type-safety of your code.
  • You will gain respect from the Javascript hipster kingdom, where any over-complicated framework with a strong NIH syndrom can reign as long as it has a catchy name.
  • One of my friend used to say that any line of code from a Perl program provides you de-facto a very strong password. I surely bet that he never tried generating credentials from compile-time C++.

So what? Aren't you satisfied with my answers? Then, maybe your question should have been: "Why could you even do that?".

As a matter of fact, I really wanted to play with the features introduced by C++17. Quite a few of them focus on improving the expressiveness of the language as well as the meta-programming facilities (mainly constexpr). Instead of writing small code samples, I thought that it would be more fun to turn all of this into a game. Pet projects are a nice way to learn concepts that you may not use before quite some time at work. Being able to run the core logic of your game at compile-time proves once a again that templates and constepxr are turing complete subsets of the C++ language.

Meta Crush Saga: an overview

A Match-3 game:

Meta Crush Saga is a tiled-matching game similar to Bejeweled or Candy Crush Saga. The core of the rules consists in matching three or more tiles of the same pattern to increase your scores. Here is sneak peek of a game state I "dumped" (dumping ASCII is pretty damn easy):

R"(
    Meta crush saga      
------------------------  
|                        | 
| R  B  G  B  B  Y  G  R | 
|                        | 
|                        | 
| Y  Y  G  R  B  G  B  R | 
|                        | 
|                        | 
| R  B  Y  R  G  R  Y  G | 
|                        | 
|                        | 
| R  Y  B  Y (R) Y  G  Y | 
|                        | 
|                        | 
| B  G  Y  R  Y  G  G  R | 
|                        | 
|                        | 
| R  Y  B  G  Y  B  B  G | 
|                        | 
------------------------  
> score: 9009
> moves: 27
)"

The game-play of this specific Match-3 is not so interesting in itself, but what about the architecture running all of this? To understand it, I will try to explain each part of the life cycle of this compile-time game in term of code.

Injecting the game state:

As a C++ afficionados or a nitpicker, you may have noticed that my previous dumped game state started with the following pattern: R"(. This is indeed a C++11 raw string literal, meaning that I do not have to escape special characters like line feed. This raw string literal is stored in a file called current_state.txt.

How do we inject this current game state into a compile state? Let's just include it into the loop inputs!

// loop_inputs.hpp

constexpr KeyboardInput keyboard_input = KeyboardInput::KEYBOARD_INPUT; // Get the current keyboard input as a macro

constexpr auto get_game_state_string = []() constexpr
{
    auto game_state_string = constexpr_string(
        // Include the raw string literal into a variable
        #include "current_state.txt"
    );
    return game_state_string;
};

Whether it is a .txt file or a .h file, the include directive from C preprocessor will work exactly the same: it will copy the content of the file at its location. Here I am copying the ascii-game-state-raw-string-literal into a variable named game_state_string.

Note that this header file loop_inputs.hpp also exposes the keyboard inputs for the current frame / compilation. Unlike the game state, the keyboard state is fairly small and can be easily received as a preprocessor definition.

Compile time computation of the new state:

Now that we have gathered enough data, we can compute a new state. And finally, we reach the point where we have to write a main.cpp file:

// main.cpp
#include "loop_inputs.hpp" // Get all the data necessary for our computation.

// Start: compile-time computation.

constexpr auto current_state = parse_game_state(get_game_state_string); // Parse the game state into a convenient object.

constexpr auto new_state = game_engine(current_state) // Feed the engine with the parsed state,
    .update(keyboard_input);                          // Update the engine to obtain a new state.


constexpr auto array = print_game_state(new_state); // Convert the new state into a std::array<char> representation.

// End: compile-time computation.

// Runtime: just render the state.
for (const char& c : array) {  std::cout << c; }

Strangely, this C++ code does not look too convoluted for what it does. Most of this code is run during the compilation phase and yet follows traditional OOP and procedural paradigms. Only the rendering, the last line, is an impediment to a pure compile-time computation. By sprinkling a bit of constexpr where it should, you can have pretty elegant meta-programming in C++17 as we will see later-on. I find it fascinating the freedom C++ gives us when it comes to mix runtime and compile-time execution.

You will also notice that this code only execute one frame, there is no game-loop in here. Let's solve that issue!

Gluing things together:

If you are revulsed by my C++ tricks, I wish you do not mind to contemplate my Bash skills. Indeed, my game loop is nothing more than a bash script executing repeatidly some compilations.

# It is a loop! No wait, it is a game loop!!!
while; do :
    # Start a compilation step using G++
    g++ -o renderer main.cpp -DKEYBOARD_INPUT="$keypressed"

    keypressed=get_key_pressed()

    # Clean the screen.
    clear

    # Obtain the rendering
    current_state=$(./renderer)
    echo $current_state # Show the rendering to the player

    # Place the rendering into a current_state.txt file and wrap it into a raw string literal.
    echo "R\"(" > current_state.txt
    echo $current_state >> current_state.txt
    echo ")\"" >> current_state.txt
done

I actually struggled a bit to get the keyboard inputs from the console. I initially wanted to receive the inputs in parallel of the compilation. After lots trial and error, I got something working-ish using the read Bash command. I would never dare to duel a Bash wizard, that language is way too maleficent!

Now let's agree, I had to resort to use another language to handle my game loop. Although, technically, nothing would prevent me to write that part of the code in C++. It also does not cancel the fact that 90% of the logic of my game is done within this g++ compilation command, which is pretty awesome!

A bit of gameplay to soften your eyes:

Now that you have suffered your way into my explanations on the game's architecture, here comes a bit of eye candy:

Meta Crush Saga in action

This pixelated gif is a record of me playing Meta Crush Saga. As you can see, the game runs smoothly enough to make it playable in real-time. It is clearly not attractive enough to be able to stream it on Twitch and for me to become the new Pewdiepie, but hey... it works! One of the funny aspect of having a game state stored as a .txt is the ability to cheat or test edge-cases really easily.

Now that I sketched the architecture, we will dive a bit more in the C++17 features used within that project. I will not focus on the game logic, as it is very specific to a Match-3, but will instead discuss subjects of C++ that be could applied in other projects too.

My C++17 learnings:

Unlike C++14 which mainly contained minor fixes, the new C++17 standard has a lot to offer. There were hopes that some long-overdue features would land this time (modules, coroutines, concepts...) and... well... they did not ; which disappointed quite a few of us. But after the mourning, we discovered a myriad of small unexpected gems that made their way through.

I would dare to say that all the meta-programming kids were spoiled this year! Few minor tweaks and additions in language now permit you to write code very similar weither it is running during compilation or afterwards during runtime.

Constepxr all the things:

As Ben Deane and Jason Turner foretold in their C++14 talk, C++ is quickly improving value-computations at compile-time using the almighty constexpr keyword. By placing this keyword in the appropriate places you can hint to your compiler that an expression is constant and could be directly evaluated at compile-time. In C++11 you could already write such code:

constexpr int factorial(int n) // Combining a function with constexpr make it potentially evaluable at compile-time.
{
    return n <= 1? 1 : (n * factorial(n - 1));
}

int i = factorial(5); // Call to a constexpr function.
// Can be replace by a good compiler by:
// int i = 120;

While powerful, constexpr had quite a lot of restrictions on its usage and made it cumbersome to write expressive code in this way. C++14 relaxed a lot constexpr and felt much more natural to use. Our previous factorial function could be rewritten this way:

constexpr int factorial(int n)
{
    if (n <= 1) {
        return 1;
    }

    return n * factorial(n - 1);
}

Indeed, C++14 lifted the rule stipulating that a constexpr function must only consist of one return statement, which forced us to use the ternary operator as a basic building block. Now C++17 brought even more placements for the constexpr keyword that we can explore!

Compile-time branching:

Did you ever end-up in a situation where you wish that you could have different behavior according to the template parameter you are manipulating? Let's say that you wanted a generic serialize function that would call .serialize() if your object provides one, otherwise fall back on calling to_string on it. As explained in more details in this post about SFINAE you would very likely write such a lovely alien code:

template <class T>
std::enable_if_t<has_serialize_v<T>, std::string> 
serialize(const T& obj) {
    return obj.serialize();
}

template <class T>
std::enable_if_t<!has_serialize_v<T>, std::string> 
serialize(const T& obj) {
    return std::to_string(obj);
}

In your dreams you may be able to rewrite that awkward SFINAE trick into such a magestic piece of code in C++14:

// has_serialize is a constexpr function that test the of serialize on a object.
// See my post on SFINAE to understand how to write such a function. 
template <class T>
constexpr bool has_serialize(const T& /*t*/);

template <class T>
std::string serialize(const T& obj) { // We know that constexpr can be placed before functions.
    if (has_serialize(obj)) {
        return obj.serialize();
    } else {
        return std::to_string(obj);
    }
}

Sadly, as soon as you wake-up and start writing C++14 for real, your compiler will vomit you a displeasant message regarding the call serialize(42);. It will explain that the object obj of type int does not have a serialize() member function. As much as you hate it, your compiler is right! Given the current code, it will always try to compile both of the branches return obj.serialize(); and return std::to_string(obj);. For an int, the branch return obj.serialize(); might well be some dead-code since has_serialize(obj) will always return false, but your compiler will still need to compile-it.

As you may expect, C++17 save us from such an embarassing situation by introducing the possibility to add constexpr after an if statement to "force" a compile-time branching and discard the unused statements:

// has_serialize...
// ...

template <class T>
std::string serialize(const T& obj)
    if constexpr (has_serialize(obj)) { // Now we can place constexpr on the 'if' directly.
        return obj.serialize(); // This branch will be discarded and therefore not compiled if obj is an int.
    } else {
        return std::to_string(obj);branch
    }
}

This is clearly a huge improvement compared to the SFINAE trick we had to employ until now. After that, you will start to get the same addiction as Ben and Jason which consists in constexpr everything, everywhere at anytime. Alas, there is still one place where the constexpr keyword would well fit in but cannot be done yet: constexpr parameters.

Constexpr parameters:

If you are assiduous, you may have noticed a strange pattern in one my previous code sample. I am talking about the loop inputs:

// loop_inputs.hpp

constexpr auto get_game_state_string = []() constexpr // Why?
{
    auto game_state_string = constexpr_string(
        // Include the raw string literal into a variable
        #include "current_state.txt"
    );
    return game_state_string;
};

Why is the variable game_state_string encapsulated into a constexpr lambda? Why not making it a constexpr global variable?

Well, I wanted to pass this variable and its content deep down to some functions. For instance, my parse_board needed to be fed with it and used it in some constant expressions:

constexpr int parse_board_size(const char* game_state_string);

constexpr auto parse_board(const char* game_state_string)
{
    std::array<GemType, parse_board_size(game_state_string)> board{};
    //                                       ^ ‘game_state_string’ is not a constant expression
    // ...  
}

parse_board(...something...);

If you are doing it this way, your grumpy compiler will complain that the parameter game_state_string is not a constant expression. When I am building my array of Gems, I need to compute its fixed capacity directly (you cannot use vectors at compile-time as it requires to allocate) and pass it as a value-template-argument to std::array. The expression parse_board_size(game_state_string) therefore needs to be a constant expression. While parse_board_size is clearly marked as constexpr, game_state_string is not AND cannot be! Two rules are annoying us in this case:

  • Arguments of a constexpr function are not constexpr!
  • And you cannot add constexpr in front of them!

It boils down to the fact that constexpr functions MUST be usable for both runtime or compile-time computations. Allowing constexpr parameters would discard the possibility to use them at runtime.

Thanksfully, there is a way to mitigate that issue. Instead of accepting the value as a normal function parameter, you can encapsulate that value into a type and pass that type as a template parameter:

template <class GameStringType>
constexpr auto parse_board(GameStringType&&) {
    std::array<CellType, parse_board_size(GameStringType::value())> board{};
    // ...
}

struct GameString {
    static constexpr auto value() { return "...something..."; }
};

parse_board(GameString{});

In this code sample, I am creating a struct type GameString which has a static constexpr member function value() that returns the string literal I want to pass to parse_board. In parse_board I receive this type through the template parameter GameStringType thanks to template argument deduction rules. Having GameStringType, I can simply call the static member function value() whenever I want to get the string literal, even in locations where constant expressions are necessary since value() is constexpr.

We succeeded to encapsulate our literal to somehow pass it to parse_board in a constexpr way. Now, it gets very annoying to have to define a new type everytime you want to send a new literal to parse_board: "...something1...", "...something2...". To solve that issue in C++11, you can rely on some ugly macros and few indirection using an anonymous union and a lambda. Mikael Park has a nice explanation on this topic in one of his post.

We can do even better in C++17. If you list our current requirements to pass our string literal, we need:

  • A generated function
  • Which is constexpr
  • With a unique or anonymous name

This requirements should ring a bell to you. What we need is a constexpr lambda! And C++17 rightfully added the possibility to use the constexpr keyword on a lambda. We could rewrite the code sample in such a way:

template <class LambdaType>
constexpr auto parse_board(LambdaType&& get_game_state_string) {
    std::array<CellType, parse_board_size(get_game_state_string())> board{};
    //                                       ^ Allowed to call a constexpr lambda in this context.
}

parse_board([]() constexpr -> { return ...something...; });
//                ^ Make our lambda constexpr.

Believe me, this feels already much neater than the previous C++11 hackery using macros. I discovered this awesome trick thanks to Björn Fahller, a member of the C++ meetup group I participate in. You can read more about this trick on his blog. Note also that the constexpr keyword is actually not necessary in this case: all the lambdas with the capacity to be constexpr will be by default. Having an explicit constexpr in the signature just makes it easier to catch mistakes.

Now you should understand why I was forced to use a constexpr lambda to pass down the string representing my game state. Have a look at this lambda and one question should arise again. What is this constexpr_string type I also use to wrap the string literal?

constexpr_string and constexpr_string_view:

When you are dealing with strings, you do not want to deal with them the C way. All these pesky algorithms iterating in a raw manner and checking null ending should be forbidden! The alternative offered by C++ is the almighty std::string and STL algorithms. Sadly, std::string may have to allocate on the heap (even with Small String Optimization) to store their content. One or two standards from now we may benefit from constexpr new/delete or being able to pass constexpr allocators to std::string, but for now we have to find another solution.

My approach was to write a constexpr_string class which has a fixed capacity. This capacity is passed as a value template parameter. Here is a short overview of my class:

template <std::size_t N> // N is the capacity of my string.
class constexpr_string {
private:
    std::array<char, N> data_; // Reserve N chars to store anything.  
    std::size_t size_;         // The actual size of the string.
public:
    constexpr constexpr_string(const char(&a)[N]): data_{}, size_(N -1) { // copy a into data_   }
    // ...
    constexpr iterator begin() {  return data_;   }       // Points at the beggining of the storage.
    constexpr iterator end() {  return data_ + size_;   } // Points at the end of the stored string.
    // ...
};

My constexpr_string class tries to mimic as closely as possible the interface of std::string (for the operations that I needed): you can query the begin and end iterators, retrive the size, get access to data, erase part of it, get a substring using substr, etc. It makes it very straightforward to transform a piece of code from std::string to constexpr_string. You may wonder what happens when you want to do use operations that would normally requires an allocation in std::string. In such cases, I was forced to transform them into immutable operations that would create new instance of constexpr_string.

Let's have a look at the append operation:

template <std::size_t N> // N is the capacity of my string.
class constexpr_string {
    // ...
    template <std::size_t M> // M the capacity of the other string.
    constexpr auto append(const constexpr_string<M>& other)
    {

        constexpr_string<N + M> output(*this, size() + other.size());
        //                 ^ Enough capacity for both. ^ Copy the first string into the output.

        for (std::size_t i = 0; i < other.size(); ++i) {
            output[size() + i] = other[i];
            ^ Copy the second string into the output.
        }

        return output; 
    }
    // ...
};

No needs to have a Fields Medal to assume that if we have a string of size N and a string or size M, a string of size N + M should be enough to store the concatenation. You may waste a bit of "compile-time storage" since both of your strings may not use their full capacity, but that is a fairly small price to pay for a lot of convenience. I, obviously, also wrote the counterpart of std::string_view which I named constexpr_string_view.

Having these two classes, I was ready to write elegant code to parse my game state. Think about something like that:

constexpr auto game_state = constexpr_string(...something...);

// Let's find the first blue gem occurence within my string:
constexpr auto blue_gem = find_if(game_state.begin(), game_state.end(), 
    [](char c) constexpr -> { return  c == B; }
);

It was fairly simple to iterate over the gems on my board - speaking of which, did you notice another C++17 gem in that code sample?

Yes! I did not have to explicitely specify the capacity of my constexpr_string when constructing it. In the past, you had to explicitely specify the arguments of a class template when using it. To avoid this pain, we would provide make_xxx functions since parameters of function templates could be deduced. Have a look on how class template argument deduction is changing our life for the better:

template <int N>
struct constexpr_string {
    constexpr_string(const char(&a)[N]) {}
    // ..
};

// **** Pre C++17 ****
template <int N>
constexpr_string<N> make_constexpr_string(const char(&a)[N]) {
    // Provide a function template to deduce N           ^ right here
    return constexpr_string<N>(a);
    //                      ^ Forward the parameter to the class template.
}

auto test2 = make_constexpr_string("blablabla");
//                  ^ use our function template to for the deduction.
constexpr_string<7> test("blabla");
//               ^ or feed the argument directly and pray that it was the good one.


// **** With C++17 ****
constexpr_string test("blabla");
//           ^ Really smmoth to use, the argument is deduced.

In some tricky situations, you may still need to help your compiler to deduce correctly your arguments. If you encouter such an issue, have a look at user-defined deduction guides.

Free food from the STL:

Alright, you can always rewrite things by yourself. But did the committee members kindly cooked something for us in the standard library?

New utility types:

C++17 introduced std::variant and std::optional to the common vocabulary types, with constexpr in mind. While the former one is really interesting since it permits you to express type-safe unions, the implementation provided in the libstdc++ library with GCC 7.2 had issues when used in constant expressions. Therefore, I gave up the idea to introduce std::variant in my code and solely utilised std::optional.

Given a type T, std::optional allows you to create a new type std::optional which may either hold a value of type T or nothing. It is pretty similar to nullable value types in C#. Let's consider a find_in_board function that return the position of the first item in the board that validate a predicate. You may not have such an item in the board at all. To handle such a situation, the position type must be optional:

template <class Predicate>
constexpr std::optional<std::pair<int, int>> find_in_board(GameBoard&& g, Predicate&& p) {
    for (auto item : g.items()) {
        if (p(item)) { return {item.x, item.y}; } // Return by value if we find such an item.
    }
    return std::nullopt; // Return the empty state otherwise.
}

auto item = find_in_board(g, [](const auto& item) { return true; });
if (item) {  // Test if the optional is empty or not.
    do_something(*item); // Can safely use the optional by 'deferencing' with '*'.
    /* ... */
}

Previously, you would have recourse to either pointer semantics or add an "empty state" directly into your position type or return a boolean and take an out parameter. Let's face it, it was pretty clumsy!

Some already existing types also received a constexpr lifting: tuple and pair. I will not explain their usage as a lot have been already written on these two, but I will share you one of my disapointment. The committee added to the standard a syntactic sugar to extract the values hold by a tuple or pair. Called structured binding, this new kind of declaration uses brackets to define in which variables you would like to store the exploded tuple or pair:

std::pair<int, int> foo() {
    return {42, 1337};
}

auto [x, y] = foo();
// x = 42 and y = 1337.

Really clever! It is just a pity that the committee members [could not, would not, did not have the time, forgot, enjoyed not] to make it constexpr friendly. I would have expected something along the way:

constexpr auto [x, y] = foo(); // OR
auto [x, y] constexpr = foo();

We now have fancy containers and utilities, how can we manipulate them easily?

Algorithms:

Upgrading a container to a handle constexpr is a rather tedious work. Comparatively, bringing constexpr to non-modifying algorithms seems rather straightforward. But strangely enough, C++17 did not see any progress in that domain, it is actually coming in C++20. For instance, the supreme std::find algorithms did not receive its constexpr signature.

Worry not (or "Qu'à cela ne tienne" in French)! As explained by Ben and Jason, you can easily turn an algorithm in constexpr by simply copying a current implementation (checkout for copyrights though) ; cppreference being a good fit. Ladies and gentlemen, I present you a constexpr std::find:

template<class InputIt, class T>
constexpr InputIt find(InputIt first, InputIt last, const T& value)
// ^ TADAMMMM!!! I added constexpr here.
{
    for (; first != last; ++first) {
        if (*first == value) {
            return first;
        }
    }
    return last;
}

// Thanks to: http://en.cppreference.com/w/cpp/algorithm/find

I already hear optimisation affionados screaming on their chair! Yes, just adding constexpr in front of a code sample gently provided by cppreference may not give you the perfect runtime performances. But if we really had to polish this algorithm it would be for compile-time performances. Keeping things simple usually give the best results when it comes to speed of compilation from what I observed.

Performance & bugs:

Every triple A games must put efforts in these topics, right?

Performance:

When I achieved a first half-workingish version of Meta Crush Saga things ran rather smoothly. It actually reached a bit more than 3 FPS (Frame Per Second) on my old laptop with an i5 clocked at 1.80GHz (frequency do matter in this case). As in any project, I quickly found my previously written code unsavoury and started to rewrite the parsing of my game state using constexpr_string and standard algorithms. Although it made my code much more maintenable it also had a severe impact on the performance ; 0.5 FPS was the new ceiling.

Unlike the old C++ adage, "zero-head abstractions" did not apply to compile-time computations. Which really makes sense if you see your compiler as an interpreter of some "compile-time code". There is still room to improve for the various compilers, but also for us writers of such code. Here is a non-exhaustive list of few observations and tips, maybe specific to GCC, that I figured out:

  • C arrays performed significantly better than std::array. std::array is a bit of modern C++ cosmetic over a C-style array and one must pay a certain price to use it in such circumstances.
  • It felt like recursive functions had the advantage (speed-wise) over writing functions with loops. It could well be that writing recursive algorithms forces you to tackle your problems in another way, which behaves better. To put in my two penny worth, I believe that the cost of compile-time calls might be smaller than executing a complicated function body especially that compilers (and their implementors) have been exposed to decades of abusive recursions we used for our own template-metaprogramming ends.
  • Copying data around is also quite expensive, notably if you are dealing with value types. If I wanted to futher optimise my game, I would mainly focus on that problem.
  • I only abused one of my CPU core to do the job. Having only one compilation unit restricted me to spawn only one instance of GCC at a time. I am not quite sure if you could parallelise my compilutation.

Bugs:

More than once, my compiler regurgitated terrible compilation errors, my code logic being flawed. But how could I find where the bug was located? Without debugger and printf, things get complicated. If your metaphoric programming beard is not up to your knees (both my metaphoric and physical one did not reach the expectations), you may not have the motivation to use templight nor to debug your compiler.

Your first friend will be static_assert, which gives you the possibility to test the value of a compile-time boolean. Your second friend will be a macro to turn on and off constexpr wherever possible:

#define CONSTEXPR constexpr  // compile-time No debug possible

// OR

#define CONSTEXPR           // Debug at runtime

By using this macro, you can force the logic to execute at runtime and therefore attach a debugger to it.

Meta Crush Saga II - Looking for a pure compile-time experience:

Clearly, Meta Crush Saga will not win The Game Awards this year. It has some great potential, but the experience is not fully compile-time YET, which may be a showstopper for hardcore gamers... I cannot get rid-of the bash script, unless someone add keyboard inputs and impure logic during the compilation-phase (pure madness!). But I believe, that one day, I could entirely bypass the renderer executable and print my game state at compile-time:

Life cycle of a fully compile-time game

A crazy fellow pseudo-named saarraz, extended GCC to add a static_print statement to the language. This statement would take a few constant expressions or string literals and output them during the compilation. I would be glad if such tool would be added to the standard, or at least extend static_assert to accept constant expressions.

Meanwhile, there might be a C++17 way to obtain that result. Compilers already ouput two things, errors and warnings! If we could somehow control or bend a warning to our needs, we may already have a decent output. I tried few solutions, notably the deprecated attribute:

template <char... words>
struct useless {
    [[deprecated]] void call() {} // Will trigger a warning.
};

template <char... words> void output_as_warning() { useless<words...>().call(); }

output_as_warning<a, b, c>();

// warning: 'void useless<words>::call() [with char ...words = {'a', 'b', 'c'}]' is deprecated 
// [-Wdeprecated-declarations]

While the output is clearly there and parsable, it is unfortunately not playable! If by sheer luck, you are part of the secret-community-of-c++-programmers-that-can-output-things-during-compilation, I would be glad to recruit you in my team to create the perfect Meta Crush Saga II!

Conclusion:

I am done selling you my scam game. Hopefully you found this post ludic and learned things along the reading. If you find any mistake or think of any potential improvement, please to do not hesitate to reach me.

I would like to thanks the SwedenCpp team for letting me having my talk on this project during one of their event. And I particularly would like to express my gratitude to Alexandre Gourdeev who helped me improve Meta Crush Saga in quite a few significant aspects.


Distributed C++ Meetup 0x02

Posted on Sat 07 April 2018 in News • Tagged with C++ Stockholm London Berlin Distributed MeetupLeave a comment

Here is a quick follow-up of the event I announced in my previous post: the Distributed C++ Meetup 0x02. A quick explanation for those too lazy to click a link or scroll down a bit to read my previous post (not judging you here, I would do the same); the concept of a Distibuted C++ Meetup consists in gathering multiple C++ user groups from around the world in one event using video-conference facilities. This time we had the pleasure to bring together the Berlin, London and Stockholm Meetup groups using King's offices.

Distributed C++ Meetup 0x02:

Once again Phil Nash took on himself to edit a full footage of the event, which you can enjoy right here:

We also have altenative footage of the two first talks made by Harald Achitz:

Being the host of the event makes you easily distracted, and so I was! But thanks to Phil's and Harald's work, I can enjoy a rewatch of the talks in a serene environment.

Harald Achitz gently explained us the space-ship operator that will arrive in C++20 and what to expect from it. Then came the introduction on Type Safe Flags made by Arvid Norberg that I particulary enjoyed : simple and very practical! Richard Spindler related us his experience with Boost.Compute. Finally, Dominic Jones made a talk on Expression Tree Transform which reminded me a lot Reactive Equations by André Bergner that I had the chance to see during Meeting C++ 2017. Although, I always enjoy template meta-programming, it was very alien to anyone at the event who wasn't familiar to expression templates. I always admire anyone willing to be on stage to share her/his knowledge and I appreciate the honor our speakers did to us by doing so that evening.

Polirazed opinions:

As the tradition goes, we sent a survey after our Meetup event. The feedback we received was substantially more divided than last time. Although the numbers indicate that a majority of the people are willing to redo such an event ; 47.50% really want a re-run, 35.00% liked it as much as a normal event, 17.50% would have prefer to stick to a normal one; it was clearly less popular than the last event with respectively 82.50%, 15% and 2%. Surprisingly people changed their minds and now would prefer going back to lightning talks (instead of 30min ones), as suggested by few feedback comments. As a comparison you can also read more about the previous event's results on Phil Nash's blog. People also expressed a strong feeling of being less "connected with the remote crowd" this time, which is sadly one of our main goal for a Distributed Meetup. It was especially surprising that this time we had questions crossing the virtual borders.

I have been reflecting on what changed in our event's recipe and what could have be done better. Here are my thoughs as well as observations from other co-organizers:

  • The more cities we gather, the less time each location has talks happening physically in front of them. No matter how good is the video-conference setup, anyone would prefer to have it "at home". While as an attendee you will appreciate spending half of the event learning from speakers at another location, two-thirds of the event might be too long. Having more cities also increase the risk of cumulating technical issues. I would suggest to come back to a two cities format.
  • We need to have room for a break in our schedule. Even Supertemplateman would not stand two hours of C++ talks on a cramped bench. Having 15min to make our transitions between countries would be pretty sweet too.
  • Lightning talks may be somehow frustrating as they barely touch the surface of a topic. But a 35min talk can also be pretty damn long if you do not like the subject, even more so when projected on a screen. We would like to try 15min for a next run of the event. Shorter talks also have a desirable side effect: speakers will tend to write slides that are very expressive. Which bring us to the next issue.
  • I noticed that slides were much harder to read especially on the weird 4-TVs setup we have in the Stockholm's office. This was corroborated by the feedback specific to Stockholm. Tiny fonts force you to concentrate to extrapolate information that you cannot obtain with your eyes, it's exhausting and frustrating. We should really advise the speakers to use a minimum font size prior to the event. There is also the possibility to have multiple screens in the Stockholm office, to improve the experience for the people that always sit in our very attractive carousels in the back (every King employee did that mistake once).
  • People don't like the mandatory NDA to access the London office. I am not sure how this could be solved, as I am not familiar to this office. Be sure that we will explore that topic later on.
  • Finally, as a reminder, never forget to turn the light on! As much as we are used to darkness in Sweden, it was still a challenge for us to see what happened in the tenebrous London office.

Conclusion:

I would like to run a third Distributed C++ Meetup with the aforementioned "bug-fixes". I am eager to see if the concept is actually viable or if the first run was simply benefiting from the novelty effect. As we are pioneer in these virtual meetups, it is pretty obvious that we will have to use our attendees as guinea pigs slowly refine our ideas .

Thanks to all people that helped during this event: Mårten Möller, Vasil Georgiev, Phil Nash, Harald Achitz, Tindy Hellman, Bruno Mikus, our tech and video-crew...


Distributed C++ Meetup

Posted on Tue 20 March 2018 in News • Tagged with C++ Stockholm London Berlin Distributed MeetupLeave a comment

As mentioned in an early post, I am active member of a C++ Meetup group in Stockholm: SwedenCpp. Last year me and the main organizer of SwedenCpp came up with the idea to gather different Meetup groups from Europe for a special event. Knowing that we are not as famous as the cppcon, and therefore could not attract people easily to come to visit us in our dark and cold country, we had to come up with a special plan. It happens that my current company King has offices of decent size in few of the major tech hubs of Europe: Berlin, Stockholm, London and Barcelona. It also happens that we have pretty good video-conference facilities that we frequently use to do company-wide calls. AND It also happens that a huge share of the mobile games (Candy Crush Saga, Bubble Witch Saga...) made at King rely heavily on a C++ tech stack. By using the video-conference facilities, having some drinks and food, all gently provided by King, we were confident that we could run an event with multiple Meetup groups participating remotely and have a lot of fun.

And thus was born the concept of a Distributed C++ Meetup!

We contacted the London Meetup group and asked them if they wanted to join us in this aventure. Sure enough, Phil Nash, the organizer, said yes. Greatly helped by few colleagues (Mårten Möller, Tindy Hellman, Bruno Mikus, Brittany Pinder...) we ran a first event in November 2017 with a connection between London and Stockholm.

Distributed C++ Meetup 0x01:

Here is a full video of the first event (Distributed C++ Meetup 0x01) where we had lightning talks from speakers on both sites:

Or if you want to see some specific talks, you can also watch this youtube playlist with videos nicely crafted by our amazing SwedenCpp video crew.

I was pretty nervous to be the host of the event, but I think that everything ran pretty smoothly and the vast majority of the attendees enjoyed the concept. We had few learnings though:

  • Some people prefer the long format for the talks instead of lightning talks we used this time.
  • I had a failed attempt in creating an interaction between the crowd and the speakers. No one really dares to speak to remote speakers! This could be solved by using apps like slido.
  • It always nice to use have backups for video recording and sound recording. The audio recorded in London was not as good as expected.
  • Do not drink the beers and ciders originally bought for the halloween party!

Distributed C++ Meetup 0x02:

The reason I decided to write about this event few months after is that 1) I am lazy 2) we will soon have another run of this concept. Next week, on the 28th of March 2018, the Distributed C++ Meetup 0x02 will happen. This time not only London and Stockholm will be involved but we also bring with us the Berlin Meetup Group managed by Sebastian Theophil and Vasil Georgiev (from King)!

If you are reading this post and want to join the event, it is still not too late to sign-up:


Trip report - Meeting C++ 2017

Posted on Thu 16 November 2017 in C++ • Tagged with C++, event, meetingcppLeave a comment

Finally, after years of watching youtube videos on that topic, I made it to my first C++ international conference! Thanks to my current employer King, I went last week to Meeting C++ in Berlin, which is, as far as I know, the biggest C++ event in Europe. I really enjoyed my time there with few hundred other fellow C++ enthusiasts. In this post, I will try to relate how I experienced the event and dress a list of the must-watch talks.

About meeting C++:

The concept:

Held in the magnificient Andels Hotel in Berlin, Meeting C++ offers the possibility to attend keynotes, talks and lightning talks (respectively lasting 2 hours, 50min and 5min) about our favourite language C++ for 3 days (one extra day added for the 6th edition of the event). C++ being multi-paradigm and a general-purpose programming language, the variety of the topics being discussed is pretty wide. It ranges from subject on "(Template) Meta-programming" to a deep dive on "How a C++ debugger work", from begginner friendly talks to hairy discussions on the yet-to-be-standardised white paper on std::expected<T, E>.

As some talks happen simulteanously in different rooms, you cannot physically attend all the talks. Instead, you would usually prepare your own schedule by trying to guess the content of the talks from their summary. It is always a dilemna to choose between a topic that you like with the risk to have nothing new to learn, and a brand-new topic for you, where sleepness may kick-in midway to the presentation. If you are curious and daring, the lightning talks on the last day permit you to randomly discover antyhing about C++ in succint presentations. In any case, you can always catch-up the missed talks by checking the Youtube channel.

Generally, I was not disapointed by the quality of the slides and the speakers. I picked up quite a few new concepts, prepared myself for the future of C++ (C++20) and refreshed myself on some fields I did not touch for a while.

More than just talks:

Where Meeting C++ really shines is in its capacity to gather roughly 600 passionated developers from various backgrounds (university, gaming industry, finance, Sillicon Valley's giants...) in one building and share. Just share anything about C++, about reverse-engineering, about your job, about your country, about the german food in your plate! The C++ community represented at this event is very active, open-minded and willing to help. The catering between the sessions and the dinner parties permit you to meet anyone easily. The even team also oganises some really fun events

In a room full of "world-class developers", it is easy to be intimidated, but you should not hesitate to reach them. They will not nitpick your words nor snob you. For some people, it is a dream to meet an Hollywood Star in the street, for me it was really delightful to have casual conversations with these "legendary" coders from the web.

The chief's suggestions of the day:

Here is a menu of most of the talks I attended. The legend is pretty simple:

  • 💀 : The difficulty of the talk (💀: Begginer friendly, 💀💀: Intermediate, 💀💀💀: High exposure to C++'s dark corners)
  • ★ : My interest for the talk (★: Good talk, ★★: Tasty talk, ★★★: Legendary talk)

I will not spoil all the talks, but simply try to give an overview of what you can expect within them. Note that all the talks are generally of high quality and my appreciation very subjective. I have seen people with very different "favorite talk".

[Keynote] Better Code: Human interface - By Sean Parent - 💀 ★★

Sean Parent is a Principal Scientist at Adobe Systems and has been working on the famous software Photoshop for more than 15 years. Sean Parent is a regular and prominent speaker at C++ conferences, one of his recently most famous talk being Better Code: Runtime Polyphormism from the same series of talks (Better Code) as the one he gave during Meeting C++.

Thorough his keynote, Sean was conveing the message that in order to have ergonomic human interfaces, you must design your code to reflects its usage through UI. By following such a principle, one can easily come-up with good namings, semantics and grouping of your UI components.

For instance, Sean was explaining that most of the menu actions in Photoshop somehow mapped some object, their methods, properties and most importantly their relations:

  • The menu action Create New Layer will somehow call the constructor of a class called something like Layer.
  • Likewise the action Delete A Layer would call its destructor.
  • A selection in Photoshop will most likely translate in a container of objects.

As a counter-example he explained that the old version of Gmail used to have a confusing flow when it comes to the most trivial usage of a mail service: creating a mail. A link, which implies navigation, was used instead of button for the "compose message" action.

Gmail failed compose button

Sean put a strong emphasis that relationships are the most difficult part of an architecture to represent. He came up with few examples on how std::stable_partition can be used to solve in an elegant way the gathering and display of items

Overall a very nice talk, but on a very abstract topic, since not much has been explored on that subject yet! This is worth paying attention in game-programming where a good UI is a key-part of the success of a game.

[Talk] Threads and Locks must Go - Rainer Grimm - 💀💀 ★

In this talk Rainer Grimm, a German author of multiple C++ books, brought under the spotlight the concurrency features introduced by the new C++ standard C++17 and the coming one C++20. Here is a short summary of my favourite features:

For C++17 (and concurrency TS):

std::vector<int> v = {...}; // A bit vector... 
std::sort(std::execution::par{}, v.begin(), v.end());
// Due to "par", this **might** execute the sort in parallel using a thread pool of some sort.
std::future<int> foo();

auto f = foo();
f.then([](std::future<int>& f) {
    // Will be called when f is done.
    std::cout << f.get(); // Will therefore not block.
});

Hopefully for C++20:

  • The stackless coroutines as implemented by MSVC and clang. This introduce two keywords co_await and co_yield. Considering the previous example using std::future, it could be rewritten in the following way:
std::future<int> foo();

int x = co_await foo(); // The continuation is "generated" by the compiler using the keyword co_await.
std::cout << x; // Everything after co_await is implicitely part of the continuation.
int x; y; // Global vars

void foo() { // Foo can be called by multiple thread simuteanously.
    // Reads-writes on x and y are now thread-safe and synchronized.
    synchronized {
        ++x; 
        ++y;
    }
}

As explained by Rainer Grimm, we will have the possibility to easily bring concurrency to our C++ codebases without ressorting to the low-level, and tricky to get right, features like thread and locks. While I appreciated the talk, it lacked a bit of novelty as I was already aware of most of the features.

[Talk] Strong types for strong interfaces - Johnathan Boccora - 💀 ★★★

A must watch! Even when facing some technical issues, Johnathan is very good speaker and I was quickly captivated by the topic of strong types. Jonathan is also a talented writer with his famous blog fluentcpp (I would really suggest to have a look at it once in a while).

As C++ developers, we heavily rely on the language's type system to express our intentions to other developers, to optimise our code and avoid shooting ourselves into our feet. Yet, we always reuse some types to express very different properties of our system. For instance to describe a person, you would use an int for her/his age and his/her weight. Did it ever come to you that the unit year (for age) should be a very different type than kg (for weight)? The concept of strong type would solve this problem by introducing new int-compatible types:

 // We need to have these empty tag types to create entirely new types and not just weakly-typed aliases.
using kg = strong_type<int, struct KgTag>;
using years = strong_type<int, struct YearsTag>;

void unsafe_create_person(int age, int weight);
void create_person(years age, kg weight); // Explicit interface.

int age = 42;
int weight = 1337;

unsafe_create_person(weight, age); // Oops I inversed the arguments but no compiler error.

create_person(years(age), kg(weight))); // Much less error-prone.

As a bonus, strong types can actually affect positively the performances of your codebase as the compiler can agressively optimise without violating strict-aliasing rules, since the types are now strictly unrelated.

This concept is not new and is already used in std::chrono or Boost.Unit, but it was really refreshing to have an explanation with simple words and good examples! I am now very keen to use this in my personal projects and at work too.

[Talk] How C++ Debuggers Work - Simon Brand (4/5) - 💀💀 ★★

Simon Brand, also known as TartanLlama (a really fancy fictious name for a Scott), presented us how a mixture of calls to ptrace, injection of the int3 opcode, parsing of the DWARF format and perseverance is the base to create a debugger on a x86(_64) architecture with a Unix platform (or Linux platform only if you OS specific calls like process_vm_readv, process_vm_writev).

Unlike some of the other talks, it would be hard to give succinct code examples, but I trully appreciated his presentation! When it comes to low-level APIs for debbuging and reverse-engineering, I have a better understanding of the Windows platform. I think that Simon did an excellent job to help me transfer my knowledge to the Unix world.

If one day I have to tackle the creation of a debugger on Unix, I would certainely come back to this talk or follow his series of blog posts on the same subject. I also think that as programmer, it is always beneficial to have some knowledge on the underlying mechanims of the tools you use (gdb, or lldb in that case). I would, therefore suggest to watch that talk to any C++ enthusiast willing to progress in their art of programming.

[Talk] The Three Little Dots and the Big Bad Lambdas - Joel Falcou - 💀💀💀 ★★★

I am always excited by watching a talk from Joel Falcou: he is a venerable (template) metra-programmer wizzard with a very didactic approach to explain things (and also we share the same nationality \O/). Once again, I was not disapointed by his session.

With a lot of humour, Joel introduced a new facet to meta-programming in C++. We used to have template meta-programming to manipulate types at compile-time (and with difficulties values), then came constexpr to ease value computation, and recently a Louis Dionne came-up with a powerful combo of these two cadrants with Boost.Hana. Joel statement was that lambdas expressions combined with auto and parameter packs are powerful enough to replace some cases where we would have resort to use template meta-programming or the uglier infamous macros! Joel came to that conclusion after being inspired by the language MetaOCaml.

Let's say that you want to fill a vector with push-back instruction generated at compile-time:

#include <tuple>
#include <iostream>
#include <array>
#include <utility>
#include <vector>

template <class F, std::size_t... I>
void apply_imp(F&& f, std::index_sequence<I...>) {
    (f(I), ...); // C++17 fold expression.
}

template <int N, class F>
void apply(F&& f) {
    apply_imp(f, std::make_index_sequence<N>{});
}

std::vector<int> bob;

auto bind = [](auto& v, auto f) { return [&v, f](auto x){ f(v, x); }; };
auto push_back = bind(bob, [](auto& v, int x) { v.push_back(x * 3); });

apply<3>(push_back);
// Will generate at compile time:
// bob.push_back(0);
// bob.push_back(3);
// bob.push_back(6);

This example is fairly trivial and there would be a high chance that you would reach the same assembly output using a simple for loop. But it is very interesting to notice that lambdas are reusable type-safe units of code that you transport, combine and "instantiate" at any time. Performance-wise, lambdas are pretty incredible according to Joel's measurement on his linear-algebra project. C++17 constexpr lambdas could also help on that topic. One drawback might be the debugging complexity when navigating in nested lambdas. I still need to wrap my head around this new concept and I am eager to rewatch Joel's talk to explore it more!

[Keynote] Its complicated! - Kate Gregory - 💀 ★★★

While excellent, Kate's keynote would be very hard to summarise correctly within few paragraphs. It makes you reflect on the difficulties to introduce C++ to newcomers. You would hope that there is a subset of the language that could be easily assimilate by anyone, Kate argues that the reality is sadly more complicated than that. Just have a look at how long are the C++ core guidelines on passing parameters to function calls. One day or another, a begginer must learn on how to pass parameters with good semantics and in an optimised fashio. Well, good luck to her/him! On the other hand, it does not mean that the language could have been designed in a simpler way. What we should strive for instead might be better naming of these concepts: the acronym RAII (Resource Acquisition Is Initialization) is obviously not as straightforward as COW (Copy-on-write). Whether you are a "newbie" or the best "lead over-engineer" of your company, this talk is really worth a look!

[Talk] There Is A New Future - Felix Petriconi - 💀💀 ★★

Felix Petriconi and Sean Parent have been working on a the stlab library for quite some time. stlab takes the best of the various future implementations std::future (C++11), std::future (C++14) or boost::future, and adds a bit of it owns features on top of it. For instance, stlabs supports passing explicit executors to control where async will execute the tasks, and where the continuation associated with then will be executed too. Executors are akin to event-loops (or message-pumps in the .Net world) that will process the tasks.

// This task will be executed on ``an_executor``
auto f = stlab::async(an_executor, [] { return 42; });

// The continuation on another executor.
f.then(another_executor, [](int x) { std::cout << x; });

While executors are present in Boost.Thread, stlabs's channels are unique to this future library. Channels are one of the Go language's favorite toy. It is a neat way to create communication between a sender and a receiver on different executors:

auto [sender, receiver] = channel<int>(receive_executor); // Create the link.

receiver | [](int x) { std::cout << x; }; // Define what should happen at the reception.

// Establish the connections.
receiver.set_ready();

sender(42); // Sent 42 through the channel.
// Receiver will print 42 when executing the task.

I really like some of the features in stlabs, hopefully this could be incorporated into the C++ standard (the executors are down in the pipe of the standardisation process).

[Talk] Introduction to proposed std::expected - Niall Douglas - 💀💀💀 ★

Are you the kind of person that would rather have errors on the return values rather than using exceptions. Niall has a solution for you: std::expected<T, E>. You can see std::expected<T, E> either as a std::optional<T> with a empty state containing an error for being empty, or as a std::variant<T, E> where you agree that the first alternative is the return value and the second alternative is the potential error. Example:

std::expected<int, std::string> foo(int x) {
    if (x < 0) return std::make_unexpected("x < 0");    
    return 42;
}

auto result = foo(-1);

if (!result) std::cout << result.error().value(); // Prints " x < 0".

std::expected starts to be cumbersome to use when combining or propogating returned results. To palliate this problem, std::expected exposes a Monad interface, with the bind member function coming directly to your mind. If you are a Haskell user, std::expected should remind you of the Maybe Monad. Using bind is still verbose and hopefully we obtain a dedicated keyword try to ease our pain.

[Talk] The most valuable values - Juan Pedro Bolívar Puente - 💀💀 ★★

During his presentation, Juan actively promoted value semantic over reference semantic and did so with some analogies from our physical world (from our dear philosopher Platos) and code examples. The talk quickly moved onto immutability and functionnal programming applied to user interfaces. There is a trend in the web sphere to follow a software architectural pattern called flux with a main implementation the redux framework. Arguably, flux is a glorified good old MVC (Model View Controller) architecture with a strong emphasis on the immutability of the model and strict flow on the interactions between the components of MVC. Model, View and Controller also get respectively renamed to Store, View and Dispatcher. An action submitted to the Dispatcher will update the Store with a new Store in a determinstic way, which will imply a redraw of the View.

Juan succeeded to mimic redux in the C++ world using his library immer. To demonstrate the capabilities of his library, Juan recreated an emacs-like editor. The beauty of having an immutable Store is truly expressed in the time-traveling machine that you can create from it: by saving all the states of the Store, you can easily come back to a previous state of your application (similar to undo / redo). You should absolutely watch the video to understand how easy it seems to implement this. On top of that you will have the chance to watch what might be the most audacious ending of a C++ talk I have ever seen.

[Talk] Reactive Equations - André Bergner - 💀💀💀 ★★★

As a meta-programmer aficionado, this was the most "devilish" talk, and therefore highly thrilling, I attended during the event. It was not the first time I heard from André Bergner, he attended the cppcon in 2015 and I remembered that he presented a nice way to have currying on your function. This time, André focused on reactive equations. If this sounds foreign to you, you might be more familiar with data binding in Qt QML using Javascript expression. André's reactive equations are similar but with simpler expressions:

// ***** Using a pseudo-language, let's define the equations *****
x : float
y : float
z : float

y = x * 42 // When x is updated, y will be automatically updated.
z = y + x // When y or x is updated, z will be automatically updated.

You may notice that I didn't write any C++ code. By default C++ would not permit to have expressions that update themselves by "magic" if one variable changes. You could write the update logic manually, but with a lot of long equations, this becomes very error prone. Instead André created a DSL (Domain Specific Language), which is equivalent to create a language within C++ itself. To define his DSL, André used expression templates. Expression templates are tricky creatures, which roughly consist in encapsulating C++ expressions into a type at compile-time. This type will retain all the operators / functions (let's call them operations) that you applied in your expression. These operations can be queried at compile-time to generate other expression that you will execute at runtime. In André's case, the encapsulated operations from his reactive equations would be used to automagically generate the update logic. To facilitate his task, André heavily relied on Boost.Proto. If you are versed in the art of meta-programming, this will certainely be entertaining to you!

[Talk] Free your functions - Klaus Iglberger - 💀 ★★★

This was a glorious hymn to our beloved free functions by Klaus Iglberger. Programmers often resort to use member functions and inheritance to provide polyphormism in C++, often overlooking that free functions and overloading would be a smarter choice.

Let's take a situation where you would need to implement a serialise function for a bunch of unrelated types. Would rather use the implementation 1?

struct serialisable {
    virtual ~serialisable();
    virtual std::string serialise();
};

struct A : serialisable {
    std::string serialise() override { /* return something... */ };
};

struct B : serialisable {
    std::string serialise() override { /* return something... */ };
};

Or the solution 2?

struct A {};
struct B {};

std::string serialise(const A& a) { /* return something... */ }
std::string serialise(const B& b) { /* return something... */ }

As Kate explained, it is complicated! If you are looking for runtime polyphormism, then you will certainely use the solution 1. If not, the solution 2 is actually preferable. It has a lot of advantages that Klaus explained for one full hour. My favorite one being that you can extend your polyphormism to types that you do not own. Let's say that you want to serialise std::vector, you can simply write an overload for it:

template <class T>
std::string serialise(const std::vector<T>& v) { /* return something... */ }

In practice, nothing prevent you from mixing a both solutions to your needs. One counter-argument being that free functions have an ugly syntax: v.serialise(); feels more natural than serialize(v);. That issue could have been solve with the unified call syntax proposal by Bjarne Stroustrup and Herb Sutter. Sadly, it was rejected by the C++ committee.

[Talk] Reader-Write Lock versus Mutex - Understanding a Lost Bet - Jeffrey Mendelsohn - 💀💀💀 ★★

Jeffrey Mendelson from Bloomberg had a bet with a colleague on whether a readers-writer lock would be faster than a mutex to protect the access on a resource that could be written by a single writer, but could have multiple readers simultaneously. The readers-writer lock would follow exactly that behaviour (multiple readers, single writer). The mutex would keep the exclusivity to one writer or one reader only! Jeffrey lost the bet, but that did not hindered him from exploring the reasons behind his lost. It was challenging for me to grasp all the implications on this topic, but here is what I understood:

  • Jeffrey's reader-writer lock was made of atomic variables to keep track of the amount of readers and writers. If the resource was currently written onto, the readers and other writers would wait onto a semaphor to be waken-up later on.
  • If the amount of time spent by the readers or the writers on the resource is fairly long, the readers-writer lock will actually perform better than the mutex as multiple reader can process simultanesouly.
  • On the other hand, if the writing and reading operations are very fast, the atomic operations on the counters will start to be costly comparatively. Atomics tend to have nonnegligible effects on the cache lines of your CPU(s). In this case, loosing the ability to have multiple readers is actually not as dramatic as you would think in comparison of stressing your cache.
  • Jeffrey came up with an hybrid solution that combines both a readers-writer lock and a fallback to a mutex that outperformed the previous solutions.

Once the video of this talk is uploaded, I must have a complete rewatch of it. It always amusing that our intuitions can be entirely wrong when it comes to concurrency and programming.

[Other] The not-so-secret lightning talks

Videos: link, link, link, link, link, link

Before the last Keynote, we had the pleasure to listen to some not-so-unexpected lightning talks. I will not spoil too much of it!

I just want to express my gratitude to Guy Davidson and Sean Parent for bringing diversity in the C++ community under the spotlight. It was more than welcome and I am glad of these initiatives.

Conclusion:

Once again, I was amazed by the C++ community and how a group of dedicated persons can build such nice event. I am already eager to fly to one of the big conference next year: Meeting C++ or cppcon. I would also encourage anyone with a bit of passion for this language or programming in general to give a try to conferences or local groups, you will discover more than you would expect!


C++ Stockholm 0x02

Posted on Mon 20 February 2017 in News • Tagged with blog C++17 C++ StockholmLeave a comment

I had the chance to participate last week in a C++ meetup. This Stockholm meetup group is fairly recent but has a lot of potential! I am always enjoying my time during these sessions, surrounded by the smart and passionated people. In the unlikely chance that you are living around Stockholm, or decided to visit the capital of Scandinavia (and start missing your keyboard), I would highly recommend you to join this group.

For the second edition of our conference C++ Stockholm 0x02, I volunteered for talk about a subject I already adressed in the blog: SFINAE and compile-time introspection. Overall, it was a good exercise for a first talk but I definitely need more training :). I discovered that it is much harder than I expected to explain code in live. Thanks to the meetup team, I have a video of this talk with a very nice editing job done on it. It is always a weird experience to watch yourself on a video and was not so keen to put it here in the first place. But since this video has already leaked among my friends (or foes in that case) and colleagues, I guess I might place it for the posterity! So, here it is:

You can find a pdf version of the slides right here.


An introduction to C++'s variadic templates: a thread-safe multi-type map

Posted on Mon 01 February 2016 in C++ • Tagged with C++11, C++14, variadic templates, meta programmingLeave a comment

Trivia:

One of our favorite motto in our C++ team at work is: you shall use dependency injections instead of singletons! It actually comes with our unit-testing strategy. If the various components of your architecture are too tightly coupled, it becomes a tremendous effort to deeply test small critical chunks of your code. Singletons are that kind of beast that revives itself without your permission and comes from hell to haunt your lovely unit-tests. Our main project being multi-threaded (hence highly bug-prone) and vital for the company, "singleton" became a forbidden word. Yet, our team recently started going down the dark path. Thanks to C++11 and its variadic templates, I carefully crafted a thread-safe multi-type map container that simplified our configuration reloading system and saved us from the dark side of the coder force. If you always wondered what are variadic templates, how C++11's tuples can be implemented, I am going to present these concepts in this post using my container as a cobaye.

Note: for the sake of your sanity and the fact that errare humanum est, this article might not be 100% accurate!

Why would I use a thread-safe multi-type map?

Let me explain our odyssey: we are working on a highly modular and multi-threaded application. One of its core feature is the ability to reload various configuration files or assets used by some components spread accross many threads and a giant hierarchy of objects. The reloading process is automic using Linux's inotify monitoring filesystem events. One thread is dedicated to the reception of filesystem events and must react accordingly by parsing any changes and pushing them to other threads. At first, we used, to pass-by any newly parsed asset, some thread-safe queues or something analog to go channels. Since we did not want to use singletons, we had to pass references to our queues all along our object hierarchy. Sadly, our queue implementation is one to one and supports only one type, none of our config/asset types share the same base-type. For each asset type and each component using this asset, we had to create a new queue and pass-it all along our hierarchy. That is certainely not convenient! What we really wanted was a hybrid class between a std::map and a std::tuple.

We could have used a std::map with Boost.Variant to store our items, using a type like the following "std::map< std::string, std::shared_ptr< Boost.Variant < ConfigType1, ConfigType2>>>". Boost.Variant permits to encapsulate a heterogeneous set of types without common base-type or base-class, which solves one of our point. Another solution would be to encapsulate manually all our configuration classes in the same familly of classes, that is pretty cumbersome. But anyway, std::map does not guaranty any safety if you are writing and reading at the same time on a map slot. Secondly, std::shared_ptr does guaranty a thread-safe destruction of the pointee object (i.e: the reference counter is thread-safe) but nothing for the std::shared_ptr object itself. It means that copying a std::shared_ptr that could potentially be modified from another thread, might lead to an undefined behaviour. Even if we were to encapsulate all these unsafe accesses with mutexes, we are still lacking a nice mechanism to get update notifications for our objects. We do not want to constantly poll the latest version and propagate it through our code. And finally, if that solution were elegant enough, why would I currently write this blog post?

C++11 brings another collection type called std::tuple. It permits to store a set of elements of heterogeneous types. Take a look at this short example:

auto myTuple = std::make_tuple("Foo", 1337, 42);

std::cout << std::get<0>(myTuple) << std::endl; // Access element by index: "Foo"
std::cout << std::get<1>(myTuple) << std::endl; // Access element by index: 1337
std::cout << std::get<2>(myTuple) << std::endl; // Access element by index: 42
std::cout << std::get<const char*>(myTuple) << std::endl; // Access element by type: "Foo"

// compilation error: static_assert failed "tuple_element index out of range"
std::cout << std::get<3>(myTuple) << std::endl;

// compilation error: static_assert failed "type can only occur once in type list"
std::cout << std::get<int>(myTuple) << std::endl;

Tuples are that kind of C++11 jewelry that should decide your old-fashioned boss to upgrade your team's compiler (and his ugly tie). Not only I could store a const char* and two ints without any compiling error, but I could also access them using compile-time mechanisms. In some way, you can see tuples as a compile-time map using indexes or types as keys to reach its elements. You cannot use an index out of bands, it will be catched at compile-time anyway! Sadly, using a type as a key to retrieve an element is only possible if the type is unique in the tuple. At my work, we do have few config objects sharing the same class. Anyway, tuples weren't fitting our needs regarding thread safety and update events. Let's see what we could create using tasty tuples as an inspiration.

Note that some tuples implementations were already available before C++11, notably in boost. C++11 variadic templates are just very handy, as you will see, to construct such a class.

A teaser for my repository class:

To keep your attention for the rest of this post, here is my thread-safe multi-type map in action:

#include <iostream>
#include <memory>
#include <string>

#include "repository.hpp"

// Incomplete types used as compile-time keys.
struct Key1;
struct Key2;

// Create a type for our repository.
using MyRepository = Repository
    <
        Slot<std::string>, // One slot for std::string.
        Slot<int, Key1>, // Two slots for int.
        Slot<int, Key2> // Must be differentiate using "type keys" (Key1, Key2).
    >;

int main()
{
    MyRepository myRepository;

    myRepository.emplace<std::string>("test"); // Construct the shared_ptr within the repository.
    myRepository.emplace<int, Key1>(1337);
    myRepository.set<int, Key2>(std::make_shared<int>(42)); // Set the shared_ptr manually.

    // Note: I use '*' as get returns a shared_ptr.
    std::cout << *myRepository.get<std::string>() << std::endl; // Print "test".
    std::cout << *myRepository.get<int, Key1>() << std::endl; // Print 1337.
    std::cout << *myRepository.get<int, Key2>() << std::endl; // Print 42.

    std::cout << *myRepository.get<int>() << std::endl;
    //             ^^^ Compilation error: which int shall be selected? Key1 or Key2?

    auto watcher = myRepository.getWatcher<std::string>(); // Create a watcher object to observe changes on std::string.
    std::cout << watcher->hasBeenChanged() << std::endl; // 0: no changes since the watcher creation.

    myRepository.emplace<std::string>("yo"); // Emplace a new value into the std::string slot.
    std::cout << watcher->hasBeenChanged() << std::endl; // 1: the std::string slot has been changed.

    std::cout << *watcher->get() << std::endl; // Poll the value and print "yo".
    std::cout << watcher->hasBeenChanged() << std::endl; // 0: no changes since the last polling.

    return EXIT_SUCCESS;
}

First and foremost, its name repository might not be well-suited for its responsibility. If your native language is the same as shakespeare and come-up with a better term, please feel free to submit it. In our internal usage, config repository sounded great!

I start by describing the slots necessary for my application by creating a new type MyRepository using a type alias. As you can see, I use the type of the slots as a key for accessing elements. But in case of contention, I must use a second key: an "empty type" ; like Key1 and Key2 in this example. If using types as keys seems odd for you, fear not! Here is the most rational explanation I can share with you: we are trying to benefit from our "know-it-all compiler". Your compiler is mainly manipulating types, one can change its flow using these types during the compilation process. Note that these structs are not even complete (no definition), it has no impact for the runtime memory or runtime execution and that's the amazing part of meta-programming. The dispatch of an expression such as "myRepository.get< int, Key1>()" is done during your build-time.

You may also notice that every slot is actually a std::shared_ptr. It enforces a clean ressource management: in a multithreaded application, one must be really careful of the lifetime of heap objects. std::shared_ptr in this case permits me to ensure that even if someone replaces a value in a slot, other components on other threads manipulating the old value won't end up with a dangling pointer/reference bomb in their hands. Another solution would be to use plain value objects, but not only it would require copying big objects in every other components but it would also remove polymorphism.

As for the updates signalisation, you first create a watcher object that establishes a contract between a desired slot to watch and your context. You can thereafter query in thread-safe way weither an update has been made and, if so, poll the latest changes. The watcher object is actually a std::unique_ptr for a special class, it cannot be moved nor copied without your permission and will automagically disable the signalisation contract between the slot and your context, once destroyed. We will dive deeper in this topic in the comming sections.

Within our application, the repository object is encapsulated into a RuntimeContext object. This RuntimeContext object is created explicitely within our main entry point and passed as a reference to a great part of our components. We therefore keep the possibility to test our code easily by setting this RuntimeContext with different implementations. Here is a simplified version of our usage:

// runtimecontext.hpp
#include "repository.hpp"

// Incomplete types used as compile-time keys.
struct Key1;
struct Key2;

class ConfigType1; // Defined in another file.
class ConfigType2; // Defined in another file.

// Create a type for our repository.
using ConfigRepository = Repository
    <
        Slot<ConfigType1>,
        Slot<ConfigType2, Key1>,
        Slot<ConfigType2, Key2>
    >;

struct RuntimeContext
{
    ILogger* logger;
    // ...
    ConfigRepository configRepository;
};

// Main.cpp

#include "runtimecontext.hpp"

int main()
{
    RuntimeContext runtimeContext;
    // Setup:
    runtimeContext.logger = new StdOutLogger();
    // ...

    // Let's take a reference to the context and change the configuration repository when necessary. 
    startConfigurationMonitorThread(runtimeContext);

    // Let's take a reference and pass it down to all our components in various threads.
    startOurApplicationLogic(runtimeContext);

    return EXIT_SUCCESS;
}

Time for a C++11 implementation:

We can decompose the solution in 3 steps: at first we need to implement a map that accepts multiple types, we then need to work on the thread safety and finish by the watcher mechanism. Let's first fulfill the mission of this post: introducing you to variadic templates to solve the multiple-type problem.

Variadic templates:

You may not have heard of variadic templates in C++11 but I bet that you already used variadic functions like printf in C (maybe in a previous unsafe life). As wikipedia kindly explains "a variadic function is a function of indefinite which accepts a variable number of arguments". In other words, a variadic function has potentially an infinite number of parameters. Likewise, a variadic template has potentially an infinite number of parameters. Let's see how to use them!

Usage for variadic function templates:

Let's say that you wish to create a template that accept an infinite number of class as arguments. You will use the following notation:

template <class... T>

You specify a group of template parameters using the ellipsis notation named T. Note that this ellipsis notation is consistent with C's variadic function notation. This group of parameters, called a parameter-pack, can then be used in your function template or your class template by expanding them. One must use the ellipsis notation again (this time after T) to expand the parameter pack T:

template <class... T> void f(T...)
//              ^ pack T       ^expansion
{
    // Your function content.
}

Now that we have expanded T, what can we do Sir? Well, first you give to your expanded parameter types, a fancy name like t.

template <class... T> void f(T... t)
//                                ^ your fancy t.
{
    // Your function content.
}

If T = T1, T2, then T... t = T1 t1, T2 t2 and t = t1, t2. Brilliant, but is that all? Sure no! You can then expand again t using an "suffix-ellipsis" again:

template <class... T> void f(T... t)
{
    anotherFunction(t...);
    //                ^ t is expanded here! 
}

Finally, you can call this function f as you would with a normal function template:

template <class... T> void f(T... t)
{
    anotherFunction(t...);
}

f(1, "foo", "bar"); // Note: the argument deduction avoids us to use f<int, const char*, const char*>
// f(1, "foo", "bar") calls a generated f(int t1, const char* t2, const char* t3)
// with T1 = int, T2 = const char* and T3 = const char*,
// that itself calls anotherFunction(t1, t2, t3) equivalent to call anotherFunction(1, "foo", "bar");

Actually, the expansion mechanism is creating comma-separated replication of the pattern you apply the ellipsis onto. If you think I am tripping out with template-related wording, here is a much more concret example:

template <class... T> void g(T... t)
{
    anotherFunction(t...);
}

template <class... T> void f(T*... t)
{
    g(static_cast<double>(*t)...);
}

int main()
{
    int a = 2;
    int b = 3;

    f(&a, &b); // Call f(int* t1, int* t2).
    // Do a subcall to g(static_cast<double>(*t1), static_cast<double>(*t2)).

    return EXIT_SUCCESS;
}

I could use the pattern '*' for f parameters and therefore take them as a pointer! In the same manner, I applied the pattern 'static_cast< double>(*) to get the value of each arguments and cast them as doubles before forwarding them to g.

One last example before moving to variadic class templates. One can combine "normal" template parameters with parameter packs and initiate a compile recursion on function templates. Let's take a look at this printing function:

#include <iostream>

template <class HEAD> void print(HEAD head)
{
    std::cout << "Stop: " << head << std::endl;
}

template <class HEAD, class... TAIL> void print(HEAD head, TAIL... tail)
{
    std::cout << "Recurse: " << head << std::endl;
    print(tail...);
}

int main()
{
    print(42, 1337, "foo");

    // Print:
    // Recurse: 42
    // Recurse: 1337
    // Stop: foo

    // Call print<int, int, const char*> (second version of print).
    // The first int (head) is printed and we call print<int, const char*> (second version of print).
    // The second int (head again) is printed and we call print<const char*> (first version of print).
    // We reach recursion stopping condition, only one element left.

    return EXIT_SUCCESS;
}

Variadic templates are very interesting and I wouldn't be able to cover all their features within this post. It roughly feels like functional programming using your compiler, and even some Haskellers might listen to you if you bring that topic during a dinner. For those interested, I would challenge them to write a type-safe version of printf using variadic templates with the help of this reference. After that, you will run and scream of fear at the precense of C's vargs.

"Variadic" inheritance:

Sometimes during my programming sessions, I have a very awkward sensation that my crazy code will never compile and, yet, I finally see "build finished" in my terminal. I am talking about that kind of Frankenstein constructions:

struct A { };

struct B { };

template <class... T>
struct C: public T... // Variadic inheritance
{

};

C<A, B> c;

Yes, we can now create a class inheriting of an infinite number of bases. If you remember my explanation about pattern replications separated by commas, you can imaginge that struct C: public T... will be "transformed" in struct C: public A, public B, public T being the pattern. We start to be able to combine multiple types, each exposing a small amount of methods, to create a flexible concret type. That's one step closer to our multi-type map, and if you are interested in this concept, take a look at mixins.

Instead of inheriting directly from multiple types, couldn't we inherit from some types that encapsulate our types? Absolutely! A traditional map has some slots accessible using keys and these slots contain a value. If you give me base-class you are looking for, I can give you access to the value it contains:

#include <iostream>

struct SlotA
{
    int value;
};

struct SlotB
{
    std::string value;
};

// Note: private inheritance, no one can access directly to the slots other than C itself.
struct Repository: private SlotA, private SlotB
{

    void setSlotA(const int& value)
    {
        // I access the base-class's value
        // Since we have multiple base with a value field, we need to "force" the access to SlotA.
        SlotA::value = value;
    }

    int getSlotA()
    {
        return SlotA::value;
    }

    void setSlotB(const std::string& b)
    {
        SlotB::value = b;
    }

    std::string getSlotB()
    {
        return SlotB::value;
    }
};


int main()
{
    Repository r;

    r.setSlotA(42);
    std::cout << r.getSlotA() << std::endl; // Print: 42.

    r.setSlotB(std::string("toto"));
    std::cout << r.getSlotB() << std::endl; // Print: "toto".

    return EXIT_SUCCESS;
}

This code is not generic at all! We know how to create a generic Slot using a simple template, and we acquired the magic "create varidiac inheritance" skill. If my Repository class inherit from Slot< TypeA> and you call a method template with TypeA as a template argument, I can call the doGet method of the Slot< TypeA> base-class and give you back the value of TypeA in that repository. Let's fix the previous ugly copy-paste code:

#include <iostream>
#include <string>

template <class Type>
class Slot
{
protected:
    Type& doGet() // A nice encapsulation, that will be usefull later on.
    {
        return value_;
    }

    void doSet(const Type& value) // Same encapsulation.
    {
        value_ = value;
    }
private:
    Type value_;
};

template <class... Slots>
class Repository : private Slots... // inherit from our slots...
{
public:
    template <class Type> // Give me a type and,
    Type& get()
    {
        return Slot<Type>::doGet(); // I can select the Base class.
    }

    template <class Type>
    void set(const Type& value)
    {
        Slot<Type>::doSet(value);
    }
};

// Incomplete types used as compile-time keys.
struct Key1;
struct Key2;

// Create a type for our repository.
using MyRepository = Repository
        <
                Slot<int>,       // Let's pick the type of our slots.
                Slot<std::string>
        >;

int main()
{
    MyRepository myRepository;

    myRepository.set<std::string>("toto");
    myRepository.set(42); // Notice the type deduction: we pass an int, so it writes in the int slot.

    std::cout << myRepository.get<int>() << std::endl; // Print: "toto".
    std::cout << myRepository.get<std::string>() << std::endl; // Print: 42.

    return EXIT_SUCCESS;
}

This repository starts to take shape, but we are not yet done! If you try to have two int slots, you will raise a compilation error: "base class 'Slot' specified more than once as a direct base class". We need to add another key-type to our slot class with a default value and we need to modify our repository methods to handle it:

struct DefaultSlotKey; // No needs for a definition

template <class T, class Key = DefaultSlotKey> // The Key type will never be trully used. 
class Slot
{
    // ...
};

template <class... Slots>
class Repository : private Slots...
{
public:
    template <class Type, class Key = DefaultSlotKey> // The default key must be here too.
    Type& get()
    {
        return Slot<Type, Key>::doGet();
    }

    template <class Type, class Key = DefaultSlotKey>
    void set(const Type& value)
    {
        Slot<Type, Key>::doSet(value);
    }
};

struct Key1; // No need for definition.
struct Key2;

// Now you can do:
using MyRepository = Repository
    <
            Slot<int>,       // Let's pick the type of our slots.
            Slot<std::string, Key1>,
            Slot<std::string, Key2>
    >;

Here is a UML representation of this Repository using distinct Keys for the type std::string: A nice UML diagram of my classes

Our repository class is missing an emplace method, right? emplace is taking a variable number of arguments with different types and forward them to create an object within one of our slots. A variable number of arguments and types must remind you something... variadic templates! Let's create this variadic emplace method as well as its equivalent in the Slot class:

// In class Slot:

template <class... Args>
void doEmplace(const Args&... args) // Here the pattern is const  &.
{
    value_ = Type(args...); // copy-operator (might use move semantics).
}

// In class Repository:
template <class Type, class Key = DefaultSlotKey, class... Args>
void emplace(const Args&... args) // Here the pattern is const  &.
{
    Slot<Type, Key>::doEmplace(args...);
}

// Usage:
myRepository.emplace<std::string>(4, 'a'); // Create a std::string "aaaa".

One last improvement for the future users of your repositories! If one morning, badly awake, a coworker of yours is trying to get a type or key that doesn't exist (like myRepository.get< double>();), he might be welcomed by such a message:

/home/jguegant/Coding/ConfigsRepo/main.cpp:36:33: error: call to non-static member function without an object argument
    return Slot<Type, Key>::doGet();
           ~~~~~~~~~~~~~~~~~^~~~~
/home/jguegant/Coding/ConfigsRepo/main.cpp:67:18: note: in instantiation of function template specialization 'Repository<Slot<int, DefaultSlotKey>, Slot<std::__1::basic_string<char>, DefaultSlotKey> >::get<double, DefaultSlotKey>' requested here
    myRepository.get<double>();
                 ^
/home/jguegant/Coding/ConfigsRepo/main.cpp:36:33: error: 'doGet' is a protected member of 'Slot<double, DefaultSlotKey>'
        return Slot<Type, Key>::doGet();
                                ^
/home/jguegant/Coding/ConfigsRepo/main.cpp:10:11: note: declared protected here
    Type& doGet()
          ^
2 errors generated.

This message is very confusing, our class does not inherit from Slot< double, DefaultSlotKey>! And we are talking about a clang output, I wonder what gcc or MSVC could produce... If you do not want to be assinated from your moody colleague with a spoon, here is a nice solution using C++11's static_asserts. Static asserts give you the possibility to generate your own compiler error messages in the same fashion as normal asserts but at compile-time. Using a the trait like std::is_base_of, you can suggest the user of your repository to check twice his type. Let's put this static_assert at the beggining of all the methods of Repository:

static_assert(std::is_base_of<Slot<Type, Key>, Repository<Slots...>>::value, 
          "Please ensure that this type or this key exists in this repository");

We are done for this part (finally...), time to think about multi-threading! If you want to know more about the magic behind std::is_base_of, I would suggest you to read my previous post on SFINAE, it might give you few hints. Here is a gist of what we achieved so far. Did you notice the change on emplace? If you do not understand it, have a look at this explanation on perfect forwarding. Sadly, it would be a way too long topic for this post (trust me on that point!) and has a minor impact on our repository right now.

Let's play safe:

The repository we just succeeded to craft can now be used in a single-thread environment without further investigation. But the initial decision was to make this class manipulable from multiple-threads without any worries considering the safety of our operations. As explained in the beginning of this post, we will not use direct values as we currently do, but instead allocate our objects on the heap and use some shared pointers to strictly control their lifetime. No matter which version (recent or deprecated) of the object a thread is manipulating, it's lifetime will be extended until the last thread using it definitely release it. It also implies that the objects themselves are thread-safe. In the case of read-only objects like configs or assets, it shouldn't be too much a burden. In this gist, you will find a repository version using std::shared_ptrs.

std::shared_ptr is an amazing feature of C++11 when dealing with multi-threading, but has its weakness. Within my code (in the previous gist link) a race condition can occur:

// What if I try to copy value_ at the return point...
std::shared_ptr<Type> doGet() const
{
    return value_;
}

// ... meanwhile another thread is changing value_ to value?
void doSet(const std::shared_ptr<Type> &value)
{
    value_ = value;
}

As specified: "If multiple threads of execution access the same std::shared_ptr object without synchronization and any of those accesses uses a non-const member function of shared_ptr then a data race will occur". Note that we are talking about the same shared pointer. Multiple shared pointer copies pointing to the same object are fine, as long as these copies originated from the same shared pointer in first place. Copies are sharing the same control block, where the reference counters (one for shared_ptr and one for weak_ptr) are located, and the specification says "the control block of a shared_ptr is thread-safe: different std::shared_ptr objects can be accessed using mutable operations, such as operator= or reset, simultaneously by multiple threads, even when these instances are copies, and share the same control block internally.".

Depending on the age of your compiler and its standard library, I suggest two solutions:

1) A global mutex:

A straightforward solution relies on a std::mutex that we lock during doGet and doSet execution:

...
    std::shared_ptr<Type> doGet()
    {
        // The lock is enabled until value_ has been copied!
        std::lock_guard<std::mutex> lock(mutex_);
        return value_;
    }

    void doSet(const std::shared_ptr<Type> &value)
    {
        // The lock is enabled until value has been copied into value!
        std::lock_guard<std::mutex> lock(mutex_);
        value_ = value;
    }

private:
    std::mutex mutex_;
...

This solution is ideal if you have a Linux distribution that only ships gcc 4.8.x like mine. While not particularly elegant, it doesn't have a great impact on performances compared to the next solution.

2) Atomic access functions:

Starting from gcc 4.9, one can use atomic access functions to manipulate shared pointers. I dream of a day where a specialisation for std::atomic< std::shared_ptr> exists, but from now, we will resort to use std::atomic_load and std::atomic_exchange:

...
    std::shared_ptr<Type> doGet() const
    {
        return std::atomic_load(&value_);
    }

    void doSet(const std::shared_ptr<Type> &value)
    {
        std::atomic_exchange(&value_, value);
    }

private:
    std::shared_ptr<Type> value_;
...

Atomics are elegants and can often bring a great increase of performances if using lock-free instructions internally. Sadly, in the case of shared_ptrs, atomic_is_lock_free will return you false. By digging in libstdc++ and libc++, you will find some mutexes. gcc seems to use a fixed size "pool" of mutexes attributed to a shared_ptr according to a hash of its pointee address, when dealing with atomic operations. In other words, no rocket-science for atomic shared pointers until now.

Our own watchers:

"...I shall live and die at my post. I am the sword in the darkness. I am the watcher on the walls. I am the shield that guards the realms of men..." -- The Night's Watch oath

We want to be able to seal a bond between one of the slot and a context. By context, I mean the lifetime of an object in a thread, a function or a method. If an update has been made on that slot, we must be signaled in that context and to retrieve the new update. The bond must be destroyed if the context does not exist anymore. It should reminds you the Night's Watch oath ... as well as the RAII idiom: "holding a resource is tied to object lifetime: resource acquisition is done during object creation, by the constructor, while resource deallocation is done during object destruction, by the destructor. If objects are destroyed properly, resource leaks do not occur.". A strong ownership policy can be obtained with the help of a std::unique_ptr and the signalisation can be done using a boolean flag.

We will, therefore, encapsulate a std::atomic_bool into a class Watcher automagically registered to a slot once created, and unregistered once destructed. This Watcher class also takes as a reference the slot in order to query its value as you can see:

template <class Type, class Key>
class Watcher
{
public:
    Watcher(Slot<Type, Key>& slot):
            slot_(slot),
            hasBeenChanged_(false)
    {
    }

    Watcher(const Watcher&) = delete;               // Impossible to copy that class.

    Watcher & operator=(const Watcher&) = delete;   // Impossible to copy that class.

    bool hasBeenChanged() const
    {
        return hasBeenChanged_;
    }

    void triggerChanges()
    {
        hasBeenChanged_ = true;
    }

    auto get() -> decltype(std::declval<Slot<Type, Key>>().doGet())
    {
        hasBeenChanged_ = false; // Note: even if there is an update of the value between this line and the getValue one,
        // we will still have the latest version.
        // Note 2: atomic_bool automatically use a barrier and the two operations can't be inversed.
        return slot_.doGet();
    }

private:
    Slot<Type, Key>& slot_;
    std::atomic_bool hasBeenChanged_;
};

As for the automatic registration, we will add two private methods registerWatcher and unregisterWatcher to our Slot class that add or remove a watcher from an internal list. The list is always protected, when accessed, with a std::mutex and tracks all the current watchers that must be signaled when set is called on that slot.

template <class Type, class Key>
class Slot
{
public:
    using ThisType = Slot<Type, Key>;
    using WatcherType = Watcher<Type, Key>;

...
private:
    void registerWatcher(WatcherType* newWatcher)
    {
        std::lock_guard<std::mutex> l(watchers_mutex_);
        watchers_.push_back(newWatcher);
    }

    void unregisterWatcher(WatcherType *toBeDelete)
    {
        std::lock_guard<std::mutex> l(watchers_mutex_);
        watchers_.erase(std::remove(watchers_.begin(), watchers_.end(), toBeDelete), watchers_.end());

        delete toBeDelete; // Now that we removed the watcher from the list, we can proceed to delete it.
    }

    void signal()
    {
        std::lock_guard<std::mutex> l(watchers_mutex_);
        for (auto watcher : watchers_) {
            watcher->triggerChanges(); // Let's raise the hasBeenChanged_ atomic boolean flag. 
        }
    }

private:
    std::vector<WatcherType*> watchers_; // All the registered watchers are in that list.

...
};

You may have notice that we are passing a bare WatcherType pointers. The ownership is actually given to whoever is using that watcher encapsulated within a std::unique_ptr. C++11's unique pointers are designed such as you can pass a custom deleter, or a delete callback so to speak. Hence, we can create a method that get a Watcher for a Slot, and register as the deleter of that Watcher a lambda function designed to call unregisterWatcher. Note that the slot MUST always lives longer than the unique pointer and its associated watcher (it should not be a problem in most cases). Let's finish that Slot class forever and ever:

template <class Type, class Key>
class Slot
{
public:
    using ThisType = Slot<Type, Key>;
    using WatcherType = Watcher<Type, Key>;

    // We use unique_ptr for a strong ownership policy.
    // We use std::function to declare the type of our deleter.
    using WatcherTypePtr = std::unique_ptr<WatcherType, std::function<void(WatcherType*)>> ;

...

public:
    WatcherTypePtr doGetWatcher()
    {
        // Create a unique_ptr and pass a lambda as a deleter.
        // The lambda capture "this" and will call unregisterWatcher.
        WatcherTypePtr watcher(new WatcherType(*this), [this](WatcherType* toBeDelete) {
            this->unregisterWatcher(toBeDelete);});

        registerWatcher(watcher.get());

        return watcher;
    }
...
};

Are we done? Hell no, but we will be really soon. All we need is to expose the possibility to acquire a watcher from the repository itself. In the same manner as set and get, we simply dispatch using the type and the key on one of our slot:

template <class Type, class Key = DefaultSlotKey>
typename Slot<Type, Key>::WatcherTypePtr getWatcher() // typename is used for disambiguate
{
    return Slot<Type, Key>::doGetWatcher();
}

WAIT, don't close that page too fast. If you want to be able to snub everyone, you can replace this ugly typename Slot::WatcherTypePtr with auto and claim that your repository class is C++14 only! Grab the full code of what we build together on gist and enjoy!

Conclusion:

Once again, I hope you enjoyed this post about one of my favourite subject: C++. I might not be the best teacher nor the best author but I wish that you learnt something today! Please, if you any suggestions or questions, feel free to post anything in the commentaries. My broken English being what it is, I kindly accept any help for my written mistakes.

Many thanks to my colleagues that greatly helped me by reviewing my code and for the time together.


An introduction to C++'s SFINAE concept: compile-time introspection of a class member

Posted on Sat 31 October 2015 in C++ • Tagged with C++11, C++14, TMP, meta programmingLeave a comment

Trivia:

As a C++ enthusiast, I usually follow the annual C++ conference cppconf or at least try to keep myself up-to-date with the major events that happen there. One way to catch up, if you can't afford a plane ticket or the ticket, is to follow the youtube channel dedicated to this conference. This year, I was impressed by Louis Dionne talk entitled "C++ Metaprogramming: A Paradigm Shift". One feature called is_valid that can be found in Louis's Boost.Hana library particulary caught my attention. This genious is_valid function heavily rely on an even more "magic" C++ programming technique coined with the term SFINAE discovered at the end of the previous century. If this acronym doesn't speak to you, don't be scared, we are going to dive straight in the subject.

Note: for the sake of your sanity and the fact that errare humanum est, this article might not be 100% accurate!

Introspection in C++?

Before explaining what is SFINAE, let's explore one of its main usage: introspection. As you might be aware, C++ doesn't excel when it comes to examine the type or properties of an object at runtime. The best ability provided by default would be RTTI. Not only RTTI isn't always available, but it also gives you barely more than the current type of the manipulated object. Dynamic languages or those having reflection on the other hand are really convenient in some situations like serialization.

For instance, in Python, using reflection, one can do the following:

class A(object):
    # Simply overrides the 'object.__str__' method.
    def __str__(self):
        return "I am a A"

class B(object):
    # A custom method for my custom objects that I want to serialize.
    def serialize(self):
        return "I am a B"

class C(object):
    def __init__(self):
        # Oups! 'serialize' is not a method. 
        self.serialize = 0

    def __str__(self):
        return "I am a C"

def serialize(obj):
    # Let's check if obj has an attribute called 'serialize'.
    if hasattr(obj, "serialize"):
        # Let's check if this 'serialize' attribute is a method.
        if hasattr(obj.serialize, "__call__"):
            return obj.serialize()

    # Else we call the __str__ method.
    return str(obj)

a = A()
b = B()
c = C()

print(serialize(a)) # output: I am a A.
print(serialize(b)) # output: I am a B.
print(serialize(c)) # output: I am a C.

As you can see, during serialization, it comes pretty handy to be able to check if an object has an attribute and to query the type of this attribute. In our case, it permits us to use the serialize method if available and fall back to the more generic method str otherwise. Powerful, isn't it? Well, we can do it in plain C++!

Here is the C++14 solution mentionned in Boost.Hana documentation, using is_valid:

#include <boost/hana.hpp>
#include <iostream>
#include <string>

using namespace std;
namespace hana = boost::hana;

// Check if a type has a serialize method.
auto hasSerialize = hana::is_valid([](auto&& x) -> decltype(x.serialize()) { });

// Serialize any kind of objects.
template <typename T>
std::string serialize(T const& obj) {
    return hana::if_(hasSerialize(obj), // Serialize is selected if available!
                     [](auto& x) { return x.serialize(); },
                     [](auto& x) { return to_string(x); }
    )(obj);
}

// Type A with only a to_string overload.
struct A {};

std::string to_string(const A&)
{
    return "I am a A!";
}

// Type B with a serialize method.
struct B
{
    std::string serialize() const
    {
        return "I am a B!";
    }
};

// Type C with a "wrong" serialize member (not a method) and a to_string overload.
struct C
{
    std::string serialize;
};

std::string to_string(const C&)
{
    return "I am a C!";
}

int main() {
    A a;
    B b;
    C c;

    std::cout << serialize(a) << std::endl;
    std::cout << serialize(b) << std::endl;
    std::cout << serialize(c) << std::endl;
}

As you can see, it only requires a bit more of boilerplate than Python, but not as much as you would expect from a language as complexe as C++. How does it work? Well if you are too lazy to read the rest, here is the simplest answer I can give you: unlike dynamically typed languages, your compiler has access a lot of static type information once fired. It makes sense that we can constraint your compiler to do a bit of work on these types! The next question that comes to your mind is "How to?". Well, right below we are going to explore the various options we have to enslave our favorite compiler for fun and profit! And we will eventually recreate our own is_valid.

The old-fashioned C++98-way:

Whether your compiler is a dinosaur, your boss refuses to pay for the latest Visual Studio license or you simply love archeology, this chapter will interest you. It's also interesting for the people stuck between C++11 and C++14. The solution in C++98 relies on 3 key concepts: overload resolution, SFINAE and the static behavior of sizeof.

Overload resolution:

A simple function call like "f(obj);"" in C++ activates a mechanism to figure out which f function shoud be called according to the argument obj. If a set of f functions could accept obj as an argument, the compiler must choose the most appropriate function, or in other words resolve the best overload! Here is a good cppreference page explaining the full process: Overload resolution. The rule of thumb in this case is the compiler picks the candidate function whose parameters match the arguments most closely is the one that is called. Nothing is better than a good example:

void f(std::string s); // int can't be convert into a string.
void f(double d); // int can be implicitly convert into a double, so this version could be selected, but...
void f(int i); // ... this version using the type int directly is even more close!

f(1); // Call f(int i);

In C++ you also have some sink-hole functions that accept everything. First, function templates accept any kind of parameter (let's say T). But the true black-hole of your compiler, the devil variable vacuum, the oblivion of the forgotten types are the variadic functions. Yes, exactly like the horrible C printf.

std::string f(...); // Variadic functions are so "untyped" that...
template <typename T> std::string f(const T& t); // ...this templated function got the precedence!

f(1); // Call the templated function version of f.

The fact that function templates are less generic than variadic functions is the first point you must remember!

Note: A templated function can actually be more precise than a normal function. However, in case of a draw, the normal function will have the precedence.

SFINAE:

I am already teasing you with the power for already few paragraphs and here finally comes the explanation of this not so complex acronym. SFINAE stands for Substitution Failure Is Not An Error. In rough terms, a substitution is the mechanism that tries to replace the template parameters with the provided types or values. In some cases, if the substitution leads to an invalid code, the compiler shouldn't throw a massive amount of errors but simply continue to try the other available overloads. The SFINAE concept simply guaranties such a "sane" behavior for a "sane" compiler. For instance:

/*
 The compiler will try this overload since it's less generic than the variadic.
 T will be replace by int which gives us void f(const int& t, int::iterator* b = nullptr);
 int doesn't have an iterator sub-type, but the compiler doesn't throw a bunch of errors.
 It simply tries the next overload. 
*/
template <typename T> void f(const T& t, typename T::iterator* it = nullptr) { }

// The sink-hole.
void f(...) { }

f(1); // Calls void f(...) { }

All the expressions won't lead to a SFINAE. A broad rule would be to say that all the substitutions out of the function/methods body are "safes". For a better list, please take a look at this wiki page. For instance, a wrong substitution within a function body will lead to a horrible C++ template error:

// The compiler will be really unhappy when it will later discover the call to hahahaICrash. 
template <typename T> void f(T t) { t.hahahaICrash(); }
void f(...) { } // The sink-hole wasn't even considered.

f(1);

The operator sizeof:

The sizeof operator is really a nice tool! It permits us to returns the size in bytes of a type or an expression at compilation time. sizeof is really interesting as it accurately evaluates an expression as precisely as if it were compiled. One can for instance do:

typedef char type_test[42];
type_test& f();

// In the following lines f won't even be truly called but we can still access to the size of its return type.
// Thanks to the "fake evaluation" of the sizeof operator.
char arrayTest[sizeof(f())];
std::cout << sizeof(f()) << std::endl; // Output 42.

But wait! If we can manipulate some compile-time integers, couldn't we do some compile-time comparison? The answer is: absolutely yes, my dear reader! Here we are:

typedef char yes; // Size: 1 byte.
typedef yes no[2]; // Size: 2 bytes.

// Two functions using our type with different size.
yes& f1();
no& f2();

std::cout << (sizeof(f1()) == sizeof(f2())) << std::endl; // Output 0.
std::cout << (sizeof(f1()) == sizeof(f1())) << std::endl; // Output 1.

Combining everything:

Now we have all the tools to create a solution to check the existence of a method within a type at compile time. You might even have already figured it out most of it by yourself. So let's create it:

template <class T> struct hasSerialize
{
    // For the compile time comparison.
    typedef char yes[1];
    typedef yes no[2];

    // This helper struct permits us to check that serialize is truly a method.
    // The second argument must be of the type of the first.
    // For instance reallyHas<int, 10> would be substituted by reallyHas<int, int 10> and works!
    // reallyHas<int, &C::serialize> would be substituted by reallyHas<int, int &C::serialize> and fail!
    // Note: It only works with integral constants and pointers (so function pointers work).
    // In our case we check that &C::serialize has the same signature as the first argument!
    // reallyHas<std::string (C::*)(), &C::serialize> should be substituted by 
    // reallyHas<std::string (C::*)(), std::string (C::*)() &C::serialize> and work!
    template <typename U, U u> struct reallyHas;

    // Two overloads for yes: one for the signature of a normal method, one is for the signature of a const method.
    // We accept a pointer to our helper struct, in order to avoid to instantiate a real instance of this type.
    // std::string (C::*)() is function pointer declaration.
    template <typename C> static yes& test(reallyHas<std::string (C::*)(), &C::serialize>* /*unused*/) { }
    template <typename C> static yes& test(reallyHas<std::string (C::*)() const, &C::serialize>* /*unused*/) { }

    // The famous C++ sink-hole.
    // Note that sink-hole must be templated too as we are testing test<T>(0).
    // If the method serialize isn't available, we will end up in this method.
    template <typename> static no& test(...) { /* dark matter */ }

    // The constant used as a return value for the test.
    // The test is actually done here, thanks to the sizeof compile-time evaluation.
    static const bool value = sizeof(test<T>(0)) == sizeof(yes);
};

// Using the struct A, B, C defined in the previous hasSerialize example.
std::cout << hasSerialize<A>::value << std::endl;
std::cout << hasSerialize<B>::value << std::endl;
std::cout << hasSerialize<C>::value << std::endl;

The reallyHas struct is kinda tricky but necessary to ensure that serialize is a method and not a simple member of the type. You can do a lot of test on a type using variants of this solution (test a member, a sub-type...) and I suggest you to google a bit more about SFINAE tricks. Note: if you truly want a pure compile-time constant and avoid some errors on old compilers, you can replace the last value evaluation by: "enum { value = sizeof(test(0)) == sizeof(yes) };".

You might also wonder why it doesn't work with inheritence. Inheritence in C++ and dynamic polymorphism is a concept available at runtime, or in other words, a data that the compiler won't have and can't guess! However, compile time type inspection is much more efficient (0 impact at runtime) and almost as powerful as if it were at runtime. For instance:

// Using the previous A struct and hasSerialize helper.

struct D : A
{
    std::string serialize() const
    {
        return "I am a D!";
    }
};

template <class T> bool testHasSerialize(const T& /*t*/) { return hasSerialize<T>::value; }

D d;
A& a = d; // Here we lost the type of d at compile time.
std::cout << testHasSerialize(d) << std::endl; // Output 1.
std::cout << testHasSerialize(a) << std::endl; // Output 0.

Last but no least, our test cover the main cases but not the tricky ones like a Functor:

struct E
{
    struct Functor
    {
        std::string operator()()
        {
            return "I am a E!";
        }
    };

    Functor serialize;
};

E e;
std::cout << e.serialize() << std::endl; // Succefully call the functor.
std::cout << testHasSerialize(e) << std::endl; // Output 0.

The trade-off for a full coverage would be the readability. As you will see, C++11 shines in that domain!

Time to use our genius idea:

Now you would think that it will be super easy to use our hasSerialize to create a serialize function! Okay let's try it:

template <class T> std::string serialize(const T& obj)
{
    if (hasSerialize<T>::value) {
        return obj.serialize(); // error: no member named 'serialize' in 'A'.
    } else {
        return to_string(obj);
    }
}

A a;
serialize(a);

It might be hard to accept, but the error raised by your compiler is absolutely normal! If you consider the code that you will obtain after substitution and compile-time evaluation:

std::string serialize(const A& obj)
{
    if (0) { // Dead branching, but the compiler will still consider it!
        return obj.serialize(); // error: no member named 'serialize' in 'A'.
    } else {
        return to_string(obj);
    }
}

Your compiler is really a good guy and won't drop any dead-branch, and obj must therefore have both a serialize method and a to_string overload in this case. The solution consists in spliting the serialize function into two different functions: one where we solely use obj.serialize() and one where we use to_string according to obj's type. We come back to an earlier problem that we already solved, how to split according to a type? SFINAE, for sure! At that point we could re-work our hasSerialize function into a serialize function and make it return a std::string instead of compile time boolean. But we won't do it that way! It's cleaner to separate the hasSerialize test from its usage serialize.

We need to find a clever SFINAE solution on the signature of "template <class T> std::string serialize(const T& obj)". I bring you the last piece of the puzzle called enable_if.

template<bool B, class T = void> // Default template version.
struct enable_if {}; // This struct doesn't define "type" and the substitution will fail if you try to access it.

template<class T> // A specialisation used if the expression is true. 
struct enable_if<true, T> { typedef T type; }; // This struct do have a "type" and won't fail on access.

// Usage:
enable_if<true, int>::type t1; // Compiler happy. t's type is int.
enable_if<hasSerialize<B>::value, int>::type t2; // Compiler happy. t's type is int.

enable_if<false, int>::type t3; // Compiler unhappy. no type named 'type' in 'enable_if<false, int>';
enable_if<hasSerialize<A>::value, int>::type t4; // no type named 'type' in 'enable_if<false, int>';

As you can see, we can trigger a substitution failure according to a compile time expression with enable_if. Now we can use this failure on the "template <class T> std::string serialize(const T& obj)" signature to dispatch to the right version. Finally, we have the true solution of our problem:

template <class T> typename enable_if<hasSerialize<T>::value, std::string>::type serialize(const T& obj)
{
    return obj.serialize();
}

template <class T> typename enable_if<!hasSerialize<T>::value, std::string>::type serialize(const T& obj)
{
    return to_string(obj);
}

A a;
B b;
C c;

// The following lines work like a charm!
std::cout << serialize(a) << std::endl;
std::cout << serialize(b) << std::endl;
std::cout << serialize(c) << std::endl;

Two details worth being noted! Firstly we use enable_if on the return type, in order to keep the paramater deduction, otherwise we would have to specify the type explicitely "serialize<A>(a)". Second, even the version using to_string must use the enable_if, otherwise serialize(b) would have two potential overloads available and raise an ambiguity. If you want to check the full code of this C++98 version, here is a gist. Life is much easier in C++11, so let's see the beauty of this new standard!

Note: it's also important to know that this code creates a SFINAE on an expression ("&C::serialize"). Whilst this feature wasn't required by the C++98 standard, it was already in use depending on your compiler. It trully became a safe choice in C++11.

When C++11 came to our help:

After the great century leap year in 2000, people were fairly optimistic about the coming years. Some even decided to design a new standard for the next generation of C++ coders like me! Not only this standard would ease TMP headaches (Template Meta Programming side-effects), but it would be available in the first decade, hence its code-name C++0x. Well, the standard sadly came the next decade (2011 ==> C++11), but it brought a lot of features interesting for the purpose of this article. Let's review them!

decltype, declval, auto & co:

Do you remember that the sizeof operator does a "fake evaluation" of the expression that you pass to it, and return gives you the size of the type of the expression? Well C++11 adds a new operator called decltype. decltype gives you the type of the of the expression it will evaluate. As I am kind, I won't let you google an example and give it to you directly:

B b;
decltype(b.serialize()) test = "test"; // Evaluate b.serialize(), which is typed as std::string.
// Equivalent to std::string test = "test";

declval is an utility that gives you a "fake reference" to an object of a type that couldn't be easily construct. declval is really handy for our SFINAE constructions. cppreference example is really straightforward, so here is a copy:

struct Default {
    int foo() const {return 1;}
};

struct NonDefault {
    NonDefault(const NonDefault&) {}
    int foo() const {return 1;}
};

int main()
{
    decltype(Default().foo()) n1 = 1; // int n1
//  decltype(NonDefault().foo()) n2 = n1; // error: no default constructor
    decltype(std::declval<NonDefault>().foo()) n2 = n1; // int n2
    std::cout << "n2 = " << n2 << '\n';
}

The auto specifier specifies that the type of the variable that is being declared will be automatically deduced. auto is equivalent of var in C#. auto in C++11 has also a less famous but nonetheless usage for function declaration. Here is a good example:

bool f();
auto test = f(); // Famous usage, auto deduced that test is a boolean, hurray!



//                             vvv t wasn't declare at that point, it will be after as a parameter!
template <typename T> decltype(t.serialize()) g(const T& t) {   } // Compilation error

// Less famous usage:
//                    vvv auto delayed the return type specification!
//                    vvv                vvv the return type is specified here and use t!
template <typename T> auto g(const T& t) -> decltype(t.serialize()) {   } // No compilation error.

As you can see, auto permits to use the trailing return type syntax and use decltype coupled with an expression involving one of the function argument. Does it means that we can use it to test the existence of serialize with a SFINAE? Yes Dr. Watson! decltype will shine really soon, you will have to wait for the C++14 for this tricky auto usage (but since it's a C++11 feature, it ends up here).

constexpr:

C++11 also came with a new way to do compile-time computations! The new keyword constexpr is a hint for your compiler, meaning that this expression is constant and could be evaluate directly at compile time. In C++11, constexpr has a lot of rules and only a small subset of VIEs (Very Important Expression) expressions can be used (no loops...)! We still have enough for creating a compile-time factorial function:

constexpr int factorial(int n)
{
    return n <= 1? 1 : (n * factorial(n - 1));
}

int i = factorial(5); // Call to a constexpr function.
// Will be replace by a good compiler by:
// int i = 120;

constexpr increased the usage of std::true_type & std::false_type from the STL. As their name suggest, these types encapsulate a constexpr boolean "true" and a constrexpr boolean "false". Their most important property is that a class or a struct can inherit from them. For instance:

struct testStruct : std::true_type { }; // Inherit from the true type.

constexpr bool testVar = testStruct(); // Generate a compile-time testStruct.
bool test = testStruct::value; // Equivalent to: test = true;
test = testVar; // true_type has a constexpr converter operator, equivalent to: test = true;

Blending time:

First solution:

In cooking, a good recipe requires to mix all the best ingredients in the right proportions. If you don't want to have a spaghetti code dating from 1998 for dinner, let's revisit our C++98 hasSerialize and serialize functions with "fresh" ingredients from 2011. Let's start by removing the rotting reallyHas trick with a tasty decltype and bake a bit of constexpr instead of sizeof. After 15min in the oven (or fighting with a new headache), you will obtain:

template <class T> struct hasSerialize
{
    // We test if the type has serialize using decltype and declval.
    template <typename C> static constexpr decltype(std::declval<C>().serialize(), bool()) test(int /* unused */)
    {
        // We can return values, thanks to constexpr instead of playing with sizeof.
        return true;
    }

    template <typename C> static constexpr bool test(...)
    {
        return false;
    }

    // int is used to give the precedence!
    static constexpr bool value = test<T>(int());
};

You might be a bit puzzled by my usage of decltype. The C++ comma operator "," can create a chain of multiple expressions. In decltype, all the expressions will be evaluated, but only the last expression will be considered for the type. The serialize doesn't need any changes, minus the fact that the enable_if function is now provided in the STL. For your tests, here is a gist.

Second solution:

Another C++11 solution described in Boost.Hanna documentation and using std::true_type and std::false_type, would be this one:

// Primary template, inherit from std::false_type.
// ::value will return false. 
// Note: the second unused template parameter is set to default as std::string!!!
template <typename T, typename = std::string>
struct hasSerialize
        : std::false_type
{

};

// Partial template specialisation, inherit from std::true_type.
// ::value will return true. 
template <typename T>
struct hasSerialize<T, decltype(std::declval<T>().serialize())>
        : std::true_type
{

};

This solution is, in my own opinion, more sneaky! It relies on a not-so-famous-property of default template parameters. But if your soul is already (stack-)corrupted, you may be aware that the default parameters are propagated in the specialisations. So when we use hasSerialize<OurType>::value, the default parameter comes into play and we are actually looking for hasSerialize<OurType, std::string>::value both on the primary template and the specialisation. In the meantime, the substitution and the evaluation of decltype are processed and our specialisation has the signature hasSerialize<OurType, std::string> if OurType has a serialize method that returns a std::string, otherwise the substitution fails. The specialisation has therefore the precedence in the good cases. One will be able to use the std::void_t C++17 helper in these cases. Anyway, here is a gist you can play with!

I told you that this second solution hides a lot of complexity, and we still have a lot of C++11 features unexploited like nullptr, lambda, r-values. No worries, we are going to use some of them in C++14!

The supremacy of C++14:

According to the Gregorian calendar in the upper-right corner of my XFCE environment, we are in 2015! I can turn on the C++14 compilation flag on my favorite compiler safely, isn't it? Well, I can with clang (is MSVC using a maya calendar?). Once again, let's explore the new features, and use them to build something wonderful! We will even recreate an is_valid, like I promised at the beggining of this article.

auto & lambdas:

Return type inference:

Some cool features in C++14 come from the relaxed usage of the auto keyword (the one used for type inference).

Now, auto can be used on the return type of a function or a method. For instance:

auto myFunction() // Automagically figures out that myFunction returns ints.
{
    return int();
}

It works as long as the type is easily "guessable" by the compiler. We are coding in C++ after all, not OCaml!

A feature for functional lovers:

C++11 introduced lambdas. A lambda has the following syntax:

[capture-list](params) -> non-mandatory-return-type { ...body... }

A useful example in our case would be:

auto l1 = [](B& b) { return b.serialize(); }; // Return type figured-out by the return statement.
auto l3 = [](B& b) -> std::string { return b.serialize(); }; // Fixed return type.
auto l2 = [](B& b) -> decltype(b.serialize()) { return b.serialize(); }; // Return type dependant to the B type.

std::cout << l1(b) << std::endl; // Output: I am a B!
std::cout << l2(b) << std::endl; // Output: I am a B!
std::cout << l3(b) << std::endl; // Output: I am a B!

C++14 brings a small change to the lambdas but with a big impact! Lambdas accept auto parameters: the parameter type is deduced according the argument. Lambdas are implemented as an object having an newly created unnamed type, also called closure type. If a lambda has some auto parameters, its "Functor operator" operator() will be simply templated. Let's take a look:

// ***** Simple lambda unamed type *****
auto l4 = [](int a, int b) { return a + b; };
std::cout << l4(4, 5) << std::endl; // Output 9.

// Equivalent to:
struct l4UnamedType
{
    int operator()(int a, int b) const
    {
        return a + b;
    }
};

l4UnamedType l4Equivalent = l4UnamedType();
std::cout << l4Equivalent(4, 5) << std::endl; // Output 9 too.



// ***** auto parameters lambda unnamed type *****

// b's type is automagically deduced!
auto l5 = [](auto& t) -> decltype(t.serialize()) { return t.serialize(); };

std::cout << l5(b) << std::endl; // Output: I am a B!
std::cout << l5(a) << std::endl; // Error: no member named 'serialize' in 'A'.

// Equivalent to:
struct l5UnamedType
{
    template <typename T> auto operator()(T& t) const -> decltype(t.serialize()) // /!\ This signature is nice for a SFINAE!
    {
        return t.serialize();
    }
};

l5UnamedType l5Equivalent = l5UnamedType();

std::cout << l5Equivalent(b) << std::endl; // Output: I am a B!
std::cout << l5Equivalent(a) << std::endl; // Error: no member named 'serialize' in 'A'.

More than the lambda itself, we are interested by the generated unnamed type: its lambda operator() can be used as a SFINAE! And as you can see, writing a lambda is less cumbersome than writing the equivalent type. It should remind you the beggining of my initial solution:

// Check if a type has a serialize method.
auto hasSerialize = hana::is_valid([](auto&& x) -> decltype(x.serialize()) { });

And the good new is that we have everything to recreate is_valid, right now!

The making-of a valid is_valid:

Now that we have a really stylish manner to generate a unnamed types with potential SFINAE properties using lambdas, we need to figure out how to use them! As you can see, hana::is_valid is a function that takes our lambda as a parameter and return a type. We will call the type returned by is_valid the container. The container will be in charge to keep the lambda's unnamed type for a later usage. Let's start by writing the is_valid function and its the containter:

template <typename UnnamedType> struct container
{
    // Remembers UnnamedType.
};

template <typename UnnamedType> constexpr auto is_valid(const UnnamedType& t) 
{
    // We used auto for the return type: it will be deduced here.
    return container<UnnamedType>();
}

auto test = is_valid([](const auto& t) -> decltype(t.serialize()) {})
// Now 'test' remembers the type of the lambda and the signature of its operator()!

The next step consists at extending container with the operator operator() such as we can call it with an argument. This argument type will be tested against the UnnamedType! In order to do a test on the argument type, we can use once again a SFINAE on a reacreated 'UnnamedType' object! It gives us this solution:

template <typename UnnamedType> struct container
{
// Let's put the test in private.
private:
    // We use std::declval to 'recreate' an object of 'UnnamedType'.
    // We use std::declval to also 'recreate' an object of type 'Param'.
    // We can use both of these recreated objects to test the validity!
    template <typename Param> constexpr auto testValidity(int /* unused */)
    -> decltype(std::declval<UnnamedType>()(std::declval<Param>()), std::true_type())
    {
        // If substitution didn't fail, we can return a true_type.
        return std::true_type();
    }

    template <typename Param> constexpr std::false_type testValidity(...)
    {
        // Our sink-hole returns a false_type.
        return std::false_type();
    }

public:
    // A public operator() that accept the argument we wish to test onto the UnnamedType.
    // Notice that the return type is automatic!
    template <typename Param> constexpr auto operator()(const Param& p)
    {
        // The argument is forwarded to one of the two overloads.
        // The SFINAE on the 'true_type' will come into play to dispatch.
        // Once again, we use the int for the precedence.
        return testValidity<Param>(int());
    }
};

template <typename UnnamedType> constexpr auto is_valid(const UnnamedType& t) 
{
    // We used auto for the return type: it will be deduced here.
    return container<UnnamedType>();
}

// Check if a type has a serialize method.
auto hasSerialize = is_valid([](auto&& x) -> decltype(x.serialize()) { });

If you are a bit lost at that point, I suggest you take your time and re-read all the previous example. You have all the weapons you need, now fight C++!

Our hasSerialize now takes an argument, we therefore need some changes for our serialize function. We can simply post-pone the return type using auto and use the argument in a decltype as we learn. Which gives us:

// Notice how I simply swapped the return type on the right?
template <class T> auto serialize(T& obj) 
-> typename std::enable_if<decltype(hasSerialize(obj))::value, std::string>::type
{
    return obj.serialize();
}

template <class T> auto serialize(T& obj) 
-> typename std::enable_if<!decltype(hasSerialize(obj))::value, std::string>::type
{
    return to_string(obj);
}

FINALLY!!! We do have a working is_valid and we could use it for serialization! If I were as vicious as my SFINAE tricks, I would let you copy each code pieces to recreate a fully working solution. But today, Halloween's spirit is with me and here is gist. Hey, hey! Don't close this article so fast! If you are true a warrior, you can read the last part!

For the fun:

There are few things I didn't tell you, on purpose. This article would otherwise be twice longer, I fear. I highly suggest you to google a bit more about what I am going to speak about.

  • Firstly, if you wish to have a solution that works with the Boost.Hana static if_, you need to change the return type of our testValidity methods by Hana's equivalents, like the following:

    template <typename Param> constexpr auto test_validity(int /* unused */)
    -> decltype(std::declval<UnnamedType>()(std::declval<Param>()), boost::hana::true_c)
    {
        // If substitution didn't fail, we can return a true_type.
        return boost::hana::true_c;
    }
    
    template <typename Param> constexpr decltype(boost::hana::false_c) test_validity(...)
    {
        // Our sink-hole returns a false_type.
        return boost::hana::false_c;
    }
    

    The static if_ implementation is really interesting, but at least as hard as our is_valid problem solved in this article. I might dedicate another article about it, one day!

  • Did you noticed that we only check one argument at a time? Couldn't we do something like:

    auto test = is_valid([](auto&& a, auto&& b) -> decltype(a.serialize(), b.serialize()) { });
    A a;
    B b;
    
    std::cout << test(a, b) << std::endl;
    

    Actually we can, using some parameter packs. Here is the solution:

    template <typename UnnamedType> struct container
    {
    // Let's put the test in private.
    private:
        // We use std::declval to 'recreate' an object of 'UnnamedType'.
        // We use std::declval to also 'recreate' an object of type 'Param'.
        // We can use both of these recreated objects to test the validity!
        template <typename... Params> constexpr auto test_validity(int /* unused */)
        -> decltype(std::declval<UnnamedType>()(std::declval<Params>()...), std::true_type())
        {
            // If substitution didn't fail, we can return a true_type.
            return std::true_type();
        }
    
        template <typename... Params> constexpr std::false_type test_validity(...)
        {
            // Our sink-hole returns a false_type.
            return std::false_type();
        }
    
    public:
        // A public operator() that accept the argument we wish to test onto the UnnamedType.
        // Notice that the return type is automatic!
        template <typename... Params> constexpr auto operator()(Params&& ...)
        {
            // The argument is forwarded to one of the two overloads.
            // The SFINAE on the 'true_type' will come into play to dispatch.
            return test_validity<Params...>(int());
        }
    };
    
    template <typename UnnamedType> constexpr auto is_valid(UnnamedType&& t) 
    {
        // We used auto for the return type: it will be deduced here.
        return container<UnnamedType>();
    }
    
  • This code is working even if my types are incomplete, for instance a forward declaration, or a normal declaration but with a missing definition. What can I do? Well, you can insert a check on the size of your type either in the SFINAE construction or before calling it: "static_assert( sizeof( T ), "type is incomplete." );".

  • Finally, why are using the notation "&&" for the lambdas parameters? Well, these are called forwarding references. It's a really complex topic, and if you are interested, here is good article about it. You need to use "auto&&" due to the way declval is working in our is_valid implementation!

Notes:

This is my first serious article about C++ on the web and I hope you enjoyed it! I would be glad if you have any suggestions or questions and that you wish to share with me in the commentaries.

Anyway, thanks to Naav and Superboum for rereading this article and theirs suggestions. Few suggestions were also provided by the reddit community or in the commentaries of this post, thanks a lot guys!


Why starting this blog?

Posted on Sat 17 October 2015 in News • Tagged with blogLeave a comment

Whilst I never had an urge to daily browse the various social networks, I have always liked to read technical articles and to follow few blogs. I wanted, since few years, to write my thoughts somewhere in order to clarify them and be able to easily fetch them later on. I procrastinated the idea of running a tech blog for few years, my aversion for the over-engineered web technologies and a lack of motivation restrained me. However, I have more spare time these days and I stumbled upon a simple and elegant Python static blog generator called Pelican. With Pelican I don't have to pull a thousand of npm packages for a sanatizing function neither to use the latest language for hipsters with an eccentric syntax. Static pages also greatly simplify the maintenance and decrease the security burdens of dynamic websites.

If this blog is mostly for my own purpose, I also hope that I could get an external opinion on my posts, my methods or even my skills in the beautiful English language.