C++ Async Development (not only for) for C# Developers Part V: Cancellation

In the previous part of the C++ Async Development (not only for) for C# Developers we looked at exception handling in the C++ Rest SDK/Casablanca. Today we will take look at cancellation. However, because in C++ Rest SDK cancellation and exceptions have a lot in common I recommend reading the post about exception handling (if you have not already read it) before looking at cancellation.

C++ Rest SDK allows cancelling running or scheduled tasks. Running tasks cannot, however, be forcefully cancelled from outside. Rather, an external code may request cancellation and the task may (but is not required to) cancel an ongoing operation. (This is actually very similar to how cancellation of tasks works in C# async.)

So, cancellation in its simplest form is about performing a check inside the task to determine if cancellation was requested and – if it was – giving up the current activity and returning from the task. While conceptually simple, implementing a reusable mechanism that allows to “perform a check to determine if cancellation was requested” is not an easy task due to threading and intricacies around capturing variables in lambda expressions. Fortunately, we don’t have to implement such a mechanism ourselves since C++ Rest SDK already offers one. It is based on two classes – pplx::cancellation_token and pplx::cancellation_token_source. The idea is that you create a pplx::cancellation_token_source instance and capture it in the lambda expression representing a task. Then inside the task you use this pplx::cancellation_token_source instance to obtain a pplx::cancellation_token whose is_cancelled method tells you if cancellation was requested. To request a cancellation you just invoke the cancel method on the same pplx::cancellation_token_source instance you passed to your task. Typically you will call the cancel method from a different thread. Let’s take a look at this example:

void simple_cancellation()
{
    pplx::cancellation_token_source cts;

    auto t = pplx::task<void>([cts]()
    {
        while (!cts.get_token().is_canceled())
        {
            std::cout << "task is running" << std::endl;
            pplx::wait(90);
        }

        std::cout << "task cancelled" << std::endl;
    });

    pplx::wait(500);

    std::cout << "requesting task cancellation" << std::endl;
    cts.cancel();
    t.get();
}

Here is the output:

task is running
task is running
task is running
task is running
task is running
task is running
requesting task cancellation
task cancelled
Press any key to continue . . .

One noteworthy thing in the code above is that the pplx::cancellation_token_source variable is captured by value. This is possible because pplx::cancellation_token_source behaves like a smart pointer. It also guarantees thread safety. All of this makes it really easy to pass it around or capture it in lambda functions in this multi-threaded environment.

We now know how to cancel a single task. However, in real world it is very common to have continuations scheduled to run after a task completes. In most scenarios when cancelling a task you also want to cancel its continuations. You could capture the pplx::cancellation_token_source instance and do a check in each of the continuations but fortunately there is a better way. The ppxl::task::then() function has an overload that takes pplx::cancellation_token as a parameter. When using this overload the task will not run if the token passed to this function is cancelled. The following sample shows how this works:

void cancelling_continuations()
{
    pplx::cancellation_token_source cts;

    auto t = pplx::task<void>([cts]()
    {
        std::cout << "cancelling continuations" << std::endl;

        cts.cancel();
    })
    .then([]()
    {
        std::cout << "this will not run" << std::endl;
    }, cts.get_token());

    try
    {
        if (t.wait() == pplx::task_status::canceled)
        {
            std::cout << "task has been cancelled" << std::endl;
        }
        else
        {
            t.get(); // if task was not cancelled - get the result
        }
    }
    catch (const std::exception& e)
    {
        std::cout << "excption :" << e.what() << std::endl;
    }
}

Here is the output:

cancelling continuations
task has been cancelled
Press any key to continue . . .

In the code above the first task cancels the token. This results in canceling its continuation because we pass the cancellation token when scheduling the continuation. This is pretty straightforward. The important thing, however, is to also look at what happens on the main thread. In the previous example we called pplx::task::get(). We did this merely to block the main thread from exiting since it would kill all the outstanding threads including the one our task is running on. Here we call pplx::task::wait(). It blocks the main thread until the task reaches the terminal state and returns the status telling whether the task ran to completion or was cancelled. Note that if the task threw an exception this exception would be re-thrown from pplx::task::wait(). If this happens we need to handle this exception or the application will crash. Finally, for the completeness, we also have code that gets the result of the task (t.get()). It will never be executed in our case since we always cancel the task but in reality more often than not the task won’t be cancelled in which case you would want to get the result of the task.

There is a couple of questions I would like to drill into here:
– Why do we even need to use pplx::task::wait() – couldn’t we just use pplx::task::get()? The difference between pplx::task::wait() and pplx::task::get() is that pplx::task::get() will throw a pplx::task_canceled exception if a task has been cancelled whereas pplx::task::wait() does not throw for cancelled tasks. Therefore by using pplx::task::wait() you can avoid exceptions for cancelled tasks. This also may simplify your code when you handle exceptions in a different place than you would want to handle cancellation
– What happens if we don’t call pplx::task::wait()/pplx::task::get() and the task is cancelled? Sometimes we want to start an activity in the “fire-and-forget” manner and we don’t really care about the result so calling pplx::task::wait()/pplx::task::get() does not seem to be necessary. Unfortunately this won’t fly – if you call neither pplx::task::wait() nor pplx::task::get() on a task that has been cancelled a pplx::task_canceled exception will be eventually thrown which will crash your app – the same way as any other unobserved exception would do.

In the previous example we blocked the main thread to be able to handle the cancellation. This is not good – we use asynchronous programming to avoid blocking so it would be nice if we figured something out to not block the main thread. An alternative to handling a cancellation in the main thread is to handle cancellations in continuations. We can use the same pattern we used before to handle exceptions i.e. we can create a task based continuation in which we will call pplx::task:wait() to get the task status or pplx::task::get() and get a potential pplx::task_canceled exception. The important thing is that the continuation must not be scheduled using the overload of the pplx::then() function that takes the cancellation token. Otherwise the continuation wouldn’t run if the task was cancelled and the cancellation wouldn’t be handled resulting in an unobserved pplx::task_canceled exception that would crash your app. Here is an example of handling cancellation in a task based continuation:

void handling_cancellation_in_continuation()
{
    pplx::cancellation_token_source cts;

    auto t = pplx::task<void>([cts]()
    {
        std::cout << "cancelling continuations" << std::endl;

        cts.cancel();
    })
    .then([]()
    {
        std::cout << "this will not run" << std::endl;
    }, cts.get_token())
    .then([](pplx::task<void>previous_task)
    {
        try
        {
            if (previous_task.wait() == pplx::task_status::canceled)
            {
                std::cout << "task has been cancelled" << std::endl;
            }
            else
            {
                previous_task.get();
            }
        }
        catch (const std::exception& e)
        {
            std::cout << "exception " << e.what() << std::endl;
        }
    });

    t.get();
}

and the output:

cancelling continuations
task has been cancelled
Press any key to continue . . .

This is mostly it. There is one more topic related to cancellation – linked cancellation token sources but it is beyond the scope of this post. I believe that information included in this post should make it easy to understand how linked cancellation token sources work and how to use them if needed.

C++ Async Development (not only for) for C# Developers Part IV: Exception handling

Last time we were able to run some tasks asynchronously. Things worked great and it was pretty straightforward. However real life scenarios are not as simple as the ones I used in the previous post. For instance networking environment can be quite hostile – the server you are connecting to may go down for any reason, the connection might get dropped at any time, etc. Any of this condition will typically result in an exception which, if unhandled, will crash your process and bring down your application. C++ async is no different – you can easily check it for yourself by running this code:

pplx::task_from_result()
    .then([]()
    { 
        throw std::exception("test exception"); }
    ).get();

(the “for C# Developers” part – Note that in the .NET Framework 4 UnobservedTaskExceptions would terminate the application. It was later changed in .NET Framework 4.5 where UnobservedTaskExceptions no longer terminate applications (it is still recommended to observe and handle exceptions though). The behavior in C++ async with Casablanca is more in line with the .NET Framework 4 – any unobserved exception will eventually lead to a crash).
You might think that the way to handle this exception is just to wrap this call in a try…catch block. This would work if you blocked (e.g. used .get()) since you would be executing the code synchronously. However if you start a task, it will run on a different thread and the exception will be thrown on this new thread so, not where you are trying to catch it. As a result your app would still crash. The idea is that you have to observe exceptions not where tasks are started but where they are completed (i.e. where you call .get() or .wait()). Using a continuation for exception handling seems like a great choice because continuations run only after the previous task has completed. So, let’s build on the previous code snippet and add a continuation that handles the exception. It would look like this (I am still using .get() at the very end but it is only to prevent the main thread from exiting and terminating the other thread):

pplx::task_from_result()
    .then([]()
    { 
        throw std::exception("test exception"); }
    )
    .then([](pplx::task<void> previous_task)
    {
        try
        {
            previous_task.get();
        }
        catch (const std::exception& e)
        {
            std::cout << e.what() << std::endl;
        }
    }).get();

One very important thing to notice is that the continuation I added takes pplx::task<void> as the parameter. This is a so called “task based continuation” and is different from continuations we have seen so far which took the value returned by the previous task (or did not take any parameter if the previous task was void). The continuations we had worked with before were “value based continuations” (we will come back to value based continuations in the context of exception handling shortly). With task based continuations you don’t receive the result from the previous task but the task itself. Now you are in the business of retrieving the result yielded by this task. As we know from the previous post the way to get the result returned by a task is to call .get() or .wait(). Since exceptions are in a sense also results of executing a task if the task threw calling .get()/.wati() will result in rethrowing this exception. We can then catch it and handle and thus make the exception “observed” so the process will no longer crash. When I first came across this pattern it puzzled me a bit. I thought .get() is blocking and I use async to avoid blocking so isn’t it a contradiction?’. But then I realized that we are already in a continuation so the task has already been completed and .get() is no longer blocking – it merely allows to get the result of the previous task (be it a value or an exception).
Coming back to value based continuations – let’s see what would happen if we added a value based continuation after the continuation that throws but before the continuation that handles this exception – just like this:

pplx::task_from_result()
    .then([]()
    { 
        throw std::exception("test exception"); }
    )
    .then([]()
    {
        std::cout << “calculating The Answer…” << std::endl;
        return 42;
    })
    .then([](pplx::task<int> previous_task)
    {
        try
        {
            std::cout << previous_task.get() << std::endl;
        }
        catch (const std::exception& e)
        {
            std::cout << e.what() << std::endl;
        }
    }).get();

(One thing to notice – since the continuation we inserted now returns int (or actually pplx::task<int> – there are some pretty clever C++ tricks used to allow returning just a value or (just throwing an exception) even though the .then() function ultimately returns a pplx::task<T> or pplx::task<void>) the task valued continuation now has to take a parameter of pplx::task<int> type instead of pplx::task<void> type). If you run the above code the result will be exactly as from the previous example. Why? When a task throws an exception all value based continuations are skipped until a task based continuation is encountered which will be invoked and will have a chance to handle the exception. This is a big and a very important difference between task based and value based continuations. This also makes a lot of sense – something bad happened and in value based continuations you have no way of knowing that it did or what it was since you have no access to the exception. There is also nothing to pass if the previous task would return something were there not for the exception. As a result executing value based continuations if nothing has happened would be plainly wrong.
If you have played a little bit with Casablanca or have seen some more advanced code that is using Casablanca you might have come across the pplx::task_from_exception() function. You might have been wondering why it is needed if you can just throw an exception. Typically tasks are executed on multiple threads and it is very common that an exception thrown in one thread is being observed on a different thread. As a result it is impossible to just unwind the stack when trying to find an exception handler. Rather, the exception has to be saved (which will make the task faulted) and then is re-thrown when the user calls .get() or .wait() to get the result. If you use the .then() function all this happens behind the scenes – you throw an exception from a continuation and the .then() function will catch it and turn into a faulted task which will be passed to the next available task based continuation. However consider the following function:

pplx::task<int> exception_test(bool should_throw)
{
    if (should_throw)
    {
        throw std::exception("bogus exception");
    }

    return pplx::task_from_result<int>(42);
}

If you pass true it will throw an exception, otherwise it will return a value. Note that I cannot just return 42; here because the return type of the function is pplx::task<int> and not int and there is no Casablanca magic involved which could turn my 42 into a task. Therefore I have to use pplx::task_from_result<int>() to return a completed task with the result. Now, let’s try to build a task based continuation that observes the exception we throw – something like this:

exception_test(true)
    .then([](pplx::task<int> previous_task)
    {
        try
        {
            previous_task.get();
        }
        catch (const std::exception& e)
        {
            std::cout << "exception: " << e.what() << std::endl;
        }
    }).get();

If you run this code it will crash. The reason is simple – we just synchronously throw from the exception_test function and no one is handling this exception. Note that we are not able to handle this exception in the continuation since it is never invoked – because there was no handler the exception crashed the application before it got to the .then(). To fix this the exception_test function needs to be modified as follows:

pplx::task<int> exception_test(bool should_throw)
{
    if (should_throw)
    {
        return pplx::task_from_exception<int>(std::exception("bogus exception"));
    }

    return pplx::task_from_result<int>(42);
}

Now instead of throwing an exception we return a faulted task. This task is then passed to our task based continuation which can now handle the exception.

That’s it for today. Next time we will look at cancellation.

C++ Async Development (not only for) for C# Developers Part III: Introduction to C++ async

Now that we know a bit about lambda functions in C++ we finally can take a look at C++ async programming with the C++ Rest SDK (a.k.a. cpprestsdk a.k.a. Casablanca). C++ Rest SDK is cross platform but thanks to the NuGet support for native packages using it with Visual Studio is extremely easy. If you would like to use it on non-Windows platforms in majority of cases you will have to compile the code yourself (note to Raspbery PI fans – I tried compiling it on my Raspberry Pi but unfortunately compilation failed due to a gcc internal compiler error). With Visual Studio you can create a new Win32 Console Application, right click on the project in the Solution Explorer and click the “Manage NuGet Packages” option. In the Manage NuGet Packages window type “cppresdk” in the in the search box and install the C++ REST SDK: C++ Rest SDK NuGet This is it – you are now ready to write asynchronous programs in C++. If you prefer, you can also install the package from the Package Manager Console with the Install-Package cpprestsdk command. Once the C++ Rest SDK is installed we can start playing with it. Let’s start with something really simple like displaying a textual progress bar. We will create a task that will loop 10 times. In each iteration it will sleep for a short period of time and then write an asterisk to the console. To show that it works asynchronously we will have a similar loop in the main thread where we similarly will wait and write a different character to the console. I expect the characters to be mixed which would prove that the progress task is actually running on a different thread. Here is the code:

#include "stdafx.h"
#include "pplx\pplxtasks.h"
#include <iostream>

int main()
{
    pplx::task<void>([]()
    {
        for (int i = 0; i < 10; i++)
        {
            pplx::wait(200);
            std::cout << "*";
        };

        std::cout << std::endl << "Completed" << std::endl;
    });

    for (int i = 0; i < 10; i++)
    {
        pplx::wait(300);
        std::cout << "#";
    }

    std::cout << std::endl;

    return 0;
}

And here is the result I got:

*#**#*#**#*#**#*
Completed
####
Press any key to continue . . .

From the output it appears that everything went as planned. Let’s take a closer look at what is really happening in the program. pplx::task<void> creates a new task that is scheduled to be run on a different thread. Once the task is created the loop writing # characters is being executed. In the meantime the scheduled task is being picked up and executed on a different thread. Note that the loop in the main thread will take longer to execute that the loop in the task. What would happen if the main thread did not live long enough – e.g. we would not have the loop in the main thread that runs longer than the task? You can easily check this by decreasing the timeout but basically the main thread would exit and would terminate all other threads – including the one the task is running on so the task would be killed. You can, however, block a thread and wait until a task is completed by using the .get() or the .wait() method. In general you want to avoid blocking threads but sometimes it can be helpful. (Note that this is in contrast to the managed world (e.g. C#) where the expectation is that apps using async features are async inside out and blocking oftentimes leads to bad things like deadlocks. The built-in support for async like async/await keywords and exception handling in async methods help tremendously meet this expectation.) Here is a new version of the program from above which is now using the .get() method to block execution of the main funcion until the task completes:

int main()
{
    auto task = pplx::task<void>([]()
    {
        for (int i = 0; i < 10; i++)
        {
            pplx::wait(200);
            std::cout << "*";
        };

        std::cout << std::endl << "Completed" << std::endl;
    });

    std::cout << "waiting for task to complete" << std::endl;
    task.get();
    std::cout << "task completed" << std::endl;

    return 0;
}

The program should output this:

waiting for task to complete
**********
Completed
task completed
Press any key to continue . . .

The output shows that .get() did the job – once the .get() method was reached the main thread was blocked waiting for the task to complete and then when the task finished the main thread was unblocked. This is great but can we take it to the next level – for instance – can we return a value from the task? This is actually quite easy – to do that you just need to return the value from the task. In our case we can return how much time (in nanoseconds) our task took to execute. We will use types from the std::chrono namespace so don’t forget to add #include <chrono> – I am leaving includes out for brevity.

int main()
{
    auto task = pplx::task<std::chrono::nanoseconds>([]()
    {
        auto start = std::chrono::steady_clock::now();

        for (int i = 0; i < 10; i++)
        {
            pplx::wait(200);
            std::cout << "*";
        };
        
        std::cout << std::endl << "Completed" << std::endl;

        return std::chrono::steady_clock::now() - start;
    });

    std::cout << "waiting for task to complete" << std::endl;
    auto task_duration = task.get();
    std::cout << "task completed in " << task_duration.count() << " ns." << std::endl;

    return 0;
}

If you look close at the code you will notice that I modified the type of the task – instead of being pplx::task<void> it is now pplx::task<std::chrono::nanoseconds> – and modified the lambda body so that it now returns a value. As a result the .get() method is no longer void – it now returns a value of the std::chrono::nanoseconds type which is actually the value we returned from our task. For completness this is what I got on my screen when I ran this program:

waiting for task to complete
**********
Completed
task completed in 2003210000 ns.
Press any key to continue . . .

While being able to run a task asynchronously is powerful oftentimes you would want to run another task that runs after a task has completed and that uses the result from the previous task. Both tasks should run asynchronously and should not require blocking the main thread to pass the result from one task to the other task. For instance you are working with a service that returns a list of ids and names in one request but also can return details for a given id in a different request. If you want to get details for a given name you would need to first send a request to get ids for names and then send another request to get the details for the id. From the perspective of the main thread you just want to say: “give me details for this name (and I don’t care how you do it)”. This can be achieved with task chaining. You chain tasks using the .then() method. In the simplest form it just takes the value returned by the previous task as the parameter. For example, in our case, if we wanted to get the result in milliseconds and not nanoseconds we could write a continuation that does the conversion (yes, there is no real benefit of doing such a simple conversion in a continuation especially that it isn’t an asynchronous operation and can easily be done in the first continuation or in the main thread but imagine you need to connect to a service that does the conversion for you) like this:

int main()
{
    auto task = pplx::task<std::chrono::nanoseconds>([]()
    {
        auto start = std::chrono::steady_clock::now();

        for (int i = 0; i < 10; i++)
        {
            pplx::wait(200);
            std::cout << "*";
        };

        std::cout << std::endl << "Completed" << std::endl;

        return std::chrono::steady_clock::now() - start;
    })
    .then([](std::chrono::nanoseconds duration)
    {
        std::cout << "task duration in ns: " << duration.count() << std::endl;
        return duration.count() / 1000000;
    });

    std::cout << "waiting for task to complete" << std::endl;
    auto task_duration = task.get();
    std::cout << "task completed in " << task_duration << " ms." << std::endl;

    return 0;
}

Running the program results in the following output:

waiting for task to complete
**********
Completed
task duration in ns: 2004372000
task completed in 2004 ms.
Press any key to continue . . .

That’s pretty much it for today. Next time we will look at different types of continuations, exception handling and possibly cancellation.

C++ Async Development (not only for) for C# Developers Part II: Lambda functions – the most common pitfalls

Warning: Some examples in this post belong to the “don’t do this at home” category.

In the first part of C++ Async Development (not only for) for C# Developers we looked at lambda functions in C++. They are cool but as you probably can guess there are some “gotchas” you need to be aware of to avoid undefined behavior and crashes. Like many problems in C++ these are related to object lifetimes where a variable is used after it has already been destroyed. For instance what do you think will be printed (if anything) if you run the following code:

std::function<int()> get_func()
{
    auto x = 42;

    return [&x]() { return x; };
}

int main()
{
    std::cout << get_func()() << std::endl;
    return 0;
}

?

On my box it prints -858993460 but it could write anything as well as could it crash. I believe a crash would have actually been better because I would not have carried on with a random result. The reason why it happened is simple. The variable x was captured by reference meaning the code inside lambda referred to the actual local variable. The lambda was returned from the get_func() function extending the lifetime of the closure. However the local variable x went out of scope when we left the get_func() so the reference captured in the closure no longer referenced x but rather it referenced a random location in memory that has already been reclaimed and potentially reused. A variation of this problem can also be encountered when working with classes. Take a look at this code:

class widget
{
public:
    std::function<int()> get_func()
    {
        return [this]() { return m_x; };
    }

private:
    int m_x = 42;
};

std::function<int()> get_func()
{
    return widget{}.get_func();
}

int main()
{
    std::cout << get_func()() << std::endl;
    return 0;
}

It basically has the same flaw as the previous example. As you probably remember from the post about lambdas the this pointer is always captured by value. So, the lambda captures a copy of the this pointer to be able to access class members but by the time the lambda is actually invoked the object is gone so the pointer is God knows what…
Now let’s try a little game. Find one difference between this class:

class widget
{
public:
    std::function<int()> get_func()
    {
        return [=]() { return m_x; };
    }

private:
    int m_x = 42;
};

and the class from the previous example. If you have not found it yet look at how we capture variables. Before we captured the this pointer explicitly but now we don’t do it. So, we should good, right? Not so fast. If you look at the lambda body you will see we are still accessing the m_x variable. This is a class variable and class variables cannot be accessed without having an instance of the class the variable is a member of. So, what happened is that when we captured variables implicitly the compiler captured the this pointer because we used a class variable in the lambda body. Since the this pointer is always captured by value the new class has the same problem as the previous one. To make things even more interesting the result would be exactly the same if we captured variables implicitly by reference (e.g. if we used [&] to capture variables). This is because the compiler notices that we use class variables in our lambda and therefore the this pointer needs to be captured. However the this pointers is always captured by value (notice that [&this] won’t compile) so even though we used [&] to request capturing variables by reference the this pointer was still captured by reference.
If you think that the scenarios shown above rarely happen in real life you might be right. However the “real life” is a bit different when doing things asynchronously. In this case lambdas represent tasks that are to be run sometime in the future. In general you have little or no control over how or when these tasks will be run (there are some advanced settings you can pass when scheduling a task which may give you some control – I have not investigated them yet). Basically, when a thread becomes available a scheduled task will be run on this thread. This means that the task (i.e. the lambda) will run completely independently on the code that scheduled it and there is no guarantees that the variables captured when creating a task will be around when the task actually runs. As a result you can encounter scenarios similar to the ones I described above quite often. So, what to do then? Over the past few months I developed a few rules and patterns which are helpful in these situations.

Do not capture implicitly
This is probably a little controversial but I like to capture variables explicitly by listing them in the capture list. This prevents from introducing a bug where I capture the this pointer implicitly and use it even though I don’t have a guarantee that it will be valid when the lambda is invoked. This rule has unfortunately one drawback – sometimes after I refactor the code I forget to remove from the capture list a variable that is no longer used in the lambda. In general, if you understand when not to use the this pointer in your lambda you should be fine capturing implicitly.

Try to capture variables by value if possible/feasible (if you can’t guarantee the lifetime)
If you look at the first case we capture the local variable x by reference but it is not necessary. Had we captured it by value we would have avoided all the problems. For instance we could do this:

std::function<int()> get_func()
{
    auto x = 42;

    return [x]() { return x; };
}

Sometimes you may want to capture by reference only to avoid copying. It is a valid reason and it is fine as long as you can guarantee that the variable captured in the closure outlives the closure itself. Otherwise it is better to be slower and correct instead of faster and return invalid results or crash.
The same applies to capturing class variables. If you can capture them by value – do it. In the second case we could change the code as follows:

class widget
{
public:
    std::function<int()> get_func()
    {
        auto x = m_x;
        return [x]() { return x; };
    }

private:
    int m_x = 42;
};

Now instead of capturing the this pointer we copy the value of the class variable to a local variable and we capture it by value.
This works in simple cases where we don’t modify the variable value in the lambda and don’t expect the variable to be modified between it was captured and the lambda was invoked. Oftentimes this is however not enough – we want to be able to modify the value of the variable captured in the closure and be able to observe the new value. What do we do in these cases?

Use (smart)pointers
The problems outlined above were caused by automatic variables whose lifetimes are tied to their scope. Once they go out of scope they are destroyed and must no longer be used. However, if we create the variable on the heap we will control the lifetime of this variable. Since this is no longer 1998 we will not use raw pointers but smart pointers. Actually, there is even more important reason to not use raw pointers – since we allocated memory we should release it. In simple cases it may not be difficult. You just delete the variable in the lambda and you are done with it. More often than not it is not that simple – you may want to capture the same pointer in multiple closures, someone else may still want to use it after the lambda was invoked, you may want to invoke the lambda multiple times etc. In these cases the simplistic approach won’t work. Even if originally your code worked correctly it is very easy to inadvertently introduce issues like these later during maintenance. Using smart pointers makes life much easier and safer. So, yeah, we will use smart pointers. For local variables it is as easy as:

std::function<int()> get_func()
{
    auto x = std::make_shared<int>(42);

    return [x]() { return *x; };
}

Notice that we create a shared_ptr variable and capture it by value. When a shared_ptr is passed by value its internal ref count is incremented. This prevents from deleting the value the pointer points to. When the pointer goes out of scope the ref count is decremented. Once the ref count reaches 0 the pointer deletes the variable it tracks. In our case we create a closure (the ref count is incremented) and return it to the caller. The caller may extend its lifetime (e.g. by returning it to its caller) or can just invoke it and let it go out of scope. One way or another the closure will eventually go out of scope and when this happens the ref count will be decremented and – if no one else is using the shared_ptr – the memory will be released. One thing worth noting is how using shared_ptrs plays nicely with the previous rule about capturing variables by value.
We can also use shared_ptrs to capture the current instance of a class as a substitute for capturing the this pointer. It’s however a bit more involved than it was in case of variables. First, we cannot just do shared_ptr(this). Doing this could lead to creating multiple shared_ptr instances for a single object (e.g. when a function creating a shared_ptr this way would be called multiple times) which would lead to invoking the destructor of object tracked by the shared_ptr multiple times which not only creates undefined behavior but can also lead to dangling pointers. (Note that this is not specific to the this pointer but applies to any raw pointer). The way around it is to derive the class from the enable_shared_from_this<T> type. Unfortunately classes derived from enable_shared_from_this<T> are not very safe and using them incorrectly may put you into the realm of undefined behavior very quickly. The biggest issue is that to be able to create a shared_ptr from the this pointer for a class derived from enable_shared_from_this<T> the instance must already be owned by a shared_ptr. The easiest way to enforce this is probably by not allowing the user to create instances of the class directly by making all the constructors (including the ones created by the compiler) private and exposing a static factory method that creates a new instance of the class and returns a shared_ptr owning the newly created instance. This is how code capturing a shared_ptr tracking this would look like:

class widget : public std::enable_shared_from_this<widget>
{
public:

    static std::shared_ptr<widget> create()
    {
        return std::shared_ptr<widget>(new widget);
    }

    std::function<int()> get_func()
    {
        auto w = shared_from_this();

        return [w]() { return w->m_x; };
    }

private:

    widget() 
    { }

    widget(const widget&) = delete;

    int m_x = 42;
};

std::function<int()> get_func()
{
    return widget::create()->get_func();
}

int main()
{
    std::cout << get_func()() << std::endl;
    return 0;
}

A word of precaution. One general thing you need to aware of when using shared_ptrs is circular dependencies. If you create a circular dependency between two (or more) shared_ptrs their ref counts will never reach 0 because each shared_ptr is always being referenced by another shared_ptr. As a result the instances the shared_ptrs point to will never be released causing a memory leak. If you cannot avoid a dependency like this you can replace one of the shared_ptrs with a weak_ptr which will break the cycle. weak_ptrs similarly to shared_ptrs can be passed by value so they are easy to use with lambda functions. If you have a weak_ptr and you need a shared_ptr you just call the weak_ptr::lock() function. Note that you need to check the value returned by the weak_ptr::lock() function since it will return an empty shared_ptr if the instance managed by shared_ptr was deleted because all the strong references were removed in the meantime.

This post was quite long and heavy but knowing this stuff will be very helpful later when we dive in the actual async programming which I think we will start looking into in the next part.

C++ Async Development (not only for) for C# Developers Part I: Lambda Functions

And now for something completely different…

I was recently tasked with implementing the SignalR C++ client. The beginnings were (are?) harder than I had imagined. I did some C++ development in late nineties and early 2000s (yes I am that old) but either I have forgotten almost everything since then or I had never really understood some of the C++ concepts (not sure which is worse). In any case the C++ landscape has changed dramatically in the last 15 or so years. Writing tests is a common practice, standard library became really “standard” and C++ 11 is now available. There is also a great variety of third party libraries making the life of a C++ developer much easier. One of such libraries is cpprestsdk codename Casablanca. Casablanca is a native, cross platform and open source library for asynchronous client-server communication. It contains a complete HTTP stack (client/server) with support for WebSockets and OAuth, support for JSon and provides APIs for scheduling and running tasks asynchronously – similar to what is offered by C# async. As such cpprestsdk is a perfect fit for a native SignalR client where the client communicates over HTTP using JSon and takes the full advantage of asynchrony.
I am currently a couple of months into my quest and I would like to share the knowledge about C++ asynchronous development with the cpprestsdk I have gained so far. My background is mostly C# and it took me some time to wrap my head around native async programming even though C# async programming is somewhat similar to C++ async programming (with cpprestsdk). Two things I wish I had had when I started were to know how to translate patterns used in C# async programming to C++ and what the most common pitfalls are. In this blog series I would like to close this gap. Even though some of the posts will talk about C# quite a lot I hope the posts will be useful not only for C# programmers but also for anyone interested in async C++ programming with cpprestsdk.

Preamble done – let’s get started.

Cpprestsdk relies heavily on C++ 11 – especially on newly introduced lambda functions. Therefore before doing any actual asynchronous programming we need to take a look at and understand C++ lambda functions especially that they are a bit different (and in some ways more powerful) than .NET lambda expressions.
The simplest C++ lambda function looks as follows:

auto l = [](){};

It does not take any parameter, does not return any value and does not have any body so doesn’t do anything. A C# equivalent would look like this:

Action l = () => {};

The different kinds of brackets one next to each other in the C++ lambda function may seem a bit peculiar (I saw some pretty heated comments from long time C++ users how they feel about this syntax) but they do make sense. Let’s take a look what they are and how they are used.

[]
Square brackets are used to define what variables should be captured in the closure and how they should be captured. You can capture zero (as in the example above) or more variables. If you want to capture more than one variable you provide a comma separated list of variables. Note that each variable can be specified only once. In general you can capture any variable you can access in the scope and – except for the this pointer – they can be captured either by reference or by value. When a variable is captured by reference you access the actual variable from the lambda function body. If a variable is captured by value the compiler creates a copy of the variable for you and this is what you access in your lambda. The this pointer is always captured by value. To capture a variable by value you just provide the variable name inside the square brackets. To capture a variable by reference you prepend the variable name with &. You can also capture variables implicitly – i.e. without enumerating variables you want to capture – letting the compiler figure what variables to capture by inspecting what variables are being used in the lambda function. You still need to tell how you want to capture the variables though. To capture variables implicitly by value you use =. To capture variables implicitly by reference you use &. What you specify will be the default way of capturing variables which you can refine if you need to capture a particular variable differently. You do that just by adding the variable to the capture list. One small wrinkle when specifying variables explicitly in the capture list is that if the variable is not being used in the lambda you will not get any notification or warning. I assume that the compiler just get rids of this variable (especially in optimized builds) but still it would be nice to have an indication if this happens. Here are some examples of capture lists with short explanations:

[] // no variables captured
[x, &y] // x captured by value, y captured by reference
[=] // variables in scope captured implicitly by value
[&] // variables in scope captured implicitly by reference
[=, &x] // variables in scope captured implicitly by value except for x which is captured by reference
[&, x] // variables in scope captured implicitly by reference except for x which is captured by value

[=, &] // wrong – variables can be captured either by value or by reference but not both
[=, x] // wrong – x is already implicitly captured by value
[&, &x] // wrong – x is already implicitly captured by reference
[x, &x] // wrong – x cannot be captured by reference and by value at the same time

How does this compare to C# lambda expressions? In C# all variables are implicitly captured by reference (so it is an equivalent of [&] used in C++ lambda functions) and you have no way of changing it. You can work around it by assigning the variable you want to capture to a local variable and capture the local variable. As long as you don’t modify the local variable (e.g. you could create an artificial scope just for declaring the variable and defining the lambda expression – since the variable would not be visible out of scope it could not be modified) you would get something similar to capturing by mutable value. This was actually a way to work around a usability issue where sometimes if you closed over a loop variable you would get unexpected results (if you are using Re# you have probably seen the warning reading “Access to foreach variable in closure. May have different behaviour when compiled with different versions of compiler.” – this is it). You can read more about this here (btw. the behavior was changed in C# 5.0 and the results are now what most developers actually expect).

()
Parenthesis are used to define a formal parameter list. There is nothing particularly fancy – you need to follow the same rules as when you define parameters for regular functions.

Return type
The simplest possible lambda I showed above did not have this but sometimes you may need to declare the return type of the lambda function. In simple cases you don’t have to and the compiler should be able to deduce the return type for you based on what your function returns. However, in more complicated cases – e.g. your lambda function has multiple return statements – you will need to specify the return type explicitly. A lambda with an explicitly defined return type of int looks like this:

auto l = []()->int { return 42;};

mutable
Again something that was not needed for the “simplest possible C++ lambda” example but I think you will encounter it relatively quickly when you start capturing variables by value. By default all C++ lambda functions are const. This means that you can access variables captured by value but you cannot modify them. You also cannot call non-const functions on variables captured by value. Prepending lambda body with the mutable keyword makes the above possible. You need to be careful though because this can get tricky. For instance what you think the following function prints?

void lambda_by_value()
{
    auto x = 42;
    auto l = [x]()
    mutable {
        x++;
        std::cout << x << std::endl;
    };

    std::cout << x << std::endl;
    l();
    l();
    std::cout << x << std::endl;
}

If you guessed:

42
43
44
42

– congratulations – you were right and you can skip the next paragraph; otherwise read on.
Imagine your lambda was compiled to the following class (disclaimer: this is my mental model and things probably work a little bit differently and are far more complicated than this but I believe that this is what actually happens at the high level):

class lambda
{
public:
    lambda(int x) : x(x)
    {}

    void operator()()
    {
        x++;
        std::cout << x << std::endl;
    }

private:
    int x;
};

Since we capture x by value the ctor creates a copy of the passed parameter and stores it in the class variable called x. Overloading the function call operator (i.e. operator()) makes the object callable making it similar to a function but the object still can maintain the state which is perfect in our case because we need to be able to access the x variable. Note that the body of the operator() function is the same as the body of the lambda function above. Now let’s create a counterpart of the lambda_by_value() function:

void lambda_by_value_counterpart()
{
    auto x = 42;
    auto l = lambda(x);
    std::cout << x << std::endl;
    l();
    l();
    std::cout << x << std::endl;
}

As you can see the lambda_by_value_counterpart() function not only looks almost the same as the original lambda_by_value() (except that instead of defining the lambda inline we create the lambda class) but it also prints the same result. This shows that when you define a lambda the compiler creates a structure for you that maintains the state and is used across lambda calls. Now also the mutable keyword should make more sense. If, in our lambda class the operator() function was actually const we would have not been able to modify the class variable x. So the function must not be const or the class variable x must be defined as (surprise!)… mutable int x;. (In the example I decided to make the function operator() non-const since it feels more correct).

auto vs. var
Since C++ lambda definitions contain full type specifications for parameters and the return type you can use auto and leave it to the compiler to deduce the type of the lambda variable. In fact I don’t even know how to write the type of the lambda variable explicitly (when I hover over the auto in Visual Studio it shows something like class lambda []void () mutable->void but this cannot be used as the type of the variable). In C# you cannot assign a lambda expression to var for several reasons. You can read up why here.

That’s more or less what I have learned about C++ lambda functions in the past few weeks apart from… common pitfalls and mistakes I am thinking about writing about soon.