Lambdas under the hood

by | Mar 29, 2021 | C++ | 0 comments

Even though it’s been 10 years since C++11 was released there still seems to be a lot of misunderstanding around lambdas. I often hear them being described as “anonymous functions” or “like std::function, just built into the language”. And these may seem vaguely correct, at least semantically. But at the same time, there’s so much more to lambdas than that, that these statements are equally as wrong as they are correct. Let’s look at some of the basic properties of lambdas that make them a valuable addition to the language, and not just a convenience.

What is a lambda

Let’s just get this out of the way – the name lambda refers to lambda-calculus. But there’s no point in me citing Wikipedia or other sources that already provide plenty of theoretical information. I think it needed to be mentioned, so there. And now I’m going to move on to information relevant to C++.

A lambda in the context of C++ can be seen as syntactic sugar for defining function objects. There’s a lot of important detail missing from this statement, but let’s start here and build on it. First things first, let’s define a simple lambda to provide some context for the discussion.

auto add1 = [](int x) { return x + 1; };

std::cout << add1(41) << "\n";

The above snippet defines an add1 variable as an instance of a lambda that takes a single integer argument, adds 1 to it and returns the result. I’m assuming that you’re already familiar with the syntax, in case you’re not, here’s a quick explanation.

The square brackets declare lambda-captures, which I’m not using in this example, but I will explain later. The parenthesis declare the parameters the lambda accepts – here it takes a single integer argument. The braces define the lambda body. The syntax may get some getting used to if you’re new, but I’m sure you’ll manage to follow along.

Well, there goes that first definition I mentioned in the introductory paragraph – I just assigned a lambda to an object, thus giving it a name, sort of. But let’s get back on track. How would a function object providing the same functionality be defined? And what is a function object in the first place?

A function object is a callable – an instance of a type, struct or class, that defines a function call operator – operator(). This means that an equivalent to the add1 lambda could be defined like so:

struct Add1 {
    int operator()(int x) const
    {
        return x + 1;
    }
};

Add1 add1{};

std::cout << add1(41) << "\n";

This code leads to the same result as the lambda example. Moreover, it likely results in equivalent machine code being generated. Notice that compared to a lambda, here, I had to do two distinct steps – first I had to define a struct (type) that specifies what the function object does, and second, I had to instantiate that type. A lambda does all this in a single expression.

Did you notice anything about the call operator declaration? That’s right – it’s const. When defining member functions we need to explicitly declare that the function does not mutate the object’s state (i.e. make them const). Lambdas on the other hand are const by default – we’re not allowed to mutate their state, unless we specify that we’d like to be able to do so. Right now there’s no state to speak of, so let’s expand the example and see what this is about.

Lambda captures and state

Lambda-captures are declared and defined in the brackets that begin the lambda definition. Going back to the add1 example, a lambda capture could be used to define the value that should be added to the argument:

auto add_captured = [y=1](int x) { return x + y; };

std::cout << add_captured(41) << "\n";

The above code would be functionally equivalent to the original example. Also note that I’m using C++14 capture syntax here – the capture y is initialized with a literal value 1. In this context of course the capture doesn’t make much sense, but imagine it being initialized with an argument passed to some function and then passed along to an std algorithm, like so:

void foo(std::vector<int>& vals, int a)
{
    auto add_captured = [y=a](int x) { return x + y; };
    std::transform(vals.begin(), vals.end(), add_captured);
}

or some other example that calculates the captured value in an arbitrarily complex way, and then uses it. Anyway, back to the example.

A function object definition equivalent to the above lambda would look like so:

struct AddY {
    int operator()(int x) const
    {
        return x + y;
    }

    int y;
};

AddY add_y{1};
std::cout << add_y(41) << "\n";

The output once again would be equivalent. The lambda-capture is simply a member data field of the function object type. Now, what about the call operator’s constness? The obvious consequence is that we’re not allowed to change the value of y. For example, if we wanted to incrementally increase the added value, we’re not allowed to do so:

auto add_incremented = [y=1](int x) {
                            ++y;
                            return x + y;
                        };

std::cout << add_incremented(41) << "\n";

The compiler complains (the example is formatted differently for readability):

../lambdas.cpp: In lambda function:
../lambdas.cpp:7:45: error: increment of read-only variable ‘y’
    7 |     auto add_incremented = [y=1](int x) { ++y; return x + y; };
      |                                             ^

The same is true of course for the function object:

struct AddY {
    int operator()(int x) const
    {
        ++y;
        return x + y;
    }

    int y;
};
../lambdas.cpp: In member function ‘int AddY::operator()(int) const’:
../lambdas.cpp:7:37: error: increment of member ‘AddY::y’ in read-only object
    7 |     int operator()(int x) const { ++y; return x + y; }
      |                                     ^

If changing the object’s state is in fact our intent, we can easily make the compiler accept this code by removing the const in the function-object example.

struct AddY {
    int operator()(int x)   // no const
    {
        ++y;
        return x + y;
    }

    int y;
};

With the lambda, we need to explicitly state that we’d like to constness removed, this is done using the mutable keyword:

auto add_incremented = [y=1](int x) mutable {
                            ++y;
                            return x + y;
                        };
std::cout << add_incremented(10) << "\n";
          << add_incremented(20) << "\n";

After these changes, the compiler happily accepts both examples.
Note that with function-objects it’s easy to simply forget to specify the call operator as const, thus allowing accidental object mutation. Since lambdas are const by default, they protect us against such accidents. The addition of the mutable keyword also brings attention of the reader to the fact that the state is being changed inside the body of this lambda. With regular function objects it might be the case that the call operator isn’t const simply because the author neglected to mark it so. This is likely the case more often then we’d like to admitt.

Generic lambdas

So you now know that a basic lambda is just a function object. As you may or may not know C++14 introduced generic lambdas. That it, the auto keyword may be used to declare lambda parameters, thus making it accept arbitrary types, like so:

auto add_captured = [y=1](auto x) {
                        return x + y;
                    };
std::cout << add_captured(3.1415) << "\n";

With this definition the compiler will now deduce the type of x for us. How is this achieved? At least two contexts where type deduction takes place should come to mind – auto and templates. Both of these in fact use the same mechanism – template type deduction. This situation is no different – a generic lambda is a function object with a templetized function call operator:

struct AddY {
    template <typename T>
    T operator()(T x) const
    {
        return x + y;
    }

    int y;
};

Note that only the function call operator is templetized, not the entire function object – the capture type is known to the compiler at the time the captures are defined so there’s no need to make the function object a template.

Zero-cost abstraction

Herb Sutter defines zero-cost abstraction as “An abstraction that performs as good or better than could be reasonably written by hand”. I think lambdas are a perfect example. You can type all of this out manually, but lambdas achieve the same outcome with much less code, while also providing safer defaults (constness).

Given the information on how lambdas work under the hood it should be clear that they are an extremely lightweight abstraction. You should not be concerned about the object insantiation or call overhead – there is none for small types like this wihout a vtable. In fact as soon as the optimizer is enabled, even with the lowest settings, code like the one shown in examples is perfectly inlined and goes away entirely. That’s where the comparison of lambdas to std::function fails. Sure, the functionality is similar, however the cost of a lambda can be orders of magnitude smaller than using a std::function. For small function objects a std::function incurs a cost of at least one pointer dereference and a virtual function call. For large objects there’s an additional cost of a dynamic allocation on construction, and this depends on the standard library implementation – some may always allocate. A lambda will never implicitly allocate and most of the time (always?) is perfectly inlined. Note that the uses are not the same – you don’t get type erasure with a lambda, so they’re only a subset of the functionality std::function provides.

Once you’re accustomed to lambdas and additionally strive for const-correctness you may find yourself writing code akin to the following:

void foo(const std::vector<int>& xs)
{
    const Bar bar = [&xs](){
                        if (xs.empty())
                        {
                            return barFactory();
                        }
                        else
                        {
                            return Bar(xs.cbegin(), xs.cend());
                        }
                    }();    // Hey! Look here!
    
    // ...
}

The above uses an Immediately Invoked Function Expression (IIFE) to conditionally initialize a variable that’s used, but unchanged throughout the rest of the code. It’s not new and is now considered an idiom. That’s only one of many lambda uses that may not be obvious at first. I’d encourage you to look for lambda-use opportunities all the time. Careful though! Once you catch on it’s quite easy to abuse them. Food for thought – could you replace most (all?) non-member function definitions with lambdas?

Summary

C++11 begun a new era in C++ – it introduced a multitude of excellent features and lambdas are arguably one of the most important ones. What’s your relationship with C++ lambdas? Do you shy away from them? Do you limit their use only to standard algorithm predicates? Or are you already one of the lambda aficionados that gets excited whenever an opportunity for an unorthodox yet reasonable lambda use presents itself?

 

References

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Share This