CMake Fundamentals Part 1

by Jeremi | Jan 4, 2021 | CMake | 3 comments

Why CMake

Over the past twenty years, CMake has managed to solidify its position as the de-facto standard build system (generator) for C++. As is often the case, the leading technology is criticized for its short-comings and even despised by some. I suspect that for those reasons some new competitors have been cropping up lately (we may look at some in the future), but it doesn’t seem like CMake is going anywhere anytime soon. It has been adopted on all major platforms, and many frameworks and essentially all C++ developers are now expected to have at least rudimentary knowledge of CMake.

In my experience it has always been critical to understand the reasoning behind what is being done – what are we actually trying to achieve? Without a logical explenation we’d be trying to memorize an arbitrary set of rules – this is often what CMake examples available on the internet provide. In this series of posts I’ll try to go over some of the fundamentals of CMake, while also explaining the decision-making and looking at some of the implementation details, that help understand the why. So let us do just that – review the basics of CMake and take a look at what happens underneath while we’re at it.

Hello CMake

Let’s start out with the simplest possible C++ program, that’s right – a “Hello World“. We will use it to present how to build executables.

#include <iostream>

int main()
{
    std::cout << "Hello, World!\n";
}

Before we move on to CMake let us remind ourselves how would this trivial program be built using just our trusty compiler. Assuming that the file is named main.cpp this could be done as follows:

$ c++ -o hello main.cpp

Simple enough. The -o flag specifies the output file name. This command will give us an executable called hello, which once executed will result in the expected output:

$ ./hello
Hello, World

Writing CMakeLists

Let’s do the same using CMake.

Every CMake-based project is defined using at least one CMakeLists.txt, a somewhat ugly name, but required nonetheless. We’ll write it first and then explain what’s going on.

cmake_minimum_required(VERSION 3.19)
project(HelloWorld)

add_executable(hello main.cpp)

Three lines of code – this may seem way more complicated than just using the compiler, but things will move in favor of CMake as soon as we start adding more files to the project. Each of these lines is a command – that’s how CMake calls its built-in routines, not functions, not macros – commands. It’s a minor detail, but worth pointing out – it may help avoid some confusion when reading the official documentation. Let’s look at each line one by one.

cmake_minimum_required(VERSION 3.19)

This should be the very first line of every top-level CMakeLists.txt. Without going into too much detail, it defines the minimum version of CMake required to build the project. Meaning that we won’t be able to build this project with a CMake version older than the one specified. In real-world projects we would likely be constrained to some older version that’s available in the toolchain, however, if there’s a choice it’s usually the best to use the latest available version.

project(HelloWorld)

As one may expect this line declares a CMake project named HelloWorld, it is also required. Calling this command defines globally available variables PROJECT_NAME and CMAKE_PROJECT_NAME, both containing the specified name of the project. We won’t be using those for now.

add_executable(hello main.cpp)

In broad terms, this command tells CMake to build an executable called hello using the source file main.cpp. More precisely it defines an executable target to be built using the specified source files.

Targets is the most important concept for understanding and using CMake well in real-world projects. Each target represents either an executable, a library or a custom user-defined set of properties and/or commands. For now, you can think of them as objects encapsulating all the information required to build an executable or library, including dependencies to other targets.

Running CMake

CMake is executed from the command line (there are also GUI and curses-GUI front-ends, but we won’t discuss those). There are multiple steps to building a project with CMake – this is a side effect of CMake being a build-system generator, rather than a standalone build system. The stages are:

configuration step
build step

The configuration step processes the CMakeLists.txt files and potentially other CMake scripts, it gathers information about all the declared targets, and generates the build files – accordingly to the selected (or detected) generator. On a Linux machine, this is most likely make or ninja. CMake is also capable of generating project files for IDEs like Visual Studio, Eclipse, XCode, etc. Note that we’re simplifying here a little yet again – generation is actually a step of its own, but for today we can pretend that it isn’t.

The build step is when the executables and/or libraries are actually compiled. Or more generally that’s when actions associated with each target are executed.

Let’s build our Hello World program (finally).

$ mkdir build && cd build
$ cmake ..

This is probably the most canonical way to build a CMake project on a Linux system. First – we create a build directory. This is the directory that will contain the build files and some other CMake-specific files. Always create a dedicated build directory, it can be a subdirectory of the project or be located outside of the project itself, but a dedicated directory is required.

Then we finally execute CMake and point it to the directory where the main CMakeLists.txt of a project is located. A lot happens in this step – CMake detects the available compilers, checks if they work correctly, detects defaults for, and configures a multitude of variables. Finally and most importantly it generates the build system files, that can be used to build the project…

Which we finally do, by invoking make directly – this builds our executable.

$ make
$ ./hello
Hello, World!

We have reached one of our goals. However, we’ve set out to investigate how exactly CMake builds the executable. The information we’re after can be extracted both at the configuration step and at the build step. Let’s look at both alternatives.

Compile commands

CMake can generate a compile_commands.json file with all of the compilation commands used to build the project. The main use of this file is to configure code editors or various tools that require project-wide context. Today we’ll use it to see exactly how CMake builds our Hello World executable.

Generation of the compile_commands is enabled by setting a cmake variable CMAKE_EXPORT_COMPILE_COMMANDS to TRUE. We will gloss over many details of CMake variables here and just note that they can be set either in the CMakeLists.txt file or specified on the command line. We will do the latter.

$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=TRUE .

The variable is defined (-D) on the command line. CMake will remember the last specified value.

The build directory should now contain a compile_commands.json file. Among other information, it will contain a command field, with its value specifying the exact command used to build the executable.

"command": "/usr/bin/c++ -o CMakeFiles/hello.dir/main.cpp.o -c /home/user/hello_cmake/main.cpp",

Whoa, what happened there? That doesn’t look anything like the command we’ve used to compile the executable. That’s because this command does something a little different. The -c flag tells the compiler to generate an object-file – a compiled and assembled binary that still needs to be linked. In our trivial example, everything is located in a single file – but it doesn’t have to be. If the executable consisted of more than just one source file, an object file would be produced for each file (or more specifically translation unit) and only then all the object files would be linked into an executable. This saves a lot of time, whenever the same translation unit is used by multiple executables or libraries.

Let’s modify our example to see how that works in practice. We will introduce two new files:

main.cpp:

#include "hello_world.h"

int main()
{
    say_hello();
}

hello_world.h:

#pragma once
void say_hello();

hello_world.cpp:

#include "hello_world.h"
#include <iostream>

void say_hello()
{
    std::cout << "Hello, World!\n";
}

The CMakeLists.txt

cmake_minimum_required(VERSION 3.19)
project(HelloWorld)

add_executable(hello main.cpp hello_world.cpp)

After introducing these changes and regenerating the project the compile_commands.json now contain commands for main.cpp and hello_world.cpp – producing object files for both translation units.

"command": "/usr/bin/c++ -o CMakeFiles/hello.dir/main.cpp.o -c /home/user/hello_cmake/main.cpp",
...
"command": "/usr/bin/c++ -o CMakeFiles/hello.dir/hello_world.cpp.o -c /home/user/hello_cmake/hello_world.cpp",

But still, what about linking? To inspect this step we will need to add another tool to our arsenal.

Verbose makefiles

Linking is done during the build step – this involves the generated build system files. The variable CMAKE_VERBOSE_MAKEFILE will result in the build process reporting detailed information about everything that’s happening

$ cmake -DCMAKE_VERBOSE_MAKEFILE=TRUE .

Calling make now will produce output for every invoked command, including directory changes, compilation, and linking. This is rather verbose (as requested), so let’s limit the output to compiler calls:

$ make | grep c++

If you’re following along make sure to replace c++ with the compiler that CMake detected on your platform, this could be g++, clang++, etc.

This gives us the following output:

/usr/bin/c++ -o CMakeFiles/hello.dir/main.cpp.o -c /home/user/hello_cmake/main.cpp
/usr/bin/c++ -o CMakeFiles/hello.dir/hello_world.cpp.o -c /home/user/hello_cmake/hello_world.cpp
/usr/bin/c++ CMakeFiles/hello.dir/main.cpp.o CMakeFiles/hello.dir/hello_world.cpp.o -o hello

Here the object files are linked with the help of the compiler into an executable called hello.

In general, the verbose makefiles will give us much more detailed information than the compile_commands. This is a very helpful tool, especially in case of linking issues.

Summary

In this post, we’ve looked at how to use CMake to compile a trivial C++ program. More importantly – we have learned how to use simple techniques to investigate what exactly happens underneath. Armed with this fundamental knowledge we’ll be able to look at more interesting examples in the future. In the next post we will begin discussing how to build libraries using CMake.

What do you think about this style of post? Is the walkthrough format that explains the reasoning, and why, behind what’s being done helpful? Or would something more structured, closer to a reference or good-practices, without much explanation be more valuable?

Continue the review of cmake fundamentals in part2.

3 Comments

soma tekumalla on April 25, 2023 at 07:13

Very nice – been breaking my heading reading the manual and a few other samples from the web.
Reply
Dave Smith on January 3, 2024 at 23:59

Excellent so far, I’m already a CMake user, but I came here for the eventual explanation of GNUInstallDirs coming up in part 7. Thanks, Jeremi!
Reply
soma on August 22, 2024 at 22:17

Very helpful introduction to CMake. Continuing to read …
Reply