CPP ‐ Preprocessor - rFronteddu/general_wiki GitHub Wiki

Prior to compilation, each code (.cpp) file goes through a preprocessing phase. In this phase, a program called the preprocessor makes various changes to the text of the code file. The preprocessor does not actually modify the original code files in any way -- rather, all changes made by the preprocessor happen either temporarily in-memory or using temporary files. Historically, the preprocessor was a separate program from the compiler, but in modern compilers, the preprocessor may be built right into the compiler itself.

Most of what the preprocessor does is fairly uninteresting (i.e. it strips out comments, and ensures each code file ends in a newline). However, the preprocessor processes #include directives

When the preprocessor has finished processing a code file, the result is called a translation unit. This translation unit is what is then compiled by the compiler.

The entire process of preprocessing, compiling, and linking is called translation.

When the preprocessor runs, it scans through the code file (from top to bottom), looking for preprocessor directives. Preprocessor directives (often just called directives) are instructions that start with a # symbol and end with a newline (NOT a semicolon). Note that the preprocessor does not understand C++ syntax -- instead, the directives have their own syntax. The final output of the preprocessor contains no directives -- only the output of the processed directive is passed to the compiler. Using directives are not preprocessor directives.

#include

When you #include a file, the preprocessor replaces the #include directive with the contents of the included file. The included contents are then preprocessed (which may result in additional #includes being preprocessed recursively), then the rest of the file is preprocessed.

Each translation unit typically consists of a single code (.cpp) file and all header files it #includes (applied recursively, since header files can #include other header files).

Macro defines

The #define directive can be used to create a macro. In C++, a macro is a rule that defines how input text is converted into replacement output text.

There are two basic types of macros:

  • function-like macros: act like functions, and serve a similar purpose. Their use is generally considered unsafe, and almost anything they can do can be done by a normal function.
  • object-like macros: can be defined in one of two ways, The top definition has no substitution text, whereas the bottom one does. Because these are preprocessor directives (not statements), note that neither form ends with a semicolon.:
#define IDENTIFIER
#define IDENTIFIER substitution_text

The identifier for a macro uses the same naming rules as normal identifiers: they can use letters, numbers, and underscores, cannot start with a number, and should not start with an underscore. By convention, macro names are typically all uppercase, separated by underscores.

When the preprocessor encounters this directive, an association is made between the macro identifier and substitution_text. All further occurrences of the macro identifier (outside of use in other preprocessor commands) are replaced by the substitution_text.

Object-like macros with substitution text were used (in C) as a way to assign names to literals. This is no longer necessary, as better methods are available in C++. Avoid macros with substitution text unless no viable alternatives exist.

Object-like macros can also be defined without substitution text. Macros of this form work like you might expect: most further occurrences of the identifier is removed and replaced by nothing! Unlike object-like macros with substitution text, macros of this form are generally considered acceptable to use.

Conditional Compilation

The conditional compilation preprocessor directives allow you to specify under what conditions something will or won’t compile. There are quite a few different conditional compilation directives, but we’ll only cover a few that are used the most often:

  • #ifdef: preprocessor directive allows the preprocessor to check whether an identifier has been previously defined via #define. If so, the code between the #ifdef and matching #endif is compiled. If not, the code is ignored.
  • #ifnde: is the opposite of #ifdef, in that it allows you to check whether an identifier has NOT been #defined yet.

One more common use of conditional compilation involves using #if 0 to exclude a block of code from being compiled (as if it were inside a comment block).

Macro substitution within other preprocessor commands

In most cases, macro substitution does not occur when a macro identifier is used within another preprocessor command. There is at least one exception to this rule: most forms of #if and #elif do macro substitution within the preprocessor command.

Directives are resolved before compilation, from top to bottom on a file-by-file basis. The preprocessor doesn’t understand C++ concepts like functions. So defining something in a function is the same as defining it before it's use. To avoid confusion, you’ll generally want to #define identifiers outside of functions.

Because an #include directive replaces the #include directive with the content of the included file, an #include can copy directives from the included file into the current file. These directives will then be processed in order.

Once the preprocessor has finished, all defined identifiers from that file are discarded. This means that directives are only valid from the point of definition to the end of the file in which they are defined. Directives defined in one file do not have any impact on other files (unless they are #included into another file).