The Compilation Process - itzjac/cpplearning GitHub Wiki

Smallest program

The flow chart bellow describes the process for the simplest and smallest C++/C program you can write.

Compilation

  1. We start the compilation process by creating a source file.

A C++ source file is a text file commonly saved with the extension cpp, cxx, cc, cp, C. We are going to create a main.cpp that contains the following code.

#include <iostream>

int main()
{
    return 0;
}
  1. Open a command line terminal in your OS and compile source file main.cpp.

The g++ command which refers to the C++ compiler from the GCC (GNU Compiler Collection), all the steps described in the flow chart above will be executed sequentially: reading the source file, pre-processing, compiling to obj, linking and generating the exe.


LINUX
$g++ main.cpp
WINDOWS
>cl main.cpp

Dissecting compilation

To break down the compilation process with cl or g++. Showing step by step each the result of each compilation stage.


  1. Preprocessor step (main.cpp -> preprocessmain.cpp).

Divided in three subtasks

Task 1

This step will strip comments out of code.

Task 2

MACRO expansion: Replace all MACROS found in the source file, and substitute them with the real content

Task 3

Header file inclusion: Files with .h extension are inserted into cpp files, which in turn will be passed to the compiler. Whenever a header file insertion is encounter, inserts the contents of the header file into the cpp. This is a form of MACRO expansion, but here an entire file is inserted into the code.


LINUX
$g++ -E main.cpp > preprocessmain.cpp
WINDOWS
>cl /P /C main.cpp

AT this stage, we are still producing a text file, but one that is ready to be processed by compiler.


  1. Compiler (preprocessmain.cpp -> main.obj)

In this step the compiler will generate all the assembly instructions that the CPU can read, what is called machine code. If the compiler is able to interpret the contents of the cpp file without any errors, an output file is produced. The output takes the form of an .obj or .o object. This file contains compiled code in a form that is very close to that of an executable, but lacks fundamental mechanisms and formating required for execution. With -c option a main.obj or main.o file will be generated which will contain the compiled code.


LINUX
$g++ -c preprocessmain.cpp
WINDOWS
>cl /c preprocessmain.cpp

It is also possible to analyze the contents by running a disassembler like objdump, the binary code of the obj will be translated to the assembler mnemonics


LINUX
$objdump -D main.obj > main.dump
WINDOWS
>dumpbin /DISASM main.obj

main.dump can be opened using a hexadecimal editor containing the assembly code, by default Windows will just dump the content into the console.

Microsoft (R) COFF/PE Dumper Version 14.29.30136.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file cpplearning.obj

File Type: COFF OBJECT

main:
  0000000000000000: 33 C0              xor         eax,eax
  0000000000000002: C3                 ret

  Summary

          28 .chks64
          98 .debug$S
          F4 .drectve
           1 .rdata
           3 .text$mn

This step doesn't produce any executable yet.


  1. Linker (main.obj -> main.exe)

The last step is this process is to produce a fully functional executable using the obj file (generally a project will be dozens or hundreds of obj files). The linker will try to generate a single executable. If any problems are encounter, the linker generates error messages. These error messages are similar to those produced by the compiler, honestly way more convoluted, but linker errors tend to be less common than compiling errors. The linker is capable of producing different kind of outputs. Can produce lib, dll or exe. By now, we will only focus on generating exe files.


LINUX
$g++ -o main.out main.obj
WINDOWS
>LINK cpplearning.obj /OUT:cpplearning.exe 

Switching Platform targets - 64 bit platform

Quite often you will need to compile to different platform targets and hardware. This requirement will depend on your target audience or performance, but more important is how you will distribute your source code. If this would be an open source project anyone getting your code will be capable of compiling your source code to whatever platform they want to execute the program, as far as it exists a C++ compiler for such a platform.

NOTE: installing g++multilib package is going to be needed, be sure to install it before continuing next steps

In case your machine is capable of compiling to multiple platform targets, investigate on the PC documentation which machine target is set up by default.

g++ create copy of the main.cpp as mainx64.cpp and use to change the target platform as x64 using the compiler option -m64.

cl works differently, the compiling options don't change but how the VC environment was loaded. You need to switch the VC environment to be x64 or x86 every time you need to change cl platform targets.


LINUX
$g++ -m64 mainx64.cpp
WINDOWS
>call "%VCINSTALLDIR%"\Auxiliary\Build\vcvarsall.bat x64
>cl main.cpp

The exe created by this step is a x64 version, can't be executed on x86 platform machines. Be aware that to execute a compiled platform, the PC you are working with must support it. Ironically, it is possible to compile to other platforms but you won't be able to test those, a concept that's known as a **toolchain ** in dev pipelines. Not hard to imagine at this point, there is a windows toolchain in UE4/5 to compile Linux servers, yes, from windows. For example, if your machine is a 32 bit architecture (x86), it is not going to be possible to execute a x64 application. Instead with a x64 architecture, you can cross-compile and execute to both x64 or x86 targets! The reason for this compatibility will be explained once introduced to the pointers and memory topics.

NOTE: UNIX like OS, to verify which platform target a binary was compiled, use the command file

file cpplearningx64.o
cpplearningx64.o: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=3fc2a5a517f1fc2c50ec0d98bc0f6d9b582e6cd7, for GNU/Linux 3.2.0, not stripped

file cpplearningx32.o 
cpplearningx32.o: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=ef3e858f4a4dd40b6b7440cefd6cc4340850403b, for GNU/Linux 3.2.0, not stripped

HINT: Linux g++ might have installed a g++ that matches OS platform. To enable g++ to compile to multiple compatible platform use the multilib

$sudo apt install g++-multilib
⚠️ **GitHub.com Fallback** ⚠️