Advanced C and CPP Compiling Notes - yszheda/wiki GitHub Wiki
- crt0 is the “plain vanilla” entry point, the first part of program code that gets executed under the control of kernel.
- crt1 is the more modern startup routine with support for tasks to be completed before the main function gets executed and after the program terminates.
Regardless of which particular exec-type function is chosen, each of them ultimately makes a call to the sys_execve
function, which starts the actual job of executing the program.
The immediate next step (which happens in function search_binary_handler
(file fs/exec.c
) is to identify the
executable format. In addition to supporting the most recent ELF binary executable format, Linux provides backwards
compatibility by supporting several other binary formats. If the ELF format is identified, the focus of action moves into
the load_elf_binary
function (file fs/binfmt_elf.c
).
After loading the program (i.e. preparing the program blueprint and copying the necessary sections to the memory for its execution), the loader takes a quick look at the value of e_entry field from the ELF header. This value contains the program memory address from which the execution will start. Disassembling the executable binary file typically shows that the e_entry value carries nothing less than the first address of the code (
.text
) section. Coincidentally, this program memory address typically denotes the origin of the_start
function.
- Starts up the program’s threading.
- Calls the
_init()
function, which performs initializations required to be completed before themain()
function starts. The GCC compiler through the__attribute__ ((constructor))
keyword supports custom design of the routines you may want to be completed before your program starts. - Registers the
_fini()
and_rtld_fini()
functions to be called to cleanup after the program terminates. Typically, the action of_fini()
is inverse to the actions of the_init()
function. The GCC compiler, through the__attribute__ ((destructor))
keyword, supports custom design of the routines you may want to complete before your program starts.
Linking static libraries in Linux adheres to the following set of rules:
- Linking static libraries happens sequentially, one static library by one.
- Linking static libraries starts from the last static library on the list of static libraries passed to the linker (from command line or through the makefile), and goes backwards, toward the first library on the list.
- The linker searches the static libraries in detail, and of all the object files contained in the static library it links in only the object file, which contains symbols that are really needed by the client binary.
- Linking the static library into the executable does not differ from doing the same thing on 32-bit Linux.
- However, linking the static library into the shared library requires that the static library be built with either the
-fPIC
compiler flag (suggested by the compiler’s error printout) or with the-mcmodel=large
compiler flag. The truth is that the use of-fPIC
flag is not the decisive factor of whether the static or dynamic library will be created; it is the-shared
linker flag.
- SOLUTION 1: Provide the custom implementation of the
_init()
method, a standard method called immediately when the dynamic library is loaded, in which a class static method instantiates the object, thus forcing initialization by the construction. Consequently, the custom implementation of the standard_fini()
, a standard method called immediately before the dynamic library is unloaded, may be provided in which the object deallocation may be completed. - SOLUTION 2: Replace direct access to such object with a call to a custom function. Such function will contain a static instance of the C++ class, and will return the reference to it. Before the first access, a variable declared static will be constructed, ensuring that its initialization will happen before the first actual call. The GNU compiler as well as the C++11 standard guarantees that this solution is thread safe.
- (affecting the whole body of code)
-fvisibility compiler flag
- (affecting individual symbols only)
__attribute__ ((visibility("<default | hidden>")))
- (affecting individual symbols or a group of symbols)
#pragma GCC visibility [push | pop]
Passing the
--no-undefined
flag to thegcc
linker will result with the unsuccessful build if each and every symbol is not resolved at build time. This way, the Linux default of tolerating the presence of unresolved symbols is effectively reverted into the Windows-like strict criteria.
$ gcc -fPIC <source files> -l <libraries> -Wl,--no-undefined -o <shlib output filename>
$ gcc -shared <list of object files> -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0.0
- By setting the
LD_PRELOAD
environment variable. - Through the
/etc/ld.so.preload
file.
Nowadays, both
rpath
andrunpath
are available, butrunpath
is given higher regard in the runtime search priority list. Only in the absence of its younger siblingrunpath
(DT_RUNPATH
field), therpath
(DT_RPATH
field) remains the search path information of the highest priority for the Linux loader. If, however, therunpath
(DT_RUNPATH
) field of the ELF binary is non-empty, therpath
is ignored.
$ gcc -Wl,-R/home/milan/projects/ -lmilanlibrary
$ export LD_RUN_PATH=/home/milan/projects:$LD_RUN_PATH
Finally, the
rpath
of the binary may be modified after the fact, by running thechrpath
utility program. One notable drawback of the chrpath is that it can’t modify therpath
beyond its already existing string length. More precisely,chrpath
can modify and delete/empty theDT_RPATH
field, but cannot insert it or extend it to a longer string.The way to examine the binary file for the value of the
DT_RPATH
field is to examine the binary’s ELF header (such as runningreadelf -d
orobjdump -f
).
When the
rpath
(DT_RPATH
) value is not set, this path supplied this way is used as the highest priority search path information.
# -Wl: prefix required when invoking linker indirectly, through gcc instead of directly invoking ld
# -R: run path linker flag
# --enable-new-dtags: both rpath and runpath set to the same string value
$ gcc -Wl,-R /home/milan/projects/ -Wl,--enable-new-dtags -lmilanlibrary
$ patchelf --set-rpath <one or more paths> <executable>
ldconfig Cache
Please notice that the path
/usr/local/lib
does not belong to this category.
When
RUNPATH
field is specified (i.e.DT_RUNPATH
is non-empty)
LD_LIBRARY_PATH
runpath
(DT_RUNPATH
field)ld.so.cache
- default library paths (
/lib
and/usr/lib
)
In the absence of
RUNPATH
(i.e.DT_RUNPATH
is empty string)
RPATH
of the loaded binary, followed by theRPATH
of the binary, which loads it all the way up to either the executable or the dynamic library which loads all of themLD_LIBRARY_PATH
ld.so.cache
- default library paths (
/lib
and/usr/lib
)
The Windows runtime dynamic library search priority schemes:
- The Windows Store applications (Windows 8) have the different set of rules than the Windows Desktop Applications.
- Whether the DLL of the same name is already loaded in the memory.
- Whether the DLL belongs to the group of known DLLs for the given version of Windows OS.
It is almost a no-brainer that the functions and variables declared
static
(in the C language sense, as in relevant only for the file in which they reside) are out of danger. Indeed, since only the nearby instructions need to access these symbols, all the accesses can be implemented by supplying the relative address offsets. In fact, only the functions and variables whose symbols are exported by the dynamic library are guaranteed to suffer from the negative effects of the address translation. In fact, when the linker knows that a certain symbol is exported, it implements all the accesses via the absolute addresses. The address translation then renders such instructions unusable.
In general, the exchange of information between the linker and the loader happens through the specific
.rel.dyn
section inserted by the linker into the body of the binary.
The special twist in the solution is that the symbol addresses are kept in a so-called global offset table (GOT) for which the linker reserves a dedicated
.got
section. The distance between the.text
section and.got
section is constant and known at link time.
// TODO
// TODO
The ultimate solution to the problem would be to host the singleton class in a dynamic library.
- A substantial change in provided runtime functionality, such as the complete elimination of a previously supported feature, substantial change of requirements for a feature to be supported, etc.
- Inability of the client binary to link against the dynamic library due to a changed ABI, such as removed functions or whole interfaces, changed exported function signatures, reordered structure or class layout, etc.
- Completely changed paradigms in maintaining the running process or changes requiring the major infrastructure changes.
The code changes that qualify for the increment of dynamic library minor version numbers typically do not impose recompiling/relinking of the client binaries, nor cause substantially changed runtime behavior.
Code changes that are mostly of internal scope, which neither cause any change in the ABI interface nor bring a substantial functionality change typically qualify for the “patch” status.
In this scheme, the text files known as version scripts featuring a fairly simple syntax are passed to the linker during the linking stage, which the linker inserts into the ELF sections (
.gnu.version
and similar ones) specialized in carrying the symbol versioning information.
simpleVersionScript
:
LIBSIMPLE_1.0 {
global:
first_function; second_function;
local:
*;
};
$ gcc -shared simple.o -Wl,--version-script,simpleVersionScript -o libsimple.so.1.0.0
__asm__(".symver list_occupancy_1_0, list_occupancy@MYLIBVERSION_1.0");
unsigned long list_occupancy_1_0(struct List* pStart)
{
// here we scan the list, and return the number of elements
return nElements;
}
__asm__(".symver list_occupancy_2_0, list_occupancy@@MYLIBVERSION_2.0");
unsigned long list_occupancy_2_0(struct List* pStart)
{
// here we scan the list, but now return the total number of bytes
return nElements*sizeof(struct List);
}
$ objdump -p /path/to/program | grep NEEDED
$ readelf -d /path/to/program | grep NEEDED
# lists only the symbols in the dynamic section (i.e., exported/visible symbols of a shared library).
$ nm -D <path-to-binary>
# lists the symbols in demangled format
$ nm -C <path-to-binary>
# search for a symbol in multitude of binaries located in the same folder
$ nm -A <library-folder-path>/* | grep symbol-name
# list the library’s undefined symbols
$ nm -u <path-to-binary>
# Parsing ELF Header
$ objdump -f
# list the available sections
$ objdump -h
# Listing All Symbols
# `nm <path-to-binary>`
$ objdump -t <path-to-binary>
# Listing Dynamic Symbols Only
# `nm -D <path-to-binary>`
$ objdump -T <path-to-binary>
# `nm -C <path-to-binary>`
$ objdump -C <path-to-binary>
# examines the dynamic section (useful for finding `DT_RPATH` and/or `DT_RUNPATH` settings)
$ objdump -p <path-to-binary>
# Examining Relocation Section
$ objdump -R <path-to-binary>
# Examining Data Section
$ objdump -s -j <section name> <path-to-binary>
# nm <path-to-binary>
$ readelf --symbols
# `nm -D <path-to-binary>`
$ readelf --dyn-syms
# examines the dynamic section (useful for finding DT_RPATH and/or DT_RUNPATH settings)
$ readelf -d
# Examining the Relocation Section
$ readelf -r
# Examining the Data Section
$ readelf -x
# Listing and Examining Segments
$ readelf --segments
# Detecting the Debug Build
$ readelf --debug-dump=line <binary file path>| wc -l
if readelf --debug-dump=line $1 > /dev/null; then echo "$1 is built for debug"; fi
If (and if only) the binary is built for debug (by passing the-g -O0 compiler flags)
$ addr2line -C -f -e /usr/mylibs/libxyz.so 0000d8cc6
# Creating the Static Library
$ ar -rcs <library name> <list of object files>
# Listing the Static Library Object Files
$ ar -t <library name>
# Deleting an Object File from the Static Library
$ ar -d <library name> <object file to remove>
# Adding the New Object File to the Static Library
$ ar -r <library name> <object file to append>
# Restoring the Order of Object Files
$ ar -m -b <object file before> <library name> <object file to move>
LD_DEBUG
file
$ readelf -h <path-of-binary> | grep Type
-
- EXEC (executable file)
-
- DYN (shared object file)
-
- REL (relocatable file)
$ objdump -f <path-of-binary>
-
- EXEC_P (executable file)
-
- DYNAMIC (shared object file)
-
- No type indicated, in the case of an object file
$ readelf -h <path-of-binary> | grep Entry
$ objdump -f <path-of-binary> | grep start
LD_DEBUG=files gdb <exe>
nm
readelf
objdump
$ readelf -S <path-to-binary>
$ objdump -t <path-to-binary>
$ readelf -d <path-to-binary>
$ objdump -p <path-to-binary>
pic_or_ltr.sh
:
if readelf -d $1 | grep TEXTREL > /dev/null; \
then echo "library is LTR, built without the -fPIC flag"; \
else echo "library was built with -fPIC flag"; fi
$ readelf -r <path-to-binary>
$ objdump -R <path-to-binary>
$ readelf -x <section name> <path-to-binary>
$ objdump -s -j <section name> <path-to-binary>
$ readelf --segments <path-to-binary>
$ objdump -p <path-to-binary>
$ objdump -d <path-to-binary>
-
$ objdump -d -M intel <path-to-binary>
-
$ objdump -d -M intel -S <path-to-binary>
-
$ objdump -d -S -M intel -j .plt <path-to-binary>
$ readelf --debug-dump=line <path-to-binary>
$ ldd <path-to-binary>
$ objdump -p /path/to/program | grep NEEDED
$ readelf -d /path/to/program | grep NEEDED
$ ldconfig -p
open()
-
mmap()
Whenever a shared library is mentioned, typically the few output lines below themmap()
call reveals the loading address.
LD_DEBUG=files
$ lsof -p `pgrep firefox`
dl_iterate_phdr()