Handling External APIs with extapi.c - SVF-tools/SVF GitHub Wiki
Background
In C/C++, external APIs refer to a set of functions provided by a library or framework. These functions enable your program to interact with and utilize the functionality offered by that external entity. However, when C/C++ programs are compiled into LLVM IR, these external APIs are often only declared rather than defined. This poses challenges for static analyzers.
To address the undefined external APIs, SVF introduces an "extapi.c
" file that contains implementations of these external API functions. When SVF encounters external APIs during its analysis, it extracts the corresponding function bodies from the extapi.c
file . Simultaneously, SVF allows users the flexibility to define their own function bodies for external APIs, tailored to specific requirements or scenarios.
The functions in extapi.c
are divided into two categories:
1. the functionality is implemented by the function body,
2. the functionality is implemented by SVF.
The functionality is implemented by the function body
The function implementations in extapi.c are intended for use when constructing the Program Assignment Graph (PAG)
in SVF. For example,
char *asctime_r(const void *tm, char *buf)
{
return buf;
}
After the execution of asctime_r()
, the pointer to the parameter buf
is returned to the user's end through the return value
. The above approach captures the side-effects of asctime_r()
. This eliminates the need to write the complete asctime_r()
code.
The functionality is implemented by SVF
Intuitively, the functionality of functions in extapi.c (extapi module
) should all be implemented within their function bodies. However, for certain specific external functions, specialized logical processing by SVF is necessary. Take the malloc()
function for instance. It allocates an object to its return value, and achieving this functionality through conventional code methods might not be feasible. As a result, we introduce an annotation
to the malloc()
function:
__attribute__((annotate("ALLOC_RET")))
void *malloc(unsigned long size)
{
return NULL;
}
When SVF encounters the malloc() function, it allocates an object to the return value of malloc()
based on the ALLOC_RET
annotation.
Annotations Explanation
The description of methodProperties is as follows:
ALLOC_RET, // returns a ptr to a newly allocated object
ALLOC_ARGi // stores a pointer to an allocated object in *argi
REALLOC_RET, // returns a ptr to another allocated object
MEMSET, // memcpy() operations
MEMCPY, // memset() operations
OVERWRITE, // svf function overwrite app function
__attribute__((annotate("ALLOC_RET")))
:The function's side-effect is allocating an object to its return value. For example,
__attribute__((annotate("ALLOC_RET")))
void *malloc(unsigned long size)
{
return NULL;
}
__attribute__((annotate("ALLOC_ARGi")))
:The function's side-effect is allocating an object to its i
-th parameter. For example,
__attribute__((annotate("ALLOC_ARG0")))
int db_create(void **dbp, void *dbenv, unsigned int flags)
{
return 0;
}
__attribute__((annotate("REALLOC_RET")))
:The function's side-effect is reallocating an object to its return value. For example,
__attribute__((annotate("REALLOC_RET")))
char *getcwd(char *buf, unsigned long size)
{
return NULL;
}
__attribute__((annotate("MEMCPY")))
:The side-effect of this function is similar to function void *memcpy(void *dest, const void *src, size_t n)
. For example,
__attribute__((annotate("MEMCPY")))
void *memmove(void *str1, const void *str2, unsigned long n)
{
return NULL;
}
__attribute__((annotate("MEMSET")))
:The side-effect of this function is similar to function void *memset(void *str, int c, size_t n)
. For example,
__attribute__((annotate("MEMSET")))
void llvm_memset_p0i8_i64(char* dst, char elem, int sz, int flag)
{
}
__attribute__((annotate("OVERWRITE")))
:If you intend to use a new function definition in extapi.c to overwrite the function definition (with the same function name) in the application. For example,
__attribute__((annotate("OVERWRITE")))
void foo(char* arg)
{
...
...
}
Function declaration and Function definition mapping relationship in SVF
Due to the introduction of extapi.c
, SVF analysis will involve two modules (assuming the user has only one file to analyze): the extapi module
and the app module
. The existence of extapi.c is to map function declarations
from the app module
to the function declarations
in the extapi module
, enabling SVF to accurately handle the side-effects of functions in the app module
that only have declarations.
To ensure the accurate mapping between function declarations
and definitions
in the extapi module
and the app module
, SVF maintains two maps: FunDeclToDefMap
and FunDefToDeclsMap
.
FunDeclToDefMap : Function declaration --> Function definition;
FunDefToDeclsMap: Function definition --> Function declarations;
The declaration
and definition
are abbreviated as:
extapi moudle: ExtDecl, ExtDef;
app module: APPDecl, APPDef;
The mapping table of function declaration
and definition
in the extapi module
and the app module
is as follows:
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| | AppDef | AppDecl | ExtDef | ExtDecl |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| AppDef | X | FunDefToDeclsMap| FunDeclToDefMap | X |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| AppDecl | FunDeclToDefMap | X | FunDeclToDefMap | X |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| ExtDef | FunDefToDeclsMap | FunDefToDeclsMap| X | X |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| ExtDecl | FunDeclToDefMap | X | X | ExtFuncsVec |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
The mapping code is located in buildFunToFunMap()
function within LLVMModule.cpp
file .
Once the accurate mapping between Function declarations and definitions is obtained, SVF will proceed to remove functions
and annotations
from the extapi module
that are not used in the app module
, based on the functions utilized in the FunDefToDeclsMap
. In other words, extapi definitions
in the extapi module
that lack corresponding app declarations
will be removed.
Most of the decl->def
and def->decl
relationships in this table are clear. Below, we give explanations for two situations that can be easily confused.
1. Using self-defined function definitions in extapi module to overwrite the function definition in the app module.
When a user wants to use their defined functions (with "__attribute__((annotate("OVERWRITE"))
) in extapi module
to overwrite the functions defined in the app module
, The app function definition (AppDef)
will be changed to an app function declaration (AppDecl)
.
Subsequently, place app function declaration(AppDecl)
and extapi function definition(ExtDef)
into the FunDeclToDefMap
and FunDefToDeclsMap
respectively.
That is, two relationships
AppDef -> ExtDef
ExtDef -> AppDef
in above Table become
AppDecl -> ExtDef
ExtDef -> AppDecl
during actual execution.
The following is a simple example to facilitate understanding:
app function:
AppDef: char* foo(char *a, char *b)
{
return a;
}
extapi self-defined function:
ExtDef: __attribute__((annotate("OVERWRITE")))
char* foo(char *a, char *b)
{
return b;
}
When SVF handles the foo()
function in the App module
,
the definition of
AppDef: char* foo(char *a, char *b)
{
return a;
}
will be changed to a declaration
AppDecl: char* foo(char *a, char *b);
Then,
AppDecl: char* foo(char *a, char *b);
ExtDef: __attribute__((annotate("OVERWRITE")))
char* foo(char *a, char *b)
{
return b;
}
will be put into
FunDeclToDefMap: {key: AppDecl, value: ExtDef}
FunDefToDeclsMap: {key: ExtDef, value: AppDecl}
2. Keeping function declarations in extapi module
In principle, all functions in extapi.c have bodies (definitions), but some functions have only function declarations
without definitions
for SVF special treatment and may require modifications to the SVF code for processing. Currently, functions with declarations in extapi.c are only used by certain modules of SVF (e.g., SSE
(SVF Symbolic Execution
) module). If the user does not have such a requirement, this section can be skipped.
ExtFuncsVec
vector is used to record these specialized function declarations(ExtDecl -> ExtDecl
relationship outlined in the above table).
For example,
app function:
AppDef: void foo()
{
call memcpy();
}
extapi function:
ExtDecl: declare sse_check_overflow();
ExtDef: void* memcpy()
{
sse_check_overflow();
}
sse_check_overflow()
used in the extapi function but not in the app function and it will be handled in SVF code. Then,
sse_check_overflow
is kept in ExtFuncsVec
for special handling.
svf__main
to support command line input
3. Introducing Since there are some main functions that need to accept command line parameters, pass them to int argc
and char*[] argv
. Therefore, the svf__main()
function is also given in extapi.c
, as well as the declared extern int main(int argc, char** argv);
. The size of argc
and argv
can be specified in svf__main()
, and they will participate in the subsequent function process as abstract symbols.
ExtDecl: declare main();
ExtDef: void svf__main()
{
main();
}
AppDef: int main() {
}