Handling External APIs with extapi.c - SVF-tools/SVF GitHub Wiki

Background

In C/C++, external APIs refer to a set of functions provided by a library or framework. These functions enable your program to interact with and utilize the functionality offered by that external entity. However, when C/C++ programs are compiled into LLVM IR, these external APIs are often only declared rather than defined. This poses challenges for static analyzers.

To address the undefined external APIs, SVF introduces an "extapi.c" file that contains implementations of these external API functions. When SVF encounters external APIs during its analysis, it extracts the corresponding function bodies from the extapi.c file . Simultaneously, SVF allows users the flexibility to define their own function bodies for external APIs, tailored to specific requirements or scenarios.

The functions in extapi.c are divided into two categories:

1. the functionality is implemented by the function body, 
2. the functionality is implemented by SVF.

The functionality is implemented by the function body

The function implementations in extapi.c are intended for use when constructing the Program Assignment Graph (PAG) in SVF. For example,

char *asctime_r(const void *tm, char *buf)
{
    return buf;
}

After the execution of asctime_r(), the pointer to the parameter buf is returned to the user's end through the return value. The above approach captures the side-effects of asctime_r(). This eliminates the need to write the complete asctime_r() code.

The functionality is implemented by SVF

Intuitively, the functionality of functions in extapi.c (extapi module) should all be implemented within their function bodies. However, for certain specific external functions, specialized logical processing by SVF is necessary. Take the malloc() function for instance. It allocates an object to its return value, and achieving this functionality through conventional code methods might not be feasible. As a result, we introduce an annotation to the malloc() function:

__attribute__((annotate("ALLOC_RET")))
void *malloc(unsigned long size)
{
    return NULL;
} 

When SVF encounters the malloc() function, it allocates an object to the return value of malloc() based on the ALLOC_RET annotation.

Annotations Explanation

The description of methodProperties is as follows:

    ALLOC_RET,    // returns a ptr to a newly allocated object
    ALLOC_ARGi    // stores a pointer to an allocated object in *argi
    REALLOC_RET,  // returns a ptr to another allocated object
    MEMSET,       // memcpy() operations
    MEMCPY,       // memset() operations
    OVERWRITE,    // svf function overwrite app function

__attribute__((annotate("ALLOC_RET"))):The function's side-effect is allocating an object to its return value. For example,

__attribute__((annotate("ALLOC_RET")))
void *malloc(unsigned long size)
{
    return NULL;
}

__attribute__((annotate("ALLOC_ARGi"))):The function's side-effect is allocating an object to its i-th parameter. For example,

__attribute__((annotate("ALLOC_ARG0")))
int db_create(void **dbp, void *dbenv, unsigned int flags)
{
    return 0;
}

__attribute__((annotate("REALLOC_RET"))):The function's side-effect is reallocating an object to its return value. For example,

__attribute__((annotate("REALLOC_RET")))
char *getcwd(char *buf, unsigned long size)
{
    return NULL;
}

__attribute__((annotate("MEMCPY"))):The side-effect of this function is similar to function void *memcpy(void *dest, const void *src, size_t n). For example,

__attribute__((annotate("MEMCPY")))
void *memmove(void *str1, const void *str2, unsigned long n)
{
    return NULL;
}

__attribute__((annotate("MEMSET"))):The side-effect of this function is similar to function void *memset(void *str, int c, size_t n). For example,

__attribute__((annotate("MEMSET")))
void llvm_memset_p0i8_i64(char* dst, char elem, int sz, int flag)
{
}

__attribute__((annotate("OVERWRITE"))):If you intend to use a new function definition in extapi.c to overwrite the function definition (with the same function name) in the application. For example,

__attribute__((annotate("OVERWRITE")))
void foo(char* arg)
{
    ...
    ...
}

Function declaration and Function definition mapping relationship in SVF

Due to the introduction of extapi.c, SVF analysis will involve two modules (assuming the user has only one file to analyze): the extapi module and the app module. The existence of extapi.c is to map function declarations from the app module to the function declarations in the extapi module, enabling SVF to accurately handle the side-effects of functions in the app module that only have declarations.

To ensure the accurate mapping between function declarations and definitions in the extapi module and the app module, SVF maintains two maps: FunDeclToDefMap and FunDefToDeclsMap.

FunDeclToDefMap : Function declaration --> Function definition;
FunDefToDeclsMap: Function definition --> Function declarations;

The declaration and definition are abbreviated as:

extapi moudle: ExtDecl, ExtDef;
app module:    APPDecl, APPDef;

The mapping table of function declaration and definition in the extapi module and the app module is as follows:

| ------- | ----------------- | --------------- | ----------------- | ----------- |
|         |      AppDef       |     AppDecl     |      ExtDef       |   ExtDecl   |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| AppDef  |        X          | FunDefToDeclsMap| FunDeclToDefMap   |      X      |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| AppDecl | FunDeclToDefMap   |        X        | FunDeclToDefMap   |      X      |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| ExtDef  | FunDefToDeclsMap  | FunDefToDeclsMap|        X          |      X      |
| ------- | ----------------- | --------------- | ----------------- | ----------- |
| ExtDecl | FunDeclToDefMap   |        X        |        X          | ExtFuncsVec |
| ------- | ----------------- | --------------- | ----------------- | ----------- |

The mapping code is located in buildFunToFunMap() function within LLVMModule.cpp file .

Once the accurate mapping between Function declarations and definitions is obtained, SVF will proceed to remove functions and annotations from the extapi module that are not used in the app module, based on the functions utilized in the FunDefToDeclsMap. In other words, extapi definitions in the extapi module that lack corresponding app declarations will be removed.

Most of the decl->def and def->decl relationships in this table are clear. Below, we give explanations for two situations that can be easily confused.

1. Using self-defined function definitions in extapi module to overwrite the function definition in the app module.

When a user wants to use their defined functions (with "__attribute__((annotate("OVERWRITE"))) in extapi module to overwrite the functions defined in the app module, The app function definition (AppDef) will be changed to an app function declaration (AppDecl). Subsequently, place app function declaration(AppDecl) and extapi function definition(ExtDef) into the FunDeclToDefMap and FunDefToDeclsMap respectively.

That is, two relationships

AppDef -> ExtDef
ExtDef -> AppDef

in above Table become

AppDecl -> ExtDef
ExtDef -> AppDecl

during actual execution.

The following is a simple example to facilitate understanding:

app function:

AppDef: char* foo(char *a, char *b)
        {
           return a;
        }

extapi self-defined function:

ExtDef: __attribute__((annotate("OVERWRITE")))
        char* foo(char *a, char *b)
        {
           return b;
        }

When SVF handles the foo() function in the App module, the definition of

AppDef: char* foo(char *a, char *b)
        {
           return a;
        }

will be changed to a declaration

AppDecl: char* foo(char *a, char *b);

Then,

AppDecl: char* foo(char *a, char *b);

ExtDef: __attribute__((annotate("OVERWRITE")))
        char* foo(char *a, char *b)
        {
           return b;
        }

will be put into

FunDeclToDefMap: {key: AppDecl, value: ExtDef}
FunDefToDeclsMap: {key: ExtDef, value: AppDecl}

2. Keeping function declarations in extapi module

In principle, all functions in extapi.c have bodies (definitions), but some functions have only function declarations without definitions for SVF special treatment and may require modifications to the SVF code for processing. Currently, functions with declarations in extapi.c are only used by certain modules of SVF (e.g., SSE (SVF Symbolic Execution) module). If the user does not have such a requirement, this section can be skipped. ExtFuncsVec vector is used to record these specialized function declarations(ExtDecl -> ExtDecl relationship outlined in the above table).

For example,

app function:

AppDef:  void foo()
         {
             call memcpy();
          }

extapi function:

ExtDecl: declare sse_check_overflow();

ExtDef:  void* memcpy()
         {
             sse_check_overflow();
         }

sse_check_overflow() used in the extapi function but not in the app function and it will be handled in SVF code. Then, sse_check_overflow is kept in ExtFuncsVec for special handling.

3. Introducing svf__main to support command line input

Since there are some main functions that need to accept command line parameters, pass them to int argc and char*[] argv. Therefore, the svf__main() function is also given in extapi.c, as well as the declared extern int main(int argc, char** argv);. The size of argc and argv can be specified in svf__main(), and they will participate in the subsequent function process as abstract symbols.

ExtDecl: declare main();

ExtDef: void svf__main()
          {
              main();
          }

AppDef: int main() {

         }