Binary instrumentation - adava/DECAF-Selective GitHub Wiki

Binary instrumentation allows one to introduce custom logics to a binary. For instance, say we want to check whether a code has a buffer overflow at runtime. We need to check the stack and see whether a previous frame is overwritten after executing a function. To implement this, we need to check the stack once before a function starts running and once after the execution. This allows us to see the frame before and after the function execution. Thereby, we find out whether a buffer overflow occurred. Binary instrumentation allows us to add these two code snippets for frame checking in the beginning and the end of any given function and executable. QEMU allows binary instrumentation since it emulates the execution environment for a guest. Since DECAF is embedded in QEMU, it can instrument different parts of a binary during translation i.e. custom code can be injected into a binary while translating and rewriting. We can embed additional codes when we are translating (disassembling) a binary and writing the equivalent intermediate code.

The code translation gives us enough knowledge for selective instrumentation. For instance, say we are interested only in instrumenting a function before making a system call. To implement this, we can check the binary during the translation and embed our additional code once we observe a system call to the system call of our interest. Another example is instrumenting the code once a particular interrupt is issued; again, in the translation time we can selectively check this and insert our custom code once we observe such call in our binary.

DECAF defines several callback types that allow selective instrumentation. Below shows the callback types that DECAF supports:

	typedef enum {
		DECAF_BLOCK_BEGIN_CB = 0,
		DECAF_BLOCK_END_CB,
		DECAF_INSN_BEGIN_CB,
		DECAF_INSN_END_CB,
		DECAF_MEM_READ_CB,
		DECAF_MEM_WRITE_CB,
		DECAF_EIP_CHECK_CB,
		DECAF_KEYSTROKE_CB,//keystroke event
		DECAF_NIC_REC_CB,
		DECAF_NIC_SEND_CB,
		DECAF_OPCODE_RANGE_CB,
		DECAF_TLB_EXEC_CB,
		DECAF_READ_TAINTMEM_CB,
		DECAF_WRITE_TAINTMEM_CB,
		DECAF_BLOCK_TRANS_CB,
		DECAF_LAST_CB, 
	} DECAF_callback_type_t;

Callbacks of different types are kept in callback_list_heads list. The callbacks in this list have the following structure:

	typedef struct callback_struct{
		int *enabled;
		//the following are used by the optimized callbacks
		//BlockBegin only uses from - to is ignored
		//blockend uses both from and to
		gva_t from;
		gva_t to;
		OCB_t ocb_type;
		DECAF_callback_func_t callback;
		LIST_ENTRY(callback_struct) link;
	}callback_struct_t;

Callback mechanism implementation

In this section, we explain in more detail how DECAF implements the callback mechanism. Call backs are some custom functions defined by user that will be called based on their type. For instance, a callback function might be called after the translation of every block of a code i.e. it is a DECAF_BLOCK_END_CB callback. These callbacks then will be registered by the Qemu in the initialization time. Figure 12 shows the trace to the do_load_plugin function during the initialization. Execution from this function finally leads to the execution of an init_plugin function. An init_plugin function is, in fact, the init function for a custom plugin defined by a DECAF user. For instance, a few of the default plugins written by the DECAF authors are apitracer and callbacktests. Looking inside the init_plugin function of these plugins, we can see a call to the VMI_register_callback function that registers a callback. Afterwards, init_plugin returns a pointer to the plugin_interface_t structure that will be stored in the decaf_plugin global variable.

Figure 12. Trace to the load plugin

During execution, DECAF checks whether there is a callback registered for a particular event e.g. a TLB flush or in a certain time in translation e.g. before block translation and injects a call for the callback. The function that checks whether a callback is registered for a callback type is DECAF_is_callback_needed and the function that invokes the callback is DECAF_invoke.

DECAF Internal callbacks

Many of the functionalities in DECAF are implemented using the callback mechanisms we just explained. The registered callbacks are kept in the callback_list_heads. The functions are registered using the following function:

DECAF_Handle DECAF_register_callback( DECAF_callback_type_t cb_type, DECAF_callback_func_t cb_func, int *cb_cond)

Two of these internal callbacks and their purpose are:

  • Tainting related callbacks: Two functions are registered to log the event of reading and writing to a tainted memory. These are read_taint_mem and write_taint_mem functions. read_taint_mem is called by helper_DECAF_invoke_mem_read_callback.
  • Keystroke callbacks: tracing_send_keystroke logs the input of a keystroke. This function is registered for DECAF_KEYSTROKE_CB. This function is called by DECAF_keystroke_place.