Add Metadata to LLVM Bitcode - ashish-gehani/SPADE GitHub Wiki
The steps below describe how to use an LLVM optimizer pass, named AddMetadata, to add metadata at function call-sites to specified parameters of the called function.
Use case
Following is a code snippet from the web server nweb which shows the read of a request data (i.e. input to the web server) from the client into the argument buffer
of the read
system call:
...
void web(int fd, int hit)
{
int j, file_fd, buflen, len;
long i, ret;
char * fstr;
static char buffer[BUFSIZE+1]; /* static so zero filled */
ret =read(fd,buffer,BUFSIZE); /* read Web request in one go */
...
Metadata can be added to the argument buffer
of the read
function call (above) using the following AddMetadata configuration file:
read, 2, input
The configuration file, says to add the metadata input
to the second argument of the read
function call.
To add the metadata, the AddMetadata pass requires the LLVM bitcode of the nweb
application. Then it iterates over all the instructions in the LLVM bitcode and adds the above-mentioned metadata to an instruction as LLVM Metadata. For each function call, a metadata is attached to it with the name call-site-metadata
. The value of the attached metadata is an MDTuple
instance. The first element in this tuple is a unique identifier for the site of the function call. Rest of the elements in the tuple are instances of MDTuple
, one for each argument (if metadata was added for the argument). The first element in the argument tuple is the index of the argument (as specified in the configuration file). The rest of the elements in the argument tuple are the metadatas (one or more) as specified in the configuration file.
The output LLVM IR snippet of the above example looks as follows:
%14 = call i64 @read(i32 %12, i8* %13, i64 10), !call-site-metadata !2
!2 = !{!"0", !3}
!3 = !{!"2", !"input"}
Requirements
- LLVM - Recommended release
10.0.0
- Clang - Recommended release
10.0.0
- CMake - Recommended release
3.13.4
or higher
On Ubuntu 20.04, the requirements can be installed with:
sudo apt-get install -y llvm-10 clang-10 cmake
Build the AddMetadata Pass From Source
- Clone SPADE repository
- Execute the command:
./bin/build-add-metadata.sh /usr/lib/llvm-10
. Make sure to update the argument/usr/lib/llvm-10
to your LLVM installation - Upon successful build, the shared library for the pass would be created in
lib/libAddMetadata.so
Using AddMetadata Pass
The pass takes three arguments:
-config
: (Mandatory) The path to input configuration file (format described below)-output
: (Optional) File location to write the output of the pass to. If the value isstdout
then output is written to standard out-debug
: (Optional) Print debug information, specifically, after each metadata addition, parse and print the metadata
Following is an example command:
$ opt -load lib/libAddMetadata.so -legacy-add-metadata -config input.config -output stdout bitcode.bc -o bitcode_with_metadata.bc
The command above reads input configuration file from input.config
, and writes the output to standard out.
File Formats
Following is a sample input configuration file specified using -config
:
# Each line contains 3 comma-separated values
# 1. The first value is the function name. The metadata will be added for all call-sites of this function
# 2. The seconds value is the argument index of the function call to which the metadata would be added
# 3. The third value is a descriptor of the metadata to identify the semantics of the metadata
# Comments start with '#' and must be at the beginning of the line
# Following tells the pass to add metadata with descriptor 'input' to all call-sites of the function 'read' for it's second parameter
read, 2, input
Following is a sample output of the pass:
10, read, 2, input
The output, above, indicates that the descriptor input
was added to the second parameter of the function read
at it's call-site which is identified by the value 10
.
Extracting Metadata from LLVM Bitcode in an LLVM Optimizer Pass
The following code snippet shows how to extract the added metadata using a callback mechanism:
// The callback function which would be called for each metadata description added for each parameter
static void extraction_metadata_callback(Instruction *instruction, StringRef *functionName, APInt *callSiteNumber, APInt *parameterIndex, StringRef *description){
// Do your work here
}
// The main function for an LLVM optimizer pass
bool MyOPTPass::runOnModule(Module &module){
// The required definition of the callback can be seen here
void (*metadata_callback_func)(Instruction *instruction, StringRef *functionName, APInt *callSiteNumber, APInt *parameterIndex, StringRef *description);
// Assigning your callback function
metadata_callback_func = &extraction_metadata_callback;
// The 'extractAllMetadata' function checks for existing metadata on each instruction. If found, then it calls the callback function.
extractAllMetadata(module, metadata_callback_func);
return false;
}
The implementation of extractAllMetadata
can be found in AddMetadata.cpp
.