Gklee Tutorial

SESA (Symbolic Execution with Static Analysis) and Flows

Welcome to the Gklee tutorial, where we will focus on the features specific to SESA, which includes a static analaysis pre-execution step which informs the Gklee execution step in its management of flows.

SESA is the latest Gklee version, which was published in Supercomputing 2014.

For technical details, please see our information page at:

http://formalverification.cs.utah.edu/GKLEE -- especially 'Practical Symbolic Checking of GPU Programs'

You are welcome to install Gklee on your machine, and it is very easy to do. However, it does take some time, as there is a lot of infrastructure that Gklee depends on. Luckily, we have something already set up for you. The AptLab system is a flexible virtual environment that you can access from the world-wide-web without having an account (though you are limited to a web-page console without an account).

So these are the steps that we will cover in this tutorial:

Loading the AptLab website
- Selecting the Gklee project (system image)
Analyzing the read_write_race_0 example program
- Review the source code
- Perform the static analysis phase (SESA)
- Run SESA annotated (byte) code through Gklee
- Read Gklee output
- Examine the generated flow graph

Load AptLab Project

This section will assist you in loading Gklee in the virtual environment AptLab. No account is required, but having one will give you more flexibility in running Gklee, such as having the ability to use your own ssh session from your favorite console.

The first step is to access the website: https://www.aptlab.net/

Once you are here, the site will ask you to supply your login credentials. If you don't have these, just click the button continue as guest.

Now you will be on the Start Experiment page:

A view of AptLab

At this point, click on the Change Profile button in order to select the Gklee system image. You will be presented with a list of profiles, and a search box. If you enter gklee in the box, you'll find an Ubuntu image for gklee. Click Select Profile to continue, and click through next to the Finalize section, where you complete with Finish.

After some time at the experiment preparation area, your experiment will be ready to run! The page will look like this:

Gklee Experiment Ready! in AptLab

You now have a console to do your Gklee experiments in.

The example program, read_write_race_0

You may access this example program from the Github page if you like, https://github.com/Geof23/GkleeTests/blob/master/read_write_race_0/read_write_race_0.cu or you can run Emacs inside of the AptLab website console: Gklee Experiment Ready! in AptLab

cd GkleeTests/read_write_race_0
emacs -nw read_write_race_0.cu

Now that you have the example either loaded in Emacs or your browser window, We may discuss features of this program.

As you may know, CUDA programs consist of code sections that run on either host or GPU (or both!). This is a fairly simple program that has a single function, or GPU kernel, device_global.

We will disregard the host code located in main, which consists of mostly memory operations (i.e. transfers to and from device memory).

This CUDA kernel performs three operations, in addition to two conditional expressions, producing three branches of execution. The operations are:

compute a global thread index
write index to global array
read from global array, using an offset of our index and then write to the global array, at our index

__global__ void device_global(unsigned int *input_array, int num_elements) {
  int my_index = blockIdx.x * blockDim.x + threadIdx.x;
    // stop out of bounds access
  if (my_index < num_elements) {

     if (my_index%2 == 1) {
       // even threads write their index to their array entry
       input_array[my_index] = my_index;
    } else {
      // odd threads copy their value from the next array entry
       input_array[my_index] = input_array[my_index+1];
    }
  }
}

Perform the Analysis

The next step is to perform the analysis with Gklee. This part consists of three steps: compilation of test to LLVM bytecode, doing the pre analysis (dataflow [static] analysis) SESA, followed by the actual dynamic verification inside the Gklee virtual machine.

Here are the actual commands to run. Make sure that your working directory is set to: .../GkleeTests/read_write_race_0:

gklee-nvcc read_write_race_0.cpp
sesa < read_write_race_0 > read_write_race_0.sesa
gklee --symbolic-config --race-prune read_write_race_0.sesa &> test_results

For users that are using their own installation of Gklee or logged into AptLab, you may construct an image of the generated flow graph. Gklee outputs a dot graph that can be converted to some type of an image. To make this viewable, we'll add one additional step (for this you need to be logged into AptLab or have 'Graphviz' installed on your system):

dot -T pdf -o flowGraph.pdf flowsGraph.dot

If you are using Gklee as an AptLab guest, we have generated a flowGraph.pdf for you that is available here:

https://github.com/Geof23/GkleeTests/blob/master/read_write_race_0/flowsGraph.pdf

Examining the results

If everything goes as expected, Gklee will detect a global race in the example program! This race is what is known as a porting race, because it is one that can sometimes be revealed when running on a different CUDA architecture (for instance, if there's a difference in warp sizes).

The output that indicates the problem:


Instruction Line: 21, In File: read_write_race_0.cpp, With Dir Path: /
users/xxx/GkleeTests/read_write_race_0
[File: /users/xxx/GkleeTests/read_write_race_0/read_write_race_0.cpp,
Line: 21, Inst:       input_array[my_index] = my_index;]
  store i32 %6, i32* %arrayidx, align 4, !dbg !989
  <W: 50630576, 4:1>
  [GKLEE]: Thread 1 : { <0, 0, 0>, <0, 0, 0> } and Thread 2 : { <0, 0, 0
  >, <1, 0, 0> } incur the (Actual) read-write race

You may read more about this in the paper we referred to earlier (http://www.cs.utah.edu/formal_verification/papers/SC14-GKLEE-SESA.pdf). Briefly, a Gklee flow is a data structure that represents all the threads in a branch of a CUDA barrier interval. It includes a SAT expression that represents the conditions that have been encountered.

Also, here is a part of the generated flow graph. As you can see from Gklee's text output:

[GKLEE]: Within the current Barrier Interval, 3 flows are used to repr
esent all threads!

Gklee used 3 flows to analyze this example. Our flow graph gives a visual of how flows are generated, and what they do. You can access the link above, but here is a partial snapshot:

partial flowGraph snapshot

The flow graph shows you where flows are merged and spawned, what conditional branches they handle, and what instructions are executed under them.

Conclusion

In this tutorial, we explored the basic operation of Gklee using its flow mechanism, which is enabled by the static analysis step (SESA). The test that we explored contains a data race, which Gklee correctly identified. We also took a look at how Gklee uses a concept called flows to efficiently and scalably model GPU threads.

flows_tutorial - Geof23/Gklee GitHub Wiki

Gklee Tutorial

SESA (Symbolic Execution with Static Analysis) and Flows

Load AptLab Project

The example program, read_write_race_0

Perform the Analysis

Examining the results

Conclusion

⚠️ GitHub.com Fallback ⚠️

flows_tutorial - Geof23/Gklee GitHub Wiki

Gklee Tutorial

SESA (Symbolic Execution with Static Analysis) and Flows

Load AptLab Project

The example program, read_write_race_0

Perform the Analysis

Examining the results

Conclusion

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️