Introduction: Soot as a command line tool - VictoryWangCN/soot GitHub Wiki

Obtaining Soot

You can always download the latest release version of Soot from the release page. There are a bunch of different options to choose from but usually you will just be needing the following:

  • soot-x.y.z.jar

Here we are using Soot 2.5.0. After having downloaded the file we can give soot a try:

$ java -cp soot-2.5.0.jar soot.Main

Soot version 2.5.0
Copyright (C) 1997-2010 Raja Vallee-Rai and others.
All rights reserved.
...

Bleeding-edge version (nightly build)

For the really brave among you, the Secure Software Engineering Group at Technische Universität Darmstadt provides a nightly build that is directly drawn from our Github repository. Usually the latest nightly build is the most stable version of Soot because tend to test code before we commit it. However, this may not always be true. The Soot webpage tells you where to download the nighty builds.

Soot’s command line

Ok, so it seems to be working but what can we do with it now? Let’s have a look at the command line options:

$ java -cp soot-2.5.0.jar soot.Main --help

General Options:
  -coffi                       Use the good old Coffi front end for parsing 
                               Java bytecode (instead of using ASM). 
  -h -help                     Display help and exit 
  -pl -phase-list              Print list of available phases 
  -ph PHASE -phase-help PHASE  Print help for specified PHASE 
  -version                     Display version information and exit 
  -v -verbose                  Verbose mode 
 ...

The full list of command line options is always available here and we encourage every Soot beginner to have a look at this documentation.

Processing single files

Soot in general processes a bunch of classes. These classes can come in one of three formats:

  • Java source code, i.e., .java files,
  • Java bytecode, i.e., .class files, and
  • Jimple source, i.e., .jimple files.

In case you don’t know yet, Jimple is Soot’s primary intermediate representation, a three-address code that is basically a sort of simplified version of Java that only requires around 15 different kinds of statements. You can instruct Soot to convert .java or .class files to .jimple files or the other hand around. You can even have Soot generate .jimple from .java, modify the .jimple with a normal text editor and then convert your .jimple to .class, virtually hand-optimizing your program. But we are getting off-track here...

The principle way to have Soot process two classes A and B is just to add them to the command line, which makes them application classes:

$ ls *.java
A.java  B.java
$ java -cp soot-2.5.0.jar soot.Main A B
Soot started on Thu Aug 21 08:26:41 GMT-05:00 2014
Exception in thread "main" java.lang.RuntimeException: couldn't find class: A (is your soot-class-path set properly?)

Whooops, what went wrong there? Well, I omitted an important detail: Soot has its own classpath!

Soot’s classpath

Soot has it’s own classpath and will load files only from JAR files or directories on that path. By default, this path is empty and therefore in the above example Soot does not “see” the classes A and B although they exist. So let’s just add the current directory “.”:

$ java -cp soot-2.5.0.jar soot.Main -cp . A B
Soot started on Thu Aug 21 08:32:13 GMT-05:00 2014
Exception in thread "main" java.lang.RuntimeException: couldn't find class: java.lang.Object (is your soot-class-path set properly?)

What’s wrong now? Apparently Soot was able to find A and B (at least it doesn’t complain about these any more) but now it’s missing java.lang.Object.

Why does Soot care about java.lang.Object anyway? In order to do anything meaningful with your program, Soot needs to have typing information and in particular it needs to reconstruct types for local variables and in order to do so it needs to know the complete type hierarchy of the classes you want to process.

Regarding the exception, there are three ways to resolve it:

  • add rt.jar to your classpath
  • add the –pp option, given your CLASSPATH variable comprises rt.jar or JAVA_HOME is set correctly
  • use the –allow-phantom-refs option (not recommended)

In the first option you add your JDK’s rt.jar to Soot’s classpath (not the JVM’s classpath!). This JAR file contains the class java.lang.Object:

$ java -cp soot-2.5.0.jar soot.Main -cp .:/home/user/ebodde/bin/sun-jdk1.6.0_05/jre/lib/rt.jar A B
Soot started on Thu Aug 21 08:42:09 GMT-05:00 2014
Transforming B...
Transforming A...
Writing to sootOutput/B.class
Writing to sootOutput/A.class
Soot finished on Thu Aug 21 08:42:12 GMT-05:00 2014
Soot has run for 0 min. 3 sec.

Heureka! This seems to have worked. Soot successfully processed the two .java files and placed resulting .class files into the sootOutput folder. Note that in general, Soot will process all classes you name on the command line and all classes referenced by those classes.

Beware, though, a common mistake is the following:

$ java -cp soot-2.5.0.jar soot.Main -cp .:~/bin/sun-jdk1.6.0_05/jre/lib/rt.jar A B
Soot started on Thu Aug 21 08:43:43 GMT-05:00 2008
Exception in thread "main" java.lang.RuntimeException: couldn't find class: java.lang.Object (is your soot-class-path set properly?)

What went wrong? Well, you tried to use ~ because that points to your home directory, no? Well yes, but the problem is that usually ~ is expanded by the shell, but not in this case. Soot gets the raw ~ string as a command line option and currently Soot is unable to expand that string into the right string for your home directory. So always use full or relative paths in Soot’s classpath.

The second option is to use –pp:

$ java -cp soot-2.5.0.jar soot.Main -cp . -pp A B  Soot started on Thu Aug 21 08:47:42 GMT-05:00 2008
Transforming A...
Transforming B...
Writing to sootOutput/A.class
Writing to sootOutput/B.class
Soot finished on Thu Aug 21 08:47:46 GMT-05:00 2008
Soot has run for 0 min. 3 sec.

Wow, that was much easier than adding this dawn classpath all the time, wasn’t it? Exactly and that’s why we added this option. –pp stands for “prepend path” and it means that Soot automatically adds the following to it’s own classpath (in that order):

  • the contents of your current CLASSPATH variable,
  • ${JAVA_HOME}/lib/rt.jar, and
  • if you are in whole-program mode (i.e. the –w option is enabled; more to come) then it also adds ${JAVA_HOME}/lib/jce.jar

The third way (not recommended) to make Soot sort of happy is the option –allow-phantom-refs:

$ java -cp soot-2.5.0.jar soot.Main -allow-phantom-refs -cp . A B
Soot started on Thu Aug 21 08:52:35 GMT-05:00 2008
Warning: java.lang.Short is a phantom class!
Warning: java.lang.Class is a phantom class!
Warning: java.lang.Character is a phantom class!
...
Transforming B...
Transforming A...
Writing to sootOutput/B.class
Writing to sootOutput/A.class
Soot finished on Thu Aug 21 08:52:37 GMT-05:00 2008
Soot has run for 0 min. 1 sec.

So what does that do? Basically this option tells Soot: “Well, I really don’t want to give you the classes you are missing (maybe because you just don’t have those classes) but please make a best effort even without them.” Soot creates a “phantom class” for each class that it cannot resolve and tells you about it. Note that this approach is very limited and in many cases does not lead to the results you need. Only use this option if you know what you are doing

Important note for Windows users Soot will treat drive letters correctly, but under Windows the path separator must be a backslash, not a forward slash.

Processing entire directories

You can also process entire directories or JAR files using Soot, using the –process-dir option:

$ java -cp soot-2.5.0.jar soot.Main -cp . -pp -process-dir .
Soot started on Thu Aug 21 09:01:12 GMT-05:00 2014
Transforming A...
Transforming B...
Writing to sootOutput/A.class
Writing to sootOutput/B.class
Soot finished on Thu Aug 21 09:01:15 GMT-05:00 2014
Soot has run for 0 min. 3 sec.

To process a JAR file, just use the same option but provide a path to a JAR instead of a directory. Nice, eh? Be careful, though: If you apply the very same command again to the very same folder you will run into a problem now:

$ java -cp soot-2.5.0.jar soot.Main -cp . -pp -process-dir .
Soot started on Thu Aug 21 09:02:29 GMT-05:00 2008
Exception in thread "main" java.lang.RuntimeException: Error: class A read in from a classfile in which sootOutput.A was expected.

What happened? Well, as I noted earlier, Soot places the generated .class files into the folder sootOutput, which resides in the current directory “.”. Therefore Soot now processed the previously generated files, at the same time complaining about the fact that a class of name “A” resides at location ./sootOutput/A and therefore should actually have the name sootOutput.A, i.e. be in the sootOutput package. Therefore, when using the –process-dir option also use the –d option to redirect Soot’s output:

$ java -cp soot-2.5.0.jar soot.Main -cp . -pp -process-dir . -d /tmp/sootout
Soot started on Thu Aug 21 09:06:29 GMT-05:00 2008
Transforming A...
Transforming B...
Writing to /tmp/sootout/A.class
Writing to /tmp/sootout/B.class
Soot finished on Thu Aug 21 09:06:32 GMT-05:00 2008
Soot has run for 0 min. 2 sec.

This redirects Soot’s output to /tmp/sootout, which is not a sub-directory of the current directory. Voila.

Processing certain types of files (.class / .java / .jimple)

Assume you have a directory that contains both A.java and A.class and you invoke Soot as before. In this case Soot will load the definition of A from the file A.class. This may not always be what you want. The –src-prec option tells Soot which input type it should prefer over others. There are four options:

  1. c or class (default): favour class files as Soot source,
  2. only-class: use only class files as Soot source,
  3. J or jimple: favour Jimple files as Soot source, and
  4. java: favour Java files as Soot source.

So e.g. -src-prec java will load A.java in the above example.

Application classes vs. library classes

Classes that Soot actually processes are called application classes. This is opposed to library classes, which Soot does not process but only uses for type resolution. Application classes are usually those explicitly stated on the command line or those classes that reside in a directory referred to via –process-dir.

When you use the -app option, however, then Soot also processes all classes referenced by these classes. It will not, however, process any classes in the JDK, i.e. classes in one of the java.* and com.sun.* packages. If you wish to include those too you have to use the special –i option, e.g. -i java. See the guide on command line options for this and other command line options.