Week 5 Extracting Code Knowledge with SootUp

In this week’s lab you will:

  1. Use Maven to add SootUp and find dependencies (Maven Central)
  2. Set up SootUp in a Java project
  3. Use SootUp to create a view, retrieve classes and methods, and get a CFG (control flow graph)

Tutorials are one hour. Prompt design and API integration are covered in Week 6–7; this week focuses on extraction.

Prerequisites

  1. Java JDK 8+ and Maven (e.g. 3.6+)
  2. A Java project (e.g. Defects4J checkout or a small Maven project) with compiled classes or source
  3. (Optional) IntelliJ or another IDE for editing

Outline (1 hour)

Part Activity Time (guide)
1 Code Instrumentation ~15 min
2 Maven + SootUp setup ~20 min
3 SootUp basics: view, class, method, CFG ~25 min

Activity 1: Code instrumentation (~15 min)#

Task 1.1: Understanding code instrumentation#

Code instrumentation is the process of inserting additional code (called probes or hooks) into a program to observe its runtime behavior without changing its core logic. Coverage tools like JaCoCo and EvoSuite use instrumentation to track which lines, branches, or methods are executed during test runs.

Why it matters: When EvoSuite generates tests, it instruments the bytecode to measure coverage. Understanding instrumentation helps you see how coverage tools work “under the hood.”

Hands-on: Manual instrumentation example

  1. Create a simple Java class (e.g. Calculator.java):
    public class Calculator {
        public int add(int a, int b) {
            return a + b;
        }
           
        public int divide(int a, int b) {
            if (b == 0) {
                throw new IllegalArgumentException("Cannot divide by zero");
            }
            return a / b;
        }
    }
    
  2. Manually instrument it by adding simple print-based instrumentation:
    public class Calculator {
        public int add(int a, int b) {
            System.out.println("[Instrumentation] Enter add(" + a + ", " + b + ")");
            int result = a + b;
            System.out.println("[Instrumentation] add result = " + result);
            return result;
        }
           
        public int divide(int a, int b) {
            System.out.println("[Instrumentation] Enter divide(" + a + ", " + b + ")");
            if (b == 0) {
                System.out.println("[Instrumentation] divide: b == 0, throwing exception");
                throw new IllegalArgumentException("Cannot divide by zero");
            }
            System.out.println("[Instrumentation] divide: normal path, computing a / b");
            int result = a / b;
            System.out.println("[Instrumentation] divide result = " + result);
            return result;
        }
    }
    
  3. Write a test that calls add():
    import org.junit.Test;
    import static org.junit.Assert.assertEquals;
       
    public class CalculatorTest {
        @Test
        public void testAdd() {
            Calculator calc = new Calculator();
            assertEquals(5, calc.add(2, 3));
        }
    }
    
  4. Run the test and observe which instrumentation messages are printed to the console. Notice that messages from divide() do not appear yet, because no test calls it.
  • What happens if you add a test for divide() with b = 0? Which lines get covered?
  • How does this manual approach compare to what JaCoCo or EvoSuite does automatically?

Real tools (like JaCoCo) instrument bytecode automatically using libraries like ASM. They insert probes at the bytecode level, so you don’t need to modify source code. EvoSuite instruments classes to measure coverage while generating tests.


Activity 2: Maven and SootUp setup (~20 min)#

Task 2.1: Maven and Maven Central#

Maven manages dependencies and build. Maven Central (mvnrepository.com or search.maven.org) is where you find coordinates (groupId, artifactId, version) for libraries.

  1. Create or open a Maven project with a pom.xml.
  2. Add SootUp dependencies (adjust version to match course):
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.core</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.java.core</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.java.sourcecode.frontend</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.java.bytecode.frontend</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.jimple.frontend</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.apk.frontend</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.callgraph</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.analysis.intraprocedural</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.analysis.interprocedural</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.qilin</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.soot-oss</groupId>
    <artifactId>sootup.codepropertygraph</artifactId>
    <version>2.0.0</version>
</dependency>
  1. Run mvn dependency:resolve (or mvn compile) to download dependencies.

Use the version required by your course if different from 2.0.0. See SootUp Getting Started.

Task 2.2: Verify SootUp#

Create a simple Java class to verify that SootUp is correctly loaded. If the following code compiles and runs without errors, the setup is successful.

import sootup.core.inputlocation.AnalysisInputLocation;
import sootup.core.model.SourceType;
import sootup.java.bytecode.frontend.inputlocation.PathBasedAnalysisInputLocation;
import sootup.java.core.views.JavaView;

import java.nio.file.Paths;

public class VerifySootUp {
    public static void main(String[] args) {
        AnalysisInputLocation inputLocation =
                PathBasedAnalysisInputLocation.create(Paths.get("target/classes"), SourceType.Application);
        JavaView view = new JavaView(inputLocation);
        System.out.println("SootUp loaded successfully! Classes available: " + view.getClasses().count());
    }
}

If you see SootUp loaded successfully! with a class count, the dependencies are working correctly.


Activity 3: SootUp basics — view, class, method, CFG (~25 min)#

Task 3.1: Create a view#

A view is the entry point for analysing classes in SootUp. You create one by specifying an input location that points to compiled bytecode (or source).

In SootUp 2.0.0, use PathBasedAnalysisInputLocation to point to a directory of .class files (e.g. target/classes), then create a JavaView from it.

import sootup.core.inputlocation.AnalysisInputLocation;
import sootup.core.model.SourceType;
import sootup.java.bytecode.frontend.inputlocation.PathBasedAnalysisInputLocation;
import sootup.java.core.JavaSootClass;
import sootup.java.core.views.JavaView;
import sootup.core.types.ClassType;

import java.nio.file.Paths;
import java.util.Optional;

public class SootUpDemo {

    public static void main(String[] args) {

        // ========== Step 1: Load bytecode and get the class ==========
        String classesDir = "YourProjectPath:target/classes";  // path to compiled .class files
        AnalysisInputLocation inputLocation =
                PathBasedAnalysisInputLocation.create(Paths.get(classesDir), SourceType.Application);
        JavaView view = new JavaView(inputLocation);

        // Get a class by its fully qualified name (FQN)
        ClassType classType = view.getIdentifierFactory().getClassType("your Class FQN Here"); // e.g., org.example.Calculator
        Optional<JavaSootClass> optionalClass = view.getClass(classType);

        if (!optionalClass.isPresent()) {
            System.out.println("Class not found: " + classType);
            return;
        }

        JavaSootClass sootClass = optionalClass.get();
        System.out.println("Successfully loaded class: " + sootClass.getName());
    }
}

Note that view.getClass() does not accept a raw string. You must first create a ClassType object via view.getIdentifierFactory().getClassType("package.ClassName"), then pass it to view.getClass(classType).

Task 3.2: List all methods with name, params, return type#

  1. From the view, get a class by its FQN (e.g. tutorial.Calculator) using the two-step process shown above: first create a ClassType, then call view.getClass(classType).
  2. From the class, get methods via sootClass.getMethods().
  3. For each method, print name, parameter types, return type, and its FQN (format: package.ClassName.methodName(paramType1,paramType2)).
        // ========== Step 2: List all methods with name, params, return type ==========
        System.out.println("========== Methods in Calculator ==========");
        Set<? extends SootMethod> methods = sootClass.getMethods();

        for (SootMethod method : methods) {
            MethodSignature sig = method.getSignature();
            String methodName = sig.getName();
            Type returnType = sig.getType();
            java.util.List<Type> paramTypes = sig.getParameterTypes();

            String paramStr = paramTypes.stream()
                    .map(Type::toString)
                    .collect(java.util.stream.Collectors.joining(","));
            String fqn = sig.getDeclClassType() + "." + methodName + "(" + paramStr + ")";

            System.out.println("  Method:      " + methodName);
            System.out.println("  Return type: " + returnType);
            System.out.println("  Parameters:  " + paramTypes);
            System.out.println("  FQN:         " + fqn);
            System.out.println("  ---");
        }

Why is the signature (name + parameter types) important for overloaded methods?

Task 3.3: Retrieve Jimple code representation#

Jimple is SootUp’s typed three-address intermediate representation. Each statement has at most three operands, making the code easier to analyse than raw bytecode. Viewing the Jimple output helps you understand how Java source is lowered before analysis.

For a method that has a body, call method.getBody() to retrieve its Jimple representation:

        // ========== Step 3: Jimple code representation for the "divide" method ==========
        System.out.println("========== Jimple Code for 'divide' ==========");
        for (SootMethod method : methods) {
            if (method.getSignature().getName().equals("divide")) {
                if (method.hasBody()) {
                    System.out.println(method.getBody());
                } else {
                    System.out.println("No body available for this method.");
                }
                break;
            }
        }

Task 3.4: Retrieve CFG for one method#

For a method that has a body, use CfgCreator to build a control flow graph and export it in DOT format. The divide method is a good example because its if (b == 0) branch produces visible cfg_true / cfg_false edges.

        // ========== Step 4: Retrieve CFG for the "divide" method ==========
        System.out.println("========== CFG for 'divide' ==========");
        for (SootMethod method : methods) {
            if (method.getSignature().getName().equals("divide")) {
                CfgCreator cfgCreator = new CfgCreator();
                PropertyGraph cfg = cfgCreator.createGraph(method);
                String dotGraph = cfg.toDotGraph();

                System.out.println(dotGraph);
                break;
            }
        }

Learning objectives#

By the end of this tutorial you should be able to:

  • Add SootUp via Maven and create a view from source or bytecode
  • Retrieve a class by FQN and list methods with signatures
  • Retrieve the Jimple intermediate representation for a method
  • Retrieve the CFG for a method

References

bars magnifying-glass xmark