Pysvf API - SVF-tools/Software-Security-Analysis GitHub Wiki

pysvf API Documentation

Global Variables

  • BIN_DIR: str
  • CURRENT_DIR: str
  • SVF_DIR: str
  • LLVM_DIR: str
  • EXTAPI_BC_PATH: str
  • Z3_DIR: str

Functions

run_tool

def run_tool(tool_name: str, args: List[str]) -> None
  • Description: Executes a specified tool with given arguments.

Classes

ICFGNode

Method Description
toString() Get the string representation of the ICFG node.
getId() Get the ID of the ICFG node.
getFun() Get the function that the ICFG node belongs to.
getBB() Get the basic block that the ICFG node belongs to.
getSVFStmts() Get the SVF statements associated with the ICFG node.
asFunEntry() Downcast to FunEntryICFGNode.
asFunExit() Downcast to FunExitICFGNode.
asCall() Downcast to CallICFGNode.
asRet() Downcast to RetICFGNode.
isFunEntry() Check if the ICFG node is a function entry node.
isFunExit() Check if the ICFG node is a function exit node.
isCall() Check if the ICFG node is a function call node.
isRet() Check if the ICFG node is a function return node.
getOutEdges() Get the out edges of the ICFG node.
getInEdges() Get the in edges of the ICFG node.

CallICFGNode

Method Description
getCaller() Get the caller function.
getCalledFunction() Get the called function.
getActualParms() Get the actual parameters of the call.
getArgument(idx) Get the argument of the call at the given index.
addActualParms(parm) Add an actual parameter to the call.
isVarArg() Check if the call is a vararg call.
isVirtualCall() Check if the call is a virtual call.
getRetICFGNode() Get the return node associated with this call.

RetICFGNode

Method Description
getActualRet() Get the actual return value.
addActualRet(ret) Add an actual return value.
getCallICFGNode() Get the call node associated with this return.

ICFGEdge

Method Description
toString() Get the string representation of the ICFG edge.
isCFGEdge() Check if the edge is a CFG edge.
isCallCFGEdge() Check if the edge is a call CFG edge.
isRetCFGEdge() Check if the edge is a return CFG edge.
isIntraCFGEdge() Check if the edge is an intra CFG edge.
getSrcNode() Get the source node of the edge.
getDstNode() Get the destination node of the edge.
asIntraCFGEdge() Downcast to IntraCFGEdge.
asCallCFGEdge() Downcast to CallCFGEdge.
asRetCFGEdge() Downcast to RetCFGEdge.

IntraCFGEdge

Method Description
getCondition() Returns the branch condition variable of the edge.
getSuccessorCondValue() Returns the actual condition value when the branch is executed.

CallCFGEdge

Method Description
getCallPEs() Get the call parameter expressions.

RetCFGEdge

Method Description
getRetPE() Get the return parameter expression.

SVFVar

Method Description
getName() Get the name of the SVF variable.
getId() Get the ID of the SVF variable.
isPointer() Check if the SVF variable is a pointer.
isConstDataOrAggDataButNotNullPtr() Check if the SVF variable is const data or agg data but not a null pointer.
isIsolatedNode() Check if the SVF variable is an isolated node.
getValueName() Get the value name of the SVF variable.
getFunction() Get the function that the SVF variable belongs to.
ptrInUncalledFunction() Check if the pointer is in an uncalled function.
isConstDataOrAggData() Check if the SVF variable is const data or agg data.
toString() Get the string representation of the SVF variable.

SVFStmt

Method Description
toString() Get the string representation of the SVF statement.
getEdgeId() Get the ID of the SVF statement.
getICFGNode() Get the ICFG node that the SVF statement belongs to.
getValue() Get the value of the SVF statement.
getBB() Get the basic block that the SVF statement belongs to.
isAddrStmt() Check if the SVF statement is an address statement.
isCopyStmt() Check if the SVF statement is a copy statement.
isStoreStmt() Check if the SVF statement is a store statement.
isLoadStmt() Check if the SVF statement is a load statement.
isCallPE() Check if the SVF statement is a call PE.
isRetPE() Check if the SVF statement is a return PE.
isGepStmt() Check if the SVF statement is a GEP statement.
isPhiStmt() Check if the SVF statement is a phi statement.
isSelectStmt() Check if the SVF statement is a select statement.
isCmpStmt() Check if the SVF statement is a compare statement.
isBinaryOpStmt() Check if the SVF statement is a binary operation statement.
isUnaryOpStmt() Check if the SVF statement is a unary operation statement.
isBranchStmt() Check if the SVF statement is a branch statement.
asAddrStmt() Downcast the SVF statement to an address statement.
asCopyStmt() Downcast the SVF statement to a copy statement.
asStoreStmt() Downcast the SVF statement to a store statement.
asLoadStmt() Downcast the SVF statement to a load statement.
asCallPE() Downcast the SVF statement to a call PE.
asRetPE() Downcast the SVF statement to a return PE.
asGepStmt() Downcast the SVF statement to a GEP statement.
asPhiStmt() Downcast the SVF statement to a phi statement.
asSelectStmt() Downcast the SVF statement to a select statement.
asCmpStmt() Downcast the SVF statement to a compare statement.
asBinaryOpStmt() Downcast the SVF statement to a binary operation statement.
asUnaryOpStmt() Downcast the SVF statement to a unary operation statement.
asBranchStmt() Downcast the SVF statement to a branch statement.
asBinaryOpStmt() Downcast the SVF statement to a binary operation statement.
asPhiStmt() Downcast the SVF statement to a phi statement.

AssignStmt

StoreStmt, LoadStmt, CopyStmt, AddrStmt, GepStmt, and RetPE inherit these two-operand accessors.

Method Description
getLHSVar() / getLHSVarID() Get the left-hand-side variable or its ID.
getRHSVar() / getRHSVarID() Get the right-hand-side variable or its ID.

MultiOpndStmt

PhiStmt, SelectStmt, CmpStmt, BinaryOPStmt, and CallPE inherit these result/operand accessors. Python uses Id in these method names (getResId, getOpVarId), not the C++ ID spelling.

Method Description
getRes() / getResId() Get the result variable or its ID.
getOpVar(index) / getOpVarId(index) Get operand index or its variable ID.
getOpndVars() Get all operand variables.
getOpVarNum() Get the number of operands.
__iter__ Iterate over operand variables.

CallPE

Call parameter edges model actual-to-formal parameter passing. The result is the callee formal parameter, and each operand is an actual argument at a call site.

Method Description
getRes() / getResId() Get the formal parameter variable or its ID.
getOpVar(index) / getOpVarId(index) Get actual argument index or its variable ID.
getOpCallICFGNode(index) Get the call-site ICFG node for operand index.
getOpCallICFGNodes() Get all call-site ICFG nodes attached to this call parameter edge.
getFunEntryICFGNode() Get the callee function-entry ICFG node.

RetPE

Return parameter edges inherit AssignStmt: the RHS is the callee formal return value and the LHS is the caller-side actual return value.

Method Description
getLHSVar() / getLHSVarID() Get the caller-side actual return variable or its ID.
getRHSVar() / getRHSVarID() Get the callee formal return variable or its ID.
getCallSite() Get the call-site ICFG node.
getFunExitICFGNode() Get the callee function-exit ICFG node.

Additional Classes and Methods

CallGraphNode

Method Description
getFunction() Get the function of the call graph node.
getId() Get the ID of the call graph node.
getName() Get the name of the function.
getInEdges() Get incoming edges of this node.
getOutEdges() Get outgoing edges of this node.
isReachableFromProgEntry() Check if this function is reachable from program entry.
toString() Get string representation of this node.

CallGraphEdge

Method Description
getSrcNode() Get the source node of the call graph edge.
getDstNode() Get the destination node of the call graph edge.
getCallSiteID() Get the call site ID.
getSrcID() Get source node ID.
getDstID() Get destination node ID.
getDirectCalls() Get direct call ICFG nodes.
getIndirectCalls() Get indirect call ICFG nodes.
isDirectCallEdge() Check if this is a direct call edge.
isIndirectCallEdge() Check if this is an indirect call edge.
toString() Get string representation.

CallGraph

Method Description
getGNode(id: int) Get the call graph node by ID.
dump() Dump the call graph to console.
view() View the call graph (opens visualization).
getCallGraphNodeByFunObj(fun: "FunObjVar") Get the call graph node for the given function.
getCallGraphNodeById(id: int) Get the call graph node by ID.
getCallGraphNodeByName(name: str) Get the call graph node by function name.
getNodes() Get all nodes in this call graph.
isReachableBetweenFunctions(src: "SVFFunction", dst: "SVFFunction") Check if there's a path between two functions.

SVFIR

Method Description
getICFG() Get the ICFG of the SVFIR.
getCallSites() Get the call sites of the SVFIR.
getSVFVarNum() Get the number of SVFVars (PAG nodes).
getCallGraph() Get the call graph of the SVFIR.
getBaseObject(id: int) Get the base object with the given ID.
getGNode(id: int) Get the SVFVar with the given ID.
getGepObjVar(id: int, offset: int) Get the GEP object variable ID.
getNumOfFlattenElements(T: SVFType) Get the number of flattened elements.
getFlattenedElemIdx(T: SVFType, origId: int) Get the flattened element index.
getFunObjVar(fun_name: str) Get the function object variable with the given name.
getFunRet(fun: FunObjVar) Get the function return value.

ICFGWTOCycle

Method Description
head The head node of the cycle.
components The components of the cycle.

AbstractState

AbstractState is the per-ICFGNode value object: a map from variable IDs to AbstractValues plus an address-to-value map for memory. Sparse-aware GEP / load / store helpers that used to live here (and on the now-removed AbstractStateManager) have moved to AbstractInterpretation; see the next section.

Method Description
as[var_id] / getVar(var_id) Get the AbstractValue associated with variable var_id.
as[var_id] = val / setVar(var_id, val) Store val (an AbstractValue, IntervalValue, AddressValue or int) at variable var_id.
store(addr, val) Store val at virtual address addr.
load(addr) Load the AbstractValue at virtual address addr.
inVarToValTable(var_id) Returns True if var_id has an interval value entry.
inVarToAddrsTable(var_id) Returns True if var_id has an address value entry.
inAddrToAddrsTable(id) Returns True if address id has an entry in the addr→addrs map.
getIDFromAddr(addr) Strip the virtual-memory mask off addr and return the underlying object ID.
joinWith(other) / meetWith(other) Join / meet this state with other (in-place).
widening(other) / narrowing(other) Return a new AbstractState that is the widening / narrowing of this state with other.
initObjVar(objVar) Initialize an object variable in the abstract state.
addToFreedAddrs(addr) / isFreedMem(addr) Track / query freed memory addresses.
printAbstractState() Print the abstract state to stdout.
clear() / bottom() / top() Reset / get bottom / get top.
equals(other) Structural equality.
clone() Return an independent copy.
static isVirtualMemAddress(val) / getVirtualMemAddress(idx) Address-mask helpers (same as on AddressValue).
static isNullMem(addr) / isBlackHoleObjAddr(addr) Test for special address values.
static isCmpBranchFeasible(svfir, cmpStmt, succ, as) / isSwitchBranchFeasible(svfir, var, succ, as) Branch-feasibility helpers used by the AE driver.

AbstractInterpretation

Wraps the per-ICFGNode trace and exposes the sparsity-aware GEP / load / store / def-use helpers that used to live on the (now-removed) AbstractStateManager. In Assignment-3 the student-defined AbstractExecution keeps a pointer to this instance under the field name svf_state_mgr.

Method Description
getAbsState(node) Return the AbstractState for node from the trace (creates if missing).
updateAbsState(node, state) Overwrite the trace entry for node with state.
getGepElementIndex(gep) Return the element index for a GepStmt.
getGepByteOffset(gep) Return the byte offset (IntervalValue) of a GepStmt.
getGepObjAddrs(pointer, offset) Return the GEP object addresses (AddressValue) for pointer (a ValVar) at offset.
loadValue(pointer, node) Load the AbstractValue pointed to by pointer at node.
storeValue(pointer, val, node) Store val through pointer at node.
getPointeeElement(var, node) Return the pointee element type for var (an ObjVar).
getAllocaInstByteSize(addr) Return the byte size of an alloca for an AddrStmt.
getTrace() Return the underlying dict-like trace (ICFGNode → AbstractState).
ai[node] / ai[node] = state Operator-style access into the trace (delegates to getAbsState / updateAbsState).
node in ai Convenience for "trace contains node".
analyse() / analyzeFromAllProgEntries() / runOnModule() Top-level driver entry points.

AbstractValue

Method Description
isInterval() / isAddr() Discriminator queries.
getInterval() Return the underlying IntervalValue.
getAddrs() Return the underlying AddressValue.
equals(other) Structural equality.
join_with(other) Joins the current value with another abstract value (AbstractValue, IntervalValue, or AddressValue).
meet_with(other) Meets the current value with another abstract value.
narrow_with(other) Narrows the current value with another abstract value.
widen_with(other) Widens the current value with another abstract value.
clone() Independent copy.

IntervalValue

Method Description
top() / bottom() (static) Return the top / bottom interval value.
isTop() / isBottom() Discriminators.
is_numeral() / is_zero() / is_int() / is_real() Numeric discriminators.
lb() / ub() Lower / upper bound (returns int).
getIntNumeral() / getRealNumeral() / getNumeral() Extract a concrete numeric value if the interval is a numeral.
containedWithin(other) / contain(other) / leq(other) / geq(other) Order queries.
join_with(other) / meet_with(other) / widen_with(other) / narrow_with(other) Lattice operations.
toString() Pretty-print.
Arithmetic / comparison operators (+, -, *, /, %, <, <=, ==, …) Pointwise interval arithmetic.

AddressValue

A set of virtual addresses. Use this to represent the points-to set of a pointer in the abstract state.

Method Description
__iter__ / __len__ / __contains__(val) Iterate / count / membership.
insert(addr) Add an address; returns True if newly added.
contains(addr) / empty() / size() / isBottom() Inspection.
join_with(other) / meet_with(other) / hasIntersect(other) Set operations.
getVals() / setVals(vals) Direct access to the underlying set.
static getVirtualMemAddress(idx) / isVirtualMemAddress(val) Address-mask helpers.

Buffer-overflow helpers. Methods like addToGepObjOffsetFromBase, hasGepObjOffsetFromBase, getGepObjOffsetFromBase, updateGepObjOffsetFromBase, reportBufOverflow, handleMemcpy, getStrlen are not part of the pybind surface. They live on the AbstractExecutionHelper class in Assignment-3/Python/Assignment_3_Helper.py — see AE Python API for their signatures.

Worklist operations

In class AndersenPTA in Assignment_1_Helper.py

Members Meanings
isWorklistEmpty() return true if the worklist is empty
popFromWorklist() return a node identifier from the worklist and remove it from the worklist
pushIntoWorklist(NodeID id) push/add a node into worklist

Points-to Set Operations

The following operations can be used for every pointer analysis implementation (e.g., AndersenPTA).

A points-to set, denoted as pts(ptr), in SVF is a mapping from pointer ptr to a set containing objects that ptr points to. Note that both the ptr and the objects are referred to as identifiers (NodeID type).

Method Description
addPts(id: int, ptd: int) Add ptd into the points-to set of id; returns True if the points-to set of id is changed.
unionPts(id: int, ptd: int) Union the points-to set of ptd into that of id; returns True if the points-to set of id is changed after the union.
getPts(id: int) Return the points-to set of id.
dumpPts(self, id: int, pts: 'PointsTo') print out the points-to set of ptr

Alias Relations

Two pointers (SVFVars) are aliases if their points-to sets share common object(s) determined by points-to analysis (e.g., AndersenPTA)

Python API Description
alias(node1_id, node2_id) Returns true if the two pointers node1_id and node2_id are aliases

ConstraintGraph, ConstraintNode, ConstraintEdge

Method Description
consCG.getConstraintNode(node_id) Returns the ConstraintNode based on its identifier
consCG.hasEdge(src_node, dst_node, edge_kind) Returns true if the edge exists on the constraint graph, e.g., hasEdge(src, dst, ConstraintEdge.Copy)
node.getAddrInEdges() Returns all incoming address constraint edges (AddrCGEdge) of this node
node.getAddrOutEdges() Returns all outgoing address constraint edges (AddrCGEdge) of this node
node.getStoreInEdges() Returns all incoming store constraint edges (StoreCGEdge) of this node
node.getStoreOutEdges() Returns all outgoing store constraint edges (StoreCGEdge) of this node
node.getLoadInEdges() Returns all incoming load constraint edges (LoadCGEdge) of this node
node.getLoadOutEdges() Returns all outgoing load constraint edges (LoadCGEdge) of this node
node.getCopyInEdges() Returns all incoming copy constraint edges of this node
node.getCopyOutEdges() Returns all outgoing copy constraint edges of this node
node.getGepInEdges() Returns all incoming gep (field access) constraint edges of this node
node.getGepOutEdges() Returns all outgoing gep (field access) constraint edges of this node
edge.getSrcID() Returns the source node id of this edge
edge.getDstID() Returns the target node id of this edge
edge.getSrcNode() Returns the source node of this edge
edge.getDstNode() Returns the target node of this edge
ander_base.addCopyEdge(src_id, dst_id) Adds a copy constraint edge (CopyCGEdge) from a source to a target node; returns true if added successfully
consCG.addCopyCGEdge(src_id, dst_id) Adds a copy constraint edge (CopyCGEdge) from a source to a target node; returns true if added successfully
pysvf.NormalGepCGEdge A subclass of ConstraintEdge which represents the field access of a struct object (Note: VariantGEPEdge as a subclass of GepCGEdge is used to model pointer arithmetic for field access in C. You don't need to handle VariantGEP in this assignment)
edge.getConstantFieldIdx() Returns the field idx when accessing a struct field
consCG.getGepObjVar(obj_id, field_idx) Returns the field object given a field index

Iterating Constraint Nodes

To iterate through every constraint node on the graph:

# Iterate through all nodes in the constraint graph
for node in consCG.getNodes():
    # node is a ConstraintNode object
    # Process the node as needed
    pass

ConstraintGraph.getGepObjVar

  • Given an object o and a NormalGepCGEdge gepEdge, we could get the field object fldObj via the following code sample.

     fldObj = consCG.getGepObjVar(o, gepEdge.getConstantFieldIdx());
    

ICFGNode.toString()

  • Get the string representation of the ICFG node

Output Sample:

<bound method PyCapsule.to_string of <pysvf.pysvf.FunExitICFGNode object at 0x7f4cd2f68a30>>

Downcasting to a Specific Subclass

In scenarios where you need to downcast an object to a specific subclass, it's important to first ensure that the object is an instance of that subclass. The following example demonstrates this process(assume that edge if an ICFGEdge):

Example in Python:

if edge.isCallCFGEdge():
    callEdge = edge.asCallCFGEdge()