Getting_Started:JavaScript_frontend - nimakarimipour/WALAWiki GitHub Wiki
We plan to improve this page with more details on how to get started with the JavaScript frontend. Feel free to make suggestions on the mailing list for what could be improved.
The JavaScript front end makes use of
Rhino
to parse JavaScript and create
ASTs. The JavaScript front end consists of the projects below, along
with the core CAst projects and the WALA projects.
* com.ibm.wala.cast.js (
javadoc
) (
source
)
o The CAst-based JavaScript front end
* com.ibm.wala.cast.js.rhino
o Parsing code specific to Rhino
* com.ibm.wala.cast.js.test (
javadoc
) (
source
)
o Tests for the CAst-based JavaScript front end
* com.ibm.wala.cast.js.rhino.test
o Test code specific to Rhino
For legal reasons, you will need to obtain version 1.7R3 of Rhino from
here and
install it as js.jar
in the lib/
directory of
com.ibm.wala.cast.js.rhino
. Running ant
on
com.ibm.wala.cast.js.rhino/build.xml
should fetch the jar for you.
Similarly, run ant
on com.ibm.wala.cast.js/build.xml
to fetch the
Jericho
and
nu.validator HTML parsers,
another required dependence.
The test data project also requires some steps to build. Once again for
legal reasons, we do not include some test program source in the
project, so you must obtain that separately. Run ant
on
com.ibm.wala.cast.js.test.data/build.xml
(com.ibm.wala.cast.js.test/build.xml
in 1.3.4) to fetch the additional
test sources. Once you are set up, you should be able to run the JUnit
tests by executing the com.ibm.wala.cast.js.rhino.test-JUnit
launcher
(see Run -> Run Configurations... -> JUnit in Eclipse).
You can use the following code to create a call graph for a stand-alone JavaScript file:
// use Rhino for parsing; change if you want to use a different parser
com.ibm.wala.cast.js.ipa.callgraph.JSCallGraphUtil.setTranslatorFactory(new CAstRhinoTranslatorFactory());
CallGraph CG = com.ibm.wala.cast.js.test.JSCallGraphBuilderUtil.makeScriptCG("dir", "file.js");
The call graph has a "fake" root node, and its successors are
prologue.js
,
which contains models of built-in JS functions, and your JS files. For
JavaScript executed in a browser, WALA additionally includes a
preamble.js
file modeling certain aspects of the DOM and its APIs. You can create a
call graph for all the JavaScript loaded from some HTML file as follows:
com.ibm.wala.cast.js.ipa.callgraph.JSCallGraphUtil.setTranslatorFactory(new CAstRhinoTranslatorFactory());
CallGraph CG = com.ibm.wala.cast.js.test.JSCallGraphBuilderUtil.makeHTMLCG(url_of_html_file);
(You can get a URL for a local File f
by calling f.toURI().toURL()
.)
HTMLCGBuilder
in the com.ibm.wala.cast.js.rhino.test
project is a command-line
driver for building a call graph for the JS code referenced from an HTML
file. See the docs on the main
method for command-line parameters. The
code in HTMLCGBuilder.buildHTMLCG()
may also be of interest for
writing your own driver; in particular it shows how to enable the
techniques described in the ECOOP'12 paper Correlation Tracking for
Points-To Analysis of
JavaScript.
For each JS file f, the call graph will contain a CGNode
representing the top-level code in f, and functions declared within
f are distinguished by having f as part of the WALA representation
of the method name. You can find the top-level CGNode
for your file as
follows:
private CGNode getFunctionNode(CallGraph CG, String dir, String file) {
TypeName type = TypeName.findOrCreate("L" + dir + "/" + file);
if (CG != null) {
Iterator<CGNode> iter = CG.iterator();
CGNode node;
while (iter.hasNext()) {
node = iter.next();
TypeName tempType = node.getMethod().getDeclaringClass().getName();
if (tempType.equals(type)) {
return node;
}
}
}
System.err.println("Can't find :" + dir + "/" + file);
return null;
}
For further technical details on how WALA constructs call graphs for JavaScript, see CAst CallGraph Details.
Given a CallGraph CG
, you can iterate over each method's IR
instructions as follows:
for (CGNode node: CG) {
// Get the IR of a CGNode
IR ir = node.getIR();
// Get CFG from IR
SSACFG cfg = ir.getControlFlowGraph();
// Iterate over the Basic Blocks of CFG
Iterator<ISSABasicBlock> cfgIt = cfg.iterator();
while (cfgIt.hasNext()) {
ISSABasicBlock ssaBb = cfgIt.next();
// Iterate over SSA Instructions for a Basic Block
Iterator<SSAInstruction> ssaIt = ssaBb.iterator();
while (ssaIt.hasNext()) {
SSAInstruction ssaInstr = ssaIt.next();
//Print out the instruction
System.out.println(ssaInstr);
}
}
}
You can also construct IRs for all methods without building a call
graph. This technique is used, e.g., by the
CorrelationFinder
class. Here is some code, lifted from CorrelationFinder
, for doing so,
given a URL url
for an HTML file (some imports have been elided):
// add in preamble.js for modeling of DOM APIs
JavaScriptLoader.addBootstrapFile(WebUtil.preamble);
Set<? extends SourceModule> script = WebUtil.extractScriptFromHTML(url);
SourceModule[] scripts = script.toArray(new SourceModule[script.size()]);
WebPageLoaderFactory loaders = new WebPageLoaderFactory(translatorFactory);
CAstAnalysisScope scope = new CAstAnalysisScope(scripts, loaders, Collections.singleton(JavaScriptLoader.JS));
IClassHierarchy cha = ClassHierarchy.make(scope, loaders, JavaScriptLoader.JS);
// to bail out early in the case of parse errors
Util.checkForFrontEndErrors(cha);
IRFactory<IMethod> factory = AstIRFactory.makeDefaultFactory();
for(IClass klass : cha) {
for(IMethod method : klass.getAllMethods()) {
IR ir = factory.makeIR(method, Everywhere.EVERYWHERE, SSAOptions.defaultOptions());
// do what is needed with the IR
}
}
- Each object has an explicit prototype field which holds a pointer to
its prototype. In the case of a chain of prototypes, the prototype
object itself will have a prototype, and so on. The properties are
initialized in the constructors of the different kinds of objects;
this logic is generated in
com.ibm.wala.cast.js.ipa.callgraph.JavaScriptConstructTargetSelector
. - In JavaScript, reads of properties follow the prototype chain but
writes do not. Prototype chain lookup used to be modeled by
generating a loop in the WALA IR, but this has changed in recent
versions. Now, we generate a
com.ibm.wala.cast.js.ssa.PrototypeLookup
instruction for all property reads, and analyses are expected to model the prototype lookup semantics for this instruction (call graph construction already does so).