Accessor chains for type error messages - ntrrgc/jsvm GitHub Wiki
It's always desirable for error messages to have as many helpful details as possible. One area where JSVM performs a bit extra work to be helpful for developers is type errors produced when interacting with JS objects from Java. This is done with some data structures called accessor chains.
GIven some code like this:
jsvm.getGlobalScope().get("pokemon").asObject().get("weight").asDouble()
These accessor chains convert errors like this:
me.ntrrgc.jsvm.JSTypeError: number was expected but undefined was found.
Into something more specific, like this:
me.ntrrgc.jsvm.JSTypeError: In pokemon.weight, number was expected but undefined was found.
These get extra helpful when inferring the failing accesses from the stack trace is tedious, ambiguous or unreliable.
It may be helpful to point some aspects of accessor chains:
- They are present in every
JSObject
and they are present in everyJSValue
that has not been created by the user (e.g. withJSValue.aString()
). - They involve only one additional object creation per emission of
JSValue
(getting aJSObject
without accessor chains needs 3 object creations). No strings are copied; instead the strings used to access properties are left to live a bit longer. All the algorithms involved in their manipulation until the time an error is thrown revolve around simple O(1) pointer assignments. - There is a maximum chain length of 8 accesors.
- They are implemented completely in Java code, with no additional help from the Duktape engine or any JNI tricks.
The following diagram shows the classes responsible for the maintenance of accessor chains.
An accessor chain is esentially a singly linked list where each node describes a step toward obtaining the JSValue
the user received. Each step can be a property access (PropertyAccessor
if the property is a string or IndexAccessor
if the property is a number), a function call (CallAccessor
) or an object construction with the new
operator (CallNewOperator
). Each node has a parentChain
reference that points to the chain this access is based on.
The root node of the chain is a different type of node, it implements the AccessorChainRoot
base class and it has three subclasses depending on where the original object that accessors operate on came from:
-
jsvm.getGlobalScope()
returns aJSObject
whose accessor chain is a singleGlobalScopeChainRoot
. -
jsvm.evaluate()
returns aJSValue
whose accessor chain is a singleEvalExpressionChainRoot
. If the code evaluated is short enough (less than 64 Java characters), its string is referenced from the node.When an error is thrown, this node is resolved with the code evaluated, provided that it is found to be a simple expression (no newlines or semicolons). If the code was too long or not simple enough to fit comfortably in an error message,
[eval expression]
is shown instead. -
jsvm.newObject()
andjsvm.newObjectWithProto()
return aJSObject
whose accessor chain is a singleClassChainRoot
initialized with the guessed class name of the new object. This is extracted from reading.constructor.name
in the new object. If the name could not be read,[unknown object]
is used instead. Error objects produced by JS exceptions thrown withJSError
also useClassChainRoot
.
These bullets cover all the cases of JSObject
creation in JSVM.
The following object diagram shows an example case where three chains are created (global
, global.pokemon
and global.pokemon.weight
). Note that they always reuse the nodes from the parent chain, so each new access always creates only one object.
Also, each access preserves the parent chain is losslessly, thanks to the nature of singly linked lists and the fact that every class used to implement property accessors is completely immutable: all their members are final and all their methods are pure.
AccessorChain.getFullPath()
is the interface method responsible for building the accessor paths shown in the error messages. For the most part, it involves calling getFullPath
on the parent chain (if it is not a root node) and joining it with the new access operation. There are special cases for when the parent is a GlobalScopeChainRoot
or a ClassChainRoot
.
The following table summarizes the possible join combinations. Each column is a traversing accessor and each row is a type of parent that is handled differently.
PropertyAccessor ( propertyName="prop" ) |
IndexAccessor ( index=5 ) |
CallAccessor |
CallNewAccessor |
|
---|---|---|---|---|
GlobalScopeChainRoot |
prop |
global[5] |
global() |
new global() |
ClassChainRoot ( className="Class" ) |
Class:prop |
Class:[5] |
Class:() |
new Class() |
Other chain type ( getFullPath()="expr" ) |
expr.prop |
expr[5] |
expr() |
new (expr)() |
Some combinations are pretty strange (e.g. why would you try to invoke the global object as a function?) but they are still possible accesses, even if they are doom to fail.
The parentheses in CallNewAccessor
path are redundant and can be elided in some cases. Its implementation skips them if the chain is only formed by PropertyAccessor
, IndexAccessor
and GlobalScopeChainRoot
, which is consistent with the definition of MemberExpression
in the ES standard.
JSObject
instances are created from JNI code only. Since accessor chains receive no help from JNI code, newly created JSObject
instances has no associated accessor chain. This is fixed with the JSObject.lateInitAccessorChain()
method.
JSValue
are also created from JNI code, though they may also be created explicitly for the user (e.g. when passing values for functions). JSValue
also needs an accessor chain since it's the sole class that actually does any type checking and is the one that throws JSTypeError
, but differently from JSObject
, an accessor chain is actually optional for JSValue
since instances created by the user using APIs like JSValue.aString()
cannot have an accessor chain.
For every case where the JSValue
is created in response to an access, there is a JSValue.lateInitAccessorChain()
that sets the accessorChain
attribute for the JSValue
, plus calls to JSObject.lateInitAccessorChain()
too if the wrapped value is an object.
Each JSVM method that returns a new JSValue
created from an object access or any JSObject
must make sure that it has an initialized accessor chain. For instance:
public JSValue evaluate(String code) {
synchronized (this.lock) {
return evaluateNative(code)
.lateInitAccessorChain(new EvalExpressionChainRoot(code));
}
}
JSObject
has full canonicalization, which means that if the user makes two operations in JSVM in any way that both return the same Duktape JS object, both operations return the same JSObject
. That is, successive operations that work with the same JS object reuse the created JSObject
.
This creates an issue with accessor chains, as the accessor chain of a JSObject
can only refer to the last access it was returned with. For instance, take this passing unit test:
@Test
public void testCanonicalizationIssue() throws Exception {
jsvm.evaluate("var anObject = {};\n" +
"var a = anObject;" +
"var b = anObject;");
JSObject a = jsvm.getGlobalScope().get("a").asObject();
JSObject b = jsvm.getGlobalScope().get("b").asObject();
assertSame(a, b);
try {
a.get("prop").asInt();
fail();
} catch (JSTypeError error) {
assertEquals("In b.prop, number was expected but undefined was found.",
error.getMessage());
}
}
If JSObject
s were not canonicalized, the accessor chain in the error message would read as a.prop
instead. Unfortunately, it is impossible to have both full canonicalization and correct access chains when accesses to same objects are made.