VM vs Interpreter - ThornLang/JavaThorn GitHub Wiki
VM vs Interpreter: Understanding ThornLang's Dual Execution Model
ThornLang provides two distinct execution models: a tree-walking interpreter and a register-based bytecode VM. This guide explains the differences, trade-offs, and when to use each.
Table of Contents
- Overview
- Architecture Comparison
- Performance Characteristics
- Feature Parity
- When to Use Each Mode
- Technical Details
- Debugging Differences
- Future Roadmap
Overview
Tree-Walking Interpreter (Default)
java com.thorn.Thorn script.thorn
The interpreter directly traverses and evaluates the Abstract Syntax Tree (AST) without any intermediate compilation step.
Key Points:
- Direct AST evaluation
- No compilation overhead
- Simpler implementation
- Better error messages
- Easier to debug
Bytecode VM (--vm flag)
java com.thorn.Thorn script.thorn --vm
The VM first compiles the AST into bytecode instructions, then executes these instructions on a register-based virtual machine.
Key Points:
- Compilation to bytecode
- Register-based architecture
- Optimized instruction dispatch
- Better performance for loops
- More memory usage
Architecture Comparison
Tree-Walking Interpreter Flow
Source Code → Scanner → Parser → AST → Interpreter → Result
↑
Direct evaluation
Bytecode VM Flow
Source Code → Scanner → Parser → AST → Compiler → Bytecode → VM → Result
↑
Instruction execution
Code Example: How Each Executes
// Simple example
x = 10;
y = 20;
result = x + y;
print(result);
Interpreter Execution:
- Evaluates
x = 10
by visiting assignment node - Evaluates
y = 20
by visiting assignment node - Evaluates
x + y
by visiting binary expression node - Evaluates
print(result)
by visiting call expression node
VM Execution:
- Compiles to bytecode:
LOAD_CONST R0, 10 ; Load 10 into register 0 STORE_GLOBAL "x", R0 ; Store R0 to global x LOAD_CONST R0, 20 ; Load 20 into register 0 STORE_GLOBAL "y", R0 ; Store R0 to global y LOAD_GLOBAL R0, "x" ; Load x into R0 LOAD_GLOBAL R1, "y" ; Load y into R1 ADD R2, R0, R1 ; Add R0 and R1, store in R2 STORE_GLOBAL "result", R2 LOAD_GLOBAL R0, "result" PRINT R0
- Executes bytecode instructions sequentially
Performance Characteristics
Startup Time
Mode | Startup Time | Best For |
---|---|---|
Interpreter | ~50ms | Short scripts, REPL |
VM | ~150ms | Long-running programs |
The VM has higher startup time due to compilation overhead.
Execution Speed
// Benchmark: Fibonacci
$ fib(n) {
if (n <= 1) return n;
return fib(n-1) + fib(n-2);
}
Operation | Interpreter | VM | VM Advantage |
---|---|---|---|
fib(30) | 125ms | 42ms | 3.0x faster |
fib(35) | 3200ms | 1100ms | 2.9x faster |
Loop 1M iterations | 890ms | 210ms | 4.2x faster |
Array operations | 340ms | 180ms | 1.9x faster |
Memory Usage
// Memory comparison for typical program
// Program with 1000 functions and 10000 variables
Metric | Interpreter | VM |
---|---|---|
Base memory | ~10MB | ~15MB |
Per function | ~2KB | ~5KB |
Per variable | ~100B | ~100B |
Peak usage | Lower | Higher |
Feature Parity
Both execution modes support all ThornLang features:
Feature | Interpreter | VM |
---|---|---|
Variables & Types | ✓ | ✓ |
Functions | ✓ | ✓ |
Classes | ✓ | ✓ |
Arrays | ✓ | ✓ |
Pattern Matching | ✓ | ✓ |
Modules | ✓ | ✓ |
Error Handling | ✓ | ✓ |
Built-in Functions | ✓ | ✓ |
Implementation Differences
While features work the same, implementations differ:
// Function calls
$ add(a, b) => a + b;
result = add(5, 3);
Interpreter:
- Creates new environment for function
- Binds parameters to arguments
- Evaluates function body AST
- Returns to caller environment
VM:
- Pushes new call frame
- Copies arguments to registers
- Jumps to function bytecode
- Returns via RETURN instruction
When to Use Each Mode
Use the Interpreter (Default) When:
-
Developing and Testing
# Quick script testing java com.thorn.Thorn test.thorn
-
Running Short Scripts
// Simple automation script files = listFiles("*.txt"); for (file in files) { process(file); }
-
Using the REPL
# Interactive development java com.thorn.Thorn thorn> x = 42 thorn> print(x * 2)
-
Better Error Messages Needed
// Interpreter provides clearer stack traces $ buggyFunction() { return undefinedVar; // Clear error location }
Use the VM (--vm) When:
-
Running Compute-Heavy Code
// Mathematical computations $ mandelbrot(width, height, maxIter) { // Complex calculations benefit from VM }
-
Processing Large Data Sets
// Data processing with many iterations data = loadLargeDataset(); for (record in data) { transformed = transform(record); results.push(transformed); }
-
Long-Running Services
// Server or daemon processes while (true) { request = waitForRequest(); response = processRequest(request); sendResponse(response); }
-
Recursive Algorithms
// Deep recursion benefits from VM's call frame optimization $ quickSort(arr, low, high) { if (low < high) { pi = partition(arr, low, high); quickSort(arr, low, pi - 1); quickSort(arr, pi + 1, high); } }
Technical Details
Interpreter Internals
// Simplified interpreter visit pattern
Object visitBinaryExpr(Expr.Binary expr) {
Object left = evaluate(expr.left);
Object right = evaluate(expr.right);
switch (expr.operator.type) {
case PLUS:
return (Double)left + (Double)right;
// ... other operators
}
}
Characteristics:
- Recursive visitor pattern
- Direct Java method calls
- Dynamic dispatch overhead
- Simple to understand and modify
VM Internals
// VM instruction dispatch loop
while (!halted) {
int instruction = getCurrentInstruction();
OpCode opcode = getOpcode(instruction);
switch (opcode) {
case ADD_FAST:
registers[A] = registers[B] + registers[C];
break;
// ... other instructions
}
pc++;
}
Characteristics:
- Flat instruction dispatch
- Register-based operations
- Minimized function call overhead
- Harder to debug
Bytecode Instruction Set
The VM uses ~50 instructions including:
Category | Instructions |
---|---|
Load/Store | LOAD_CONST, LOAD_LOCAL, STORE_LOCAL, LOAD_GLOBAL |
Arithmetic | ADD, SUB, MUL, DIV, MOD, POW, NEG |
Fast Arithmetic | ADD_FAST, SUB_FAST, MUL_FAST, DIV_FAST |
Comparison | EQ, NE, LT, LE, GT, GE |
Control Flow | JUMP, JUMP_IF_FALSE, JUMP_IF_TRUE, CALL, RETURN |
Arrays | GET_INDEX, SET_INDEX, ARRAY_LENGTH, ARRAY_PUSH |
Objects | GET_PROPERTY, SET_PROPERTY, NEW_OBJECT |
Register Allocation
The VM uses a simple register allocation strategy:
// Source code
x = a + b * c;
// Bytecode (simplified)
MUL R0, b, c ; R0 = b * c
ADD R1, a, R0 ; R1 = a + R0
STORE x, R1 ; x = R1
Debugging Differences
Error Messages
Interpreter Error:
Runtime error: Undefined variable 'foo'
at myFunction (script.thorn:10)
at processData (script.thorn:25)
at main (script.thorn:40)
VM Error:
Runtime error: Undefined variable 'foo'
at bytecode offset 0x1A5
in function myFunction
Debugging Tools
Tool | Interpreter | VM |
---|---|---|
--ast flag | Shows AST structure | Shows AST structure |
Stack traces | Full source location | Bytecode offsets |
Step debugging | Possible (future) | More complex |
Performance profiling | Basic timing | Instruction counts |
Debug Mode Example
# View AST (works for both modes)
java com.thorn.Thorn script.thorn --ast
# Future: VM bytecode dump
# java com.thorn.Thorn script.thorn --vm --dump-bytecode
Future Roadmap
Planned Interpreter Improvements
-
AST Optimization Pass
- Constant folding
- Dead code elimination
- Common subexpression elimination
-
Cached Property Access
- Property lookup tables
- Inline caching
Planned VM Improvements
-
Advanced Optimizations
- Better register allocation
- Instruction combining
- Loop optimizations
-
JIT Compilation
- Hot path detection
- Native code generation
- Adaptive optimization
-
Debugging Support
- Source maps for bytecode
- Bytecode disassembler
- Step-through debugging
Unified Improvements
-
Profiling Tools
# Future profiling support java com.thorn.Thorn script.thorn --profile
-
Optimization Levels
# Future optimization flags java com.thorn.Thorn script.thorn --vm -O2
Decision Matrix
Use this matrix to decide which mode to use:
Criteria | Score | Interpreter | VM |
---|---|---|---|
Script runs < 1 second | High | ✓ | |
Script runs > 10 seconds | High | ✓ | |
Heavy computation | High | ✓ | |
Many function calls | Medium | ✓ | |
String manipulation | Low | ✓ | ✓ |
Development/debugging | High | ✓ | |
Production deployment | Medium | ✓ | |
REPL usage | High | ✓ |
Examples
Example 1: Script Automation (Use Interpreter)
// file_renamer.thorn - Better with interpreter
import { fs } from "system";
files = fs.listDir(".");
for (file in files) {
if (file.endsWith(".tmp")) {
newName = file.replace(".tmp", ".bak");
fs.rename(file, newName);
print("Renamed: " + file + " -> " + newName);
}
}
Example 2: Data Processing (Use VM)
// data_analyzer.thorn - Better with VM
$ analyze(dataset) {
results = {};
for (record in dataset) {
category = record["category"];
value = record["value"];
if (results[category] == null) {
results[category] = {
"sum": 0,
"count": 0,
"min": value,
"max": value
};
}
stats = results[category];
stats["sum"] += value;
stats["count"] += 1;
stats["min"] = min(stats["min"], value);
stats["max"] = max(stats["max"], value);
}
return results;
}
// Process large dataset
data = loadCSV("large_dataset.csv"); // 1M+ records
results = analyze(data);
Summary
- Interpreter: Best for development, scripting, and I/O-bound tasks
- VM: Best for computation, long-running programs, and production
- Both: Support all ThornLang features equally
- Choose based on: Runtime duration, computation intensity, and use case
The dual execution model gives ThornLang flexibility to excel in both scripting and application development scenarios.
See Also
- Performance Guide - Detailed performance optimization
- Getting Started - How to use --vm flag
- Language Reference - Full language features