VM vs Interpreter - ThornLang/JavaThorn GitHub Wiki

VM vs Interpreter: Understanding ThornLang's Dual Execution Model

ThornLang provides two distinct execution models: a tree-walking interpreter and a register-based bytecode VM. This guide explains the differences, trade-offs, and when to use each.

Table of Contents

Overview

Tree-Walking Interpreter (Default)

java com.thorn.Thorn script.thorn

The interpreter directly traverses and evaluates the Abstract Syntax Tree (AST) without any intermediate compilation step.

Key Points:

  • Direct AST evaluation
  • No compilation overhead
  • Simpler implementation
  • Better error messages
  • Easier to debug

Bytecode VM (--vm flag)

java com.thorn.Thorn script.thorn --vm

The VM first compiles the AST into bytecode instructions, then executes these instructions on a register-based virtual machine.

Key Points:

  • Compilation to bytecode
  • Register-based architecture
  • Optimized instruction dispatch
  • Better performance for loops
  • More memory usage

Architecture Comparison

Tree-Walking Interpreter Flow

Source Code → Scanner → Parser → AST → Interpreter → Result
                                        ↑
                                   Direct evaluation

Bytecode VM Flow

Source Code → Scanner → Parser → AST → Compiler → Bytecode → VM → Result
                                                               ↑
                                                     Instruction execution

Code Example: How Each Executes

// Simple example
x = 10;
y = 20;
result = x + y;
print(result);

Interpreter Execution:

  1. Evaluates x = 10 by visiting assignment node
  2. Evaluates y = 20 by visiting assignment node
  3. Evaluates x + y by visiting binary expression node
  4. Evaluates print(result) by visiting call expression node

VM Execution:

  1. Compiles to bytecode:
    LOAD_CONST    R0, 10      ; Load 10 into register 0
    STORE_GLOBAL  "x", R0     ; Store R0 to global x
    LOAD_CONST    R0, 20      ; Load 20 into register 0
    STORE_GLOBAL  "y", R0     ; Store R0 to global y
    LOAD_GLOBAL   R0, "x"     ; Load x into R0
    LOAD_GLOBAL   R1, "y"     ; Load y into R1
    ADD           R2, R0, R1  ; Add R0 and R1, store in R2
    STORE_GLOBAL  "result", R2
    LOAD_GLOBAL   R0, "result"
    PRINT         R0
    
  2. Executes bytecode instructions sequentially

Performance Characteristics

Startup Time

Mode Startup Time Best For
Interpreter ~50ms Short scripts, REPL
VM ~150ms Long-running programs

The VM has higher startup time due to compilation overhead.

Execution Speed

// Benchmark: Fibonacci
$ fib(n) {
    if (n <= 1) return n;
    return fib(n-1) + fib(n-2);
}
Operation Interpreter VM VM Advantage
fib(30) 125ms 42ms 3.0x faster
fib(35) 3200ms 1100ms 2.9x faster
Loop 1M iterations 890ms 210ms 4.2x faster
Array operations 340ms 180ms 1.9x faster

Memory Usage

// Memory comparison for typical program
// Program with 1000 functions and 10000 variables
Metric Interpreter VM
Base memory ~10MB ~15MB
Per function ~2KB ~5KB
Per variable ~100B ~100B
Peak usage Lower Higher

Feature Parity

Both execution modes support all ThornLang features:

Feature Interpreter VM
Variables & Types
Functions
Classes
Arrays
Pattern Matching
Modules
Error Handling
Built-in Functions

Implementation Differences

While features work the same, implementations differ:

// Function calls
$ add(a, b) => a + b;
result = add(5, 3);

Interpreter:

  • Creates new environment for function
  • Binds parameters to arguments
  • Evaluates function body AST
  • Returns to caller environment

VM:

  • Pushes new call frame
  • Copies arguments to registers
  • Jumps to function bytecode
  • Returns via RETURN instruction

When to Use Each Mode

Use the Interpreter (Default) When:

  1. Developing and Testing

    # Quick script testing
    java com.thorn.Thorn test.thorn
    
  2. Running Short Scripts

    // Simple automation script
    files = listFiles("*.txt");
    for (file in files) {
        process(file);
    }
    
  3. Using the REPL

    # Interactive development
    java com.thorn.Thorn
    thorn> x = 42
    thorn> print(x * 2)
    
  4. Better Error Messages Needed

    // Interpreter provides clearer stack traces
    $ buggyFunction() {
        return undefinedVar;  // Clear error location
    }
    

Use the VM (--vm) When:

  1. Running Compute-Heavy Code

    // Mathematical computations
    $ mandelbrot(width, height, maxIter) {
        // Complex calculations benefit from VM
    }
    
  2. Processing Large Data Sets

    // Data processing with many iterations
    data = loadLargeDataset();
    for (record in data) {
        transformed = transform(record);
        results.push(transformed);
    }
    
  3. Long-Running Services

    // Server or daemon processes
    while (true) {
        request = waitForRequest();
        response = processRequest(request);
        sendResponse(response);
    }
    
  4. Recursive Algorithms

    // Deep recursion benefits from VM's call frame optimization
    $ quickSort(arr, low, high) {
        if (low < high) {
            pi = partition(arr, low, high);
            quickSort(arr, low, pi - 1);
            quickSort(arr, pi + 1, high);
        }
    }
    

Technical Details

Interpreter Internals

// Simplified interpreter visit pattern
Object visitBinaryExpr(Expr.Binary expr) {
    Object left = evaluate(expr.left);
    Object right = evaluate(expr.right);
    
    switch (expr.operator.type) {
        case PLUS:
            return (Double)left + (Double)right;
        // ... other operators
    }
}

Characteristics:

  • Recursive visitor pattern
  • Direct Java method calls
  • Dynamic dispatch overhead
  • Simple to understand and modify

VM Internals

// VM instruction dispatch loop
while (!halted) {
    int instruction = getCurrentInstruction();
    OpCode opcode = getOpcode(instruction);
    
    switch (opcode) {
        case ADD_FAST:
            registers[A] = registers[B] + registers[C];
            break;
        // ... other instructions
    }
    pc++;
}

Characteristics:

  • Flat instruction dispatch
  • Register-based operations
  • Minimized function call overhead
  • Harder to debug

Bytecode Instruction Set

The VM uses ~50 instructions including:

Category Instructions
Load/Store LOAD_CONST, LOAD_LOCAL, STORE_LOCAL, LOAD_GLOBAL
Arithmetic ADD, SUB, MUL, DIV, MOD, POW, NEG
Fast Arithmetic ADD_FAST, SUB_FAST, MUL_FAST, DIV_FAST
Comparison EQ, NE, LT, LE, GT, GE
Control Flow JUMP, JUMP_IF_FALSE, JUMP_IF_TRUE, CALL, RETURN
Arrays GET_INDEX, SET_INDEX, ARRAY_LENGTH, ARRAY_PUSH
Objects GET_PROPERTY, SET_PROPERTY, NEW_OBJECT

Register Allocation

The VM uses a simple register allocation strategy:

// Source code
x = a + b * c;

// Bytecode (simplified)
MUL     R0, b, c    ; R0 = b * c
ADD     R1, a, R0   ; R1 = a + R0
STORE   x, R1       ; x = R1

Debugging Differences

Error Messages

Interpreter Error:

Runtime error: Undefined variable 'foo'
  at myFunction (script.thorn:10)
  at processData (script.thorn:25)
  at main (script.thorn:40)

VM Error:

Runtime error: Undefined variable 'foo'
  at bytecode offset 0x1A5
  in function myFunction

Debugging Tools

Tool Interpreter VM
--ast flag Shows AST structure Shows AST structure
Stack traces Full source location Bytecode offsets
Step debugging Possible (future) More complex
Performance profiling Basic timing Instruction counts

Debug Mode Example

# View AST (works for both modes)
java com.thorn.Thorn script.thorn --ast

# Future: VM bytecode dump
# java com.thorn.Thorn script.thorn --vm --dump-bytecode

Future Roadmap

Planned Interpreter Improvements

  1. AST Optimization Pass

    • Constant folding
    • Dead code elimination
    • Common subexpression elimination
  2. Cached Property Access

    • Property lookup tables
    • Inline caching

Planned VM Improvements

  1. Advanced Optimizations

    • Better register allocation
    • Instruction combining
    • Loop optimizations
  2. JIT Compilation

    • Hot path detection
    • Native code generation
    • Adaptive optimization
  3. Debugging Support

    • Source maps for bytecode
    • Bytecode disassembler
    • Step-through debugging

Unified Improvements

  1. Profiling Tools

    # Future profiling support
    java com.thorn.Thorn script.thorn --profile
    
  2. Optimization Levels

    # Future optimization flags
    java com.thorn.Thorn script.thorn --vm -O2
    

Decision Matrix

Use this matrix to decide which mode to use:

Criteria Score Interpreter VM
Script runs < 1 second High
Script runs > 10 seconds High
Heavy computation High
Many function calls Medium
String manipulation Low
Development/debugging High
Production deployment Medium
REPL usage High

Examples

Example 1: Script Automation (Use Interpreter)

// file_renamer.thorn - Better with interpreter
import { fs } from "system";

files = fs.listDir(".");
for (file in files) {
    if (file.endsWith(".tmp")) {
        newName = file.replace(".tmp", ".bak");
        fs.rename(file, newName);
        print("Renamed: " + file + " -> " + newName);
    }
}

Example 2: Data Processing (Use VM)

// data_analyzer.thorn - Better with VM
$ analyze(dataset) {
    results = {};
    
    for (record in dataset) {
        category = record["category"];
        value = record["value"];
        
        if (results[category] == null) {
            results[category] = {
                "sum": 0,
                "count": 0,
                "min": value,
                "max": value
            };
        }
        
        stats = results[category];
        stats["sum"] += value;
        stats["count"] += 1;
        stats["min"] = min(stats["min"], value);
        stats["max"] = max(stats["max"], value);
    }
    
    return results;
}

// Process large dataset
data = loadCSV("large_dataset.csv");  // 1M+ records
results = analyze(data);

Summary

  • Interpreter: Best for development, scripting, and I/O-bound tasks
  • VM: Best for computation, long-running programs, and production
  • Both: Support all ThornLang features equally
  • Choose based on: Runtime duration, computation intensity, and use case

The dual execution model gives ThornLang flexibility to excel in both scripting and application development scenarios.

See Also