Architecture Overview - kennetholsenatm-gif/q_mini_wasm_v2 GitHub Wiki
Architecture Overview
Architecture Diagram
graph TB
subgraph "Core Framework"
T[Ternary State Space<br/>GF(3) Arithmetic]
S[Stabilizer Tableau<br/>Clifford Gates]
M[MoE Router<br/>Tropical Geometry]
F[Forward-Forward<br/>Learning]
end
subgraph "Runtime"
O[Orchestrator<br/>Thread Pool]
Y[SYCL Kernels<br/>GPU/CPU]
end
subgraph "External"
W[WebAssembly<br/>Target]
G[Go/DLL<br/>Plugin]
C[Flash-CIM<br/>Interface]
end
subgraph "Input Processing"
Q[Quantizer<br/>Absmean]
SH[Shadow<br/>Clifford Hash]
end
subgraph "Output"
EC[Error Correction<br/>Steane Code]
ACT[Activations<br/>Trit Vectors]
end
Q --> SH
SH --> M
M --> T
M --> S
T --> F
S --> F
F --> EC
EC --> ACT
O --> Y
Y --> T
Y --> S
F -.-> W
F -.-> G
F -.-> C
style T fill:#e1f5fe
style S fill:#f3e5f5
style M fill:#fff3e0
style F fill:#e8f5e8
style O fill:#fce4ec
style Y fill:#f1f8e9
Loading
Architecture Diagram
flowchart LR
subgraph Input
A[Continuous<br/>Input]
B[Absmean<br/>Quantizer]
C[Ternary<br/>Trits]
end
subgraph Processing
D[Clifford<br/>Shadow]
E[MoE<br/>Router]
F[Expert<br/>Selection]
G[Forward-Forward<br/>Inference]
end
subgraph Correction
H[Steane<br/>Code]
I[Error<br/>Detection]
J[Error<br/>Correction]
end
subgraph Output
K[Final<br/>Activations]
L[Result<br/>Vector]
end
A -->|FP32 values| B
B -->|{+1,0,-1}| C
C -->|Trit vector| D
D -->|Hash| E
E -->|Top-K| F
F -->|Selected experts| G
G -->|Raw output| H
H -->|Syndrome| I
I -->|Corrections| J
J -->|Corrected| K
K -->|Trit vector| L
style A fill:#ffebee
style B fill:#fce4ec
style C fill:#f3e5f5
style D fill:#ede7f6
style E fill:#e8eaf6
style F fill:#e3f2fd
style G fill:#e1f5fe
style H fill:#e0f7fa
style I fill:#e0f2f1
style J fill:#e8f5e9
style K fill:#f1f8e9
style L fill:#f9fbe7
Loading
Architecture Diagram
sequenceDiagram
participant User
participant Orchestrator
participant Quantizer
participant Shadow
participant Router
participant Expert
participant Learner
participant Corrector
User->>Orchestrator: submit_inference(input)
Orchestrator->>Quantizer: quantize(input)
Quantizer-->>Orchestrator: ternary_input
Orchestrator->>Shadow: compute_hash(ternary_input)
Shadow-->>Orchestrator: shadow_hash
Orchestrator->>Router: route_topk(shadow_hash)
Router-->>Orchestrator: selected_experts
loop For each selected expert
Orchestrator->>Expert: process(ternary_input)
Expert->>Learner: forward_forward(input)
Learner-->>Expert: activations
Expert-->>Orchestrator: expert_output
end
Orchestrator->>Corrector: encode_and_correct(outputs)
Corrector-->>Orchestrator: corrected_output
Orchestrator-->>User: final_result
Note over Orchestrator: Async execution with<br/>thread pool
Note over Router: Tropical geometry<br/>Top-K selection
Note over Learner: Local layer-wise<br/>no backpropagation
Loading
System Design
q_mini_wasm_v2 is a modular C++17 framework organized around five core
subsystems that work together to provide quantum-inspired, energy-efficient
AI inference at the extreme edge.
// Process input through selected experts
FFConfig ff_config{
.num_layers = 2,
.neurons_per_layer = 64,
.learning_rate = 0.01
};
auto learner = std::make_unique<ForwardForwardLearner>(ff_config);
5. Fault Tolerance
// Apply Steane code for error correctionauto steane = std::make_unique<QutritSteaneCode>();
auto corrected_output = steane->encode_and_correct(output);
6. Output
// Final ternary activations
std::vector<Trit> final_output = corrected_output;
Memory Model
Stack-Allocated Trit Vectors
All operations use stack-allocated trit vectors with deterministic sizing.
This eliminates heap allocation in the hot path, providing predictable
memory usage and avoiding garbage collection overhead.
SYCL Device Memory
When SYCL acceleration is enabled, data is transferred to device memory
for parallel processing: