Architecture SYCL Acceleration - kennetholsenatm-gif/q_mini_wasm_v2 GitHub Wiki

SYCL Acceleration

Overview

SYCL provides hardware-agnostic parallelism for GPU and multi-core CPU acceleration.

Parallel Operations

Operation Parallelism Strategy
Tableau updates Row-level parallelism
Modulo-3 arithmetic Sub-group vectorization
MoE routing Expert-level parallelism
Forward-Forward Layer-level parallelism

Kernel Structure

queue.submit([&](handler& h) {
    h.parallel_for(range{n}, [=](id<1> i) {
        // Parallel tableau row update
    });
});

Build Configuration

cmake -DUSE_SYCL=ON ..
cmake --build .

Requires Intel oneAPI or DPC++ toolchain.

See Also