SpikingNeuronArray - reecewayt/llm-assisted-design-portfolio GitHub Wiki

Simple Spiking Neuron Array Implementation in SystemVerilog

Project Overview

LLM workflows are becoming ubiquitously used in Hardware and Software design. This project represents my cohorts first coding challenge, which was to conduct vibe coding by replicating the John Hopkins paper. I used claude.ai in this experiement- development steps and challenges are reported below.

Components

Neuron model implementation
Fully connected network architecture
SPI interface for external communication

Implementation Details

Toolchain

Tool	Description
VS Code	Text Editor
Verilator	Open source Verilog/SystemVerilog simulator
Makefile	Used to automate compiling, simulating, and wave viewing
claude.ai	LLM Assistant of choice but any would work
gtkwave	Open source wave viewer

Neuron Model, Network Architecture, and SPI Interface

The neuron model (neuron.sv) implements a simplified leaky integrate-and-fire (LIF) model with parameterized:

Membrane potential accumulation
Leakage mechanism
Threshold-based firing
Refractory period

The network (network.sv) is a fully connect two layer network, each layer instantiates three neurons.

The SPI interface (spi_interface.sv) provides the ability to interface with this model at the ASIC level (more on this later).

Testing: The project includes several testbenches:

neuron_tb.cpp: Tests individual neuron behavior
network_tb.cpp: Validates network-level functionality
standalone_spi_test.cpp: Verifies SPI communication protocol

💡See the Makefile for details on running these individual tests. Run make help for a complete list of targets.

Development Steps

In this section is a brief overview of some of the key prompts and steps I took to have Claude code up the program. Instead of fixing mistakes or errors myself, I generally relied on Claude to fix them, mostly to stick with the experiment and test Claude's ability to arrive at a solution.

To start, I asked the following openning prompt, which was similar to the original paper but includes more specifications initially:

Prompt: Can you write a SystemVerilog module for a spiking leaky integrate and fire neuron? Please include a multibit current input and a refractory period?

Claude.ai: I'll create a SystemVerilog module for a spiking leaky integrate and fire neuron with a multibit current input and refractory period.

Looking at the output code below, its easy to see that Claude included erronous inputs in the port list, and generally wrong logic for updating the membrane potential. This was easily fixed with a few follow up prompts, and details on how to handle underflow/overflow. Below is the initial module; the finished module can be found here. This neuron module was verified with a test bench (also generated by Claude) and can be ran with make run-neuron. Alternatively, you can run gtkwave to view the waveform of the simulation with make wave-neuron.

Initial Module

module leaky_integrate_fire_neuron #(
    parameter int CURRENT_WIDTH = 8,           // Width of the current input
    parameter int POTENTIAL_WIDTH = 16,        // Width of membrane potential
    parameter int LEAK_FACTOR_WIDTH = 8,       // Width of the leak factor
    parameter int THRESHOLD_WIDTH = 16,        // Width of the threshold parameter
    parameter int REFRACTORY_COUNTER_WIDTH = 4 // Width of refractory counter (determines max refractory period)
) (
    input logic clk,                                  // Clock signal
    input logic rst_n,                                // Active low reset
    input logic [CURRENT_WIDTH-1:0] current_in,       // Input current
    input logic [LEAK_FACTOR_WIDTH-1:0] leak_factor,  // Leak factor (higher value = faster leak)
    input logic [THRESHOLD_WIDTH-1:0] threshold,      // Firing threshold
    input logic [REFRACTORY_COUNTER_WIDTH-1:0] refractory_period, // Number of cycles for refractory period
    output logic spike_out                            // Output spike
);

    // Internal registers
    logic [POTENTIAL_WIDTH-1:0] membrane_potential;
    logic [REFRACTORY_COUNTER_WIDTH-1:0] refractory_counter;
    logic in_refractory_period;

    // Calculate if neuron is in refractory period
    assign in_refractory_period = (refractory_counter > 0);

    always_ff @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            // Reset state
            membrane_potential <= '0;
            refractory_counter <= '0;
            spike_out <= 1'b0;
        end else begin
            // Default spike output is 0
            spike_out <= 1'b0;

            // Handle refractory period counter
            if (refractory_counter > 0) begin
                refractory_counter <= refractory_counter - 1'b1;
            end

            // Update membrane potential only when not in refractory period
            if (!in_refractory_period) begin
                // Calculate leak (simple multiplication and shift)
                // membrane_potential = membrane_potential - (membrane_potential * leak_factor / 2^LEAK_FACTOR_WIDTH)
                logic [POTENTIAL_WIDTH+LEAK_FACTOR_WIDTH-1:0] leak_amount;
                leak_amount = (membrane_potential * leak_factor) >> LEAK_FACTOR_WIDTH;

                // Add current and subtract leak
                if (current_in > leak_amount[POTENTIAL_WIDTH-1:0]) begin
                    membrane_potential <= membrane_potential + current_in - leak_amount[POTENTIAL_WIDTH-1:0];
                end else begin
                    membrane_potential <= membrane_potential - leak_amount[POTENTIAL_WIDTH-1:0] + current_in;
                end

                // Check if threshold is reached
                if (membrane_potential >= threshold) begin
                    // Generate spike
                    spike_out <= 1'b1;
                    // Reset membrane potential
                    membrane_potential <= '0;
                    // Enter refractory period
                    refractory_counter <= refractory_period;
                end
            end
        end
    end

endmodule

With a working neuron, I followed suite as the authors from the paper and asked for the network module. I included more details with the hope of getting close to a solution in one shot but this proved to be mostly ineffective.

Prompt: Okay this all works, now that we have a working neuron model and its been verified. Can you create a new module that instantiates a network of neurons with 2 layers in a fully connected fashion. Please connect them with programmable synapses, which means each synapse can have a programmable weight. Therefore the inputs into layer 2 will be the spikes at layer one but each will be connected to layer 2 via a multiply and accumulate synapse. Where the multiply is the programmable weights.

Claude: I'll create a neural network module with two fully connected layers and programmable synapses.

Claude generally followed my description well, and generated a seperate synapse module which I liked. Overall though, it missed the mark in several ways. At this point, I'm wondering if my prompts need to be better thought out, or if providing fewer details would be better. The layers created were wrong, but by my own fault, as I omitted how many neurons to instantiate in each layer. This of course was easy to fix with minimal prompting. Another issue was with the convoluted, and honestly confusing, way it was performing the comb block for current accumulation. To fix this I asked Claude to simplify the synapse module by using a single continuous assignment statement. It generated this final result for a (multiple input synapse)[(../blob/main/SpikingNeuronArray/src/multi_input_synapse.sv)] which worked for me. Once the synapse connect was correct, it took one more prompt for it to fix and simplify the (network module)[(../blob/main/SpikingNeuronArray/src/network.sv)]

Initial Synapse and Network

module synapse #(
    parameter int WEIGHT_WIDTH = 8,        // Width of the weight
    parameter int SPIKE_WIDTH = 1,         // Width of spike signal
    parameter int CURRENT_WIDTH = 8        // Width of output current
) (
    input logic clk,                                // Clock signal
    input logic rst_n,                              // Active low reset
    input logic [SPIKE_WIDTH-1:0] spike_in,         // Input spike
    input logic [WEIGHT_WIDTH-1:0] weight,          // Programmable weight
    output logic [CURRENT_WIDTH-1:0] current_out    // Output current
);

    // The synapse multiplies the spike by the weight to produce output current
    always_ff @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            current_out <= '0;
        end else begin
            // When spike_in is active, output the weight as current
            // Otherwise, output zero current
            current_out <= spike_in ? weight : '0;
        end
    end

endmodule

module neural_network #(
    parameter int LAYER1_NEURONS = 4,          // Number of neurons in layer 1
    parameter int LAYER2_NEURONS = 2,          // Number of neurons in layer 2
    parameter int CURRENT_WIDTH = 8,           // Width of current signals
    parameter int WEIGHT_WIDTH = 8,            // Width of synaptic weights
    parameter int POTENTIAL_WIDTH = 8,         // Width of membrane potential
    parameter int LEAK_FACTOR_WIDTH = 8,       // Width of the leak factor
    parameter int THRESHOLD_WIDTH = 8,         // Width of the threshold parameter
    parameter int REFRACTORY_COUNTER_WIDTH = 4 // Width of refractory counter
) (
    input logic clk,                                          // Clock signal
    input logic rst_n,                                        // Active low reset
    input logic [CURRENT_WIDTH-1:0] inputs[LAYER1_NEURONS],   // Input currents to first layer
    input logic [WEIGHT_WIDTH-1:0] weights[LAYER1_NEURONS][LAYER2_NEURONS], // Programmable weights
    output logic spikes_layer1[LAYER1_NEURONS],               // Spikes from layer 1
    output logic spikes_layer2[LAYER2_NEURONS]                // Spikes from layer 2
);

    // Declare intermediate signals
    logic [CURRENT_WIDTH-1:0] synapse_currents[LAYER1_NEURONS][LAYER2_NEURONS];
    logic [CURRENT_WIDTH-1:0] accumulated_currents[LAYER2_NEURONS];

    // Instantiate layer 1 neurons
    generate
        for (genvar i = 0; i < LAYER1_NEURONS; i++) begin : layer1_neurons
            neuron #(
                .CURRENT_WIDTH(CURRENT_WIDTH),
                .POTENTIAL_WIDTH(POTENTIAL_WIDTH),
                .LEAK_FACTOR_WIDTH(LEAK_FACTOR_WIDTH),
                .THRESHOLD_WIDTH(THRESHOLD_WIDTH),
                .REFRACTORY_COUNTER_WIDTH(REFRACTORY_COUNTER_WIDTH)
            ) neuron_inst (
                .clk(clk),
                .rst_n(rst_n),
                .current_in(inputs[i]),
                .spike_out(spikes_layer1[i])
            );
        end
    endgenerate

    // Instantiate synapses between layer 1 and layer 2
    generate
        for (genvar i = 0; i < LAYER1_NEURONS; i++) begin : layer1_synapses
            for (genvar j = 0; j < LAYER2_NEURONS; j++) begin : layer2_connections
                synapse #(
                    .WEIGHT_WIDTH(WEIGHT_WIDTH),
                    .SPIKE_WIDTH(1),
                    .CURRENT_WIDTH(CURRENT_WIDTH)
                ) synapse_inst (
                    .clk(clk),
                    .rst_n(rst_n),
                    .spike_in(spikes_layer1[i]),
                    .weight(weights[i][j]),
                    .current_out(synapse_currents[i][j])
                );
            end
        end
    endgenerate

    // Accumulate currents for each layer 2 neuron
    always_comb begin
        for (int j = 0; j < LAYER2_NEURONS; j++) begin
            accumulated_currents[j] = '0;
            for (int i = 0; i < LAYER1_NEURONS; i++) begin
                // Add with saturation to prevent overflow
                if (accumulated_currents[j] > ({CURRENT_WIDTH{1'b1}} - synapse_currents[i][j])) begin
                    accumulated_currents[j] = {CURRENT_WIDTH{1'b1}}; // Saturate at maximum
                end else begin
                    accumulated_currents[j] = accumulated_currents[j] + synapse_currents[i][j];
                end
            end
        end
    end

    // Instantiate layer 2 neurons
    generate
        for (genvar j = 0; j < LAYER2_NEURONS; j++) begin : layer2_neurons
            neuron #(
                .CURRENT_WIDTH(CURRENT_WIDTH),
                .POTENTIAL_WIDTH(POTENTIAL_WIDTH),
                .LEAK_FACTOR_WIDTH(LEAK_FACTOR_WIDTH),
                .THRESHOLD_WIDTH(THRESHOLD_WIDTH),
                .REFRACTORY_COUNTER_WIDTH(REFRACTORY_COUNTER_WIDTH)
            ) neuron_inst (
                .clk(clk),
                .rst_n(rst_n),
                .current_in(accumulated_currents[j]),
                .spike_out(spikes_layer2[j])
            );
        end
    endgenerate

endmodule

Lastly, the paper asks for a SPI module. My prompt was significantly different but I think exposes a valuable observation about LLMs.

Prompt: Okay now that our neuron and network modules are working I want you to create an SPI interface for this neuron array/network. So that this can be created in a simple ASIC. You will need to design how you want the interface to work. Please include a top module which instantiates the network and the spi interface. Connect all the necessary wires and ports.

Claude: I'll design an SPI interface for your spiking neural network ASIC. This will allow the network to be configured and controlled via a standard SPI connection.

There are numerous errors here. Most notibly the port list is completely wrong. It has defined inputs and outputs typical of a SPI bus but many that are not. With that said, I think the bad port list has more to do with my prompt; at this point in the exercise, I'm realizing that Claude, and probably other LLMs, are best suited for simplified tasks or functions. I later repeated this prompt but asked it to just generate a standard standalone SPI bus and got much better results. See the (spi_interface)[(../blob/main/SpikingNeuronArray/src/spi_interface.sv)] module for these results.

Initial Interface

module spi_interface #(
    parameter int LAYER1_NEURONS = 3,          // Number of neurons in layer 1
    parameter int LAYER2_NEURONS = 3,          // Number of neurons in layer 2
    parameter int CURRENT_WIDTH = 8,           // Width of current signals
    parameter int WEIGHT_WIDTH = 8,            // Width of synaptic weights
    parameter int ADDR_WIDTH = 8,              // Address width
    parameter int DATA_WIDTH = 8               // Data width (same as CURRENT_WIDTH and WEIGHT_WIDTH)
) (
    // Clock and reset
    input logic clk,                           // System clock
    input logic rst_n,                         // Active low reset

    // SPI interface
    input  logic spi_sck,                      // SPI clock
    input  logic spi_cs_n,                     // SPI chip select (active low)
    input  logic spi_mosi,                     // SPI master out, slave in
    output logic spi_miso,                     // SPI master in, slave out

    // Interface to network
    output logic [CURRENT_WIDTH-1:0] inputs[LAYER1_NEURONS],                 // Input currents to layer 1
    output logic [WEIGHT_WIDTH-1:0] weights[LAYER2_NEURONS][LAYER1_NEURONS], // Synapse weights
    input  logic spikes_layer2[LAYER2_NEURONS],                              // Output spikes from layer 2

    // Control signals
    output logic network_enable,                // Enable signal for the network
    output logic network_reset_n                // Reset signal for the network
);

    // Internal registers
    logic [ADDR_WIDTH-1:0] spi_addr;          // Address register
    logic [DATA_WIDTH-1:0] spi_data_in;       // Data input register
    logic [DATA_WIDTH-1:0] spi_data_out;      // Data output register

    // SPI state machine
    enum logic [2:0] {
        IDLE,       // Idle state
        CMD,        // Command/address receiving
        ADDR,       // Address receiving
        WRITE_DATA, // Data receiving for write
        READ_DATA   // Data sending for read
    } spi_state;

    // SPI bit counter
    logic [3:0] bit_counter;

    // Memory mapped registers
    // 0x00-0x0F: Control registers
    // - 0x00: Control register (bit 0: network_enable, bit 1: network_reset_n)
    // 0x10-0x1F: Reserved
    // 0x20-0x2F: Input currents (3 neurons)
    // 0x30-0xFF: Weights (3x3 = 9 weights)

    logic [DATA_WIDTH-1:0] control_reg;

    // Command types
    localparam logic READ_CMD = 1'b0;
    localparam logic WRITE_CMD = 1'b1;

    // Current command
    logic cmd_type;

    // SPI receiver logic
    always_ff @(posedge spi_sck or negedge rst_n) begin
        if (!rst_n) begin
            spi_state <= IDLE;
            bit_counter <= '0;
            spi_addr <= '0;
            spi_data_in <= '0;
            cmd_type <= READ_CMD;
        end else if (!spi_cs_n) begin
            case (spi_state)
                IDLE: begin
                    // Start receiving command/address
                    spi_data_in <= {spi_data_in[DATA_WIDTH-2:0], spi_mosi};
                    bit_counter <= bit_counter + 1'b1;

                    // After receiving 8 bits, move to address state
                    if (bit_counter == 7) begin
                        bit_counter <= '0;
                        spi_state <= CMD;
                    end
                end

                CMD: begin
                    // Extract command type (MSB)
                    cmd_type <= spi_data_in[7];

                    // Move to appropriate state based on command
                    spi_state <= ADDR;
                    spi_addr <= spi_data_in[ADDR_WIDTH-1:0];
                    bit_counter <= '0;
                    spi_data_in <= '0;

                    if (spi_data_in[7] == WRITE_CMD) begin
                        spi_state <= WRITE_DATA;
                    end else begin
                        spi_state <= READ_DATA;
                        // Preload output data based on address
                        spi_data_out <= get_register_value(spi_data_in[ADDR_WIDTH-1:0]);
                    end
                end

                WRITE_DATA: begin
                    // Shift in data bits
                    spi_data_in <= {spi_data_in[DATA_WIDTH-2:0], spi_mosi};
                    bit_counter <= bit_counter + 1'b1;

                    // After receiving 8 bits, process the write
                    if (bit_counter == 7) begin
                        bit_counter <= '0;
                        set_register_value(spi_addr, {spi_data_in[DATA_WIDTH-2:0], spi_mosi});
                        spi_state <= IDLE;
                    end
                end

                READ_DATA: begin
                    // Shift out data bits
                    spi_data_out <= {spi_data_out[DATA_WIDTH-2:0], 1'b0};
                    bit_counter <= bit_counter + 1'b1;

                    // After sending 8 bits, go back to idle
                    if (bit_counter == 7) begin
                        bit_counter <= '0;
                        spi_state <= IDLE;
                    end
                end

                default: spi_state <= IDLE;
            endcase
        end else begin
            // When CS is high, reset the state machine
            spi_state <= IDLE;
            bit_counter <= '0;
        end
    end

    // SPI transmitter logic (MISO)
    assign spi_miso = (spi_state == READ_DATA) ? spi_data_out[DATA_WIDTH-1] : 1'b0;

    // Map control register to control signals
    assign network_enable = control_reg[0];
    assign network_reset_n = control_reg[1];

    // Function to get register value based on address
    function logic [DATA_WIDTH-1:0] get_register_value(logic [ADDR_WIDTH-1:0] addr);
        logic [DATA_WIDTH-1:0] result;

        if (addr == 8'h00) begin
            // Control register
            result = control_reg;
        end else if (addr >= 8'h20 && addr < 8'h20 + LAYER1_NEURONS) begin
            // Input currents
            result = inputs[addr - 8'h20];
        end else if (addr >= 8'h30 && addr < 8'h30 + (LAYER2_NEURONS * LAYER1_NEURONS)) begin
            // Weights - calculate indices
            logic [7:0] offset = addr - 8'h30;
            logic [7:0] row = offset / LAYER1_NEURONS;
            logic [7:0] col = offset % LAYER1_NEURONS;

            result = weights[row][col];
        end else if (addr >= 8'hF0 && addr < 8'hF0 + LAYER2_NEURONS) begin
            // Read spike outputs (one bit per neuron, packed into bytes)
            if (addr == 8'hF0) begin
                // First byte contains spikes for neurons 0-7
                result = {
                    spikes_layer2[0],
                    spikes_layer2[1],
                    spikes_layer2[2],
                    1'b0, 1'b0, 1'b0, 1'b0, 1'b0
                };
            end else begin
                result = 8'h00;  // For addresses beyond our neuron count
            end
        end else begin
            // Reserved addresses return 0
            result = 8'h00;
        end

        return result;
    endfunction

    // Procedure to set register value based on address
    function void set_register_value(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
        if (addr == 8'h00) begin
            // Control register
            control_reg <= data;
        end else if (addr >= 8'h20 && addr < 8'h20 + LAYER1_NEURONS) begin
            // Input currents
            inputs[addr - 8'h20] <= data;
        end else if (addr >= 8'h30 && addr < 8'h30 + (LAYER2_NEURONS * LAYER1_NEURONS)) begin
            // Weights - calculate indices
            logic [7:0] offset = addr - 8'h30;
            logic [7:0] row = offset / LAYER1_NEURONS;
            logic [7:0] col = offset % LAYER1_NEURONS;

            weights[row][col] <= data;
        end
        // Note: Spike outputs are read-only, so no write handling needed
    endfunction

    // Initialize registers
    initial begin
        control_reg = 8'h00;  // Network disabled and in reset by default

        // Initialize inputs and weights to zero
        for (int i = 0; i < LAYER1_NEURONS; i++) begin
            inputs[i] = '0;
            for (int j = 0; j < LAYER2_NEURONS; j++) begin
                weights[j][i] = '0;
            end
        end
    end

endmodule

Summary of Results

My key takeaway from this exercise is that LLMs offer a quick and efficient way to get a project going and develop early prototyping. Notably, Claude was able to get my toolchain and workspace setup without error within only a few minutes. As for coding, Claude generally did well at the beginning of this exercise, but as the project's complexity grew, it made needless mistakes that I don't think most programmers would make. It especially struggled once I asked it to create an ASIC with an SPI interface to the neuromorphic array. I think this highlights the key bottleneck of LLMs today, which is their limited ability to reason through a problem. While writing this I became curious why this is, so I asked Claude to respond to this curiosity.

Prompt: Why do LLMs like yourself struggle with system level tasks. Especially in software or hardware design?

Claude: Large Language Models like me struggle with system-level tasks in software and hardware design for several fundamental reasons

Limited causal reasoning - System design requires understanding complex cause-effect relationships across different components. While I can describe individual components well, I struggle to model how they interact in a complete dynamic system over time...

I then asked a follow up to get more of the technical details behind this:

Prompt: From a more technical perspective and less personified way, why do LLMs struggle with these types of tasks?

Claude: From a technical perspective, LLMs struggle with system-level tasks in software and hardware design due to several fundamental limitations:

1.Training objective mismatch: LLMs are trained to predict tokens in a sequence based on statistical patterns in text data, not to solve complex system design problems that require optimization across multiple interdependent variables and constraints.

2.Representational limitations: The dense vector representations used in transformer architectures don't naturally encode hierarchical structures or causal relationships crucial for system design. System-level understanding requires modeling components at different levels of abstraction simultaneously.