xtensa - modrpc/info GitHub Wiki

Table of Contents

Core Architecture

  • 24-bit instructions (to reduce code-size) which perform 32-bit operations

Registers

AR: General Registers

  • AR ("address" registers) (cf. coprocessor registers, which serve as "data" registers)
  • If windowed register option is configured; address register file is extended and a mapping from virtual to physical registers is used

SAR: Shifts and Shift Amount Register

Tools

Linux/Xtensa

Xtensa on QEMU

  • Xtensa on QEMU
  • Prerequisites
    • Install libffi (assuming no root privilege)
      • wget sourceware.org:/pub/libffi/libffi-3.2.1.tar.gz
      • ./configure --prefix=<mypath>
      • make install
    • Install GLIB
      • wget -c http://ftp.acc.umu.se/pub/GNOME/sources/glib/2.6/glib-2.36.3.tar.xz
      • # allows configure to find libffi
      • setenv PKG_CONFIG_PATH <mypath>/lib/libffi/pkgconfig
      • ./configure --prefix=<mypath>
      • make install
    • Install Automake, Autoconf
    • Install OpenSSL (twice; once for .so; once for .h files)
      • ./config --prefix=/usr/local --openssldir=/usr/local/ssl
      • make && make install
      • ./config shared --prefix=/usr/local --openssldir=/usr/local/ssl
      • make clean
      • make && make install

Cross Compilation Toolchain

TIE (Tensilica Instruction Extension) Language

Overview

  • Create new instructions to increase processor performance and efficiency
  • Exploit data-level parallelism
    • create SIMD registers and operations and perform same operation across multiple elements of vector word
  • Exploit instruction-level parallelism
    • create multi-operation instruction using FLIX (Xtensa Flexible Lenghth Instruction eXtensions) -- a multi-issue VLIW with vairable slot widths
  • Increate data bandwidth
    • Ports (GPIO) for point-to-point direct connections without flow control
    • Queue (FIFO) for point-to-point data transfers with flow control
    • Memory lookup interfaces for connecting to arbitrary-width memories or RTL blocks for low-latency data transfers

Extending Xtensa Instruction Pipeline

Adding I/O Interfaces to Xtensa

Ports

  • Ports (GPIOs) are wires which directly connect two Xtensa processors or connect an Xtensa processor to external RTL.
  • Up-to 1024 wires wide allowing wide data types to be transferred using single load/strore operations.
  • Mostly for transfer control and status information
  • No handshake mechanism
  • Two TIE Port types
    • Export State: State made visible to the extenal world
    • Import Wire: External wires made visible to the data path of an Xtensa processor

Queue Interfaces

  • Queue interfaces enable simple connectivity from an Xtensa processor to external synchronous, FIFO devices.
  • Two Queue Types
    • Input Queue interface
    • Output Queue interface

Output queue interface

Three ports are created for every output Queue interface, e.g. for MY_OQ

queue MY_OQ 32 out
operation MY_PushQ
   {in AR qdata}
   {out MY_OQ} {
   assign MY_OQ = qdata ;
 }
  • Operations created:
    • TIE_MY_OQ_Full:
    • TIE_MY_OQ_PushReq:
    • TIE_MY_OQ[31:0]:
  • How to use Queue in C
# include <xtensa/tie/outqueue.h>
int data = 12;
MY_PushQ(data);

Lookup Interfaces

  • For directly connecting to external memories (ROM/RAM) for performing table lookups or for connecting special-purpose hardware with fixed latencies (e.g. other RTL)
    • e.g. strobing

Xtensa Inter-Processor Communications in SOCs

Processor Buses

  • Bus: shared-access HW mechanism allowing one or more processors to communciate with slave memories and I/O blocks
    • in simple designs, slave is accessible only from ONE bus (processor which owns the bus owns the slave)
    • in bus-based, multiprocessor systems, different processors must arbitrate for the bus
  • processors and slaves have a range of bus-tranfer requirements or traffic patterns

Bus Design Tradeoffs

  • Bus width and clock rate
  • Arbitration
    • Round-robin arbitration: fair but long latency for time-critical requests
    • Strict-priority arbitration: min latency for critial requests
    • Reserved-bandwidth arbitration: min guaranteed bandwidth over a time interval
  • Transfer types:
    • fixed-block transfers:
    • vairable-block transfers:
    • split transactions: decomposition of bus requests (usually a read) into two transfers:
      • one to convey an address from the machine to the slave
      • the other to return a response data block from the slave to the master
    • atomic transactions:

Bus Implementation with Xtensa

Global memory access over a system bus

Local memory of another processor accessed over a system bus

Multi-ported memory directly connected to processors

Direct connections to local memories

Direct Connections to Xtensa Processor Ports, Queues, and Lookups

Using Ports (GPIO)

Ports: Interrupt-driven Handshake

Using a DMA Controller to Manage Data Transfers over a System Bus

Using Queues (FIFO Interfaces)

  • Buffer streaming data between two points -- input/output queues are read/written as registers.
  • Data rate: Each queue can transfer up to 1Kb/cycle.
  • Can be used to boost processor I/O transfer rates
  • Queues allows to connect producer and consumer with different speed through buffering

  • FIFO queues can be mapped into address spaces of the processors.
  • Store to the address of Queue's tail causes a PUSH
  • Load from the address of Queue's head causes a PPO

Using Lookup Interfaces

⚠️ **GitHub.com Fallback** ⚠️