577 SPI-CPU

577 : SPI-CPU

Design render
  • Author: MS College of Engineering
  • Description: A 4-bit CPU using the SPI Flash RAM from the QSPI PMOD to load programs.
  • GitHub repository
  • Open in 3D viewer
  • Clock: 30000000 Hz

Info - TinyTapeout SPI Microcoded CPU

How it Works

This project implements a compact, 4-bit microcoded CPU designed for the TinyTapeout (GF180MCU) platform. Rather than using limited on-chip area for program storage, the CPU fetches and executes its instructions directly from an external SPI RAM (such as a physical 23LC512 memory chip or an RP2040 microcontroller emulating it).

Hardware Architecture

The design is split into three main functional blocks:

  1. TinyTapeout Wrapper (tt_um_spi_cpu_top): Handles top-level ASIC pins and bridges them to the internal logic.
  2. SPI Fetch & CPU Wrapper (spi_wrap): Manages the 12-bit Program Counter (PC), an FSM to decode instructions fetched over SPI, and a byte-wide SPI master engine (spi_read_byte).
  3. Execution Unit Datapath (ExecutionUnit): Based on the Aeolus CPU Core topology, it coordinates an 8-bit Accumulator (ACC), a 4-bit Register File (Registers A, B, and O), an 8-bit shift register with a overflow flag (SF), and a 4-bit slice ALU.
Instruction Fetch & Execution Pipeline

To optimize memory bandwidth, every byte fetched from the SPI RAM packs two 4-bit micro-operations:

  • opcode1 = spi_data[3:0] (Executed first)
  • opcode2 = spi_data[7:4] (Executed second)

The spi_wrap controller cycles through a sequential Finite State Machine (FSM):

  • S_FETCH_START: Triggers a memory read when the SPI engine is idle.
  • S_FETCH_WAIT_OPCODE: Waits for the transaction to finish and latches the instruction byte.
  • S_EXECUTE_1: Sets the execution bus to opcode1 and pulses cpu_start.
  • S_EXECUTE_2: Sets the execution bus to opcode2, pulses cpu_start, increments the PC, and loops back to fetch the next pair.

The underlying spi_read_byte module executes a standard 23LC512 READ (0x03) command sequence, transmitting {8'h03, 16'b0, pc} MSB-first over MOSI before shifting in the payload.

The Microprogram (Firmware)

As a proof-of-concept hardware demonstration, the pre-loaded microcode implements a 4×4-bit to 8-bit software binary multiplier using a shift-and-add algorithm:

  • ui_in[7:4] = Operand A (4-bit)
  • ui_in[3:0] = Operand B (4-bit)
  • uo_out[7:0] = Product Output ($A \times B$)

Conditional instructions like SNZA and SNZS check the state of the shift register flag, allowing the datapath to selectively add values into the accumulator to dynamically handle binary multiplication without complex, rigid hardware branching paths.


How to Test

Verification workspace parameters rely on cocotb coupled with Icarus Verilog (iverilog).

Dependencies

Ensure your environment includes Python 3.11+ and the proper HDL toolchain packages:

pip install cocotb
sudo apt install iverilog

IO

#InputOutputBidirectional
0DATA_A[0]OUT_REG[0]SPI_CS_N
1DATA_A[1]OUT_REG[1]SPI_MOSI
2DATA_A[2]OUT_REG[2]SPI_MISO
3DATA_A[3]OUT_REG[3]SPI_SCK
4DATA_B[0]OUT_REG[4]
5DATA_B[1]OUT_REG[5]
6DATA_B[2]OUT_REG[6]
7DATA_B[3]OUT_REG[7]CPU_VALID

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (Tiny Tapeout Factory Test) tt_um_Vincent2405_adder_tree (BSD Convolution Adder Tree) tt_um_BastiBudde_i2c_slave_sensor (I2C Slave Template with Emulated Sensor) tt_um_60hz_load (60 Hz Grid-Forming ASIC with Dump-Load Control) tt_um_spi_config_reg (Simple SPI configuration for analog designs) tt_um_ex_drosen766 (Project) tt_um_spi_cpu_top (SPI-CPU) tt_um_d5smith_mfa (Music for ASICs) tt_um_i2c_master (I2C Master Controller) tt_um_aswarby_mac (Aswarby INT8 MAC) tt_um_arrakeen_spsram_direct (TT-Arrakeen-SPSRAM-direct) tt_um_alu (8-bit Interactive ALU) tt_um_JCT_PoC (ttgf jct PoC) tt_um_jct_lea (LEA-128) tt_um_cwru_cpu (CWRU CPU) tt_um_teapot (100Mbps Ethernet Accelerator Wrapper) tt_um_jte_cordic (CORDIC sin/cos generator) tt_um_aidenkoch4 (Three Channel RGB PWM Controller) tt_um_pschuetz_tremolo (Tremolo guitar pedal ASIC) tt_um_jsabree11_fibonacci_checker (fibbonaci_tt) tt_um_connerdaehler_boop (Procedural ASIC) tt_um_Kieckenwama_Traffic_LIGHT_FSM (Traffic Light FSM) tt_um_KimLuu02_WashingMachine_FSM (WashingMachine_FSM) tt_um_PaulineKreis_PWM_Analyser (PWM-Analyser) tt_um_PWM (PWM Generator) tt_um_wokwi_466666882406199297 (Simple Sprinkler) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_spi_master (SPI Master Slave Communication) tt_um_likitha_trng (Secure TRNG Entropy Generator) tt_um_wnn (8-bit WNN Pattern Recognizer) tt_um_raksha (Raksha) tt_um_uart_soc (UART_SOC) tt_um_ecdsa_verify (ECDSA Verification) tt_um_ecc_processor (ECC Processor) tt_um_fast_auth (Fast Authentication Accelerator) tt_um_karthik_trng (TRNG using Ring Oscillator) tt_um_push (Secure V2X Mini Demonstrator) tt_um_santosh_aes_sbox (AES S-Box Accelerator) tt_um_hardware_anomaly_detection (Hardware Anomaly Detection) tt_um_multi_protocol (Multi-Protocol Communication Controller) tt_um_pqc_ntt_butterfly (PQC NTT Butterfly Core) tt_um_cambridge_nlfsr (Programmable Chaotic NLFSR) tt_um_4b_accumulator_cpu (4 bit Accumulator CPU) tt_um_spi_slave (SPI Slave with 8-Register File) tt_um_geeta_doddamani_lfsr (4-bit Maximum-Length LFSR) tt_um_ecc_accelerator (ECC Scalar Accelerator) tt_um_egurapha_chacha20 (ChaCha20) tt_um_configurable_pwm (Configurable PWM Generator) tt_um_Arctic0 (Arctic0 16-bit CPU) tt_um_comp8 (8-bit Comparator) tt_um_pwm_cit (Configurable 8-bit PWM Generator) tt_um_rameshwar_door_lock (Digital Door Lock) tt_um_sandy_venky (8-bit LFSR Circuit) tt_um_ljhahne_pong (Pong) tt_um_v2x_warning (V2X Collision Warning) tt_um_ecc_scalar_mult (ECC Scalar Multiplication) tt_um_fhw_appel_spiPWMio (spiPWMio) tt_um_arrakeen_spsram_direct_sramrules (TT-Arrakeen-SPSRAM-direct-sramrules) tt_um_arrakeen_spsram_direct_5v (TT-Arrakeen-SPSRAM-direct-5V) tt_um_LukeSilva_cartrip (Car Trip) tt_um_coffeepot (100Mpbs 3 port Ethernet switch) tt_um_emiliopeju_lightscan (Lightscan) tt_um_Alanduan21_triad01_top (triad01) tt_um_lif_snn (4-Neuron LIF Spiking Neural Network) tt_um_smerity_mandelbrot (Smerity-Mandelbrot) tt_um_elvtide01_7SegmentDice (7SegmentDice) tt_um_elemental_harmony (Elemental Harmony Game) tt_um_pattern_gen (Programmable Waveform and PWM Generator) tt_um_antimatter15_pdm_vad (PDM Voice Activity Detector) tt_um_layla_spike_detector (Neural Spike Detector) tt_um_detronyx_arith_lab (Detronyx Arithmetic Lab Tile) tt_um_hasheddan_nni (Nearest Neighbor Interpolation) tt_um_brisq (BRISQ) tt_um_santhosh_spike_codec_gf (Neuromorphic Spike Codec (GF180)) tt_um_santhosh_aer_router_gf (Asynchronous-AER Spike Router (4-phase REQ/ACK, 16-entry routing table, GF180)) tt_um_santhosh_snn_wta_gf (Spiking Neural Network WTA Inference Engine (GF180)) tt_um_santhosh_cim_bist_gf (CIM Controller with BIST and Fault Map (GF180)) tt_um_santhosh_neuro_puf_gf (Neuromorphic PUF (distinct-tap LFSR arbiter + memristor XOR, GF180)) tt_um_detronyx_uart_trace_exerciser (Detronyx UART Trace Exerciser) tt_um_ro_puf (Tiny RIng Oscillator PUF) tt_um_franretfie_top (Quadrature sine generator) tt_um_cherny_xor_8bi (XORing given bits) tt_um_mealycpp_ascon_sdmc_uart (ASCON Integrated Crypto Processor) tt_um_reflex_s4 (AER Reflex Chip - MCP2515 CAN gateway) tt_um_polytrig_core (PolyTrig Digital Waveform Synthesis Core) tt_um_waferspace_vga_screensaver (Wafer.space Logo VGA Screensaver) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_urish_simon (Simon Says memory game) Available