96 Asynchronous-AER Spike Router (4-phase REQ/ACK, 16-entry routing table, GF180)

96 : Asynchronous-AER Spike Router (4-phase REQ/ACK, 16-entry routing table, GF180)

Design render
  • Author: Prof. Santhosh Sivasubramani, IIT Delhi
  • Description: GF180mcuD synchronous implementation of a 4-input / 4-output AER (address-event-representation) spike router. Time-multiplexed AER input and output buses each carry one event at a time as a 2-bit channel tag plus 6-bit address. Each incoming event is captured using a double-flop-synchronised 4-phase bundled-data REQ/ACK handshake, looked up in a 16-entry {dst_ch, dst_addr} routing table (or bypassed to pass through unchanged), queued in a 4-deep FIFO, and emitted out the output port with another 4-phase REQ/ACK handshake. Drops on FIFO overflow are counted in a 16-bit saturating drop counter with a sticky overflow_ever bit. Per-input 8-bit saturating event counters and IN_LAST / OUT_LAST debug latches are readable over SPI. 25 MHz signoff clock.
  • GitHub repository
  • Open in 3D viewer
  • Clock: 25000000 Hz

How it works

A 4-input / 4-output AER (Address-Event-Representation) spike router implemented entirely in GF180mcuD digital logic. The external AER handshakes are 4-phase bundled-data REQ/ACK, double-flop-synchronised into a single clock domain for STA cleanliness.

Event flow

 (upstream sender)                               (downstream receiver)
   ui_in[7:6] = src_ch      +-------------+       uo_out[7:6] = dst_ch
   ui_in[5:0] = src_addr    |             |       uo_out[5:0] = dst_addr
   uio[4]     = in_req  --> |   ROUTER    | -->   uio[6]      = out_req
   uio[5]     = in_ack  <-- |             | <--   uio[7]      = out_ack
                            +------+------+
                                   |
                                SPI slave
                            (uio[0..3] = cs/mosi/miso/sck)

Internal pipeline per event:

 IS_IDLE  --(in_req rising, global_en=1)--> IS_LOOKUP
 IS_LOOKUP--(compute route_out)          --> IS_PUSH
 IS_PUSH  --(push FIFO, or drop++)       --> IS_ACK
 IS_ACK   --(drive in_ack=1, wait in_req=0)--> IS_IDLE

Output side drains the FIFO with a symmetric 4-phase handshake:

 OS_IDLE     --(fifo non-empty)--> OS_DRIVE     (load uo_out, raise out_req)
 OS_DRIVE    --(out_ack=1)     --> OS_WAIT_REL  (drop out_req, pop FIFO)
 OS_WAIT_REL --(out_ack=0)     --> OS_IDLE

Routing table

16-entry lookup table, each entry 8 bits: {dst_ch[1:0], dst_addr[5:0]}. The index is the low 4 bits of the source address (src_addr[3:0]). This is a deliberate design simplification: 4 source channels × 64 addresses is a 256-entry space, but only 16 routing slots exist on-chip. The low-nibble hash lets any source whose address ends in the same nibble share a routing slot, which is enough for most AER test workloads (neuron cluster → neuron cluster remapping) without blowing the area budget.

With CTRL.bypass = 1 the table is skipped and the output payload is exactly the input payload ({src_ch, src_addr}).

Back-pressure — drops and drop counter

The output FIFO is 4 events deep. If a new event arrives while the FIFO is full, it is dropped:

  • drop_cnt (16-bit saturating, regs 0x02..0x03) increments.
  • STATUS.overflow_ever (sticky bit) is latched.
  • The per-input evt_cnt counter is still incremented (the event was captured; it just didn't fit).

Both counters are cleared by writing 1 to CTRL.clear_drop / CTRL.clear_evt (those bits are 1-cycle pulses, self-clearing).

Register map

Addr Name Description
0x00 CTRL {4'd0, clear_drop, clear_evt, bypass, global_en}
0x01 STATUS {fifo_full, fifo_empty, overflow_ever, out_busy, in_busy, fifo_cnt[2:0]}
0x02 DROP_LO drop counter low byte
0x03 DROP_HI drop counter high byte
0x04..0x07 EVT_CNT[0..3] 8-bit saturating per-input event counters
0x08 IN_LAST {src_ch, src_addr} of most recent captured event
0x09 OUT_LAST {dst_ch, dst_addr} of most recent emitted event
0x10..0x1F ROUTE[0..15] 8-bit {dst_ch, dst_addr}; index = src_addr[3:0]

Pinout

  • ui_in[5:0] — source address (6 bits)
  • ui_in[7:6] — source channel tag (2 bits, selects which of 4 inputs)
  • uo_out[5:0] — destination address (6 bits)
  • uo_out[7:6] — destination channel tag (2 bits)
  • uio[0]spi_cs_n
  • uio[1]spi_mosi
  • uio[2]spi_miso
  • uio[3]spi_sck
  • uio[4]in_req
  • uio[5]in_ack
  • uio[6]out_req
  • uio[7]out_ack

How to test

# 1. Enable, route mode
spi_write(R_CTRL, 0x01)              # global_en=1, bypass=0
# 2. Program a route: src_addr[3:0]=5 -> {dst_ch=1, dst_addr=0x2A}
spi_write(R_ROUTE_BASE + 5, 0x6A)    # (1<<6)|0x2A
# 3. Host 4-phase send on input port:
#    drive {src_ch=3, src_addr=0x25} on ui_in, raise in_req;
#    when chip asserts in_ack on uio[5], drop in_req; when in_ack drops, done
# 4. Consume on output port:
#    wait for chip to raise out_req on uio[6]; read uo_out = 0x6A;
#    raise out_ack on uio[7]; when chip drops out_req, drop out_ack

External hardware

Any AER-compliant source (silicon retina, spiking sensor emulator, FPGA event generator) and any AER-compliant sink. Upstream and downstream use a 4-phase bundled-data REQ/ACK protocol; the router tolerates arbitrary handshake timing on either side because both REQ and ACK are double-flop-synchronised at the input boundary.

IO

#InputOutputBidirectional
0src_addr[0] (AER in, LSB)dst_addr[0] (AER out, LSB)spi_cs_n (in)
1src_addr[1]dst_addr[1]spi_mosi (in)
2src_addr[2]dst_addr[2]spi_miso (out)
3src_addr[3]dst_addr[3]spi_sck (in)
4src_addr[4]dst_addr[4]in_req (AER in, REQ from upstream sender)
5src_addr[5] (MSB of 6-bit src addr)dst_addr[5] (MSB of 6-bit dst addr)in_ack (AER in, ACK to upstream sender)
6src_ch[0] (AER in channel tag LSB)dst_ch[0] (AER out channel tag LSB)out_req (AER out, REQ to downstream receiver)
7src_ch[1] (AER in channel tag MSB)dst_ch[1] (AER out channel tag MSB)out_ack (AER out, ACK from downstream receiver)

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (Tiny Tapeout Factory Test) tt_um_Vincent2405_adder_tree (BSD Convolution Adder Tree) tt_um_BastiBudde_i2c_slave_sensor (I2C Slave Template with Emulated Sensor) tt_um_60hz_load (60 Hz Grid-Forming ASIC with Dump-Load Control) tt_um_spi_config_reg (Simple SPI configuration for analog designs) tt_um_ex_drosen766 (Project) tt_um_spi_cpu_top (SPI-CPU) tt_um_d5smith_mfa (Music for ASICs) tt_um_i2c_master (I2C Master Controller) tt_um_aswarby_mac (Aswarby INT8 MAC) tt_um_arrakeen_spsram_direct (TT-Arrakeen-SPSRAM-direct) tt_um_alu (8-bit Interactive ALU) tt_um_JCT_PoC (ttgf jct PoC) tt_um_jct_lea (LEA-128) tt_um_cwru_cpu (CWRU CPU) tt_um_teapot (100Mbps Ethernet Accelerator Wrapper) tt_um_jte_cordic (CORDIC sin/cos generator) tt_um_aidenkoch4 (Three Channel RGB PWM Controller) tt_um_pschuetz_tremolo (Tremolo guitar pedal ASIC) tt_um_jsabree11_fibonacci_checker (fibbonaci_tt) tt_um_connerdaehler_boop (Procedural ASIC) tt_um_Kieckenwama_Traffic_LIGHT_FSM (Traffic Light FSM) tt_um_KimLuu02_WashingMachine_FSM (WashingMachine_FSM) tt_um_PaulineKreis_PWM_Analyser (PWM-Analyser) tt_um_PWM (PWM Generator) tt_um_wokwi_466666882406199297 (Simple Sprinkler) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_spi_master (SPI Master Slave Communication) tt_um_likitha_trng (Secure TRNG Entropy Generator) tt_um_wnn (8-bit WNN Pattern Recognizer) tt_um_raksha (Raksha) tt_um_uart_soc (UART_SOC) tt_um_ecdsa_verify (ECDSA Verification) tt_um_ecc_processor (ECC Processor) tt_um_fast_auth (Fast Authentication Accelerator) tt_um_karthik_trng (TRNG using Ring Oscillator) tt_um_push (Secure V2X Mini Demonstrator) tt_um_santosh_aes_sbox (AES S-Box Accelerator) tt_um_hardware_anomaly_detection (Hardware Anomaly Detection) tt_um_multi_protocol (Multi-Protocol Communication Controller) tt_um_pqc_ntt_butterfly (PQC NTT Butterfly Core) tt_um_cambridge_nlfsr (Programmable Chaotic NLFSR) tt_um_4b_accumulator_cpu (4 bit Accumulator CPU) tt_um_spi_slave (SPI Slave with 8-Register File) tt_um_geeta_doddamani_lfsr (4-bit Maximum-Length LFSR) tt_um_ecc_accelerator (ECC Scalar Accelerator) tt_um_egurapha_chacha20 (ChaCha20) tt_um_configurable_pwm (Configurable PWM Generator) tt_um_Arctic0 (Arctic0 16-bit CPU) tt_um_comp8 (8-bit Comparator) tt_um_pwm_cit (Configurable 8-bit PWM Generator) tt_um_rameshwar_door_lock (Digital Door Lock) tt_um_sandy_venky (8-bit LFSR Circuit) tt_um_ljhahne_pong (Pong) tt_um_v2x_warning (V2X Collision Warning) tt_um_ecc_scalar_mult (ECC Scalar Multiplication) tt_um_fhw_appel_spiPWMio (spiPWMio) tt_um_arrakeen_spsram_direct_sramrules (TT-Arrakeen-SPSRAM-direct-sramrules) tt_um_arrakeen_spsram_direct_5v (TT-Arrakeen-SPSRAM-direct-5V) tt_um_LukeSilva_cartrip (Car Trip) tt_um_coffeepot (100Mpbs 3 port Ethernet switch) tt_um_emiliopeju_lightscan (Lightscan) tt_um_Alanduan21_triad01_top (triad01) tt_um_lif_snn (4-Neuron LIF Spiking Neural Network) tt_um_smerity_mandelbrot (Smerity-Mandelbrot) tt_um_elvtide01_7SegmentDice (7SegmentDice) tt_um_elemental_harmony (Elemental Harmony Game) tt_um_pattern_gen (Programmable Waveform and PWM Generator) tt_um_antimatter15_pdm_vad (PDM Voice Activity Detector) tt_um_layla_spike_detector (Neural Spike Detector) tt_um_detronyx_arith_lab (Detronyx Arithmetic Lab Tile) tt_um_hasheddan_nni (Nearest Neighbor Interpolation) tt_um_brisq (BRISQ) tt_um_santhosh_spike_codec_gf (Neuromorphic Spike Codec (GF180)) tt_um_santhosh_aer_router_gf (Asynchronous-AER Spike Router (4-phase REQ/ACK, 16-entry routing table, GF180)) tt_um_santhosh_snn_wta_gf (Spiking Neural Network WTA Inference Engine (GF180)) tt_um_santhosh_cim_bist_gf (CIM Controller with BIST and Fault Map (GF180)) tt_um_santhosh_neuro_puf_gf (Neuromorphic PUF (distinct-tap LFSR arbiter + memristor XOR, GF180)) tt_um_detronyx_uart_trace_exerciser (Detronyx UART Trace Exerciser) tt_um_ro_puf (Tiny RIng Oscillator PUF) tt_um_franretfie_top (Quadrature sine generator) tt_um_cherny_xor_8bi (XORing given bits) tt_um_mealycpp_ascon_sdmc_uart (ASCON Integrated Crypto Processor) tt_um_reflex_s4 (AER Reflex Chip - MCP2515 CAN gateway) tt_um_polytrig_core (PolyTrig Digital Waveform Synthesis Core) tt_um_waferspace_vga_screensaver (Wafer.space Logo VGA Screensaver) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_urish_simon (Simon Says memory game) Available