864 TinyTensorCore

864 : TinyTensorCore

Design render

How it works

TinyTensorCore is a hardware accelerator for 3×3 integer matrix multiplication, designed to fit in a 1×2 TinyTapeout tile. The core takes two 3×3 matrices of 5-bit signed operands as input and produces a 3×3 matrix of 12-bit signed results.

The design is composed of four modules:

  • tt_um_tinytensorcore — top-level wrapper that maps the TinyTapeout pin convention (ui_in, uo_out, uio_*) to a 10-bit instruction bus and a 12-bit result output.
  • tensor_core_controller — instruction decoder and burst-load FSM. Sequences three opcodes: NOP (0b000), TENSOR_CORE (0b001, triggers the matmul), and BURST (0b010, streams 9 operand pairs in or 9 results out over consecutive cycles).
  • tensor_core_register_file — holds two 3×3 matrices of 5-bit signed values (matrix A and matrix B). Loaded one operand pair per cycle during a burst.
  • tensor_core — combinational MAC array that computes the full 3×3 × 3×3 matmul in a single cycle.

The instruction format is 10 bits wide:

  • instr[2:0] — opcode
  • instr[9:5] — operand A (matrix A element, 5-bit signed) during burst loads
  • instr[4:0] — operand B (matrix B element, 5-bit signed) during burst loads (its low 3 bits double as the opcode field)

The 8-bit ui_in carries instr[7:0], and uio_in[5:4] carries instr[9:8]. Results stream out as 12-bit values over uo_out (low 8 bits) and uio_out[3:0] (high 4 bits, including sign).

How to test

A typical test sequence:

  1. Hold rst_n low for several cycles, then release.
  2. Issue a BURST opcode (0b010) for one cycle to start a load burst.
  3. For the next 9 cycles, present operand pairs on the instruction bus — matrix A element on instr[9:5], matrix B element on instr[4:0]. Hold low 3 bits at 0b000 (NOP) so the FSM doesn't restart.
  4. Wait a few idle cycles, then issue TENSOR_CORE opcode (0b001) for one cycle to compute the matmul.
  5. Issue another BURST opcode to stream the 9 result values out over the next 9 cycles. Each result appears as {uio_out[3:0], uo_out[7:0]} — a 12-bit signed value.

Comprehensive RTL vs gate-level verification was performed using Cadence Xcelium with a 26-test suite (identity, all-ones, sequential, max values, zero matrix, diagonal, and 20 random matrices). RTL and gate-level outputs were bit-exact identical.

External hardware

No external hardware required.

IO

#InputOutputBidirectional
0instr[0]result[0]result[8]
1instr[1]result[1]result[9]
2instr[2]result[2]result[10]
3instr[3]result[3]result[11] (sign)
4instr[4]result[4]instr[8]
5instr[5]result[5]instr[9]
6instr[6]result[6]
7instr[7]result[7]

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (Tiny Tapeout Factory Test) tt_um_oscillating_bones (Oscillating Bones) tt_um_wokwi_457142813149930497 (TinyTapeOut workshop) tt_um_wokwi_457311688017142785 (tiny tapeout test gates) tt_um_bfcpu (bfCPU) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_rebeccargb_intercal_alu (INTERCAL ALU) tt_um_rebeccargb_vga_pride (VGA Pride) tt_um_wokwi_457215959798165505 (4-bit N frequency divider) tt_um_ppu_aebarthyi (simple_ppu) tt_um_riscyv02 (RISCY-V02) tt_um_pong (Pong) tt_um_LH_TapeoutMultiplier (tt_um_LH_TapeoutMultiplier) tt_um_wokwi_457571222315471873 (7 Seg C) tt_um_wokwi_457571216758012929 (Mikes Second) tt_um_wokwi_457569452934172673 (FirstTinyTapeoutWokwiProject) tt_um_wokwi_457571159626309633 (Tiny Tapeout V1) tt_um_wokwi_457577038845586433 (TinyTapeOut) tt_um_wokwi_457571280506256385 (Tiny tapeouts test gates) tt_um_wokwi_457576742418338817 (calculator) tt_um_wokwi_457571219715001345 (Ami's TT Logic Gates) tt_um_wokwi_457571067547656193 (Mikes First Wokwi design) tt_um_wokwi_457571453314827265 (Tiny tapeout one hot to seven segment display 1-8) tt_um_wokwi_457572218833202177 (4bit adder and hex converter) tt_um_wokwi_463741407580251137 (Lady's First Tapeout) tt_um_wokwi_457571138696714241 (jdisplayer) tt_um_wokwi_457570687900145665 (Tiny Tapeout Test Gates) tt_um_wokwi_457577511431565313 (Tiny Tapeout Test Gates) tt_um_wokwi_457571262875481089 (Tiny Tapeout) tt_um_wokwi_457571417674762241 (TamTries Tiny Tapeout) tt_um_wokwi_457571366985520129 (georgies wokwi design) tt_um_wokwi_457571701752981505 (WilfTT) tt_um_wokwi_457572875733692417 (First WOKWI Design) tt_um_wokwi_457571571887847425 (tiny tapeout gate test) tt_um_wokwi_457571352249873409 (First Wokwi design) tt_um_wokwi_457571405919170561 (Namo's first tapeout) tt_um_wokwi_457571339952163841 (OR Gate with NAND) tt_um_wokwi_457571188658258945 (Abishag's first Wokwi Design) tt_um_wokwi_457571426719781889 (Tiny) tt_um_wokwi_457571268900604929 (tiny tape GDS) tt_um_wokwi_457571949070179329 (Tom Haley Tiny Tape Out Design ) tt_um_alex_ha_192 (alex_ha_192) tt_um_wokwi_457577241913913345 (tiny tapeout test gates ) tt_um_wokwi_457571297367365633 (First Wokwi Attempt) tt_um_wokwi_457571363309211649 (idk yet) tt_um_wokwi_457571305740256257 (Work In progress title) tt_um_wokwi_457579594627462145 (TinyTapeoutProjectDefne) tt_um_wokwi_457571274041781249 (Tiny Tapeout Workshop by Kirsty Tan) tt_um_wokwi_457571233499594753 (Tiny Tapeout Workshop) tt_um_wokwi_457570205537212417 (Tiny Tapeout Test Project) tt_um_ojas_sharma_imperial_ttcpu (ttcpu 4-bit RISC microprocessor) tt_um_wokwi_457571271419289601 (chip one) tt_um_wokwi_457573490746716161 (Name Serial Printer) tt_um_wokwi_457569507958215681 (Tiny tapeout proj) tt_um_wokwi_457577929607958529 (Random 1st Attempt) tt_um_wokwi_457571438667259905 (PD+PFD+FreqDiv) tt_um_wokwi_457571602706552833 (Joe's first Wokwi design) tt_um_wokwi_457571471659666433 (Nicolas' first Wokwi design) tt_um_wokwi_457571148733696001 (Tiny Tapeout Workshop 1) tt_um_wokwi_457572520479222785 (Tiny Tapeout: Buenos días Mundo! ) tt_um_wokwi_457571494688497665 (First Chip) tt_um_wokwi_457571341266031617 (D-Type Flip Flop) tt_um_wokwi_457581344351934465 (WOKWI) tt_um_wokwi_457571462196267009 (Tiny Tapeout) tt_um_wokwi_457571359410603009 (TinyTapeout) tt_um_Terdoo_Osu (Spiking Pattern Recognition Core) tt_um_wokwi_457571319408448513 (Mani TinyTapeout) tt_um_wokwi_457571298662360065 (Tiny Tapeout Test Gates) tt_um_wokwi_457573015156590593 (Lil tapeout) tt_um_wokwi_457576363047649281 (Inverter) tt_um_wokwi_457571216488527873 (Tiny Tapeout Template Copy Paul 1) tt_um_wokwi_457571472208072705 (Tiny Tapeout Test design) tt_um_wokwi_457571381968631809 (Tiny tapeout test) tt_um_wokwi_457571314694049793 (Tiny Tapeout Test) tt_um_wokwi_457571368009979905 (Tiny Tapeout Test Gates) tt_um_wokwi_457571389542502401 (First thing) tt_um_wokwi_457570267471381505 (Tiny Tapeout) tt_um_wokwi_457571563051492353 (CS First Wokwi design) tt_um_wokwi_457577392775721985 ( tiny Tapeout Test Gate) tt_um_wokwi_457570279596067841 (Tiny Tapeout Workshop - AJJ) tt_um_wokwi_457571180646081537 (Alins Password) tt_um_wokwi_457572360568198145 (Tiny Tapeout) tt_um_wokwi_457571270578328577 (Tiny tapeout workshop) tt_um_wokwi_457581625098771457 (Tiny Tapeout First Test Run) tt_um_wokwi_442342513281875969 (First Design) tt_um_wokwi_457581848269362177 (Tiny Tapeout Brainf*ck?) tt_um_sap_alexanderholden (sap1) tt_um_wokwi_457571752214675457 (3bit_ALU) tt_um_wokwi_457571542558115841 (Tiny Tapeout") tt_um_wokwi_457573095390500865 (Tiny Tapeout Workshop Counter) tt_um_wokwi_457571511812802561 (Akash's first Wokwi design) tt_um_wokwi_457577563633889281 (Tiny Tapeouts gate tests) tt_um_wokwi_457576950671858689 (Hymns_GDS) tt_um_wokwi_457571371384299521 (Digital digit display circuit - TINYTAPEOUT) tt_um_rowantylerr_RC_TDC (Resistor Capacitor TDC) tt_um_wokwi_463662181299058689 (2 bit ALU) tt_um_chinghey (Hey FlexCompute-130) tt_um_8b10 (serdes8b10) tt_um_rom_vga_screensaver (VGA Screensaver with embedded bitmap ROM) tt_um_mayamelon_top (Tiny PI Controller) tt_um_JAIMEPRYOR0_VGA_YAY (VGA_YAY) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_mng2_2ncos (A Tale of Two NCOs) tt_um_shimmydee_checkers (One-tile ADC) tt_um_urish_simon (Simon Says memory game) tt_um_dheeeraaj_sine_chirp_beacon (DDS Sine Chirp Beacon) tt_um_nicholas_ls194a (Universal Shift Register (SN74LS194A compatible)) tt_um_BellaB05_Hearts (Pink Hearts) tt_um_scottshuynh_ad_astra (ASIC Ad Astra) tt_um_liamolucko_vga (VGA demo) tt_um_lledoux_s3fdp_seqcomb (S3FDP Seq+Comb Stream Core) tt_um_5482582_cat_vga (Cat VGA) tt_um_vga_example_directional_toggle (Directional toggle of VGA playground example) tt_um_jimbok_ro_puf (Ring Oscillator PUF) tt_um_xxsahanaxx_hwsec_glitch (Hardware Security Glitching Attack) tt_um_NguyenHuuHenry_vga_project (VGA_Project) tt_um_irfantekin_analog (tt_um_irfantekin_analog) tt_um_chicagojones_sky26a_trng (Sky26a Advanced TRNG) tt_um_yen (YEN) tt_um_pedometer (Ultra Low Power Pedometer ASIC) tt_um_analog_atenfyr1 (Configurable Self-biasing Miller-compensated OTA) tt_um_aes_sbox (Formally-Verified Constant-Time AES S-Box) tt_um_tcpu_alienflip (tcpu) tt_um_nebula (Sierpinski Fractal Starfield) tt_um_zenith_tx26 (Zenith TX26) tt_um_odgrip_demoscene_ttsky26a (My first demoscene) tt_um_vighnesh_sawant_plane (Plane with a banner) tt_um_glyph_mode_hd (Glyph Mode HD) tt_um_TSARKA_TinyQV (TinyQV Wishbone SoC) tt_um_SimpleCounter (Simple Counter) tt_um_cfar_nobuzzer (CFAR Detector without Buzzer) tt_um_present (Present) tt_um_top (Approximate Logic Unit) tt_um_goose (OIIA-goose) tt_um_riscv_core (Tiny RISC-V) tt_um_dac_test3v3 (Design and Implementation of R-2R Ladder DAC for GPR Application) tt_um_tadc_its (Time Domain ADC) tt_um_algofoogle_vga_matrix_dac (Analog VGA CSDAC experiments) tt_um_jyblue1001_pll (Analog-PLL) tt_um_axi4lite2x2_top (AXI4-Lite 2M-2S Interconnect) tt_um_systolic_top (4x4 Systolic Matrix MAC Accelerator) tt_um_goose_game (Goose Game) tt_um_rongbin99_happyredmapleleaf_audio_chip (Audio Wave Generator Chip) tt_um_fp_id (FinSec-1: AS-68M Fingerprint Verification ASIC) tt_um_game_of_life (Demoscene: Game of Life) tt_um_ds_missile_command (Missile Command) tt_um_cmos_inverter (Reactive Plasma: CMOS Inverter) tt_um_nightplumeaki_tinypipcore (tinypipcore) tt_um_immrudul_w7khan (Mrudul and Wahhaj Demoscene F2025) tt_um_sohamgovande_transformer (Transformer) tt_um_isa084_uart_servo (UART Positioning PWM Interface) tt_um_wokwi_461265571826974721 (Bias Correction Filter) tt_um_8_bit_cpu (8-bit CPU) tt_um_richad (ADPPLS) tt_um_algofoogle_dottee (DOTTEE VGA demo) tt_um_sar_fms (SAR FSM) tt_um_kolontsov_journey (Journey) tt_um_fft_adityaamehra (64 Sample FFT ASIC) tt_um_lambda_clock (Lambda Clock) tt_um_ece298A_analog (ECE298A analog tile) tt_um_toivoh_demo (Orion Iron Ion [TTSKY26a demo competition]) tt_um_kilian_interference (Wave Lattice) tt_um_fabulous_sky_26a (Tiny FABulous FPGA) tt_um_Rats2012_WobblyBits (WobblyBits - A probabilistic computing chip) tt_um_rebelmike_asic_odyssey (2026: An ASIC Odyssey) tt_um_huyatieo_tinyqv_speck (Speck-V SoC) tt_um_mosbius (mini mosbius) tt_um_remedy_cpu (FFD16 cpu 16-bit) tt_um_vga_ocarina (Ocarina on VGA) tt_um_TinyGPU_v3 (Tiniest GPU V3) tt_um_santhosh_ring_osc (Ring Oscillator PVT Sensor & TRNG) tt_um_santhosh_xbar_ctrl (Memristive Crossbar Peripheral Controller) tt_um_santhosh_stdp_ctrl (Digital STDP Learning Controller) tt_um_santhosh_stoch_neuron (LFSR-Based Stochastic Neuron) tt_um_anweiteck_ldo (1V-LDO) tt_um_sriaxi4lite_top (Axi4_Lite) tt_um_bch_code_15_7_2 (Bose-Chaudhuri-Hocquenghem Code) tt_um_mastensg_ttsky26a_demo (Luz) tt_um_pakesson_vga_rocket (VGA Rocket) tt_um_adpll (ADPLL - All-Digital Phase-Locked Loop) tt_um_Bingyao_FCOTA (Self biased Single Ended Folded Cascoded OTA) tt_um_spacewar_top (Spacewar) tt_um_microlane_demo (microlane demo project) tt_um_NE567Mixer28 (ECG Front End) tt_um_wakita_mux8onehot_cap (Mux8onehot Pulldown Mosfet) tt_um_johshoff_metaballs (Metaballs v2) tt_um_tomvdsch_cyclonerunner (CycloneRunner) tt_um_lowprocess_wildcamping (PicoMIPS CPU) tt_um_canvas (Tiny Canvas) tt_um_snrlxd1068_MACs (Linear and Logarithmic MACs) tt_um_pakesson_simon64_128 (SIMON64/128) tt_um_AmitChen1415 (Tiny Blackjack) tt_um_ole_moller_double_dabble_SV (double_dabble_SV) tt_um_toivoh_demo_1tile (Single tile demo [TTSKY26a demo competition]) tt_um_shiho_space_invaders (Tiny Space Invaders) tt_um_analog_RO (Analog RO) tt_um_electron65_vga (VGA Clock Demo) tt_um_wokwi_457571266840151041 (3-Bit ALU) tt_um_katomata (Katomata - 1D Cellular Automata) tt_um_shimomi_analog (analog circuit) tt_um_toivoh_demo_4tile (Four tile demo [TTSKY26a demo competition]) tt_um_IEEE_open_silicon_FOSSEE (Ring oscillator VCO and Differential Amplifier) tt_um_lm_chip_top (Project Long Man: A Delay-Insensitive Interconnect) tt_um_AlephNaNsea_space_time_waves_and_filaments (Space-Time Waves and Filaments) tt_um_spacelizard_apu (Spacelizard APU) tt_um_wokwi_457569490272926721 (Letter S) tt_um_mau_top_4b (SIMD2 Math Accelerator Unit) tt_um_maze (Maze) tt_um_demoscenettsky (Algorithmic Pattern Generator) tt_um_wokwi_457572141968369665 (Arran's tinytapeout project) tt_um_maxluppe_ttsky26a_analog (Standard Digital Logic Cells Analog Comparator) tt_um_grammartile (GrammarTile) tt_um_bubble_sort (IEEE Bubble Sort Engine) tt_um_ahmed_nematallah_12_bit_adc (12-bit ADC) tt_um_bad_ode_plotter_vga (Bad VGA ODE Plotter) tt_um_wokwi_463706339714973697 (Demo 4-bit ALU 74181 variant) tt_um_wokwi_457569853853115393 (Jasper Tiny Tape Out Workshop) tt_um_wokwi_457560507752701953 (Osian Tiny Tapeout) tt_um_wokwi_457571501325987841 (Rola_Tiny Tapeout Template Workshop4Mar26) tt_um_wokwi_457571903121572865 (TT-wokwi-template) tt_um_wokwi_463380823859050497 (My_Name_on_7_Seg_display) tt_um_wokwi_457569584731832321 (Tiny Tapeout 9 Template Copy) tt_um_wokwi_457571826952995841 (Tiny Tapeout Novomorphic Design 1) tt_um_wokwi_457571349142937601 (Tiny Tapeout Secret First Letter Code) tt_um_wokwi_457571261877235713 (Tiny Tapeout Test) tt_um_wokwi_457582867322921985 (Tiny Tapeout Test GDS) tt_um_wokwi_457571135132600321 (Tiny Tapeout Test Gates) tt_um_wokwi_457571331577181185 (Tinytapeout_IA) tt_um_wokwi_457576779101727745 (tiny tapeout test gates) tt_um_wokwi_457571577702202369 (tj wowki) tt_um_wokwi_457572953060951041 (wokwi) tt_um_pettit_galton (Tiny Galton) tt_um_fountaincoder_top_abc (ABC Temporal Coincidence Detector) tt_um_prime_quine (Prime Quine) tt_um_ghtag_trinity_gf16 (Trinity GF16 Dot Product Accelerator) tt_um_LFSR (Configurable Galois LFSR) tt_um_Acrazt05_titan_proccesing_unit (Titan Proccesing Unit (TPU)) tt_um_essen (Digital) tt_um_alu_bns (6-bit Multi-Functional ALU) tt_um_gerardvt_spade_poc (Interactive XOR Plasma (Spade HDL)) tt_um_gerardvt_clash_poc (Interactive Triangle-Wave Plasma (Clash HDL)) tt_um_jackthoene_frogger (Frogger) tt_um_wokwi_463698873100105729 (IEEE Open Silicon 2026: UTB Logic Trivia Challenge: 8-bit Digital Lock) tt_um_wokwi_463666635153364993 (IEEE - Hex Counter and Logic Gate Validator) tt_um_ChristmasTree_MaligayangPasko (ChristmasTree_MaligayangPasko) tt_um_wokwi_463711763041599489 (IEEE Open Silicon 2026: UTB UART Transmitter basic) tt_um_tinytensorcore (TinyTensorCore) tt_um_uwasic_crypto (UWASIC Crypto) tt_um_topadi (time) tt_um_siliconimist (Siliconimist Demoscene) tt_um_neutern_0 (tt_um_neutern_0) tt_um_htfab_hsxo (HSXO) tt_um_madech_8bit_processor_vga (8-Bit Processor with VGA) tt_um_vga_clock (VGA clock) tt_um_usu_AXIS_MVMul (AXI-Stream Matrix Vector Multiplier) tt_um_weird_numbers (Weird Numbers) tt_um_bovi_cable_tester (Cable Tester) tt_um_libokuohai_asap_cpu_v2 (ASAP CPU v2) tt_um_LinusSkucas_pio (Tiny PIO) tt_um_thomas_ep_sensor (EP Sensor v7 (symmetric in-place thicken, Zhao-compliant)) tt_um_rakhanaufm_truerandom (Current-Starved Ring Oscillator Based True Random Number Generator) tt_um_parakeet (parakeet) tt_um_mcml_vco (MCML experiments) tt_um_tpu ( Tensor Processing Unit) tt_um_strasti (8-Bit ALU) tt_um_zed_analog (Analog design) tt_um_axi4lite_top (Axi4_Lite) tt_um_c4m_spsram_direct (TTSKY-SPSRAM-direct) tt_um_Onchip_Folded_Cascode_N_with_Bias (Folded Cascode N Type with Bias from Onchip Research Group) tt_um_htfab_hybrid (Telephone hybrid) tt_um_ilamparuthi_cfar (CFAR Radar Detector) tt_um_pakesson_glitcher (Glitcher) tt_um_advaittej_stopwatch (V-SPACE Demo: Command & Control Chronograph) tt_um_william_pll (Smartcard PLL Clock Generator) tt_um_Melody_Generator_JLANordhal (Melody Generator based on Markov Chains) tt_um_d_monteiro (Neuromorphic Processor (SNN)) tt_um_jacob_kebaso_4bit_cpu (Nibble - 4-bit CPU) tt_um_signal_detector (Signal_Detection_Processor) tt_um_catalinlazar_tinycore8 (TinyCore8) tt_um_chidam_secengine (Tiny Secure Telemetry Engine) tt_um_urish_usb_cdc (USB CDC (Serial) Device) tt_um_josenbm (9-Channel Frequency Counter with I2C + SPI DAC & ADC) tt_um_shalindra_vga_rings (Variable Speed and Colour Select VGA Rings) tt_um_dinukuk_MYVGA_GLIDER (DKTT01 - VGA Glider) tt_um_fibonacci_JoaoBortolace (Fibonacci Counter) tt_um_wokwi_461639934990157825 (4 bit unlock (IEEE)) tt_um_ctw_ldo (LDO Regulator Skywater 130nm)