519 BSD Convolution Adder Tree

519 : BSD Convolution Adder Tree

Design render

How it works

SD format: 00 => -1 01 => 0 10 => 0 11 => 1

reads 8 values in Registers R0-R7, adds 8 values in BSD format where each value is shifted by 1 digit resulting in: Y = R0 · 1 + R1 · 2 + R2 · 4 + R3 · 8 + R4 · 16 + R5 · 32 + R6 · 64 + R7 · 128

every Addition is performed in BSD format either BSD + TC => BSD or BSD + BSD=> BSD

output given in BSD format.

How to test

Because Tiny Tapeout provides only 8 dedicated input bits and 8 dedicated output bits, the register inputs and result output are multiplexed.

Control signals on uio_in uio_in[7] => write enable uio_in[2:0] => register select for R0 to R7 uio_in[4:3] => output chunk select

During writing, uio_in[7] must be set to 1. This enables writing to the selected register.

uio_in[7] = 1 => write enabled uio_in[7] = 0 => write disabled

After all registers have been loaded, uio_in[7] must be set back to 0. This prevents accidental overwriting of registers during readout.

Writing registers:

To write a value into a register:

Put the input value on ui_in[7:0]. Set uio_in[7] = 1. Set uio_in[2:0] to the target register index. Apply one clock cycle.

Register selection:

uio_in[2:0] = 000 => R0 uio_in[2:0] = 001 => R1 uio_in[2:0] = 010 => R2 uio_in[2:0] = 011 => R3 uio_in[2:0] = 100 => R4 uio_in[2:0] = 101 => R5 uio_in[2:0] = 110 => R6 uio_in[2:0] = 111 => R7

Example input values:

[2, 4, 8, 16, 8, 4, 2, 0]

Write sequence:

ui_in = 00000010 uio_in = 10000000 // write enabled, select R0 clock

ui_in = 00000100 uio_in = 10000001 // write enabled, select R1 clock

ui_in = 00001000 uio_in = 10000010 // write enabled, select R2 clock

ui_in = 00010000 uio_in = 10000011 // write enabled, select R3 clock

ui_in = 00001000 uio_in = 10000100 // write enabled, select R4 clock

ui_in = 00000100 uio_in = 10000101 // write enabled, select R5 clock

ui_in = 00000010 uio_in = 10000110 // write enabled, select R6 clock

ui_in = 00000000 uio_in = 10000111 // write enabled, select R7 clock

-writing done

uio_in = 00000000 // write disabled clock Reading the output

The result ist lenght 26 bit so we need to multiplex through uo_out[7:0].

The output chunk is selected using uio_in[4:3]

uio_in[4:3] = 00 => first 8 bits are laying on uo_out[0:7] : o0 uio_in[4:3] = 01 => second 8 bits ": o1 uio_in[4:3] = 10 => third 8 bits ": o2 uio_in[4:3] = 11 => fourth 8 bits ": o3

Read sequence:

uio_in = 00000000 uo_out now contains o0

uio_in = 00001000 uo_out now contains o1

uio_in = 00010000 uo_out now contains o2

uio_in = 00011000 uo_out now contains o3

The full BSD result is reconstructed as:

[o3 | o2 | o1 | o0]

Real convolution usage

For a real 3x3 image convolution, the values loaded into registers R0 to R7 should be the preadded weight sums of the 3x3 filter kernel.

The circuit itself does not contain the full ROM. Instead, the ROM lookup can be simulated or calculated externally. The external ROM implements the lookup table for the fixed 3x3 filter kernel(for example gaussian).

For each bit position of the 8-bit pixel values, one bit from each of the nine pixels is used to form a 9-bit ROM address:

[p1, p2, p3] [p4, p5, p6] => 8 x 9-bit address => 512-entry ROM => 8 preadded values => load into R0 to R7 => BSD Out of complete Convolution [p7, p8, p9]

IO

#InputOutputBidirectional
0input value bit 0BSD output chunk bit 0register select bit 0
1input value bit 1BSD output chunk bit 1register select bit 1
2input value bit 2BSD output chunk bit 2register select bit 2
3input value bit 3BSD output chunk bit 3output chunk select bit 0
4input value bit 4BSD output chunk bit 4output chunk select bit 1
5input value bit 5BSD output chunk bit 5
6input value bit 6BSD output chunk bit 6
7input value bit 7BSD output chunk bit 7write enable

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (Tiny Tapeout Factory Test) tt_um_Vincent2405_adder_tree (BSD Convolution Adder Tree) tt_um_BastiBudde_i2c_slave_sensor (I2C Slave Template with Emulated Sensor) tt_um_60hz_load (60 Hz Grid-Forming ASIC with Dump-Load Control) tt_um_spi_config_reg (Simple SPI configuration for analog designs) tt_um_ex_drosen766 (Project) tt_um_spi_cpu_top (SPI-CPU) tt_um_d5smith_mfa (Music for ASICs) tt_um_i2c_master (I2C Master Controller) tt_um_aswarby_mac (Aswarby INT8 MAC) tt_um_arrakeen_spsram_direct (TT-Arrakeen-SPSRAM-direct) tt_um_alu (8-bit Interactive ALU) tt_um_JCT_PoC (ttgf jct PoC) tt_um_jct_lea (LEA-128) tt_um_cwru_cpu (CWRU CPU) tt_um_teapot (100Mbps Ethernet Accelerator Wrapper) tt_um_jte_cordic (CORDIC sin/cos generator) tt_um_aidenkoch4 (Three Channel RGB PWM Controller) tt_um_pschuetz_tremolo (Tremolo guitar pedal ASIC) tt_um_jsabree11_fibonacci_checker (fibbonaci_tt) tt_um_connerdaehler_boop (Procedural ASIC) tt_um_Kieckenwama_Traffic_LIGHT_FSM (Traffic Light FSM) tt_um_KimLuu02_WashingMachine_FSM (WashingMachine_FSM) tt_um_PaulineKreis_PWM_Analyser (PWM-Analyser) tt_um_PWM (PWM Generator) tt_um_wokwi_466666882406199297 (Simple Sprinkler) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_spi_master (SPI Master Slave Communication) tt_um_likitha_trng (Secure TRNG Entropy Generator) tt_um_wnn (8-bit WNN Pattern Recognizer) tt_um_raksha (Raksha) tt_um_uart_soc (UART_SOC) tt_um_ecdsa_verify (ECDSA Verification) tt_um_ecc_processor (ECC Processor) tt_um_fast_auth (Fast Authentication Accelerator) tt_um_karthik_trng (TRNG using Ring Oscillator) tt_um_push (Secure V2X Mini Demonstrator) tt_um_santosh_aes_sbox (AES S-Box Accelerator) tt_um_hardware_anomaly_detection (Hardware Anomaly Detection) tt_um_multi_protocol (Multi-Protocol Communication Controller) tt_um_pqc_ntt_butterfly (PQC NTT Butterfly Core) tt_um_cambridge_nlfsr (Programmable Chaotic NLFSR) tt_um_4b_accumulator_cpu (4 bit Accumulator CPU) tt_um_spi_slave (SPI Slave with 8-Register File) tt_um_geeta_doddamani_lfsr (4-bit Maximum-Length LFSR) tt_um_ecc_accelerator (ECC Scalar Accelerator) tt_um_egurapha_chacha20 (ChaCha20) tt_um_configurable_pwm (Configurable PWM Generator) tt_um_Arctic0 (Arctic0 16-bit CPU) tt_um_comp8 (8-bit Comparator) tt_um_pwm_cit (Configurable 8-bit PWM Generator) tt_um_rameshwar_door_lock (Digital Door Lock) tt_um_sandy_venky (8-bit LFSR Circuit) tt_um_ljhahne_pong (Pong) tt_um_v2x_warning (V2X Collision Warning) tt_um_ecc_scalar_mult (ECC Scalar Multiplication) tt_um_fhw_appel_spiPWMio (spiPWMio) tt_um_arrakeen_spsram_direct_sramrules (TT-Arrakeen-SPSRAM-direct-sramrules) tt_um_arrakeen_spsram_direct_5v (TT-Arrakeen-SPSRAM-direct-5V) tt_um_LukeSilva_cartrip (Car Trip) tt_um_coffeepot (100Mpbs 3 port Ethernet switch) tt_um_emiliopeju_lightscan (Lightscan) tt_um_Alanduan21_triad01_top (triad01) tt_um_lif_snn (4-Neuron LIF Spiking Neural Network) tt_um_smerity_mandelbrot (Smerity-Mandelbrot) tt_um_elvtide01_7SegmentDice (7SegmentDice) tt_um_elemental_harmony (Elemental Harmony Game) tt_um_pattern_gen (Programmable Waveform and PWM Generator) tt_um_antimatter15_pdm_vad (PDM Voice Activity Detector) tt_um_layla_spike_detector (Neural Spike Detector) tt_um_detronyx_arith_lab (Detronyx Arithmetic Lab Tile) tt_um_hasheddan_nni (Nearest Neighbor Interpolation) tt_um_brisq (BRISQ) tt_um_santhosh_spike_codec_gf (Neuromorphic Spike Codec (GF180)) tt_um_santhosh_aer_router_gf (Asynchronous-AER Spike Router (4-phase REQ/ACK, 16-entry routing table, GF180)) tt_um_santhosh_snn_wta_gf (Spiking Neural Network WTA Inference Engine (GF180)) tt_um_santhosh_cim_bist_gf (CIM Controller with BIST and Fault Map (GF180)) tt_um_santhosh_neuro_puf_gf (Neuromorphic PUF (distinct-tap LFSR arbiter + memristor XOR, GF180)) tt_um_detronyx_uart_trace_exerciser (Detronyx UART Trace Exerciser) tt_um_ro_puf (Tiny RIng Oscillator PUF) tt_um_franretfie_top (Quadrature sine generator) tt_um_cherny_xor_8bi (XORing given bits) tt_um_mealycpp_ascon_sdmc_uart (ASCON Integrated Crypto Processor) tt_um_reflex_s4 (AER Reflex Chip - MCP2515 CAN gateway) tt_um_polytrig_core (PolyTrig Digital Waveform Synthesis Core) tt_um_waferspace_vga_screensaver (Wafer.space Logo VGA Screensaver) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_urish_simon (Simon Says memory game) Available