749 Custom ALU

749 : Custom ALU

Design render
  • Author: Venkatesh A Nayak, Sathyan E N,Prathap C, Sahana M S , Keerthana V,Shylashree N, RV College of Engineering, Bengaluru
  • Description: A Custom ALU/NPU supporting arithmetic, logic, and neural-inspired operations including rotation, priority encoding, gray code, majority, parity error detection and weighted computations
  • GitHub repository
  • Open in 3D viewer
  • Clock: 50000000 Hz

Credits

We gratefully acknowledge the Center of Excellence (CoE) in Integrated Circuits and Systems (ICAS) and the Department of Electronics and Communication Engineering (ECE) for providing the necessary resources and guidance. Special thanks to Dr. K R Usha Rani (Associate Dean - PG), Dr. H V Ravish Aradhya (HOD-ECE), Dr. K. S. Geetha (Vice Principal) and Dr. K. N. Subramanya (Principal) for their constant encouragement and support in facilitating this Tiny Tapeout. SKY25A submission

How it works

This module implements a Custom Arithmetic Logic Unit (ALU) with two operating modes: standard ALU mode and Neural Processing Unit (NPU) mode.

In ALU mode, the design performs:

  • Standard arithmetic operations: addition, subtraction, multiplication, division
  • Bitwise logical operations: AND, OR, XOR, NOT
  • Bit rotation (left and right)
  • Priority encoder
  • Gray code conversion
  • Majority function logic
  • Comparisons (greater than, equality)

In NPU mode, it performs simple neural-inspired computations including:

  • Weighted sum with error correction
  • ReLU-like activation
  • Scaled dot product
  • Max/min selection
  • Thresholding
  • Inverted XOR
  • Sigmoid-like output
  • Binary classification
  • Bitwise weighted sum
  • Quadratic neuron function

The operation is selected using a 4-bit opcode and mode bit provided via uio_in.

How to test

  • The 8-bit input signal ui_in is divided into:

    • A = ui_in[3:0] (lower 4 bits)
    • B = ui_in[7:4] (upper 4 bits)
  • Operation is selected via:

    • Opcode = uio_in[3:0] (4-bit opcode for selecting ALU/NPU function)
    • Mode = uio_in[4]
      • 0: ALU mode
      • 1: NPU mode
  • The 8-bit output uo_out is structured as:

    • uo_out[3:0]: ALU/NPU result
    • uo_out[4]: Error flag (e.g., divide by zero)
    • uo_out[5]: Sign flag (MSB of result)
    • uo_out[6]: Carry flag (for addition/subtraction)
    • uo_out[7]: Zero flag (result == 0)

Internal Architecture

Functional Overview

The tt_um_customalu is a 4-bit custom Arithmetic Logic Unit (ALU) with two operating modes:

  • ALU Mode (Mode = 0): Performs standard arithmetic and logic operations.
  • NPU Mode (Mode = 1): Emulates neural processing operations (e.g., thresholding, ReLU, dot product).

Inputs are sampled synchronously on the rising clock edge. Operations are executed based on the selected Opcode and Mode.


On every rising edge of clk:

  • A = ui_in[3:0] (Operand A)

  • B = ui_in[7:4] (Operand B)

  • Opcode = uio_in[3:0] (Operation Selector)

  • Mode = uio_in[4] (0: ALU Mode, 1: NPU Mode) On the next clk cycle:

  • Depending on the Mode, the ALU executes one of 32 possible operations (16 ALU, 16 NPU).

  • The result is stored in ALU_Result.

  • Flags such as Zero, Carry, Sign, and Error are updated accordingly.


Operation Modes

ALU Mode (Mode_reg == 0)

Includes basic and advanced operations like:

  • Arithmetic: ADD, SUB, MUL, DIV
  • Logic: AND, OR, XOR, NOT
  • Bitwise: ROTATE, GRAY CODE
  • Comparison: GREATER, EQUAL
  • Custom: PRIORITY ENCODER, MAJORITY, weighted function

NPU Mode (Mode_reg == 1)

Implements lightweight neural functions:

  • Activation: ReLU, Threshold, Sigmoid-like
  • Neuron computations: Base neuron, Scaled dot, Noisy neuron, Quadratic neuron
  • Classification & logic: Equality, Max, Min, Binary classification, Inverted XOR

Reset Behavior

When rst_n is low (active low reset):

  • All internal registers (A_reg, B_reg, Opcode_reg, Mode_reg, ALU_Result) are cleared to 0.
  • All flags (Zero, Carry, Sign, Error) are reset to 0.

This ensures safe and deterministic startup and operation.

Testing can be done by providing various combinations of ui_in and uio_in, then observing the result and status flags on uo_out.

Opcode Table

ALU Mode (Mode_reg == 0)

Opcode (bin) Opcode (hex) Operation Description
0000 0x0 ADD A + B, with carry, zero, and sign flags
0001 0x1 SUB A - B, with carry, zero, and sign flags
0010 0x2 MUL A * B, lower 4 bits, with zero and sign flags
0011 0x3 DIV A / B, handles divide-by-zero with error flag
0100 0x4 ROTL Rotate A left by 1 bit
0101 0x5 ROTR Rotate A right by 1 bit
0110 0x6 PRIORITY ENCODER Encodes position of most significant set bit in A
0111 0x7 GRAY CODE Converts A to Gray code
1000 0x8 MAJORITY FUNCTION `(A & B)
1001 0x9 WEIGHTED FUNC ((A * 3) + 2) % 17
1010 0xA AND Bitwise AND of A and B
1011 0xB OR Bitwise OR of A and B
1100 0xC NOT Bitwise NOT of A
1101 0xD XOR Bitwise XOR of A and B
1110 0xE GREATER THAN 1 if A > B, else 0
1111 0xF EQUALITY 1 if A == B, else 0

NPU Mode (Mode_reg == 1)

Opcode (bin) Opcode (hex) Operation Description
0000 0x0 BASE NEURON ((A * 3) + 2) % 17
0001 0x1 RELU(A - B) A - B if A > B, else 0
0010 0x2 SCALED DOT (A * B) >> 2
0011 0x3 MAX Maximum of A and B
0100 0x4 MIN Minimum of A and B
0101 0x5 THRESHOLD 0xF if A > 4, else 0x0
0110 0x6 INVERTED XOR Bitwise ~(A ^ B)
0111 0x7 NOISY NEURON (A + B + 3) % 10
1000 0x8 EQUALITY NEURON 0xF if A == B, else 0x0
1001 0x9 NO-OP / NULL Always returns 0x0
1010 0xA SIGN BIT AGREEMENT 1 if A[3] ^ B[3] is 1, else 0
1011 0xB SIGMOID-LIKE Returns 0x8 if 2 < A < 13, else 0x0
1100 0xC FIRE ON DIFFERENCE 0xF if A > B and (A - B) > 2, else 0x0
1101 0xD BITWISE WEIGHTED SUM Weighted sum: A[3]*4 + A[2]*3 + A[1]*2 + A[0]*1
1110 0xE QUADRATIC NEURON (A * A + B) % 16
1111 0xF BINARY CLASSIFICATION 1 if A > B, else 0

External hardware

List external hardware used in your project (e.g. PMOD, LED display, etc), if any

IO

#InputOutputBidirectional
0a[0]result[0]opcode[0]
1a[1]result[1]opcode[1]
2a[2]result[2]opcode[2]
3a[3]result[3]opcode[3]
4b[0]Error flagMode bit for ALU/NPU
5b[1]Sign flag
6b[2]Carry flag
7b[3]Zero flag

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (Tiny Tapeout Factory Test) tt_um_oscillating_bones (Oscillating Bones) tt_um_rebelmike_incrementer (Incrementer) tt_um_rebeccargb_tt09ball_gdsart (TT09Ball GDS Art) tt_um_tt_tinyQV (TinyQV 'Asteroids' - Crowdsourced Risc-V SoC) tt_um_DalinEM_asic_1 (ASIC) tt_um_urish_simon (Simon Says memory game) tt_um_rburt16_bias_generator (Bias Generator) tt_um_librelane3_test (Tiny Tapeout LibreLane 3 Test) tt_um_10_vga_crossyroad (Crossyroad) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_rebeccargb_intercal_alu (INTERCAL ALU) tt_um_rebeccargb_dipped (Densely Packed Decimal) tt_um_rebeccargb_styler (Styler) tt_um_rebeccargb_vga_timing_experiments (VGA Timing Experiments) tt_um_rebeccargb_colorbars (Color Bars) tt_um_rebeccargb_vga_pride (VGA Pride) tt_um_cw_vref (Current-Mode Bandgap Reference) tt_um_tinytapeout_logo_screensaver (VGA Screensaver with Tiny Tapeout Logo) tt_um_rburt16_opamp_3stage (OpAmp 3stage) tt_um_gamepad_pmod_demo (Gamepad Pmod Demo) tt_um_micro_tiles_container (Micro tile container) tt_um_virantha_enigma (Enigma - 52-bit Key Length) tt_um_jamesrosssharp_1bitam (1bit_am_sdr) tt_um_jamesrosssharp_tiny1bitam (Tiny 1-bit AM Radio) tt_um_MichaelBell_rle_vga (RLE Video Player) tt_um_MichaelBell_mandelbrot (VGA Mandelbrot) tt_um_murmann_group (Decimation Filter for Incremental and Regular Delta-Sigma Modulators) tt_um_betz_morse_keyer (Morse Code Keyer) tt_um_urish_giant_ringosc (Giant Ring Oscillator (3853 inverters)) tt_um_tiny_pll (Tiny PLL) tt_um_tc503_countdown_timer (Countdown Timer) tt_um_richardgonzalez_ped_traff_light (Pedestrian Traffic Light) tt_um_analog_factory_test (TT08 Analog Factory Test) tt_um_alexandercoabad_mixedsignal (mixedsignal) tt_um_tgrillz_sixSidedDie (Six Sided Die) tt_um_mattvenn_analog_ring_osc (Ring Oscillators) tt_um_vga_clock (VGA clock) tt_um_mattvenn_r2r_dac_3v3 (Analog 8 bit 3.3v R2R DAC) tt_um_mattvenn_spi_test (SPI test) tt_um_quarren42_demoscene_top (asic design is my passion) tt_um_micro_tiles_container_group2 (Micro tile container (group 2)) tt_um_z2a_rgb_mixer (RGB Mixer demo) tt_um_frequency_counter (Frequency counter) tt_um_urish_sic1 (SIC-1 8-bit SUBLEQ Single Instruction Computer) tt_um_tobi_mckellar_top (Capacitive Touch Sensor) tt_um_log_afpm (16-bit Logarithmic Approximate Floating Point Multiplier) tt_um_uwasic_dinogame (UW ASIC - Optimized Dino) tt_um_ece298a_8_bit_cpu_top (8-Bit CPU) tt_um_tqv_peripheral_harness (Rotary Encoder Peripheral) tt_um_led_matrix_driver (SPI LED Matrix Driver) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_mac (MAC) tt_um_dpmunit (DPM_Unit) tt_um_nitelich_riscyjr (RISCY Jr.) tt_um_nitelich_conway (Conway's GoL) tt_um_pwen (Pulse Width Encoder) tt_um_mcs4_cpu (MCS-4 4004 CPU) tt_um_mbist (Design of SRAM BIST) tt_um_weighted_majority (Weighted Majority Voter / Trend Detector) tt_um_brandonramos_VGA_Pong_with_NES_Controllers (VGA Pong with NES Controllers) tt_um_brandonramos_opamp_ladder (2-bit Flash ADC) tt_um_NE567Mixer28 (OTA folded cascode) tt_um_acidonitroso_programmable_threshold_voltage_sensor (Programmable threshold voltage sensor) tt_um_DAC1 (tt_um_DAC1) tt_um_trivium_stream_processor (Trivium Stream Cipher) tt_um_analog_example (Digital OTA) tt_um_sortaALUAriaMitra (Sorta 4-Bit ALU) tt_um_RoyTr16 (Connect Four VGA) tt_um_jnw_wulffern (JNW-TEMP) tt_um_serdes (Secure SERDES with Integrated FIR Filtering) tt_um_limpix31_r0 (VGA Human Reaction Meter) tt_um_torurstrom_async_lock (Asynchronous Locking Unit) tt_um_galaguna_PostSys (Post's Machine CPU Based) tt_um_edwintorok (Rounding error) tt_um_td4 (tt-td04) tt_um_snn (Reward implemented Spiking Neural Network) tt_um_matrag_chirp_top (Tiny Tapeout Chirp Modulator) tt_um_sha256_processor_dvirdc (SHA-256 Processor) tt_um_pchri03_levenshtein (Fuzzy Search Engine) tt_um_AriaMitraClock (12 Hour Clock (with AM and PM)) tt_um_swangust (posit8_add) tt_um_DelosReyesJordan_HDL (Reaction Time Test) tt_um_upalermo_simple_analog_circuit (Simple Analog Circuit) tt_um_swangust2 (posit8_mul) tt_um_thexeno_rgbw_controller (RGBW Color Processor) tt_um_top_layer (Spike Detection and Classification System) tt_um_Alida_DutyCycleMeter (Duty Cycle Meter) tt_um_dco (Digitally Controlled Oscillator) tt_um_8bitalu (8-bit Pipelined ALU) tt_um_resfuzzy (resfuzzy) tt_um_javibajocero_top (MarcoPolo) tt_um_Scimia_oscillator_tester (Oscillator tester) tt_um_ag_priority_encoder_parity_checker (Priority Encoder with Parity Checker) tt_um_tnt_mosbius (tnt's variant of SKY130 mini-MOSbius) tt_um_program_counter_top_level (Test Design 1) tt_um_subdiduntil2_mixed_signal_classifier (Mixed-signal Classifier) tt_um_dac_test3v3 (Analog 8 bit 3.3v R2R DAC) tt_um_LPCAS_TP1 ( LPCAS_TP1 ) tt_um_regfield (Register Field) tt_um_delaychain (Delay Chain) tt_um_tdctest_container (Micro tile container) tt_um_spacewar (Spacewar) tt_um_Enhanced_pll (Enhance PLL) tt_um_romless_cordic_engine (ROM-less Cordic Engine) tt_um_ev_motor_control (PLC Based Electric Vehicle Motor Control System) tt_um_plc_prg (PLC-PRG) tt_um_kishorenetheti_tt16_mips (8-bit MIPS Single Cycle Processor) tt_um_snn_core (Adaptive Leaky Integrate-and-Fire spiking neuron core for edge AI) tt_um_myprocessor (8-bit Custom Processor) tt_um_sjsu (SJSU vga demo) tt_um_vedic_4x4 (Vedic 4x4 Multiplier) tt_um_braun_mult (8x8 Braun Array Multiplier) tt_um_r2r_dac (4-bit R2R DAC) tt_um_stochastic_integrator_tt9_CL123abc (Stochastic Integrator) tt_um_uart (UART Controller with FIFO and Interrupts) tt_um_lfsr_stevej (Linear Feedback Shift Register) tt_um_FFT_engine (FFT Engine) tt_um_tpu (Tiny Tapeout Tensor Processing Unit) tt_um_tt_tinyQVb (TinyQV 'Berzerk' - Crowdsourced Risc-V SoC) tt_um_IZ_RG_22 (IZ_RG_22) tt_um_32_bit_fp_ALU_S_M (32-bit floating point ALU) tt_um_AriaMitraGames (Games (Tic Tac Toe and Rock Paper Scissors)) tt_um_sc_bipolar_qif_neuron (Stochastic Computing based QIF model neuron) tt_um_mac_spst_tiny (Low Power and Enhanced Speed Multiplier, Accumulator with SPST Adder) tt_um_kb2ghz_xalu (4-bit minicomputer ALU) tt_um_emmersonv_tiq_adc (3 Bit TIQ ADC) tt_um_simonsays (Simon Says) tt_um_BNN (8-bit Binary Neural Network) tt_um_anweiteck_2stageCMOSOpAmp (2 Stage CMOS Op Amp) tt_um_6502 (Simplified 6502 Processor) tt_um_swangust3 (posit8_div) tt_um_jonathan_thing_vga (VGA-Video-Player) tt_um_wokwi_412635532198550529 (ttsky-pettit-wokproc-trainer) tt_um_vga_hello_world (VGA HELLO WORLD) tt_um_jyblue1001_pll (Analog PLL) tt_um_BryanKuang_mac_peripheral (8-bit Multiply-Accumulate (MAC) with 2-Cycle Serial Interface) tt_um_rebeccargb_tt09ball_screensaver (TT09Ball VGA Screensaver) tt_um_openfpga22 (Open FGPA 2x2 design) tt_um_andyshor_demux (Demux) tt_um_flash_raid_controller (SPI flash raid controller) tt_um_jonnor_pdm_microphone (PDM microphone) tt_um_digital_playground (Sky130 Digital Playground) tt_um_mod6_counter (Mod-6 Counter) tt_um_BMSCE_T2 (Choreo8) tt_um_Richard28277 (4-bit ALU) tt_um_shuangyu_top (Calculator) tt_um_wokwi_441382314812372993 (Sumador/restador de 4 bits) tt_um_TensorFlowE (TensorFlowE) tt_um_wokwi_441378095886546945 (7SDSC) tt_um_wokwi_440004235377529857 (Latched 4-bits adder) tt_um_dlmiles_tqvph_i2c (TinyQV I2C Controller Device) tt_um_markgarnold_pdp8 (Serial PDP8) tt_um_wokwi_441564414591667201 (tt-parity-detector) tt_um_vga_glyph_mode (VGA Glyph Mode) tt_um_toivoh_pwl_synth (PiecewiseOrionSynth Deluxe) tt_um_minirisc (MiniRISC-FSM) tt_um_wokwi_438920793944579073 (Multiple digital design structures) tt_um_sleepwell (Sleep Well) tt_um_lcd_controller_Andres078 (LCD_controller) tt_um_SummerTT_HDL (SJSU Summer Project: Game of Life) tt_um_chrishtet_LIF (Leaky Integrate and Fire Neuron) tt_um_diff (ttsky25_EpitaXC) tt_um_htfab_split_flops (Split Flops) tt_um_alu_4bit_wrapper (4-bit ALU with Flags) tt_um_tnt_rf_test (TTSKY25A Register File Test) tt_um_mosbius (mini mosbius) tt_um_robot_controller_top_module (AR Chip) tt_um_flummer_ltc (Linear Timecode (LTC) generator) tt_um_stress_sensor (Tiny_Tapeout_2025_three_sensors) tt_um_krisjdev_manchester_baby (Manchester Baby) tt_um_mbikovitsky_audio_player (Simple audio player) tt_um_wokwi_414123795172381697 (TinySnake) tt_um_vga_example (Jabulani Ball VGA Demo ) tt_um_stochastic_addmultiply_CL123abc (Stochastic Multiplier, Adder and Self-Multiplier) tt_um_nvious_graphics (nVious Graphics) tt_um_pe_simonbju (pe) tt_um_mikael (TinyTestOut) tt_um_brent_kung (brent-kung_4) tt_um_7FM_ShadyPong (ShadyPong) tt_um_algofoogle_vga_matrix_dac (Analog VGA CSDAC experiments) tt_um_tv_b_gone (TV-B-Gone) tt_um_sjsu_vga_music (SJSU Fight Song) tt_um_fsm_haz (FSM based RISC-V Pipeline Hazard Resolver) tt_um_dma (DMA controller) tt_um_3v_inverter_SiliconeGuide (Analog Double Inverter) tt_um_rejunity_lgn_mnist (LGN hand-written digit classifier (MNIST, 16x16 pixels)) tt_um_gray_sobel (Gray scale and Sobel filter for Edge Detection) tt_um_Xgamer1999_LIF (Demonstration of Leaky integrate and Fire neuron SJSU) tt_um_dac12 (12 bit DAC) tt_um_voting_machine (Digital Voting Machine) tt_um_updown_counter (8bit_up-down_counter) tt_um_openram_top (Single Port OpenRAM Testchip) tt_um_customalu (Custom ALU) tt_um_assaify_mssf_pll (24 MHz MSSF PLL) tt_um_Maj_opamp (2-Stage OpAmp Design) tt_um_wokwi_442131619043064833 (Encoder 7 segments display) tt_um_wokwi_441835796137492481 (TESVG Binary Counter and shif register ) tt_um_combo_haz (Combinational Logic Based RISC-V Pipeline Hazard Resolver) tt_um_tx_fsm (Design and Functional Verification of Error-Correcting FIFO Buffer with SECDED and ARQ ) tt_um_will_keen_solitaire (solitaire) tt_um_rom_vga_screensaver (VGA Screensaver with embedded bitmap ROM) tt_um_13hihi31_tdc (Time to Digital Converter) tt_um_dteal_awg (Arbitrary Waveform Generator) tt_um_LIF_neuron (AFM_LIF) tt_um_rebelmike_register (Circulating register test) tt_um_MichaelBell_hs_mul (8b10b decoder and multiplier) tt_um_SNPU (random_latch) tt_um_rejunity_atari2600 (Atari 2600) tt_um_bit_serial_cpu_top (16-bit bit-serial CPU) tt_um_semaforo (semaforo) tt_um_bleeptrack_cc1 (Cross stitch Creatures #1) tt_um_bleeptrack_cc2 (Cross stitch Creatures #2) tt_um_bleeptrack_cc3 (Cross stitch Creatures #3) tt_um_bleeptrack_cc4 (Cross stitch Creatures #4) tt_um_bitty (Bitty) tt_um_spi2ws2811x16 (spi2ws2811x8) tt_um_uart_spi (UART and SPI Communication blocks with loopback) tt_um_urish_charge_pump (Dickson Charge Pump) tt_um_adc_dac_tern_alu (adc_dac_BCT_addr_ALU_STI) tt_um_sky1 (GD Sky Processor) tt_um_fifo (ASYNCHRONOUS FIFO) tt_um_TT16 (Asynchronous FIFO) tt_um_axi4lite_top (Axi4_Lite) tt_um_TT06_pwm (PWM Generator) tt_um_hack_cpu (HACK CPU) tt_um_marxkar_jtag (JTAG CONTROLLER) tt_um_cache_controller (Simple Cache Controller) tt_um_stopwatchtop (Stopwatch with 7-seg Display) tt_um_adpll (all-digital pll) tt_um_tnt_rom_test (TT09 SKY130 ROM Test) tt_um_tnt_rom_nolvt_test (TT09 SKY130 ROM Test (no LVT variant)) tt_um_wokwi_414120207283716097 (fulladder) tt_um_kianV_rv32ima_uLinux_SoC (KianV uLinux SoC) tt_um_tv_b_gone_rom (TV-B-Gone-EU) Available Available Available Available Available Available Available Available Available Available Available Available Available