University of Connecticut
ECE 3401 / CSE 3302: Digital Systems Design
Spring 2023
Programming Assignment 1:
Design of Cascaded ALU Accumulators
Due February 13, 2023 (Monday) @ 11:59 PM on HuskyCT
1 Introduction
This programming assignment will require you to design a system with two ALUs that can
perform arithmetic operations on 32-bit operands in a cascaded design. Operands either
come from external 32-bit inputs or feedback from one of two registers. This assignment has
the following functionality:
• The following ALU operations:
32-bit addition
32-bit multiplication
32-bit shift-logical-right
32-bit compare-if-equal
• Logic to control arithmetic operations, based on a select input.
• Logic to control ALU and 64-bit register inputs using multiplexers.
• Verifcation of the design using a test bench.

2 Description
The fgure below shows the architecture of the system:
Overall Design of PA1
The ALU operation of the inputs and values stored in the assigned register bank are performed continuously. X[31:0] and Y[31:0] serve as external input signals to the system,
and are 32-bits wide. The arithmetic logical units ALU1 and ALU2 have operations controlled by the
alu1_sel and alu2_sel signals, which are described in table 1 below:
Table 1: ALU Operation Selection

alu1_sel / alu2_sel OPERATION
00 addition
01 multiplication
10 right-logical shift
11 compare-if-equal

The r1_en and r2_en signals control the 64-bit registers R1 and R2. When ‘1,’ the register
should propagate the value from D
Q on the rising edge of clk. Otherwise, the register
should instead hold the old value of Q.
The register reset signal
reset synchronously clears the values of R1 and R2. When brought
to a ‘1,’ the values held by R1 and R2 are set to 0 on the rising edge of
clk. When the signal
is ‘0,’ the registers operate as normally. The reset signal should remain ‘0,’ except when you
need to clear the outputs of the registers. Initially, the register output values should be 0 by
asserting the reset signal at the beginning of simulation.
There are also 3 MUX control signals:
m1_sel, m2_sel, and m3_sel. When these select
signals are ‘0’, the MUX should select the input A to propagate to the output. Otherwise,
B should be propagated. M1 chooses between the input
Y and the output Z1 as the operand
A of ALU2, while M2 and M3 select a feedback signal
Z1 or Z2 as the operand B to ALU2
and ALU1, respectively.
Each ALU is expected to perform an unsigned integer add, multiply, logical right shift, and
compare-if-equal operation. We do not require structural VHDL for these operations. You
should use behavioral operators and the appropriate IEEE libraries for your design. One
design option is to use constrained unsigned to represent the input operand (i.e.
X: in unsigned (31 downto 0); for the X operand) and use IEEE.numeric_bit.all library to perform the add and multiply operations. The comparison operation can be performed via library functions, or simply the = operator. Note that for the right shift
operation, input B is the value to be shifted right by a single bit, and input A
is unused. All non-shift operations use both operands A and B.
The register output values, Z1 and Z2, are 64-bit buses. As mentioned above, the values
of the multiplexer select signals determine which outputs to use for the ALU1 and ALU2
feedback inputs. The lower 32-bit output of each register bank is connected to the inputs
of the multiplexers (i.e. any ALU input signals coming from register outputs need to be
truncated appropriately to match the 32-bit widths of each ALU input). Keep in mind we
do not consider overflow in the design of our digital system, so these conditions may be
You will use separate VHDL modules for i) the register bank (
dff.vhd) ii) the ALU (alu.vhd)
and iii) the overall PA1 module (
pa1.vhd). You are given top-level modules with entity instantiations, and you are expected to write the architecture for each module. In the pa1
module specifcally (
pa1.vhd), you are given the architecture declaration, signal defnitions,
and entity instantiations, and you are required to fll in the logic for each signal and the
port maps for each of these entity instantiations. Note, entity instantiation is an alternative
way to instantiate modules. You can read more about entity instantiation and how it works
compared to component instantiation
here. It is recommended that you use explicit port
connection defnitions in your port map defnitions, which was used exclusively in the examples from the hyperlink provided above. You will be graded on the design of these modules
and their functionality.

3 Test Bench
A good design develops a test bench. Complete the following simulations using test benches
testbench1.vhd and testbench2.vhd. The frst test bench is provided and described in
section 3.1. The second test bench in section 3.2 gives you a high level function that you
need to implement. You should use
testbench1.vhd as a reference to help you when writing
your own code in
3.1 Calculating 3D Distance with a Test Bench
In the provided example test bench (testbench1.vhd) we use the digital design to fnd a 3D
point’s (squared) distance from origin. Given a point’s coordinate,
(x, y, z), and the origin
(0, 0, 0), we calculate the squared distance between them using the following
r2 = x2 + y2 + z2 (1)
When you are fnished writing each module (
dff.vhd, alu.vhd, and pa1.vhd), simulate the
design using
testbench1.vhd. To select the current simulation, right click “testbench1” and
select “set as top” (it should be bold). In the test bench, the coordinate
(x, y, z) = (5, 6, 7)
is used to test the functionality by default. After testing and confrming the result, feel free
to change the values of the defned constants (lines 30, 31, and 32 in
testbench1.vhd) to
(x, y, z) = (a, b, c) to test your code with alternate coordinates. The test bench
data flow for generic coordinates
(x, y, z) = (a, b, c) is as follows:
• Reset the system so R1 and R2 are zero. Set the inputs
X a, Y b. Set up
X + R1 and R2 Y + R1. Wait one clock cycle (R1 = a and R2 = b).
• Set up R1
X · R1 and R2 Y · R2. Wait one clock cycle (R1 = a2 and R2 = b2).
• Set input
X 0. Set up R1 X· R1 and R2 R1+ R2. Wait one clock cycle
(R1 = 0 and R2 =
a2 + b2).
• Set input
X c. Hold R2 output by disabling the register. Wait one clock cycle
(R1 =
c and R2 = a2 + b2).
• Set up R1
R1· X. Wait one clock cycle (R1 = c2 and R2 = a2 + b2).
• Set up R2
R1 + R2. Enable R2 so it can update. Wait one clock cycle
(R2 =
a2 + b2 + c2). Done!
Keep in mind that the R1 output is mapped to
Z1 and R2 is mapped to Z2, so these
are the signals you should observe in your simulation. For in-depth details on how this
is implemented using the designed system, read through
testbench1.vhd. Comments are
provided for you to follow the test bench flow as a tutorial.

3.2 Series Calculation using a Test Bench
For a certain calculation, you need to compute the result of the following function:

f(n) =
nX k
k(k + 1)(k + 2)

However, you realize you can calculate the function much more efciently by using the
following series equivalence:

k(k + 1)(k + 2) = n(n + 1)(n + 2)(n + 3)

You now want to use your digital design to efciently compute f(n) using the skeleton
test bench
testbench2.vhd, but due to resource constraints, you are given the following
restriction: input
X is hardwired to 1 and input Y is hardwired to n; that is, in the test
bench, the input
X 1 and Y n assignments are held constant for the duration of
the test bench, and they may not be modifed
Using equation 3 and your completed digital design, compute
f(5) in as few clock cycles
as possible
using the skeleton test bench testbench2.vhd to write your code. Although
you are only responsible for showing the
f(5) simulation, you are free to try using different
input values of
n by modifying the constant assignment at line 30. Remember that you
cannot modify
X and Y after the initial X 1 and Y n assignments, but all other control
signals (MUX select, ALU select, register enable, and reset) may be changed at any point,
unrestricted (Hint: sketch out the data flow required for the calculation frst, and write out
what values are accumulated at each clock cycle. Then, translate it to test bench control
signals to get the desired behavior).
Your result should be in Z2 (the output of R2)
at the end of your computation.
It will help to frst read through the testbench1.vhd
code and comments to understand the PA1 test bench flow.

4 Deliverables
Please submit the following report saved as a single PDF:
1. Your code for each module and your second testbench. You can copy and paste the
code into a Word document; make sure to clearly label each code block.
2. Submit screenshots of the following:
• Your output waveform of
testbench1.vhd using (x, y, z) = (5, 6, 7). Show the
waveform from 0ns to 120ns. The
clk, reset, control, input, and output signals
should all be clearly visible. The bit vectors
X, Y, Z1, and Z2 should also be
unsigned decimal radix (check the adding signals to waveform guide if you need
help formatting or changing the radix).
• Your output waveform of
testbench2.vhd computing f(5). Show the waveform
from 0ns to 150ns. If it takes longer that 150ns to compute your output, show
enough of the waveform to clearly show the result (multiple screenshots are fne).
Format all the waveform signals the same way as the previous test bench.