I attended [Logic Design Laboratory] class from August 2022 to December 2022, supervised by Professor IlYong Chun.
In this class, I learned how to use the Verilog HDL tool and participated in a final-term project: 2D Convolution Module Design.
The overall module consists of a Controller, Memory, Computation module, and Display module. Because it was a team project, I designed the computation module part.
The goal of the computation module is to conduct 2D convolution between 4x4 input & 3x3 filters to make 2x2 output in single PE (Processing Element), 3x3 SA (Systolic Array), 2x2 SA and compare the performance of each case. Details are below.
[Click Here ↓ ] to see the code (a detailed explanation is below)
[Figure 1] 2D Convolution Operation
[Figure 2] Overall Convolution Module
PE should be able to conduct multiplication and accumulation. It's unable to store 4 outputs simultaneously with a single PE module. That means it's unable to conduct 4 output computations simultaneously, which results in poor performance.
[Click here!] to see the Verilog HDL codes for
Processing Element
Each elements of arrays are 8-bit precision.
[Click here!] to see the Verilog HDL codes for
Single PE
I designed SA in Output-Stationary Dataflow: aligned input & filter value to the vertical/horizontal input of SA as shown in [Figure 3]. A of C11 means the blue part in the 4x4 input array, which is used to compute C11, as shown in [Figure 4].
[Figure 3] 3x3 SA in output-stationary dataflow
[Figure 4] A of C11
[Click here!] to see the Verilog HDL codes for
3x3 SA
3x3 SA has 9 PEs, meaning nine spaces can store four outputs, allowing fast computation without overlapping the rows and columns. However, because 2x2 SA only has 4 PEs and these four storage spaces should be directly used for four outputs, I put many 0s in the vertical/horizontal input string as shown in [Figure 6] to avoid distortions. 2x2 SA will perform poorly compared to 3x3 SA because of the longer input string.
[Figure 5] 2x2 SA
[Figure 6] Vertical/Horizontal input strings
[Click here!] to see the Verilog HDL codes for
2x2 SA
As explained above, Single PE showed the worst performance while 3x3 SA showed best.
[Table 1] Clk needed for 2D convolution