Projects‎ > ‎

Sudoku ASIC

I took Paul Franzon's ECE 520 ASIC Design class in 2007 and designed a Sudoku ASIC capable of solving 300,000 puzzles per second.  Before designing the chip in Verilog, my classmate and I carefully considered different approaches to this problem, since our grade depended on both speed and chip area (and getting the project done on time).  I wrote algorithms in C to study their performance and ensure they could solve the sample puzzles.  Since the ASIC didn't have onboard RAM, and external RAM had a high access/area cost, we needed an algorithm that minimized use of external RAM.  Our two final candidate algorithms were a stack-based algorithm and a parallel row/column/box scanning algorithm.  The stack algorithm was elegant and required very little chip area, but suffered a large external RAM access penalty.  We finally settled on a parallel scanning algorithm.  The final chip design was quite elegant, with three scanning engines and a scoreboard to keep track of progress toward a solution.  We finished the project ahead of schedule and received a top score for our solution.
I think the main lessons I learned from this exercise were (a) keep it simple and (b) do a lot of design/research up-front.   We really concentrated on the simplest possible algorithms we could find, since it was no use having a highly-optimized algorithm that we could not possibly implement within the allotted schedule.  We found that that time spent talking about the different algorithms, doing napkin sketches, and simulating in C was well-worth the effort in the long run.  It seemed like we were burning a lot of time up-front, but once we started coding Verilog, our work went very smoothly, and we spent very little time tweaking our simulation once we got it running.