Remark: This page will detail some of the projects I have done for school.
Date: Winter 2012
Instructor: Michael Franz
I implemented an optimizing compiler for a simple programming language in C#. Some of the features of the language are global variables, local variables, arrays, functions, while loops, and if statements. The code generator (of the compiler) outputs a native program for the DLX processor. The techniques/methods used for optimizing and generating machine code are listed and briefly explained below:
Static Single Assignment (SSA) Form - the treatment of each left-hand operand in an assignment instruction as a separate "variable."
Phi Functions - a device used to resolve assigned values coming from multiple branches. (For example, a variable can be assigned a particular value within the if-block of an if-statement and a different value within the else-block)
Dominator Tree - a structure for a given program which shows the instructions that will always be executed before particular (future) instructions. Used for the process of Copy Propagation and CSE (see below).
Copy Propagation - the process of copying over referenced "variables" to future instructions, getting rid of redundant aliases in the process.
Common Sub-Expression Elimination (CSE) - the process which eliminates redundant instructions (i.e., instructions with same operands and same operation).
Interference Graphs & Graph Coloring - used for register allocation to clusters of "variables." I used a heuristic for graph coloring, where the weight of the cluster is f(x) = ((#lines used) / (live range)) + sum(nested level per instruction that has x as an operand). The weight of the cluster determines the order of graph coloring (i.e., register allocation).
Tools Used:
aiSee3 (similar to VCG) - a tool used to visualize graphs. I used this tool to display the control flow graph (CFG) of the optimized intermediate language and their basic blocks, the dominator tree, and the post-dominator tree; this tool was particularly helpful for debugging purposes.
Download: Source Code
Visualization of Some Compilation Stages:
Date: Fall 2011
Instructor: Charless Fowlkes
The following projects were programmed using Matlab.
K-Means Clustering
USPS MailBox Detector
Method: AdaBoost Cascade using color and brightness features.
Download: Source Code
Citations:
Viola, Paul and Jones, Michael. “Robust Real-time Object Detection.” Second International Workshop On Statistical and Computational Theories of Vision – Modeling, Learning, Computing, and Sampling 13 July 2001.
"AdaBoost.” en.Wikipedia.org. Wikipedia, n.d. Web. 22 Nov. 2011.
Date: Fall 2011
Instructor: Mike Carey
With my partner Arthur Valadares, we implemented some core components of a database management system in C/C++. We implemented the Paged File Manager, Record Manager, Indexing Component , and Query Engine. Heap files were used to store data.
Paged File Manager - this component lies at the lowest level which manages I/O operations to read or write pages of data into a system file.
Record Manager - we implemented support for variable sized data (with the assumption that a tuple would be less than a page). Additionally, we implemented a catalog of attributes to maintain metadata pertaining to tables and their attributes. A static sized directory of pages which is structured as a complete tree is used to keep track of free pages and their remaining free-space.
Indexing Component - we used a B+ tree data structure for index files.
Query Engine - for queries, we implemented support for conditional filtering, projection, table scans, (tree) index scans, nested loop joins, index nested loop joins, and GRACE hash join.
Tools: Visual Studio 2010, Git
Download: Source Code