EECS 573

Term Research Project

The term project consists in a research project of your choice, related to the material covered in class. It could involve exploring some new solutions in architecture, focused on either new techniques or methodologies. It could also explore some new applications of the topics discussed in class to a different area of research. Or, it could look at optimizing an application for a particular architecture. You may develop the project in groups of 2, 3, or 4 people. (Also, PhD students pursuing their own research in the class may also make a special request to propose a single-person project.)

Please read this page completely! Also, the document includes a list of suggested projects, if you are looking for ideas. One page project proposals are due (in class and via email to the professor and GSI) on Monday, October 3.

Deadlines

Project list available: Sept 21
Project proposal due: Oct 3, in class, also email to professor and GSI
Proposal and checkpoint meeting: shortly after Oct 3, watch for a class announcement
Project presentations: Week of December 5 (a 4-hour extended meeting with refreshments)
Final report and deliverables due: December 9 (by end-of-day) via email to professor and GSI

Project proposal

A project proposal specifies the project title, members of the project team, objectives of your project, describing briefly how you are going to achieve these objectives and what will be your deliverables (that is, things that you will supply to the professor, typically a report, code, and instructions on how to build and run the code). The outline should also specify the relevance/novel contributions of your approach and references to related work in this area. Finally, you should include a development timeline with milestones completed by the first checkpoint meeting. 3-6 paragraphs (max 1 page) will suffice.

Checkpoint meeting

Each group will have a checkpoint meetings to review project status and challenges. The group must bring all team members to the checkpoint meeting with the professor, and they must provide a 1-page summary status report of the project, indicating achievements, unresolved issues and remaining timeline for project completion.

Project presentation

A 15-minute presentation at the end of the semester on the problem you were trying to solve, motivations, objectives and the results of your work. You should also include some discussion on interesting aspects of your work, problems you encountered and interpretation of your results.

Project report

A final report document on your project of length MAX 4 pages. The deliverables also include the software you developed and/or the verification or hardware code that was part of your project. The final report document must be submitted electronically in PDF format.

IMPORTANT POINT: please include a section detailing "Group Dynamics", which specifically states which work was done by each team member.

Grading

The project is worth 40% of your overall grade in EECS 573.

Projects Ideas

New: Port your favorite application to the VIP-Bench Benchmark Suite. VIP-Bench is composed of benchmarks that utilize secret computation, a form of computation that performed directly on encrypted data. Secret computation is an architectural technique to support data privacy and stop data breaches.
New: Optimize the complexity of a data-oblivious algorithm that utilizes secret computation. Secret computation prevents algorithms from making decisions based on the data they are processing, which can increase their complexity. For example, sorting algorithms cannot terminate early because the algorithm cannot tell when the data finishes sorting early. Take an existing algorithms, devise a data-oblivious version that reduces its complexity, and measure its performance advantage using the VIP-Bench evaluation facilities. More details are available upon request.
New: Develop a specialized compute element for an image-based classification application (e.g., handwriting recognition) and evaluate using an architecture simulator.
New: Design a energy-saving technique for random forests and evaluate the power savings using a simulator which provides power results.
New: Create a new graph data management technique for accelerating dynamic graph processing and evaluate on a memory system simulator.
New: Devise and evaluate a way to leverage processing-in-memory (PIM) to increase the performance of standard graph processing.
New: Build a compute-in-cache design to accelerate the performance of General Matrix Multiply (GeMM) and evaluate the performance gain vs power overhead.
New: Take an explainable AI model, propose changes to an architecture which accelerates execution for a particular, interesting application of the model, and evaluate the expected performance gain vs area overhead.

Accelerate a core AI algorithm using the RoCC extension interface on the RISC-V rocket core, provide access to the extension with GCC extensions, and measure the cycle time benefits compared to a version of the program without hardware extensions.
Add an encryption capability to the RISC-V pipeline, examples include homomorphic encryption, RSA exponentiation, digital-currency compatible hash generator, etc., provide access to the extension with GCC extensions, and measure the cycle time benefits compared to a version of the program without hardware extensions.
Add a trusted execution environment capability (TEE) to RISC-V, implement it with Chisel in the Rocket core, provide access to the extension with GCC extensions, and measure the cycle time benefits compared to a version of the program without hardware extensions.
Build a fuzzer for RISC-V integrated accelerators that can check for correctness and security issues, work with another project this semester in EECS 573 to help them validate their extensions with your fuzzing tool.
Exploit the existence of partially tagged BTBs to subvert compiler-based Spectre/Meltdown protections, details are available on request.
Add strongly authenticated pointers to RISC-V, similar to those found in ARM cores, and try to add your own spin on the technology, like storing the PAC in metadata memory or increasing the size of the MAC to ensure no false positives or pointer forging.
Add trusted boot support to RISC-V: the design must be able to attest the code that is being loaded into memory, before it is allowed to execute; as a bonus, give an interface to the hardware attestation mechanism that can attest all of the software in the system, this will stop the jailbreaking technologies that work to side-step trusted boot mechanisms implemented in software.
Add an authenticated register file (ARF) to RISC-V: the ARF contains values that are known to come an authenticated source, likely inserted into the ARF using digitally signed write commands; explore how the ARF can be used to eliminate the need for the as-yet experimental PUF circuits.
Design a very simple computer (e.g., LC2K) that possesses tamper-resistant hardware, such that if the hardware is changed in any way: at design or manufacturing time, those changes can be detected or thwarted at run-time
System-in-Package (SiP) technology - building chips by bonding "chiplets" to a silicon interposer - can help reduce custom chip design costs and improve performance in various ways. Another interesting possibility is that, by building chips in pieces, it might be possible to integrate various disparate technologies (quantum computing, carbon nanotubes, TFETs, etc.) on a single chip, where previously it would have been impossible to build these all into a single monolithic die. Consider applications for which this integration would be useful, and then design/evaluate such a SiP.
The A2 analog malicious hardware attack was successful because it broke standard security/design paradigms to target a different layer of the computing stack, going analog to bypass digital-centric defences. Can we likewise defend more effectively against A2-style attacks by going to different layers of the stack? Devise an approach...
Two recent papers "Flipping Bits in Memory Without Accessing Them" and "ANVIL: Software-based protection against next generation rowhammer attacks" have shown serious security concerns due to rowhammer attacks, where repeated accesses to a row of memory cause bit flips in adjacent rows, these attacks used the CLFLUSH x86 instruction to flush specific cache lines, thereby allowing high locality rowhammer accesses to reach the DRAM bypassing the on-chip caches, this attack was not shown on ARM processors as they don’t support unprivileged CLFLUSH instruction, perhaps you can employ non-cacheable reads and/or memory barrier instruction like DMB to bypass the cache and show the possibility of rowhammer attacks on ARM processors.
Even though execution platforms are comprehensively verified to machine code and ISA level, caches are mostly excluded and are considered transparent to program behavior, recent work "Cache Storage Channels: Alias-Driven Attacks and Verified Countermeasure" has shown that by deliberately not following the programming guideline and breaking coherency of the memory system a cache storage side channel can be induced on ARMv7 processors, by accessing the same physical address through virtual aliases with mismatched cacheability attributes as well as executing self-modifying code without flushing the instruction cache, this work has shown attacks ranging from extracting secret keys from victim processes to subverting the integrity properties of an ARMv7 hypervisor, Intel’s x86 reference manual states that memory type aliases using page tables and page attribute table(PAT) may lead to undefined operations that can result in a system failure, it is also explicitly stated that the accesses using the (non-cacheable) WC memory type may not check the caches, in this project try leveraging this fact to enable these attacks on Intel x86 processors and propose a solution to stop them.
Employ (micro)architectural perturbations in a new and unique way, for example, one might be able to use perturbations to make an early misprediction "prediction", which could potentially reduce significantly the cost of a branch misprediction design a hardware-based shadow stack scheme that keeps a copy of call/return addresses in a randomly placed location in the physical address space, far from wandering loads and stores that want to redirect control flow to injected data; use the shadow stack to verify return instructions; could you adapt this technique to indirect jumps? (e.g., v-tables, switch statements, GOT table, etc...)
A recent paper "Profiling a warehouse-scale computer" showed that about 5% of Google's entire server capacity is spent "pickling" (or "protobufs"), that is, serializing data structures so they can be efficiently stored and moved throughout a distributed system; devise an accelerator that can significantly reduce the cost of this these operations.
Designing NVM memory attacks: as NVM memory is more tightly integrated, it becomes possible to create memory access patterns with the express purpose of wearing out the storage, design an attack and devise ways to protect against it
Spectre attacks: defeat all of the Spectre measures that have been proposed by companies effected by the attack (hint: all can be defeated with clever manipulation of the microarchitecture), devise protections that cannot be easily defeated
Devise an architecture with information protection domains, such that information can be tagged, public, protected, or private (similar to the protection domains provided by some programming languages), show that the implementation enforces the movement of data only between compatible protection domains, show how the technology can prevent security attacks.
Security bug: design a subtle bug for one of the processors that you designed or that we can provide to you (picojava, opensparc, verisimple, mips lite, a few 470 final project designs). The bug should be sufficiently subtle that it does not affect the normal system functionality. However, it should enable a backdoor for a security attack. Develop at least one security attack made possible by your bug. This is probably a very challenging project.
X-point memory: design an architecture that integrates X-point memory into the memory hierarchy (rather than as a disk as most architectures are proposing to do), determine how to deal with the durability issues (perhaps with an approrpriate application) and determine the benefits
The introspective core: design an instruction set extension that can provide a new level of analysis into the operation of a processor; for example, it acts as a logic analyzer, or defect detector, or some other analyzer that would seem surprising/strange to implement in sofware.
Domain-specific processor designs: design a programmable processor for an important program domain with tight performance/power/cost constraints, e.g.,
- soft radio
- audio codec processor
- video codec processor
- RF signal processor
- Natural I/O (speech, writing, gestures) processor
- Vision processor
- Recognition algorithms
- Data mining algorithms
- Data synthesis algorithms
- Etc...
Architectural mechanisms for fine-grain protection: add support to the microarchitecture to support byte-grain named protection domains; demonstrate the performance of these mechanisms, and show how they can be used (e.g., program/module protection, debugging, etc...)
Devise an application-specific architecture for quickly executing dynamically typed interpreted language (e.g., Python, JavaScript, actionscript, etc.), in particular for the use in speeding browser processing, as a bonus, make this core as tiny as possible to aid in making strong security claims about its performance
Graceful degradation: devise an architecture with performance that degrades gracefully with defects, don't worry how to detect the defects, simply demonstrate that if they can be detected, performance can be held in proportion to the number of working components.
Software-Based Resiliency: Design a software program that can tolerate transient faults, using only software based technique. Use research projects like DHard (http://plasma.cs.umass.edu/emery/diehard) for inspiration on how to accomplish this goal. Using asynchronous interrupts to injects random transient faults into your program, explore the cost-benefit trade-offs of a software-based resiliency technique.
Devise a cache design that is resistant to side channel attacks, it must not be possible to determine the actions of the program from viewing the cache miss stream; it must be a novel approach or build substantially on previous techniques

Propose Your Own Project: Come up with your own project idea and pitch it to the professor. Be mindful however that your proposal must focus on a architectural design or analysis technique, and it must also detail the infrastructure to be utilized, and what experiments and measurements are to be performed within the timeframe of this semester's EECS 573 course. Good luck!

Page updated

Report abuse