Cache Architecture Exploration for Memory-Intensive Kernels
Evaluating cache hierarchy, associativity, and block-size trade-offs using Intel PIN and a Python-based cache simulator  

Project Overview
This project analyzed cache-system behavior for memory-intensive 2D tensor workloads by developing and profiling three kernels: scatter, gather, and convolution. The study used Intel PIN to capture memory-access traces and a Python-based cache simulator to evaluate how different cache configurations affect hit rate, miss rate, and average memory access time (AMAT). The experiments varied cache hierarchy depth, L2 associativity, and L2 block size to understand performance trends across multiple workload patterns.