Heterogeneous Computing for Signal and Data Processing (with GPU and parallel computing)

(Previously Signal Processing and Communications on Mobile Multicore Processors)

Course number: EECS E4750      

Target Audience: 

    Students interested in acquiring software and systems design skills in parallel computing for graphics processing units (GPUs) and heterogeneous computing infrastructures, relevant to applications in data processing, deep learning, signal and communications industries in the next decade. Applicable to research projects of CU faculty.

  --> Course description link.

  --> Project proposals (for companies, collaborators, alumni): please enter your project ideas here.

Bulletin Description:
  • Methods for deploying signal and data processing algorithms on contemporary general purpose graphics processing units (GPGPUs) and heterogeneous computing infrastructures. Using programming languages such as OpenCL and CUDA for computational speedup in audio, image and video processing and computational data analysis. Significant design project.
Course: Open to Columbia students.
  • Fall 2017: EECS E4750.
  • Fall 2016: Promoted to EECS course.
  • Fall 2015: Electrical Engineering E4750 section 001 :: SP & COMM ON MOBILE MULTI PROC  ::  R 1:10pm-3:40pm
  • Fall 2014: Electrical Engineering E4750 section 001 :: SP & COMM ON MOBILE MULTI PROC  ::  R 1:10pm-3:40pm

Applications of Parallel Computing

Heterogeneous Parallel Computing (HPC)

Parallel SW development in OpenCL and CUDA, Apple Metal, Vulkan, other standards.

  • Motivating examples from imaging, audio, multimedia
  • Cross section of mobile processor architectures: Nvidia, AMD, Intel
  • General Purpose Processors, Graphic Processing Units (GPU), DSPs
  • ARM architecture
  • Parallel programming concepts for mobile platforms
  • CUDA and OpenCL language 
  • Tools: development environments, code development, profiling
  • Standards: Khronos OpenGL, WebGL, HSA
  • Parallel programming examples
    •    Signal processing
    •    Image and video processing
    •    Communications processing, protocols
    •    Data Analysis
    •    Neural networks and deep learning
  • Power Considerations
 Selected Details
  • Portability and Scalability in HPC
  • Data Parallelism and Threads
  • Memory Allocation and Data Movement
  • Kernel-Based Parallel Programming
  • Matrix-Matrix Multiplications
  • Thread Scheduling
  • Tiled Processing
  • Control Divergence
  • Memory BW and Coalescing
  • Convolution and Tiled Convolution
  • Reduction Kernels
  • Atomic Operations
  • Histogram Kernel
  • Applications: Deep Learning, Imaging, Video, ...
Project Suggestions for implementation in OpenCL
  • Image processing
  • Audio processing
  • Machine learning
  • Deep Learning algorithm parallelization
  • Optimization of communication networks
  • Optimization of energy networks
  • Medical applications
  • Graphics
  • Video processing
  • Visualization
  • Financial applications
 Books, Tools and Resources
  • BOOKS:
    • David Kirk and Wen-mei Hwu, "Programming Massively Parallel Processors -A Hands-on Approach," 3rd Edition, publisher: Elsevier eBook ISBN: 9780128119877, Paperback ISBN: 9780128119860,  (https://www.elsevier.com/books/programming-massively-parallel-processors/kirk/978-0-12-811986-0)
    • Old book- D. Kirk and W. Hwu, “Programming Massively Parallel Processors – A Hands-on Approach,” 2nd Edition, Morgan Kaufman Publisher (elsevier) ,ISBN-13: 978-0124159921 ISBN-10: 0124159923 (http://www.elsevier.com/books/programming-massively-parallel-processors/kirk/978-0-12-415992-1)
    • OpenCL Programming by Example, Ravishekhar Banger, Koushik Bhattacharyya, Packt Publishing (December 23, 2013),ISBN : 1849692343, ISBN 13 : 9781849692342
  • Parallel machines: Server with NVIDIA Tesla K40 + Nvidia Quadro K5000s + Mobile: Jetson TK1 + Intel Xeon E5-1620v
  • AMD OpenCL SDK (sw development kit)
2014 Fall Projects
  • Low Rank Matrix Recovery Using Principal Component Analysis
  • Acceleration of Genetic Algorithms and Image Pattern Recognition of fMRI Fingerprint
  • Parallel Implementations of Detection Algorithms for MIMO Systems on the GPU
  • Harnessing GPU for solving Options Pricing problems in Financial Engineering
  • Topics Extraction with GPU Acceleration (machine learning)
  • Parallel Decoding of Space Time Codes on GPU
  • Image processing using parallel computing and PyOpenCL (Night vision)
 2015 Fall Projects
  • 3D Image Reconstruction (Stereo Vision Based Depth Perception & 3D Spatial Reconstruction)
  • Accelerate the Analysis of EEG Signal Based on Nonlinear Feature Extraction and Classification by Parallel Algorithm
  • Fast VOIP MOS (Mean Opinion Score) calculation
  • GPU Acceleration for Neural Network based Handwritten Digits Recognition
  • Image Matching Accelerator based on SIFT
  • Image blending
  • Image Stitching
  • Local Linear Embedding using OpenCL
  • Performance of Linear Equalization in Narrowband Channels
  • Parallel Computing on SAR Image Processing
  • Canny Edge and Boundary Detection using OpenCL
  • Parallel HEVC Video Compression Using OpenCL
  • Speaker Recognition
2016 Fall Projects
  • 3D Voxel De-blurring
  • Laplacian Approximation on GPU
  • Basic object recognition
  • Camera Localization
  • Dark channel haze removal
  • Disparity Map Calculation by GPU
  • FPGA Implementation of PyOpenCL/OpenCL
  • First Principles MPI Simulator
  • Fingerprints recognition for security
  • GPU acceleration for SPH(Smooth Particle Hydrodynamics)
  • K-means Clustering Acceleration on GPU
  • Kinect Color and Depth Image Alignment
  • GPU-based Monte Carlo simulation of light transport for optical fiber probe geometries
  • Object Tracking Based on Video Analysis
  • Parallel Computing in Traffic Sign Detection
  • Real-time medical image processing empowered by parallel computing
  • Parallel simulated annealing
  • Recommendation Algorithms using deep learning
  • Real-time image de-hazing
  • Smart Cluster Construction
2017 Fall Projects
  • Project proposals (for companies, collaborators, alumni): please enter your project ideas here.