recommended
David Luebke and Greg Humphreys. 2007. How GPUs work. IEEE Computer,40, 2, 96-100. (ncsu)
Cook, R. L. (1984). Shade trees. ACM Siggraph Computer Graphics, 18(3), 223-231. (ncsu)
Akenine-Moller, T., Haines, E., & Hoffman, N. (2018). The graphics processing unit. Chapter 3 in Real-time rendering. AK Peters/CRC Press. (ncsu)
Wyman, C., & Marrs, A. (2019). Introduction to directx raytracing. Ray Tracing Gems: High-Quality and Real-Time Rendering with DXR and Other APIs, 21-47. (pdf)
optional
Michael Garland and David B. Kirk. 2010. Understanding throughput-oriented architectures. Commun. ACM 53, 11 (November 2010), 58-66. (ncsu)
Kayvon Fatahalian and Mike Houston. 2008. A closer look at GPUs. Commun. ACM, 51, 10, 50-57.Â
Cliff Woolley. 2005. GPU program optimization. Chapter 35 in M. Pharr & R. Fernando (eds.), GPU Gems 2, Addison Wesley.
Cem Cebenoyan. 2004. Graphics pipeline performance. Chapter 28 in R. Fernando (ed.), GPU Gems, Addison Wesley.
ETH's Computer Vision and Geometry Group's lecture on GPGPU optimization
AnandTech's piece on DirectX 11
David Owen & David Luebke's Intro to Parallel Programming Udacity course.
Bill Dally. (2018). Future of AI at NVIDIA. Practical AI Podcast, 15.
Bill Dally. (2018). Where AI goes next. AI Podcast, 62.
manual
unity: performance, optimization, optimization2
videos