Chasing Speed:
How We Measure, Break, and Redefine Performance
Chasing Speed:
How We Measure, Break, and Redefine Performance
::: Home > Instruction > CMSC 180: Introduction to Parallel Computing > Topic 04: Chasing Speed
In this topic, we learn how to measure what “fast” really means.
We study how to quantify the benefits of parallelism using speedup, efficiency, and cost.
We see why even a million processors cannot make a slow sequential part vanish, and how researchers turn this limitation into opportunity through better scaling laws and smarter algorithms.
Finally, we learn how to measure performance in practice—with profiling tools, real data, and real-world case studies.
Define and apply key performance metrics such as speedup, efficiency, and cost.
Explain Amdahl’s Law and Gustafson’s Law, and compare how they view scalability.
Evaluate scalability and cost-optimality using isoefficiency analysis and practical profiling data.
Why can’t adding more processors always make a program infinitely faster?
What different stories do Amdahl’s and Gustafson’s laws tell about scaling?
Why might a “cost-optimal” algorithm still fail to deliver good results on real machines?
Speedup, Efficiency, and Cost
Defining Performance: When Fast Is Not Always Efficient
Amdahl’s and Gustafson’s Laws: Two Ways to See Limits
Scalability Metrics
Isoefficiency and Cost-Optimality: Keeping Pace with Growth
Load Balance and Granularity: Sharing the Burden Wisely
Measuring Performance in Practice
Instrumentation and Profiling: Seeing Where Time Goes
Case Studies of Scaling: When Numbers Meet Reality
Current Lecture Handout
Chasing Speed: How We Measure, Break, and Redefine Performance, rev 2023*
Note: Links marked with an asterisk (*) lead to materials accessible only to members of the University community. Please log in with your official University account to view them.
The semester at a glance:
Amdahl, G. M. (1967). Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the AFIPS Spring Joint Computer Conference, 30, 483–485. https://doi.org/10.1145/1465482.1465560
Baker, C., Chaudhuri, A., & Kale, L. V. (2021). Dynamic load balancing in Charm++ for exascale applications. Concurrency and Computation: Practice and Experience, 33(21), e6379. https://doi.org/10.1002/cpe.6379
Dongarra, J., et al. (2022). Report on the Frontier exascale system. International Journal of High Performance Computing Applications, 36(4), 435–451. https://doi.org/10.1177/10943420221110772
Gustafson, J. L. (1988). Reevaluating Amdahl’s law. Communications of the ACM, 31(5), 532–533.
Hummel, F., et al. (2020). Strong and weak scaling of molecular dynamics simulations on GPUs. Computer Physics Communications, 255, 107263. https://doi.org/10.1016/j.cpc.2020.107263
Jouppi, N. P., et al. (2020). A domain-specific supercomputer for training deep neural networks. Communications of the ACM, 63(7), 67–78. https://doi.org/10.1145/3360307
Malony, A. D., et al. (2011). Performance analysis tools for HPC applications. Concurrency and Computation: Practice and Experience, 23(3), 271–290. https://doi.org/10.1002/cpe.1640
Michalakes, J., et al. (2015). WRF model scaling analysis on modern supercomputers. Journal of Computational Physics, 295, 103–115. https://doi.org/10.1016/j.jcp.2015.03.048
Palmer, T. N., et al. (2022). High-resolution climate modeling in the exascale era. Nature Climate Change, 12(3), 198–207. https://doi.org/10.1038/s41558-022-01290-2
Shalf, J., et al. (2020). HPC cost and energy efficiency analysis of NASA’s Pleiades supercomputer. IEEE Computer, 53(8), 58–67. https://doi.org/10.1109/MC.2020.2993853
Access Note: Published research articles and books are linked to their respective sources. Some materials are freely accessible within the University network or when logged in with official University credentials. Others will be provided to enrolled students through the class learning management system (LMS).
::: Home > Instruction > CMSC 180: Introduction to Parallel Computing > Topic 04: Chasing Speed