Final Presentation

Outline

    • Background

      • H.264 video codec & encoding

        • Interframe encoding

          • MVs + reference = residual

        • Motion Estimation

          • It's paralellizable!

          • 1920×1088 (1080p HD video)

            • = 8,160 Macroblocks

          • Where did the block go?

            • Search window: 16 is normal (maybe 32...)

              • (16*2+1)^2 = 1089

          • Per position: SAD (16x16)

          • Full search (exhaustive)

            • a lot!

            • Usually (i.e. x264) not done exhaustively, but per FRAME still a lot of work

      • Previous attempts

        • On CPU

          • x264- very optimized

        • in CUDA (Jae)

          • Wei-Nien Chen; Hsueh-Ming Hang, "H.264/AVC motion estimation implmentation on Compute Unified Device Architecture (CUDA)," Multimedia and Expo, 2008 IEEE International Conference on, pp.697-700, June 23 2008-April 26 2008.

          • S Ryoo, CI Rodrigues, SS Baghsorkhi, SS Stone, DB. "Optimization Principles and Application Performance Evaluation of a Multithreaded GPU Using CUDA" 2008.

          • MVp problem

            • needed for cost calculations

            • Quality vs. speed

    • Our project

      • Deal with MVp problem to allow us to solve ME problem in parallel

      • Hierarchical (pyramid)

        • provides an estimate for the MVp

      • Alternatives: wavefront (figure out interblock dependencies)

      • CUDA implementation overview

        • thread organization

        • memory organization

    • Testing framework (Lawrence)

      • python framework for ME algorithm testing (replaces Matlab portion from before)

        • examples of C extensions for gold standard code

        • and examples of pyCuda code

        • side by side or overlay comparisons

    • Results (?)

      • motion estimation speedup

      • entire encoder speedup

      • video encoding/decoding demo

    • Conclusions

      • future extension

      • CUDA experiences (prescription for future improvement of language, architecture, tools, programming model, etc.)

    • Acknowledgement

      • Dark_Shikari (x264 dev)