Skip to main content

Unit 2.1.2 Outline Week 2

  • 2.1 Opening Remarks

    • 2.1.1 Launch

    • 2.1.2 Outline Week 2

    • 2.1.3 What you will learn

  • 2.2 Blocked Matrix-Matrix Multiplication

    • 2.2.1 Basic idea

    • 2.2.2 Haven't we seen this before?

  • 2.3 Blocking for Registers

    • 2.3.1 A simple model of memory and registers

    • 2.3.2 Simple blocking for registers

    • 2.3.3 Streaming \(A_{i,p} \) and \(B_{p,j} \)

    • 2.3.4 Combining loops

    • 2.3.5 Alternative view

  • 2.4 Optimizing the Micro-kernel

    • 2.4.1 Vector registers and instructions

    • 2.4.2 Implementing the micro-kernel with vector instructions

    • 2.4.3 Details

    • 2.4.4 More options

    • 2.4.5 Optimally amortizing data movement

  • 2.5 Enrichments

    • 2.5.1 Lower bound on data movement

  • 2.6 Wrap Up

    • 2.6.1 Additional exercises

    • 2.6.2 Summary