Unit 3.4.2 Avoiding repeated memory allocations (may become important in Week 4)
ΒΆIn the final implementation that incorporates packing, at various levels workspace is allocated and deallocated multiple times. This can be avoided by allocating workspace immediately upon entering LoopFive (or in MyGemm). Alternatively, the space could be "statically allocated" as a global array of appropriate size. In "the real world" this is a bit dangerous: If an application calls multiple instances of matrix-matrix multiplication in parallel (e.g., using OpenMP about which we will learn next week), then these different instances may end up overwriting each other's data as they pack, leading to something called a "race condition." But for the sake of this course, it is a reasonable first step.