Strip Mining
Getting effective performance from a multi-processor machine (i.e., getting speedup close to n from n processors) is a difficult problem.
For some matrix computations, analysis of loops and array indexes may allow ``strips'' of the array to be sent to different processors, so that each processor can work on its strip in parallel.
This technique is effective for a significant minority (perhaps 25%) of important matrix computations.