Reducing Wasted Speculation
Abstract:
Modern microprocessors achieve high performance through aggressive speculation. However, large amounts of energy and potential performance are lost by speculating fruitlessly. The two most important speculation techniques are caches and speculative execution.
Caches hold a subset of the blocks from the high-latency main memory, speculating that quick access to these blocks will benefit the program. Unfortunately, most blocks in the last-level cache will not be referenced again before they are removed from the cache. These dead blocks waste time and energy as they reduce the effective capacity of the cache.
Speculative execution mitigates pipeline control hazards by predicting the outcome of branches, allowing subsequent instructions to be fetched and executed down the predicted path. Many instructions will be wrongly executed before an incorrect prediction is discovered, again wasting time and energy.
This talk discusses novel techniques for reclaiming lost performance and energy through reducing speculation wasted by caches and speculative execution. The talk will also discuss ongoing projects and future research directions.
Biography:
Daniel A. Jimenez is an Associate Professor in the Department of Computer Science at The University of Texas at San Antonio. He is currently on leave at the Barcelona Supercomputing Center. His research focuses on microarchitecture and low-level compiler optimizations. From 2002 through 2007, Daniel was an Assistant Professor in the Department of Computer Science at Rutgers. In 2005 Daniel took sabbatical leave at the Technical University of Catalonia (UPC) in Barcelona, Catalonia, Spain. In 2008 he was promoted to Associate Professor with tenure at Rutgers. Daniel earned his B.S. (1992) and M.S. (1994) in Computer Science at The University of Texas at San Antonio and his Ph.D. (2002) in Computer Sciences at The University of Texas at Austin. He is an NSF CAREER award recipient, an ACM Senior Member, and General Chair of the 2011 HPCA conference.
Sangyeun Cho
University of Pittsburgh
StimulusCache and MorphCache
Abstract:
This talk will focus on our recent multicore L2 cache design related research efforts. StimulusCache is motivated by the fact that future processors can suffer more frequent hard faults. Processor vendors already resort to "core disabling" to combat the low yield problem caused by hard faults. With core disabling, faulty processor cores are taken off-line so that the chip can be salvaged. However, conventional core disabling ignores the yield disparity between a compute core dominated by random logic and the associated L2 cache that has a more regular structure. StimulusCache decouples cores and private L2 caches to expose "excess caches" after core disabling so that they can be beneficially utilized by other healthy cores. MorphCache is designed to efficiently support heterogeneous workloads (e.g., virtual machines of different users) on many cores. MorphCache's novel architecture techniques are its flexible private cache capacity allocation, distance-aware placement, and efficient broadcasting. Although these techniques work cooperatively, we found exclusive capacity allocation with chain links most important.
Abstract:
Sangyeun Cho received the BS degree in computer engineering from Seoul National University in 1994 and the PhD degree in computer science from the University of Minnesota in 2002. In 1999, he joined the System LSI Division of Samsung Electronics Co., Giheung, Korea, and contributed to the development of Samsung's flagship embedded processor core family CalmRISC(TM). He was a lead architect of CalmRISC-32, a 32-bit microprocessor core, and designed its memory hierarchy including caches, DMA, and stream buffers. Since 2004, he has been with the Computer Science Department at the University of Pittsburgh, where he is currently an associate professor. His research interests are in the area of computer architecture and embedded systems with particular focus on performance, power and reliability aspects of memory and storage hierarchy design for next-generation multicore systems. Sangyeun is the happy father of Seyun (b. December 2009).