Compiler Managed Micro-cache Bypassing for High Performance EPIC Processors
Authors:
Youfeng Wu, Ryan Rakvic, Li-Ling Chen, Jesse Fang
Microprocessor Research Labs
2200 Mission College Blvd
Santa Clara, CA 95052-8119
Intel Corporation
Chyi-Chang Miao, George Chrysos
Massachusetts Microprocessor Design Center
334 South Street
Shrewsbury, MA 01545-4112
Intel Corporation
Abstract:
Advanced microprocessors have been increasing clock rates, well beyond the
Gigahertz boundary. For such high performance microprocessors, a small and
fast data micro cache (ucache) is important to overall performance, and
proper management of it via load bypassing has a significant performance
impact. In this paper, we propose and evaluate a hardware-software
collaborative technique to manage ucache bypassing for EPIC processors. The hardware supports the ucache bypassing with a flag in the load instruction
format, and the compiler employs static analysis and profiling to identify
loads that should bypass the ucache. The collaborative method achieves a
significant improvement in performance for the SpecInt2000 benchmarks. On
average, about 40%, 30%, 24%, and 22% of load references are identified to
bypass 256B, 1K, 4K, and 8K sized ucaches, respectively. This reduces the
ucache miss rates by 39%, 32%, 28%, and 26%. The number of pipeline stalls
from loads to their uses is reduced by 13%, 9%, 6%, and 5%. Meanwhile, the
L1 and L2 cache misses remain largely unchanged. For the 256B ucache,
bypassing improves overall performance on average by 5%.
Web Site:
www.intel.com/research/mrl/