Three-dimensional memory vectorization for multimedia applications
Authors:
Jesus Corbal, Departament d'Arquitectura de Computadors, UPC, Barcelona
Roger Espasa, Departament d'Arquitectura de Computadors, UPC, Barcelona
Mateo Valero, Departament d'Arquitectura de Computadors, UPC, Barcelona
Abstract:
Vector processors have good performance, cost and adaptability
when targeting multimedia applications.
However, for a significant number of media programs,
conventional memory configurations fail to deliver enough memory
references per cycle to feed the SIMD functional units. This paper
addresses the problem of the memory bandwidth.
We propose a novel mechanism suitable for 2-dimensional vector
architectures and targeted at providing high effective bandwidth for
SIMD memory instructions. The basis of this mechanism is the
extension of the scope of vectorization at the memory level,
so that 3-dimensional memory patterns can be fetched into a
second-level register file.
By fetching long blocks of data and by reusing 2-dimensional
memory streams at this second-level register file, we obtain a
significant increase in the effective memory bandwidth. As side
benefits, the new 3-dimensional load instructions provide a high
robustness to memory latency and a significant reduction of the
cache activity, thus reducing power and energy requirements.
At the investment of a 50% more area than a regular SIMD register
file, we have measured and average speed-up of 13% and the potential
for power savings in the L2 cache of a 30%.
Web Site:
http://research.ac.upc.es/hpc/