Open Research Compiler (ORC) for the Itanium Processor Family
In conjunction with MICRO-34
Austin, Texas
December 1, 2001
The Explicitly Parallel Instruction Computing (EPIC) architecture,
exemplified by the Itanium Processor Family (IPF), has led to an
exciting arena for the compiler and architecture research. To enable
and encourage research activities on IPF, Intel Corp. in collaboration
with the Chinese Academy of Sciences plans to deliver an open-source IPF
compiler, Open Research Compiler (ORC), around Q4 of 2001.
The goal of ORC is to provide a flexible and robust compiler
infrastructure to the research community so that researchers can
focus their efforts on solving their critical issues without spending
substantial resources on infrastructure work. ORC is based on an
advanced and product-quality open source compiler, Pro64, which include
many optimization components, such as inter-procedural analysis and
optimizations, loop-nest optimizations, and machine-independent and
-dependent optimizations.
ORC has added a number of novel research infrastructure features and
state-of-the-art IPF optimizations. The infrastructure features include
region-based optimization framework, parameterized machine model, and
support for a rich set of profiling feedback. The newly designed
IPF-specific optimizations include control and data speculation with
recovery code generation, if-conversion, predicate analysis to support
predicate-aware data flow analysis, and instruction scheduling integrated
with template selection. ORC combined with an IPF performance simulator,
such as one produced by the open source Liberty Simulator Environment
from the Princeton University, provides a complete framework to study
various architecture and compilation issues on EPIC. This tutorial will
be the first public forum to introduce the technical features in ORC.
Planned outline of the tutorial:
1. Overview of ORC
- Objectives of ORC
- Research compiler infrastructure for IPF
- Design goals of ORC
- Requirements and priority
- Overview of ORC/Pro64
- Features and state of the compiler
- Different components, such as interprocedural analysis (IPA),
loop-nest optimizer (LNO), global optimizer (WOPT),
code generation (CG), etc.
- Intermediate representations in ORC/Pro64
- List of functionality in the first release
- Research infrastructure features
- IPF-specific optimizations
2. Infrastructure features to enable compiler research
- Region-based optimizations
- Definition and structures of regions
- Properties and features of the regions
- Region formation
- Framework
- Optimization phases driven under this framework
- Compared to other region-based work
- Parameterized machine model
- Reading architecture and micro-architectural parameters from IPF
Knobsfile API
- Centralized machine parameters
- How to change machine parameters for architectural study
- Rich set of profiling
- New edge frequency profiling support in CG
- Co-exist with the current Pro64 edge profiling
- Various instrumentation points in CG
- Value profiling
- Support to extend to other profiling
3. Advanced optimizations for features on Itanium Processor Family (IPF)
- If-conversion and predicate analysis
- Simple and effective if-conversion
- Leveraging hyperblock formation
- Generation of parallel compares
- Analysis of predicate relations
- Predicate analysis to be used for predicate-aware data flow analysis
- Control and data speculation with recovery code generation
- Control and data speculation
- Advanced speculation
- Combined control and data speculation
- Speculation feeding to speculation
- Recovery code generation
- Register allocation in recovery blocks
- Instruction scheduling
- Global and local instruction scheduling
- DAG-based
- Integrated with selection of templates and issue ports
- Scheduling partially ready instructions
- Generating compensation code
4. Case study
- Parameterized machine model
- Changing the number of bundles issued per cycles
- Changing the dispersal rules
- Instruction scheduling and bundling
- Performance analysis
5. Release and future plan
- Planned features for the second release and beyond
- IPF simulator
- Other tools
- Inviting research to ORC
- thread-level parallelism
- instruction-level parallelism
- co-design of architecture and compiler
- power management
- optimization for memory hierarchy
- co-design of static and dynamic compilation
- program analysis
- Release support
- how to download and install
- documentation
- how to report problems
- how to contribute to the source code
- User groups
- how to provide feedback
- mail alias
Tutorial Organizers:
Roy Ju (MRL, Intel) [email protected]
Sun Chan (MRL, Intel) [email protected]
Chengyong Wu (ICT, CAS) [email protected]
Bios:
Roy D.C. Ju is a senior researcher in the Programming System Lab in the
Microprocessor Research Labs, Intel Corp. He is currently the compiler
architect of an IA-64 open source research compiler, which aims at providing
an infrastructure for compiler and architecture research on IA-64 to the
research and open source communities. His primary research interests
include compiler optimization, optimization for memory hierarchy, software
power management, program analysis, computer architecture, and parallel
processing. Prior to joining in Intel, he was with the Hewlett-Packard
Company from 1994 to 1999, and he was a project lead in designing and
developing an optimizing compiler for IA-64. He worked at IBM from 1992
to 1994 in developing a then state-of-the-art Fortran 90 optimizing
compiler.
He received his B.S. degree in Electrical Engineering from National Taiwan
University in 1984. He received his M.S. and Ph.D. degrees in Electrical
and Computer Engineering from the University of Texas at Austin in 1988
and 1992, respectively. He currently holds 6 U.S. patents and has published
more than 30 journal and conference papers in various areas, including
array language optimizations, compilation for instruction-level parallelism,
cache optimization, coarse-grained parallelization, etc. He has served on
the program committees of a number of conferences, such as MICRO-33 and
PLDI 2001.
Sun C. Chan is the manager of the open source research compiler project in
the
Microprocessor Research Labs, Intel Corp. His primary research interests
include large-scale software engineering, compiler scalar optimization, both
in the global and inter-procedural areas, and instruction level parallelism.
Prior to joining Intel, he was with SGI where he was a manager of global
optimizer and interprocedural optimizer. He is also the coordinator and
architect of the open source Pro64 compiler. Before SGI, he was a project
lead at Mips working on global scheduling and optimization of dynamic shared
objects. He received his M.S. degree in Computer Science from Purdue
University
in 1981. He holds 10 U.S. patents with several more pending and has
published
in journal and conference papers in various areas, including
inter-procedural
analysis, global optimization and instruction-level parallelism. He is also
interested in engaging university researchers in compiler and architecture
research.
Chengyong Wu received the B.S. degree in Mathematics from the Fudan
University,
Shanghai, P. R. China, in 1991 and the M.S. degree in Computer Engineering
from
the Beijing University of Aeronautics and Astronautics, Beijing, P. R.
China,
in 1996 and the Ph.D. degree in Computer Sciences from the Institute of
Computing Technology, Beijing, P. R. China, in 2000. Since March 2000, he
has
been with the Advanced Compiler Technology Lab at the Division of Computer
Systems, Institute of Computing Technology, Chinese Academy of Sciences. His
research interests include instruction-level-parallelism, optimization, and
common compiler infrastructure. He is currently working in a project of
developing an IPF open source research compiler.