Instructor: Ramakrishna Upadrasta (U. Ramakrishna)

Email: Ramakrishna AT iith DOT ac DOT in

Office:

The objective of this course is to learn basic and advanced compiler optimization techniques, either traditional or modern in their scope, or scalar-variable based or loop-optimization based in their application or machine independent or dependent in their variety.

The initial part of the course would be devoted to a collection of traditional and classic compiler analyses and optimizations that are primarily based on control flow and data flow analyses. This will be followed by studying more high-level optimizations that are based on the static single assignment intermediate representation as well as low-level optimizations like register allocation, instruction scheduling and software pipelining.

The latter part of this course would be devoted to a model named polyhedral compilation where for-loops can be transformed to run efficiently on advanced architectures like multi-core or GPU using rational and integer linear programming techniques. Here, the focus would be on basics of the three phase process of dependence analysis, affine scheduling and code generation.

11-March-2014 | Organization, Logistics and Introduction |

12-March-2014 | Introduction to Compiler Optimizations Global Sub-Expression Elimination, Copy Propagation, Loop Invariant Code-Motion, Strength Reduction, Redundancy Elimination, Unrolling, Inlining, Tail Recursion Removal, Vectorization, Loop Interchange, Loop Blocking Intro to Optimization by Prof. YN Srikant, IISc (NPTEL) |

14-March-2014 | Control Flow Analysis(CFA) Dominators, Back Edges, Natural Loops, Interval analysis, T1-T2 analysis, Reducibility (using back edges, Interval analysis and T1-T2 analysis). CFA by Prof. YN Srikant, IISc (NPTEL) Dominators by Prof. Keshav Pingali (UT Austin) |

19-March-2014 & 21-March-2014 |
Data Flow Analysis (DFA) DFA schema, Four classic DFA problems (Reaching Definitions, Available Expressions, Live Variables, Very Busy Expressions), DU and UD chains DFA by Prof. YN Srikant, IISc (NPTEL) DFA by Prof. Uday Khedker (IITB) |

28-March-2014 | DFA and Optimizations Optimizations using DFA: Global Common Subexpression Elimination, Copy Propagation, Loop Invariant Code Motion, Induction variables. DFA by Prof. YN Srikant, IISc (NPTEL) DFA optimizations by Prof. YN Srikant, IISc (NPTEL) DFA Bit Vector Frameworks by Prof. Uday Khedker (IITB) |

2nd-April-2014 | SSA Form: Introduction and Construction Introduction and Definition, Join sets and &Phi nodes Dominators and Dominance Frontiers, Minimal SSA phi placement (Cytron et al) and variable renaming SSA introduction by Prof. YN Srikant, IISc (NPTEL) Dominators by Prof. Keshav Pingali (UT Austin) SSA construction and optimizations by Prof. Keshav Pingali (UT Austin) |

4th-April-2014 & 9th-April-2014 |
SSA optimizations: Constant Propagation (CP) Constant propagation optimization, The CP lattice, Transfer function and meet operators, Classification of CP algorithms (the three older algorithms and Wegman-Zadeck algorithm), Wegman-Zadeck's Conditional Constant Propagation SSA Constant Propagation by Prof. YN Srikant, IISc (NPTEL) SSA construction and optimizations by Prof. Keshav Pingali (UT Austin) |

11th-April-2014 & 16th-April-2014 |
Register Allocation Hardness of problem: exact vs. heuristic solutions Chaitin's algorithm, Constructing the Interference Graph (IG) Kempe's heuristic for coloring, Recent results on chordal graphs Register Allocation by Prof. YN Srikant, IISc (NPTEL) Register Allocation by Prof. Keshav Pingali (UT Austin) Register Allocation from CMU |

16th-April-2014 | Instruction Scheduling and Software Pipelining Overview of Insruction Scheduling and Software Pipelining Instruction Scheduling by Prof. YN Srikant, IISc (NPTEL) Software Pipelining by Prof. YN Srikant, IISc (NPTEL) |

TBD | Loop Optimizations |

TBD | Polyhedral Compilation, Dependence Analysis |

TBD | Polyhedral Scheduling (Feautrier) |

**SSA and its Construction:**Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13, 4 (October 1991), 451-490. ACM-DL

**Constant Propagation:**Mark N. Wegman and F. Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst. 13, 2 (April 1991), 181-210. ACM-DL PDF of a textbook chapter

- .....

**Reducibility and Control Flow Analysis:**Allen, Kildall, Hecht, Kam, Ullman**Dominators:**Thomas Lengauer and Robert Endre Tarjan. 1979. A fast algorithm for finding dominators in a flowgraph. ACM Trans. Program. Lang. Syst. 1, 1 (January 1979), 121-141. ACM-DL**Reducibility and Exponentiality:**Larry Carter, Jeanne Ferrante, and Clark Thomborson. 2003. Folklore confirmed: reducible flow graphs are exponentially larger. SIGPLAN Not. 38, 1 (January 2003), 106-114. ACM-DL**Program Dependence Graph and Control Dependence:**Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9, 3 (July 1987), 319-349. ACM-DL**SSA form and DJ Graphs:**Vugranam C. Sreedhar and Guang R. Gao. 1995. A linear time algorithm for placing &Phi-nodes. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages (POPL '95). ACM, New York, NY, USA, 62-73. ACM-DL**DJ Graphs and Loops:**Vugranam C. Sreedhar, Guang R. Gao, and Yong-Fong Lee. 1996. Identifying loops using DJ graphs. ACM Trans. Program. Lang. Syst. 18, 6 (November 1996), 649-658. ACM-DL**Sreedhar's PhD thesis:**Vugranam C. Sreedhar. 1996. Efficient Program Analysis Using DJ Graphs. Ph.D. Dissertation. McGill University, Montreal, Que., Canada, Canada. UMI Order No. GAXNN-08158. PDF link**Chapters from SSA Book:**Rastello et al: Book-Link**Sparse Evaluations:**G. Ramalingam. 2002. On sparse evaluation representations. Theor. Comput. Sci. 277, 1-2 (April 2002), 119-147. ACM-DL Science-Direct**Control Dependence:**Keshav Pingali and Gianfranco Bilardi. 1997. Optimal control dependence computation and the Roman chariots problem. ACM Trans. Program. Lang. Syst. 19, 3 (May 1997), 462-491. ACM-DL**Compiler Phases and Machine Learning:**Keith D. Cooper, Devika Subramanian, and Linda Torczon. 2002. Adaptive Optimizing Compilers for the 21st Century. J. Supercomput. 23, 1 (August 2002), 7-22. ACM-DL, PS file from Rice (This paper is subsumed by the Handbook chapter "Machine Learning and Compiler Optimizations" by P. J. Joseph et. al.)**Optimization Sequences and Machine Learning:**Suresh Purini and Lakshya Jain. 2013. Finding good optimization sequences covering program space. ACM Trans. Archit. Code Optim. 9, 4, Article 56 (January 2013), 23 pages. ACM-DL**Machine Learning Models for Compiler Optimizations:**Kapil Vaswani, Matthew J. Thazhuthaveetil, Y. N. Srikant, and P. J. Joseph. 2007. Microarchitecture Sensitive Empirical Models for Compiler Optimizations. In Proceedings of the International Symposium on Code Generation and Optimization (CGO '07). ACM-DL (This paper is subsumed by the Handbook chapter "Machine Learning and Compiler Optimizations" by P. J. Joseph et. al.)**Code Generator Generator:**Christopher W. Fraser, David R. Hanson, and Todd A. Proebsting. 1992. Engineering a simple, efficient code-generator generator. ACM Lett. Program. Lang. Syst. 1, 3 (September 1992), 213-226. ACM-DL. Ref Ch#6 of Muchnick.**Chaitin's NP-Completeness:**Florent Bouchez, Alain Darte, and Fabrice Rastello. Register Allocation: What does Chaitin's NP-Completeness Proof Really Prove?. Technical report RR2006-13, LIP, ENS-Lyon, France, March 2006. PDF of Tech Report**Register Allocation and Chordal Graphs:**Fernando Magno QuintãPereira, Jens Palsberg. Register Allocation via Coloring of Chordal Graphs. The Third Asian Symposium on Programming Languages and Systems. 315 - 329. Spring. 2005. PDF of paper**Coalescing by Graph Coloring:**Sebastian Hack and Gerhard Goos. 2008. Copy coalescing by graph recoloring. In Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation (PLDI '08). ACM, New York, NY, USA, 227-237. ACM Link**Register Allocation for SSA-form:**Sebastian Hack, Daniel Grund, Gerhard Goos: Register Allocation for Programs in SSA-Form. CC 2006: 247-262 Springer**Sebastian Hack's thesis:**Register Allocation for Programs in SSA Form PDF file**Omega Test:**William Pugh. 1991. The Omega test: a fast and practical integer programming algorithm for dependence analysis. In Proceedings of the 1991 ACM/IEEE conference on Supercomputing (Supercomputing '91). ACM, New York, NY, USA, 4-13. ACM-DL**Parallelization of Do Loops:**Leslie Lamport. 1974. The parallel execution of DO loops. Commun. ACM 17, 2 (February 1974), 83-93. ACM-DL**Lamport is this year's Turing Award Winner!!**- .....

**Dependence Abstractions:**Yi-Qing Yang, Corinne Ancourt, and Françs Irigoin. 1995. Minimal data dependence abstractions for loop transformations: extended version. Int. J. Parallel Program. 23, 4 (August 1995), 359-388. IJPP-95, LCPC-94, PS.Z from ENSMP (Also see entry by Irigoin in Encyclopedia of Parallel Computing.)**Polyhedral Compilation:**Sanjay V. Rajopadhye: Dependence Analysis and Parallelizing Transformations. The Compiler Design Handbook 2002: 329-372 (Handbook#1) PDF**Loop Transformations:**Alain Darte**Instruction Scheduling:**R. Govindarajan: Instruction Scheduling. The Compiler Design Handbook, 2nd ed. 2007. (Also ref Ch.#17 of Muchnick.)**Software Pipelining:**Vicki-Allan: CSUR, Handbook#1. ACM**Software Pipelining:**Hongbo Rong, R. Govindarajan: Advances in Software Pipelining. The Compiler Design Handbook, 2nd ed. 2007**Chapters from Muchnick::**Alias analysis (Ch.#10), Interprocedural analysis (Ch.#19), Optimizations for Memory Hierarchy (Ch.#20), Case Studies of Compilers (Ch.#21)**Encyclopedia entries (some examples) :**Banerjee's Dependence Test (Banerjee), Code Generation (Bastoul), Dependence Abstractions (Irigoin), Dependences (Feautrier), Loop Nest Parallelization (Banerjee), Modulo Scheduling and Loop Pipelining (Nicolau), ..., Omega Test (Wonnacott), Parallelization, Automatic (Padua), ..., Tiling (Irigoin), Trace Scheduling (Freudenberger) Springer**Machine Learning and Compiler Optimizations:**Statistical and Machine Learning Techniques in Compiler Design, P. J. Joseph, Matthew J. Thazhuthaveetil, Y. N. Srikant, Kapil Vaswani, Chapter 8 in The Compiler Design Handbook: Optimizations and Machine Code Generation, CRC Press, 2nd Edition, December 2007.- .....

**Halide: Image processing DSL and compiler:**Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Fré Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation (PLDI '13). ACM, New York, NY, USA, 519-530. ACM-DL of PLDI-13, PDF link of pldi-13, Related SIGGRAPH-12 paper, Halide page at MIT**R-Stream Compiler:**Encyclopedia Entry**Julia:**Julia Lang, Julia Publications**Galois:**Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M. Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario Méez-Lojo, Dimitrios Prountzos, and Xin Sui. 2011. The tao of parallelism in algorithms. SIGPLAN Not. 46, 6 (June 2011), 12-25. ACM-DL PDF from UT Austin**Benchmarks:**A study of Compiler Benchmarks: SPEC, NAS, HPC, PolyBench etc.- .....

Activity | Weight |
---|---|

Class Participation | 20% |

End Term Exam (Open Book) | 40% |

Paper Presentation | 40% |