Subsections

Organization

Credits

The project was leaded by David Bednárek13.1. Following students participated in the project directly:

Zdenek Dvorák
13.2 
Zdenek implemented the thread safe profiling code, added infrastructure for high level branch predictor algorithms.

Major part of his work is related to loop optimizations. He implemented new data structure to hold natural loop tree and modified natural loop discovery code originally contributed by Michael Hayes to produce it. On the top of it he implemented loop unswitching, loop unrolling and loop peeling algorithms, one of the most successful new optimizations in our project.

He also made numerous contributions to the control flow graph handling code, most importantly he changed the basic block representation from linearly ordered array to double linked list avoiding quadratical complexity of many important algorithms in new CFG aware code. This required many changes of code in compiler that uses basic blocks.

He also found and fixed many latent bugs in the compiler and our new improved CFG manipulation code.

Jan Hubicka
13.3 
Jan had special role in project by being the only participant with previous knowledge of GCC internals and its development model. In early stages of project he helped other participants to familiarize with project by solving relatively simple problems and has been concentrating on adding infrastructure necessary for CFG and profile manipulation. He also added new branch prediction heuristics and a pass to estimate profile.

Predating the official start of project, he implemented first prototype of working basic block profiler, extended branch prediction pass and modified few passes to use the information. Since the information was not maintained only register allocation used the information. He also implemented bare bones of future cfg-cleanup pass.

In later stages of project he implemented few new optimization passes (tracer, webizer, register coalescing), but continued to concentrate on GCC cleanups and infrastructural changes and later on mid-level RTL implementation. He also reviewed majority of contributions of other participants before integrating to the source tree.

Pavel Nejedlý
13.4 
Major part of Pavel's work is concentrated on the basic block profiler. He implemented new, robust and extensible file format to hold profiles. His format allows overall information to be summarized which is used to identify better hot spots in the program and to do code placement decisions as well as user errors (such as using outdated profiles) to be detected and reported. In future his work will allow more sophisticated profile merging algorithm, as new format stores each train run separately instead of merging them all at once.

Pavel has also reimplemented the interface between profile instrumentation generated by compiler and the libgcc runtime in cleaner and more extensible way.

Last his major contribution is new data structure to hold dominance information that allows fast dynamical updates and is necessary for Zdenek's new loop optimizer code.

Josef Zlomek
13.5 
In early stages Josef has found and fixed major problem of Michael's natural loop discovery code. Later he has implemented and tuned the new basic block reordering algorithm based on software trace cache. His implementation is in many way superior to the one described by original article. He also improved and fixed the cfglayout module he used for his work.

Most important contribution is probably the new variable tracking pass fixing the aged defect of GCC debugging information output code allowing us to enable more optimizations at default level.

Josef has also reviewed lots of code done by Jan Hubicka and provided useful comments and fixed important bugs.

Following people has contributed to our project:

Andreas Jaeger
 
Andreas set up the SPEC2000 tester to do daily benchmarking of our development tree which allowed us to easily tune new optimizations on nontrivial benchmark suite. He also set up an extended weekly tester which provided feedback for our branch prediction heuristics, and kindly benchmarked results of our work that are presented in this documentation.
Richard Henderson
 
Richard, one of the most active maintainers of GCC, has reviewed and approved for inclusion to the mainline the essential parts of our work. He also provided number of important ideas. Without his help our project would not be possible at all. He also helped us to redesign scope tree maintenance code in cfglayout.c.
Jeff Law
 
Jeff pushed us to implement midRTL.
Daniel Berlin
 
Daniel implemented and tested Dwarf2 output routine based on Josef's variable tracking pass. He also tested and fixed our version of compiler on powerPC and added improvements to the loop unrolling debug output code.

Documentation

Our documentation has been written by all participants of the project each describing his portion of work. The printed section contains introduction part and experimental results. The later section describing data structures and interfaces has been committed for inclusion in the official GCC manual [1].

Major portion of documentation however is in the sources. This is dictated by GNU Coding Standards [1] and maintainers of GCC project. Since GCC contains a lots of aged and complicated code, good documentation is essential.

Each function should have description of purpose of its arguments and return value, in addition to commenting all nontrivial steps in the function body. Each of our newly added modules contains summary of algorithm in the introductory comment.

Sources

Since our project is integrated to the GCC source tree, we provide short overview of locations of our changes. We have implemented from scratch following modules:

cfgcleanup.c, cfgloop.c, cfgloopanal.c, tracer.c, web.c, coalesce.c, loop-new.c unroll-new.c, unroll-new.c var-tracking.c, bb-reorder.c13.6, midrtl.c, et-forest.c, et-forest.h, predict.def and analyze_branches script.

We made major changes to following modules:

flow.c
We split out the CFG handling code into cfg.c, cfganal.c, cfgrtl.c, cfgloop.c and importantly changed (de facto reimplemented) all the contents. We also added support for easy dynamic updating of liveness information.
emit-rtl.c
We modified all routines to update the flow graph.
toplev.c
We reorganized the rest_of_compilation function.
cfglayout.c
We rewrote the code to handle syntactic scope nest tree and changed all interfaces. We added functionality for code duplication and reorganized the way how the insn chain is put together.
gcse.c
GCSE now maintains liveness information and uses our code hoisting infrastructure code to avoid limitations of previous implementation.
df.c
Since we are the first users of Michael Hayes' data flow module, we had to fix bugs and some design defects. We also cleaned up the code.
gen*.c
All these functions had to be updated to handle midRTL.
recog.c
The most of mid level RTL code is present here.
optabs.c
We had to modify all RTL generation routines to produce mid level RTL code.
ifcvt.c
contains double test optimization pass.
predict.c
contains new branch predictor implementation.
We basically rewrote this module from scratch.
profile.c, final.c, libgcc.c
contain updated profiler implementation and changed interfaces.
reg-stack.c,reg-class.c,local.c,global.c
Updated to use the profile.
stmt.c
contains high level branch predictors.

Because of the nature of project, we had to touch almost all other back-end modules in GCC as well.

Timeline

Exact progress of the project is described in ChangeLog, ChangeLog.6 and ChangeLog.cfg files in the GCC subdirectory, but here we point out some dates we consider especially important for our project.
Jul 28
First prototype of new branch predictor and profiler integrated to GCC + modification of register allocator. (Jan Hubicka)
Sep 10
First version of cfgcleanup integrated. (Jan Hubicka)
Sep 25
Large reorganization of CFG related modules. flow.c breakup (Jan Hubicka)
Nov 12
CFG efforts moved from mainline to cfg-branch due to destabilization.
Nov 12
Natural loop discovery code finally works. (Josef Zlomek)
Nov 12
Cross-jumping and jump threading merged to mainline. (Jan Hubicka)
Nov 13
Old jump optimization pass is finally dead. (Jan Hubicka)
Nov 15
High level branch prediction integrated. (Zdenek Dvorák)
Nov 17
CFG layout duplication code and preliminary tracer implementation. (Jan Hubicka)
Nov 26
Double test combining pass. (Jan Hubicka)
Dec 11
First version of software trace cache implementation. (Josef Zlomek)
Dec 12
Tracer now produces better code.
Dec 13
Major of our changes are now in the mainline tree. Thanks to Richard Henderson.
Dec 15
GCC mainline feature freeze.
Jan 1
New profiler file format. (Pavel Nejedlý)
Jan 2
Thread safe profiling integrated. (Zdenek Dvorák)
Jan 9
Software trace cache now produces better code (new loop rotation code). (Josef Zlomek)
Jan 16
Code hoisting infrastructure and GCSE revamp. (Jan Hubicka)
Jan 21
Midlevel RTL prototype. (Jan Hubicka)
Jan 23
Variable tracking code. (Jan Hubicka)
Jan 24
New natural loop code. (Zdenek Dvorák)
Feb 13
Finished integration of most of cleanups, fixes and infrastructural updates to the mainline. Thanks to Richard Henderson.
Feb 15
Final mainline freeze.
Feb 19
Major cfglayout cleanups and fixes. (Jan Hubicka)
Feb 21
Loop code updates integrated. (Zdenek Dvorák)
Feb 25
Loop unswitching code. (Zdenek Dvorák)
Mar 18
Loop unrolling and peeling code. (Zdenek Dvorák)
Mar 25
New debug output code. (Daniel Berlin, Josef Zlomek)
Apr 1
Feature freeze.
Apr 3
PowerPC and Sparc bootstrapped.
Apr 9
Fast dominance tree updating code integrated. (Pavel Nejedlý)

Jan Hubicka 2003-05-04