Automatic generation of fast optimizing code generators
Summary (2 min read)
Introduction
- This paper describes a system that accepts compact specifications of an intermediate code and target machine and produces program code for an integrated code generator and peephole optimizer.
- The code generators are produced as follows.
- The programmer also prepares a machine description for a retargetable peephole optimizer [2] .
- And notice is given that copying is by permission of the Association for Computing Machinery.
- Requires a fee and/ or specific permissmn.
Representation
- Both the training and production code generators accept the same input -an "abstract syntax dag" built by the front end.
- The front end has propagated types and folded them into the opcodes (e.g. the I prefix flags integer opcodes) so that the back end need not understand t,he frout end's type system, which is typically more complex than the back end's.
- On the VAX, for example, the subtree rooted at the ISUB above is ultimately replaced with the instruction sub13 -c,,r,r4, and the rest of the tree is replaced with clrl -up+4*7 Cr41.
- The compiler has not yet accommodated full C, but the size of the table may be estimated.
- The bindings for the pattern variables %O and %I are never stored in this node because they are available (after register assignment) in the children's vars fields.
Specifying the Code Generator
- Here are a few lines from the specification that defines the int,ermediate code and the naive VAX code generator:.
- Opcodes GLOBAL and moval -%O, r%l are leaves, and the remaining opcodes above are binary.
- The presence of a second number indicates that a register must be allocated to hold the target instruction's result.
- If the intermediate code uses a constant field -in the examples above, GLOBAL needs the name of a global variable and ILT needs a label number -the front end stores it in the appropriate pattern variable.
- The automatically generated code generators do the rest.
The Training Code Generator
- The 3 Initially, the code generator uses only those opcodes that appeared in the specification of the naive code generator, so the initial opcode list holds exactly the two columns from the specification.
- This case analysis takes the form of an if-then-else chain that may edit the dag and jump off to the case that handles the new opcode.
- The goto L37 above is really omitted.
- This results in redundant assignments to the opcode field when rewrite re-encounters a multiply-referenced node that has been previously traversed and rewritten, but moving the assignment saves more than it sacrifices.
- These arrays are needed by only the register allots tor and output routine, which need to know where to store register names and how many children to traverse.
The Peephole Optimizer and Trace
- The training routine combine is a retargetable peephole optimizer.
- It then searches the machine description for an instruction with this combined effect.
- If the value produced by an instruction is used several times, its cost is divided equally between its users.
- A full review of this technique is beyond the scope of this paper, but Reference 2 elaborates.
- The last line above reports that the result register of the new instruction is to be bound to xl.
The Production Code Generator
- To produce the production system, the code generator generator accepts the trace above and the specification of the naive code generator.
- It produces an optimizing code generator that is like the naive one presented above, except the opcode list is extended to include all the new instruction variants generated during training, optimizing case analysis is inserted at the head of each case that handles a target instruction, and the call on combine is omitted.
- It uses b-Wars CO] because %I is the first pattern variable of b that requires local storage.
- If no optimization applies, control falls off the chain of ifs into code that updates a->op and returns.
- Case analysis like that above could be generated without training on a testbed.
Discussion
- Two emerging compilers use the techniques above.
- One uses a modified peel as a front end and has largely complete back ends for the VAX and the MC68020.
- The interface between its front end and generated code generators is somewhat less efficient than that shown above.
- At present, this compiler runs in about 55% of the time taken by peel.
- In a typical run, Thus rewrite currently takes less than 1% of the time taken by peel.
Did you find this useful? Give us your feedback
Citations
Cites background from "Automatic generation of fast optimi..."
...O problema de geração automática de otimizadores peephole foi tema de diversos trabalhos de pesquisa [12, 17, 18, 26, 39, 40, 42, 50] e pode ser facilmente acoplado ao projeto acllvmbe atual, mostrando-se, portanto, como um trabalho futuro importante....
[...]
...De fato, diversos trabalhos [12,17,18,26,39,40,42,50] exploram algoritmos superotimizadores e peephole optimizers para programas sem desvios, sendo redirecionáveis com um modelo da máquina alvo....
[...]
...[17, 18, 26] apresenta sua versão de um compilador redirecionável e otimizante....
[...]
References
273 citations
123 citations
"Automatic generation of fast optimi..." refers background in this paper
...The programmer also prepares a machine description for a retargetable peephole optimizer [2]....
[...]
...The training code generator has no compiled code to improve these instructions, so their cases break out of the switch and call combine, which is a retargetable peephole optimizer [2]....
[...]
95 citations
[...]
55 citations
"Automatic generation of fast optimi..." refers methods in this paper
...Pennello has described a technique for replacing an LR parsing table and its interpreter with equivalent optimized assembly code [9]....
[...]
37 citations