As we have already discussed, the performance advantages of avoiding expensive method invocations is only half of the inlining performance story. The other half is cross-call optimizations. Cross-call optimizations allow the compiler to perform source and machine level optimizations to a method based on a more expansive contextual view of its invocation. These optimizations generally take the form of doing things at compile-time to avoid the necessity of doing them at run-time; for example, simple things like converting
x = 90.0;
— // nothing that changes x’s value
y = sin(x);
x = 90.0;
y = 1.0; // sin(90) = 1
Why Not Inlining?
If inlining is so good, why don’t we just inline everything? This simple question has a lot of complicated answers. Let’s start the answer with an inlining situation. Suppose we inline a method that, when compiled, contains 550 bytes of source code. Further suppose that 50 bytes of the called method are associated with call prologue and epilogue (method invocation overhead). If our hypothetical method is statically invoked a dozen times (called from a dozen different locations within a program), we have just increased our program size by 5,450 instructions ((550 instructions per inlining—50 instructions of invocation overhead) * 12)—550 for the otherwise called version), and we have improved the execution performance of each inlined execution of the method say by only 10%. (Assume that the large method has 50 cycles of call overhead and that the method requires 500 cycles to execute. This is pure conjecture; some methods with 500 machine code instructions may have an average execution time of 10 cycles and others may require millions of cycles.) Thus we have a 10x increase in code size of this one method with only marginal perinvocation performance improvement. Code expansions of this magnitude, when extrapolated across all the inlineable methods with a program, will have huge negative secondary performance characteristics, like cache misses and page faults, that will dwarf any supposed primary gains. Put differently, an overaggressively inlined program will execute fewer instructions, but take longer doing so. Thus, one reason why all methods are not inlined is that the code expansion that inlining creates may not be tolerable.
Another reason for not inlining everything is that some methods cannot be inlined; for example, recursive methods cannot be inlined. Suppose some method A called itself. Any attempt to inline A would result in an infinite loop as the compiler continually attempted to insert A into A. (There are actually some cases in which a very clever compiler could inline a function, particularly if the variable that controls the recursion is passed in as a literal.) Thus, recursive methods generally cannot be inlined (though later we will discuss some mechanisms to achieve a similar net effect). Methods that are indirectly recursive can sometimes be inlined. For example if A calls B and B calls A, then B, although indirectly recursive, can be inlined
Reasons for not inlining:
- Code expansion.
- Some version of inline functions cause problems with compiler.
- Recursive inlining may cause problems.
- Recompilation in case of changing any code in the inlined function.
Singleton methods are good candidate to be inlined.
Trivials are small methods with generally less than four simple source level statements that compile into ten or fewer assembly language instructions. These methods are so small that they do not provide any possibility for significant code expansion. It’s strongly recommended to inline them.
Inlining may backfire, and overly aggressive inlining will almost certainly do so. Inlining can increase code size. Large code size suffers a higher rate of cache misses and page faults than smaller code.