Programmers may have different views on C++ performance depending on their respective experiences. But there are a few basic principles that we all agree on:
- I/O is expensive.
- Function call overhead is a factor so we should inline short, frequently called functions.
- Copying objects is expensive. Prefer pass-by-reference over pass-by-value.
Let us see the following code sample:
As you can tell, addOne() doesn’t do much, which is exactly the point of a baseline. We are trying to isolate the performance factors one at a time. Our main() function invoked addOne() a million times and measured execution time:
Next, we added a Trace object to addOne and measured again to evaluate the performance delta. This is Version 1
The cost of the for loop has skyrocketed from 55 ms to 3,500 ms. In other words, the speed of addOne has plummeted by a factor of more than 60! See graph below:
This is duo the following overhead operations made in version 1:
- Create the string name local to addOne (at the start).
- Invoke the Trace constructor (at the start).
- The Trace constructor invokes the string constructor to create the member string (at the start).
- Destroy the string name (at the end).
- Invoke the Trace destructor (at the end).
- The Trace destructor invokes the string destructor for the member string (at the end).
The performance recovery plan was to eliminate objects and computations whose values get dropped when tracing is off. We started with the string argument created by addOne and given to the Trace constructor. We modified the function name argument from a string object to a plain char pointer
Now the execution dropped from 3,500 ms to 2,500 ms. See figure below:
- Object definitions trigger silent execution in the form of object constructors and destructors. We call it “silent execution” as opposed to “silent overhead” because object construction and destruction are not usually overhead.
- Just because we pass an object by reference does not guarantee good performance. Avoiding object copy helps, but it would be helpful if we didn’t have to construct and destroy the object in the first place.
- Don’t waste effort on computations whose results are not likely to be used.
- Don’t aim for the world record in design flexibility. All you need is a design that’s sufficiently flexible for the problem domain. A char pointer can sometimes do the simple jobs just as well, and more efficiently, than a string.
Inline. Eliminate the function call overhead that comes with small, frequently invoked function calls.