rulururu

post Improve C++ Compile-Time and Run-Time Performance

February 21st, 2008

Filed under: C++ — Kai @ 6:51 pm

Organizing your code as good as possible for great compile time and also, much more important, better run-time performance is still a major point of software development.

An application delivering results faster doesn’t always mean that you’ve to re-design all of the code and does not in any case depend on damn-fast libraries you use. Often things can be made better much easier. In some cases it’s not important what you’re using - even more how you’re using it.

I’d like to show you a number of techniques for improving performance in C++. Non of those “tricks” is a difficult programming trick or something really new. They’re just common things that can make your application a little bit more high-performance.

  • To achieve the fastest build times, compile without debug info and without optimization.
  • GCC parameter -g produces debugging information in the operating system’s native format. You don’t have to produce it every time building your code - it’s just necessary when debugging the project.
    -O3 is often used to generate smaller executables - if you need a small binary you’d use it but when being focused on compile time you’d better use the default -O0.

  • What libraries to use?
  • Most C++ libraries are designed for good performance over a wide range of uses. C++ standard library gives you several good algorithms with a significant number of services. Don’t write every algorithm yourself but sometimes it might happen that the standart C++ library doesn’t provide a perfect solution for your project. Before reproducing an insufficient algorithm have a look at other libraries such as BOOST Library or Apache C++ Standard Library. Especially for painting or threading issues there’s a whole bunch of libraries.

  • Include files
  • Reducing the number of included files by better organisating the project can make compilation more efficient.

    Most programmers add something like this to their headers to avoid including a header more than one time.

    #ifndef MYCLASS_H
      #define MYCLASS_H
      class MyClass 
      { 
       ... 
      };
    #endif

    If the file has not been previously read, the class is defined, otherwise, the file is essentially empty. We can avoid the unnecessary read of the file by testing the guard on the outside of the include as well.

    #ifndef MYCLASS_H
      #include "myclass.h"
    #endif

    This technique is most effective when the compiler is hitting the limits of available main memory and as a result the file cache is ineffective.

  • Usage of inline functions
  • Inline function expansion can significantly increase run time, but the cost is significantly increased compilation time. That means you’d know what has more importance to you - compile time or run time…

    In a previous post I described inlining in detail:
    Inlining - on February 10th, 2008

  • Bit-fields
  • Both C and C++ allow integer members to be stored into memory spaces smaller than the compiler would ordinarily allow.

    Convert boolean and small integer values into bit-fields, and then place these fields adjacent to each other. This technique can substantially reduce data size - this should only be a way if your applications uses too much of memory capacity.

    This can bring benefit to your application when reading external file formats - non-standard file formats for example a 9 bit integers or something like that.

    Mostly bitfields are declared in structures:

    struct A
    {
     unsigned int f1: 4;
     unsigned int f2: 4; 
    };

    It’s more common to use bit fields to store a set of boolean datatype flags compactly. Bit fields are kinda tricky way of space optimization.

  • Converter methods
  • In general you cast an objects that inherits from another into it.
    The usage of dynamic cast is very general, and consequently is more expensive than most specific needs warrant.

    car* pCar = (car*)pVehicle;
    car* cCar = dynamic_cast<car*>( pVehicle );

    You can achieve a lot of useful functionality by providing dynamic converter methods:

    car* pCar = pVehicle->to_car()
     
    class vehicle
    {
        virtual car* to_car() { return (car*)0; }
    };
    class car : vehicle 
    {
        virtual car* to_car() { return this; }
    };

    In most cases this is not just clearer it’s also faster.

  • Default operators
  • Use the default operators. If a class definition does not declare a parameterless constructor, a copy constructor, a copy assignment operator, or a destructor, the compiler will implicitly declare them.

    When the compiler builds a default operator, it knows a great deal about the work that needs to be done and can produce very good code. This code is often much faster than user-written code because the compiler can take advantage of assembly-level facilities.

    Default operators are inline functions, so do not use default operators when inline functions are inappropriate.

  • Passing reference paramters
  • Most programmers, including myself, often pass references to functions ’cause it’s said to faster than a call by value. Sounds logical because for a call be value a copy is needed which allocates storage.

    However, value parameters may be more efficient, and even when they are not directly more efficient, the compiler knows that a value parameter cannot be aliased, and so can better optimize access to the parameter.

  • Temporary objects can be avoided
  • In order to avoid lots of temporary objects which make longer compile time and also reduce performace of the application because more memory is needed you’d consider the following:

    T x = a + b;

    Produces a temporary object for the sum of a and b and the passes it into x of class T.

    T x(a); x += b;

    Above solution doesn’t need a temporary object.

  • Write member variables into local variables
  • Accessing member variables is a common operation in C++ member functions.
    Due the fact that you need to read/write members the compiler must often load member variables from memory through the this pointer.

    This pointer might not always be valid which forces the application to reload the member every time again whenever it’s needed. If you pass your member at the top of a function into a local variable it’s just accessed one time and in the rest of the function the local varible can be used.

    That way you can avoid unnecessary memory reloads.

    Another great advantage that improves performace is that the values can reside in registers, as is the case with primitive types.

  • Too many function calls
  • You often see things like that:

    if(App->MyClass->GetInstance())
    {
    App->MyClass->GetInstance()->GetAll()->Select(...);
    MyClass->GetInstance()->GetAll()->SelectTop();
    }

    You don’t have to do that. Often this is a result of copy and paste lines as everybody does lots of times.

    Nevertheless a more readable and also cached memory reducing solution is:

    MyClass* inst = App->MyClass->GetInstance();
    if(inst)
    {
    inst->GetAll()->Select(...);
    inst->GetAll()->SelectTop();
    }
  • const does not improve run performace
  • Many programmers use const reference parameters for functions with the intent to inform the compiler that the parameter is read-only. The const keyword says that the storage may not be modified through the given name.

    What it does not say is that the storage cannot be modified through some other name. Only variables directly declared as const are really constant.

    Basically it’s an ineffective way for improving run-time performance. It does, however, catch errors in the programming process.

  • Deallocating memory
  • Once unnecessary allocations are eliminated, the next most effective technique for improving performance is to deallocate memory when it is no longer needed. This is best accomplished with explicit calls to delete.

    However, applications may lose control of their memory, and a conservative garbage collector can sometimes be a critical tool. The C++ compiler provides a conservative garbage collector with the option -library=gc.

  • Loop optimizations
  • The most simple thing to remove some spare loops which can simply be written outright. Consider this:

    for(int i=0; i<4; i++) 
    array[i] = i;

    this is logically the same as

    array[0] = 0; 
    array[1] = 1, 
    array[2] = 2; 
    array[3] = 3;

    Of course it’s nonsense doing this in on or two parts of your projects. But if you have thirty or more of those parts it might give you back some performance.

    Do as less checking in a loop as possible, try just giving the filtered data objects, those that really should be passed through, to a loop.

  • Different compiler versions
  • It’s interesting to know that GCC 2.95.3 does faster compile code than version 3 compilers.
    This shouldn’t be the determining factor ’cause using the newest (or a newer) version of GCC is always recommented. A downgrade to a lower version can just be necessary because of an application getting no longer compiled by the newer one.

Delay optimization as much as possible, and don’t do it if you can avoid it. Optimizing too early or too often is not a good approach to engineering. Better to have a program that runs than a fast program that crashes. Better rethink your concept twice before starting the programming work instead of correting half of the project afterwards.

2 Comments »

  1. Hey. Good article about increasing build times.

    I think there are different ways to achieve this. Some guy from IO Interactive told me something about to write a bunch of files including all the .h or something like that. But I dont remember exactly how it works.

    Anyway, good article ;)

    Comment by Jose Antonio — September 28, 2011 @ 8:54 am

  2. Very nice article!!

    It’s just missing one little thing ‘Pimpl’.

    Comment by M.S. Babaei — October 31, 2011 @ 11:22 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

ruldrurd
Powered by WordPress, Content and Design by Kai Bellmann
Entries (RSS) and Comments (RSS)