rulururu

post Typecasting: Inside the compiler

August 24th, 2010

Filed under: C++ — Kai @ 12:21 pm

I was unsure how Type Casting happens without loss of data inside the compiler.

For example:

 int i = 10;
 UINT k = (UINT) k;

 float fl = 10.123;
 UINT  ufl = (UINT) fl; // data loss here?

 char *p = "bla blub";
 unsigned char *up = (unsigned char *) p;

Now I was wondering how the compiler does handle this type of typecasting?

Well, first of all I got to say that a cast is an explicit request to convert a value of one type to a value of another type. A cast will also always produce a new object, which is a temporary returned by the cast operator. Casting to a reference type, however, will not create a new object. The object referenced by the value is reinterpreted as a reference of a different type.

Now to your question. Note that there are two major types of conversions:

  • Promotions: This type can be thought of casting from a possibly more narrow type to a wider type. Casting from char to int, short to int, float to double are all promotions.
  • Conversions: These allow casting from long to int, int to unsigned int and so forth. They can in principle cause lost of information. There are rules for what happens if you assign a -1 to an unsigned typed object for example. In some cases, a wrong conversion can result in undefined behavior. If you assign a double larger than what a float can store to a float, the behavior is not defined.

Let’s have a look at the casts:

int i = 10;
unsigned int k = (unsigned int) i; // :1

float fl = 10.123;
unsigned int  ufl = (unsigned int) fl; // :2

char *p = "bla blub";
unsigned char *up = (unsigned char *) p; // :3

1. This cast causes a conversion to happen. No loss of data happens, since 10 is guaranteed to be stored by an unsigned int. If the integer were negative, the value would basically wrap around the maximal value of an unsigned int (see 4.7/2).

2. The value 10.123 is truncated to 10. Here, it does cause lost of information, obviously. As 10 fits into an unsigned int, the behavior is defined.

3. This actually requires more attention. First, there is a deprecated conversion from a string literal to char*. But let’s ignore that here. (see here). More importantly, what does happen if you cast to an unsigned type? Actually, the result of that is unspecified per 5.2.10/7 (note the semantics of that cast is the same as using reinterpret_cast in this case, since that is the only C++ cast being able to do that):

A pointer to an object can be explicitly converted to a pointer to an object of different type. Except that converting an rvalue of type “pointer to T1” to the type “pointer to T2″ (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified.

So you are only safe to use the pointer after you cast back to char * again.

post Empty blocks

October 27th, 2008

Filed under: C++ — Kai @ 10:54 am

A tricky loop I’d like to show you.

First of all a quick explanation of the two functions I use.

  • log2 computes the base-2 logarithm of x.
  • ceil rounds, it returns the smallest integral value that is not less than x.
/* compute the ceil(log2(x)); i,x are unsigned; x is not 0 */
for( i = x>>1, n = 0; i != 0; i >>= 1, n++ ) {
}

In C we can have loops that have nothing in their body. You should use braces around an empty body. This allows you to expand the body if necessary. Further it gives us a visual clue that something out-of-the-normal is going on.

post Making a program survive

October 15th, 2008

Filed under: C++ — Kai @ 7:49 am

VERIFY can be used for things that should never fail, though you may want to make sure you can provide better error recovery if the error can actually cause a crash in a production system.

The C language provides a macro, called assert, that is used to verify conditions that must be true at any point of the program. These include preconditions, postconditions and invariants, all of which are explained in introductory programming courses. Whenever an assertion is violated, the program is abruptly stopped because there is most likely a bug in its code.

Many programmers put ASSERT macros liberally throughout their code. This is usually a good idea. The nice thing about the ASSERT macro is that using it costs you nothing in the release version because the macro has an empty body. Simplistically, you can imagine the definition of the ASSERT macro as being

#ifdef _DEBUG
#define ASSERT(x) if( (x) == 0) report_assert_failure()
#else
#define ASSERT(x)
#endif

(The actual definition is more complex, but the details don’t matter here). This works fine when you are doing something like

ASSERT(whatever != NULL);

which is pretty simple, and omitting the computation of the test from the release version doesn’t hurt. But some people will write things like

ASSERT( (whatever = somefunction() ) != NULL);

which is going to fail utterly in the release version because the assignment is never done, because there is no code generated (we will defer the discussion of embedded assignments being fundamentally evil to some other essay yet to be written.

That’s what VERIFY is for. Imagine the definitions of VERIFY as being

#ifdef _DEBUG
#define VERIFY(x) if( (x) == 0) report_assert_failure()
#else
#define VERIFY(x) (x)
#endif

Note this is a very different definition. What is dropped out in the release version is the if-test, but the code is still executed.

You always have to keep in mind that in the release version of MFC, VERIFY evaluates the expression but does not print or interrupt the program. For example, if the expression is a function call, the call will be made.

At least this is an example how VERIFY can be used (codelines out of the tar-1.16 project)

/* Verify requirement R at compile-time, as an integer constant expression.
   return 1.  */
 
# ifdef __cplusplus
template <int w>
  struct verify_type__ { unsigned int verify_error_if_negative_size__: w; };
#  define verify_true(R) \
     (!!sizeof (verify_type__<(R) ? 1 : -1>))
# else
#  define verify_true(R) \
     (!!sizeof \
      (struct { unsigned int verify_error_if_negative_size__: (R) ? 1 : -1; }))
# endif
 
/* Verify requirement R at compile-time, as a declaration without a
   trailing ';'.  */
 
# define verify(R) extern int (* verify_function__ (void)) [verify_true (R)]
 
#endif

post Tricky Floating Point Numbers

September 27th, 2008

Filed under: C++ — Kai @ 6:34 pm

Everyone knows that floating point numbers do have finite ranges, but this limitation can show up in unexpected ways. For instance you may find the output of the following lines of code surprising.

float f = 16777216; 
cout << f << " " << f+1 << endl;

Against expectations this code prints the value 16777216 twice.

What happened? According to the IEEE specification for floating point arithmetic, a float type is 32 bits wide. Twenty four of these bits are devoted to the significand (what used to be called the mantissa or also coefficient) and the rest to the exponent. The number 16777216 is 224 and so the float variable f has no precision left to represent f+1.
A similar phenomena would happen for 253 if f were of type double because a 64-bit double devotes 53 bits to the significand.

The following code prints 0 rather than 1.

x = 9007199254740992; // 2^53 
cout << ((x+1) - x) << endl;

We can also run out of precision when adding small numbers to moderate-sized numbers. For example, the following code prints “Sorry!” because DBL_EPSILON (defined in float.h) is the smallest positive number ε such that 1 + ε ≠ 1 when using double types.

x = 1.0;
y = x + 0.5 * DBL_EPSILON;
if (x == y)
    cout << "Sorry!" << endl;

Similarly, the constant FLT_EPSILON is the smallest positive number ε such that 1 + ε ≠ 1 when using float types.

post Incredible C++ Snippet - How it works

September 24th, 2008

Filed under: C++ — Kai @ 8:00 pm

I think it’s time to disclose the secret of that piece of code. First of all it’s not that tricky as you might have expected. You just have to get some basic knowledge about memory allocation.

Probably most programmers if they have a look at those few lines would say that here a beginner didn’t pay attention and that the program will poorly fail by an access violation error or something similar.

For everybody who didn’t try out this snippet himself:
If y is declared before x in the line the loop is becoming an endless loop. If it’s declared the other way round the program seems to work correctly.

First of all I have to clarify that even if we declare y before x in that line, x is first allocated in memory. It’s read by the machine from right to left.
Second thing which is important to know is that memory is always allocated from top to bottom (for our imagination).

For better explanation I changed the order of x and y:

int y,x;
int feld[5];

On the stack first of all y gets it’s space of 4 bytes after that x gets the same. After that above those two the array gets 5 times 4 bytes (20 bytes).

just allocated

After running the loop 5 times (from 0 up to 4) the array is filled with values. Until here everything works as it should.

everything correct till here

But wrongly the loop is run one more time (<=5) and that's why the next value is written into the space that was allocated for x (as you can see on the picture below). In fact x gets overwritten...

x it overwritten

The endless loop was caused by writing every time again 1 into y.
This is no black magic and it’s also not illegal according to the C++ standart. It’s just a bad error done by the programmer which often happens and usually almost not fineable in a few minutes.

It should just show you to be careful with your allocations.

post Incredible C++ Snippet

September 24th, 2008

Filed under: C++ — Kai @ 4:31 pm

Please have a look at this code snippet and try to tell me what you’re expecting it to print out.

#include <iostream>
using namespace std;
 
int main()
{
	int x,y;
	int feld[5];
 
	x=6;
	for(y=0;y<=5;y++)
	{
		feld[y] = 1;
		cout << y << endl;
	}
	cout << x << endl;
 
return 0;	
}

Mention that the array size is 5 and the for loop goes till 5 (including!).

If you don’t get the solution why this happens the way it happens it might help you to know changing the order of decleration of x and y does cause a totally different output.
I will give you the solution to that miracle in a few days…

Have fun ;-)

ruldrurd
Next Page »
Powered by WordPress, Content and Design by Kai Bellmann
Entries (RSS) and Comments (RSS)