float 1.1 == double 1.1 ???????
Hi everyone,
main()
{
float x = 1.1;
double y = 1.1;
if(x==y)
printf("True\n");
else
printf("False\n\n");
if(((double)1.1)==((float)1.1))
printf("True\n");
else
printf("False\n");
}
For the first comparasion, compiler says...
>>>warning C4305: 'initializing' : truncation from 'const double' to 'float'<<<
The first evaluates to false second to true. :confused:
WhY??
bye
# 4 Re: float 1.1 == double 1.1 ???????
Computers use binary. There is no exact binary representation for 1.1. Since there is no exact binary representation for 1.1, this value is approximated by the computer.
As a matter of fact, 1.1 is a non-terminating, repeating, fractional number when converted to binary.
This is also the reason you should never compare floats or doubles for equality. Floats and doubles are rarely exactly equal to each other.
Regards,
Paul McKenzie
Ok. So obviously it should be fine to compare double with double, or float with another float.
Now,
If double is converted to binary, and float is converted to binary, how the 2 differ. Lets say both are 1.1 in decimal at first. Is there a difference in the way they are converted to binary???
Thanks in advance
# 5 Re: float 1.1 == double 1.1 ???????
All right, here's the detailed answer. :)
You've discovered why one should not test floating point numbers for equality in the way your program is attempting to do. There's an article on this issue in the FAQ here (http://www.dev-archive.com/forum/showthread.php?t=323835). The basic idea is that, while integers are stored as their precise value in binary format, a floating point number is usually stored as the best approximation that space allows. Most floating point values cannot be represented precisely in binary anyway, as NMTop40 pointed out. To use your example, 1.1 in binary looks like:
1.00011001100110011001100...
That's 1 + (1/16) + (1/32) + (1/256) + (1/512) + ... and so on. You can't exactly represent 1 + (1/10) like this, not with only a finite number of bits to work with.
Here's a modified version of your program, designed to show the binary representations of the floats and doubles that you're working with:
#include <stdio.h>
void ShowFloatHex(float *f)
{
unsigned int *p = (unsigned int*)f;
printf("%X\n", *p);
}
void ShowDoubleHex(double *d)
{
unsigned int *p = (unsigned int*)d;
printf("%X%X\n", *(p + 1), *p);
}
int main()
{
float fOriginal = 1.1;
double dOriginal = 1.1;
double dNew = (double)fOriginal;
float fNew = (float)dOriginal;
ShowFloatHex(&fOriginal);
ShowFloatHex(&fNew);
printf("\n");
ShowDoubleHex(&dOriginal);
ShowDoubleHex(&dNew);
printf("\n");
if(fOriginal == dOriginal)
printf("True\n");
else
printf("False\n\n");
if(((double)1.1)==((float)1.1))
printf("True\n");
else
printf("False\n");
return 0;
}
Note that my functions for showing hex representation of floats and doubles assume that sizeof(float) == 4, that sizeof(double) == 8, and that we're on a Little-endian machine.
Now, fOriginal and dOriginal are your x and y, respectively. The variable fNew is dOriginal converted to a float, and the variable dNew is fOriginal converted to a double. This is done so you can see what happens to these variables when the conversions are made. Running the program produces this:
3F8CCCCD
3F8CCCCD
3FF199999999999A
3FF19999A0000000
The first set of values is the original float, followed by the original double after being converted to a float. The double had greater precision, and that extra precision was sacrificed in order to bring it down to a float, and so the two values are the same.
The second set of values is the original double, followed by the original float after being converted to a double. You can see that the float does not gain extra precision by being cast to a double, because the original float was just an approximation, so the computer has no way to know what it was an approximation of to begin with. Thus the extra bits available for the mantissa are simply set to zero, and the two values are different.
From these results, you can guess that when you compare a float to a double using the == operator, it converts the float to a double in order to make a direct comparison (because converting the double to a float instead would sacrifice some of the double's precision). That's why you get False for the first result.
As for why you get True for the second result, it's simple. Since you're comparing two literals, the compiler doesn't even bother storing (float)1.1 and (double)1.1 in memory anywhere. It simply sees that you're comparing two entities whose mathematical values are the same (even if their binary representations would be different), and so it assumes that the condition will always evaluate to true. Here are the relevant lines from the disassembly when I ran this program through the debugger:
34: if(((double)1.1)==((float)1.1))
0040EB8C mov ecx,1
0040EB91 test ecx,ecx
0040EB93 je main+0C4h (0040eba4)
Even if you don't understand assembly language, you should be able to pick up on what's happening here. Instead of comparing two values that are stored in memory, the program loads the value 1 into ECX, then compares ECX against itself -- the comparison will always be true. This is the compiler trying to make life easier for you; the conversion of (float)1.1 and (double)1.1 to binary, and the subsequent loss of precision, never take place.
I hope that this is clear, but if you have any additional questions, feel free to ask. If you don't understand the way floating point numbers are represented, or have trouble with the binary number system, please see the relevant entries in the FAQ.