Introduction
Software developers deal with two primary data types everyday. We know that they are String and Numeric. When some data is said to be not-a-number, then for some developers, it could be a string. In fact, it is not!
When we deal with algorithms that perform extensive numerical computation on floating point numbers, there come situation which produces a result that cannot be called number!
This article is trying to give its reader a brief overview of below given numerical concepts using C++ code samples.
Downloadable sample project is compiled in VS 2012. You can compile it in any version of Visual C++ starting from VC 6.0.
Background
I was involved in a project which required porting some Matlab image processing algorithms to C++. The Matlab algorithm had high intensity in computation. It has to deal with many floating point operations, each of which will be repeated for several thousand counts, until a specific condition is met.
One of the most challenging requirements of the project was to produce floating point output that exactly matches the output of Matlab, using C++. Any mismatch in one digit of fractional part will produce an output that is quite different from that of Matlab output.
During verification of Matlab to C++ ported code, it is found that at some point, certain double
variables were holding strange numbers such as "1.#QNAN00000000000", "-1.#IND000000000000" etc. How did it happen?
Matlab has a rich set of quiet-easy-to-use utility functions and operators. One such function is given below.
Pixels(isnan(Pixels)) = 0 ;
Above Matlab statement will iterate an array named Pixels
and assign a value of 0 to each array position whose value is not-a-number.
Since there was some need for haste to produce an output in C++, one statement such as the above was missed during the porting, from Matlab to C++ and the result was obvious. I was confronted with "strange" numbers such as "1.#QNAN00000000000"
1. Concept of NaN
NaN means Not A Number.
When a computer performs extensive numerical calculations, result will be such that it cannot be treated as a number!
As an example, consider below code.
double dSQRTValue = sqrt( -1.00 );
double dResult = -dSQRTValue;
Here variable dResult
will hold a NaN. So a NaN represents a numeric quantity that cannot be treated as a valid quantity.
What can be done to represent it? Usually we designate 0 or -1 to mark an invalid entry in a float or double variable. This kind of idea will not work here because 0 and -1 are valid numbers.
Here comes the importance of NaN representation.
A. Representation of NaN
I would like to present both non-standard and standard representation of NaN. Non-standard representation is given just out of academic interest.
I. Non Standard Representation
Define an array of size 2 of type long.
const unsigned long lnNAN[2] = {0x00000000, 0x7ff80000};
Now, cast it to a double value!
const double NOT_A_NUMBER = *( double* )lnNAN;
Now, the constant variable NOT_A_NUMBER contains a NaN.
II. Standard Representation
The <limits>
header file defines a template class named numeric_limits
having the following function for getting a NaN.
const double STD_NOT_A_NUMBERD = std::numeric_limits<double>::quiet_NaN();
B. How a NaN Looks Like?
Above is how a NaN is displayed in MS Visual C++ debugger. We will get the same string representation with functions such as sprintf()
and stream classes such as stringstream
.
C. Comparison of NaN
Here again, I would like to give both non-standard and standard way of comparison. As said above, the non-standard way is given just for academic purpose.
I. Non Standard Comparison
bool bNaN = false;
if( 0 == memcmp( &NOT_A_NUMBER, &dQNan, sizeof(double)))
{
bNaN = true;
}
II. Standard Comparison
Header file <float.h>
contains declaration of function _isnan()
for checking whether a number is NaN or not.
D. Properties of NaN
In this section, certain properties of NaN, during a numerical operation, are explained. This is not going to be a complete set of such properties.
I. Equality Check Returns False
A NaN has a property that comparison for equality will always return a false.
if( dResult == dResult )
{
int nNumber = 0;
}
II. Any Calculation with a NaN Returns a NaN
Suppose variable dResult
contains a NaN.
dResult += 1234;
After above operation, the variable dResult
will hold a NaN.
Concept of IND
IND means Indeterminate Number.
An IND number is a value that is one step down from NaN. That is, an IND is a value that is almost equivalent to a NaN. There are situations in computation whose result cannot be determined by the FPU (Floating Point Unit). In such cases the result will be set as an indeterminate number.
Consider below code.
double dInfinity = <INF>;
double dIND = ( dInfinity / dInfinity );
After the division operation, variable dIND
will hold an IND. Another example is given below.
double dZero = 0.00;
double dIND1 = ( dZero / dZero );
After the division operation, variable dIND
will hold an IND.
Examples are given only for demonstration. There can be other situations in which result of an arithmetic expression produces an Indeterminate value.
A. Representation of IND
I. Non Standard Representation
Define a long array of size 2.
const unsigned long lnIND[2] = {0x00000000, 0xfff80000};
Now, cast it to a double value.
const double AN_INDETERMINATE = *( double* )lnIND;
Please note that constant variable lnIND
contains a different value when compared to the corresponding NaN representation.
II. Standard Representation
I couldn't find any function that provides the standard representation of an IND number. This may be due to the fact that C++ (Microsoft) treats an IND as a NaN. This point is evident from the fact that the function _isnan()
returns true (a non zero) when an IND is given as input.
B. How an IND Looks Like?
Above is how an IND is displayed in the debugger. We will get the same string representation with functions such as sprintf()
and stream classes such as stringstream
. There can be both –VE and +VE representation of IND value. The string representation such as 1.#IND000000000000 are the Windows OS/Microsoft specific representation.
The concept and the internal representation ( i.e. IEEE Floating Point Format) will be same across Platforms/Environment but the user level Keyword/String will be different.
C. Comparison of IND
I. Non Standard Method
bool bIND = false;
if( 0 == memcmp( &AN_INDETERMINATE, &dIND, sizeof(double)))
{
bIND = true;
}
II. Standard Method
So far, I could not find any standard functions.
One tricky solution (on Windows Platform) is to take the string representation of the double value and then check for the presence of substring ‘#IND'.
C. Properties of IND
I. Equality Check Returns False
An IND has an important property that the comparison for equality will always return false. Consider below code.
if( dIND == dIND )
{
int a = 0;
}
II. Any Calculation with a IND Returns an IND or NaN
dIND += 1234;
dIND += -dIND;
3. Concept of INF
INF means Infinity.
An arithmetic operation results in an infinite number when the result of operation cannot be held in the corresponding data type. Here the result is said to be overflowed. That is, the result has overflowed available storage space. In such cases, the result is marked as INF.
As an example, consider the below code.
double dZero = 0.00; double dINF = 1/dZero ;
Here variable dINF
will hold an infinity.
Examples are given just for understanding. There can be other situations in which the result of an expression produces an INF value.
A. Representation of INF
I. Non Standard Representation
Define a long array of size 2.
const unsigned long lnINF[2] = {0x00000000, 0x7ff00000};
Now, cast it to a double value.
const double AN_INFINITY_POSITIVE = *( double* )lnINF;
II. Standard Representation
The <limits>
header file defines the following function for getting an INF value.
const double STD_AN_INFINITY_POSITIVE = std::numeric_limits<double>::infinity()
Since there are both +VE and –VE infinity, the above function returns a +VE infinity. Negative infinity can be obtained as below.
const double AN_INFINITY_NEGATIVE = -AN_INFINITY_POSITIVE;
B. How an INF Looks Like?
Above is how a +VE INF is displayed in the debugger. We will get the same string representation with functions such as sprint() and stream classes such as stringstream. The string representation such as 1.#INF000000000000 are the Windows OS/Microsoft specific representation.
The concept and the internal representation ( i.e. IEEE Floating Point Format) will be same across Platforms/Environment but the user level Keyword/String will be different.
C. Comparison of INF
I. Non Standard Method
bool bINF = false;
if( 0 == memcmp( &AN_INFINITY_POSITIVE, &dINF, sizeof(double)) ||
0 == memcmp( &AN_INFINITY_NEGATIVE, &dINF, sizeof(double)))
{
bINF = true;
}
II. Standard Method
The "float.h" header file defines the function _finite()
for checking whether a number is INF or not. There are other standard methods too.
C. Properties of INF
I. Equality Check Returns True
An INF has a property that the comparison for equality will always return True. That is
if( dINF == dINF )
{
int a = 0;
}
if( -dINF == -dINF )
{
int a = 0;
}
II. Any Calculation with a INF Returns an IND or NaN
dINF += -dINF;
dINF += NOT_A_NUMBER;
4. Concept of DEN
DEN means Denormalized. It is also known as Subnormal.
All of us know that there are infinite rational numbers between 0 and 1. Have you ever thought how much out of the infinite numbers a computer can store?
Since computer is a finite machine, there are limitations. It has limitation in the representation of floating numbers.
We know that float
and double
data types are represented by IEEE 754 floating point representation. This representation has two parts. One is Mantissa part and the second is Exponent part. An example is shown below.
Suppose an arithmetic operation results in a number that is very close to zero but NOT zero. Due to the floating point representation limit, CPU may not be able to represent it for further computation. In this case, the number is marked as a denormalized number.
As an example, consider below code.
double dDenTest = 0.01E-305;
dDenTest /= 10;
A. Representation of DEN
I. Non Standard Representation
Define a long array of size 2.
const unsigned long lnDEN[2] = {0x00000001, 0x00000000};
Now, cast it to a double value.
const double A_DENORMAL = *( double* )lnDEN;
II. Standard Representation
The <limits>
header file defines the following function for getting a DEN value.
double dDEN = std::numeric_limits<double>::denorm_min();
B. How a DEN Looks Like?
Above is how a DEN value is displayed in VS debugger. We will get the same string representation with functions such as sprintf()
and stream classes such as stringstream
. The string representation is the Windows OS/Microsoft specific representation.
The concept and the internal representation ( i.e. IEEE 754 Floating Point Format) will be same across Platforms/Environment but user level Keyword/String will be different.
C. Comparison of DEN
I. Non Standard Method
bool bDEN = false;
if( 0 == memcmp( &A_DENORMAL, &dDEN, sizeof(double)))
{
bDEN = true;
}
II. Standard Method
if ( dDEN != 0 && fabs ( dDEN ) <= std::numeric_limits<double>::denorm_min())
{
bDEN = true;
}
C. Properties of DEN
I. Equality Check is Same as Numeric Comparison
if( dDEN == dDEN )
{
int a = 0;
}
II. Any Calculation with a DEN is Same as Normal Calculation
dDenTest = 0.01E-305;
dDenTest /= 10; dDenTest *= 10;
Points of Interest
All Non Standard representation of NaN, IND, INF and DEN are given just out of academic interest. It simply shows how they are possibly represented in memory. Note that it is NOT the only way of representing a NaN, IND, INF and DEN in memory. There can be other representations. For more information, refer IEEE floating point representation.
All these NaN, IND, INF and DEN are a graceful way of telling the user that something has gone out of the boundary. It provides the floating point unit (FPU), otherwise called the Math coprocessor, a way out when it can no longer represent a floating point value.
Some useful references are given below.