您的位置:首页 > 编程语言 > C语言/C++

IEEE floating-point exceptions in C++

2013-07-25 09:23 225 查看
This page will answer the following questions.

My program just printed out
1.#IND
or
1.#INF
(on Windows) or
nan
or
inf
(on Linux). What happened?
How can I tell if a number is really a number and not a
NaN
or an infinity?
How can I find out more details at runtime about kinds of
NaN
s and infinities?
Do you have any sample code to show how this works?
Where can I learn more?
These questions have to do with floating point exceptions. If you get some strange non-numeric output where you're expecting a number, you've either exceeded the finite limits of floating point arithmetic or you've asked for some result that is undefined.
To keep things simple, I'll stick to working with the
double
floating point type. Similar remarks hold for
float
types.

Debugging 1.#IND, 1.#INF, nan, and inf

If your operation would generate a larger positive number than could be stored in a
double
, the operation will return
1.#INF
on Windows or
inf
on Linux. Similarly your code will return
-1.#INF
or
-inf

if the result would be a negative number too large to store in a
double
. Dividing a positive number by zero produces a positive infinity and dividing a negative number by zero produces a negative infinity. Example code at the end of this page will
demonstrate some operations that produce infinities.

Some operations don't make mathematical sense, such as taking the square root of a negative number. (Yes, this operation makes sense in the context of complex numbers, but a
double
represents a real number and so there is no
double

to represent the result.) The same is true for logarithms of negative numbers. Both
sqrt(-1.0)
and
log(-1.0)
would return a
NaN
, the generic term for a "number" that is "not a number". Windows displays a
NaN

as 
-1.#IND
("IND" for "indeterminate") while Linux displays
nan
. Other operations that would return a
NaN
include 0/0, 0*∞, and ∞/∞. See the sample code below for examples.

In short, if you get
1.#INF
or
inf
, look for overflow or division by zero. If you get
1.#IND
or
nan
, look for illegal operations. Maybe you simply have a bug. If it's more subtle and you have something that
is difficult to compute, seeAvoiding Overflow, Underflow, and Loss of Precision. That article gives tricks for computing results that have intermediate steps overflow if computed directly.

Testing for NaNs and infinities

Next suppose you want to test whether a number is an infinity or a
NaN
.For example, you may want to write to a log file print a debug message when a numericalresult goes bad, or you may want to execute some sort of alternate logic in your code.There
are simple, portable ways to get summary information and more complicated, less portableways to get more information.

First, the simple solution. If you want to test whether a
double
variable contains a valid number, you can check whether
x == x
.This looks like it should always be true, but it's not! Ordinary numbers alwaysequal themselves, but
NaN
s
do not. I've used this trick on Windows, Linux, and Mac OSX.If you ever use this trick, put big bold comments around your code so that some well-meaningperson won't come behind you and delete what he or she things is useless code.Better yet, put the test in
a well-documented function in a library that has controlled access. The following function will test whether
x
is a (possibly infinite) number.

bool IsNumber(double x)
{
// This looks like it should always be true,
// but it's false if x is a NaN.
return (x == x);
}


To test whether a variable contains a finite number, (i.e. not a
NaN
and not an infinity) you can use code like the following.

bool IsFiniteNumber(double x)
{
return (x <= DBL_MAX && x >= -DBL_MAX);
}


Here
DBL_MAX
is a constant defined in
float.h
as the largest
double
that can be represented. Comparisons with
NaN
s always fail,even when comparing to themselves, and so the test above will fail for a
NaN
.If
x

is not a
NaN
but is infinite, one of the two tests will fail depending on whether it is a positive infinity or negative infinity.

Getting more information programmatically

To get more detail about the type of a floating point number, there is a function
_fpclass
on Windows and a corresponding function
fp_class_d
on Linux. I have not been able to get the corresponding Linux code to workand so I'll stick
to what I've tested and just talk about Windows from here on out.

The Windows function
_fpclass
returns one of the following values:

_FPCLASS_SNAN   // signaling NaN
_FPCLASS_QNAN   // quiet NaN
_FPCLASS_NINF   // negative infinity
_FPCLASS_NN     // negative normal
_FPCLASS_ND     // negative denormal
_FPCLASS_NZ     // -0
_FPCLASS_PZ     // +0
_FPCLASS_PD     // positive denormal
_FPCLASS_PN     // positive normal
_FPCLASS_PINF   // positive infinity


The following code illustrates which kinds of operations result in which kinds of numbers. To port this code to Linux, the
FPClass
function would need to use
fp_class_d
and its corresponding constants.

#include <cfloat>
#include <iostream>
#include <sstream>
#include <cmath>

using namespace std;

string FPClass(double x)
{
int i = _fpclass(x);
string s;
switch (i)
{
case _FPCLASS_SNAN: s = "Signaling NaN";                break;
case _FPCLASS_QNAN: s = "Quiet NaN";                    break;
case _FPCLASS_NINF: s = "Negative infinity (-INF)";     break;
case _FPCLASS_NN:   s = "Negative normalized non-zero"; break;
case _FPCLASS_ND:   s = "Negative denormalized";        break;
case _FPCLASS_NZ:   s = "Negative zero (-0)";           break;
case _FPCLASS_PZ:   s = "Positive 0 (+0)";              break;
case _FPCLASS_PD:   s = "Positive denormalized";        break;
case _FPCLASS_PN:   s = "Positive normalized non-zero"; break;
case _FPCLASS_PINF: s = "Positive infinity (+INF)";     break;
}
return s;
}

string HexDump(double x)
{
unsigned long* pu;
pu = (unsigned long*)&x;
ostringstream os;
os << hex << pu[0] << " " << pu[1];
return os.str();
}

// ----------------------------------------------------------------------------
int main()
{
double x, y, z;

cout << "Testing z = 1/0\n";
// cannot set x = 1/0 directly or would produce compile error.
x = 1.0; y = 0; z = x/y;
cout << "z = " << x/y << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting z = -1/0\n";
x = -1.0; y = 0; z = x/y;
cout << "z = " << x/y << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting z = sqrt(-1)\n";
x = -1.0;
z = sqrt(x);
cout << "z = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting z = log(-1)\n";
x = -1.0;
z = sqrt(x);
cout << "z = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting overflow\n";
z = DBL_MAX;
cout << "z = DBL_MAX = " << z;
z *= 2.0;
cout << "; 2z = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting denormalized underflow\n";
z = DBL_MIN;
cout << "z = DBL_MIN = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";
z /= pow(2.0, 52);
cout << "z = DBL_MIN / 2^52= " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FP
4000
Class(z) << "\n";
z /= 2;
cout << "z = DBL_MIN / 2^53= " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting z = +infinity + -infinty\n";
x = 1.0; y = 0.0; x /= y; y = -x;
cout << x << " + " << y << " = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting z = 0 * infinity\n";
x = 1.0; y = 0.0; x /= y; z = 0.0*x;
cout << "x = " << x << "; z = 0*x = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting 0/0\n";
x = 0.0; y = 0.0; z = x/y;
cout << "z = 0/0 = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting z = infinity/infinity\n";
x = 1.0; y = 0.0; x /= y; y = x; z = x/y;
cout << "x = " << x << "; z = x/x = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting x fmod 0\n";
x = 1.0; y = 0.0; z = fmod(x, y);
cout << "fmod(" << x << ", " << y << ") = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nTesting infinity fmod x\n";
y = 1.0; x = 0.0; y /= x; z = fmod(y, x);
cout << "fmod(" << y << ", " << x << ") = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

cout << "\nGetting cout to print QNAN\n";
unsigned long nan[2]={0xffffffff, 0x7fffffff};
z = *( double* )nan;
cout << "z = " << z << "\n";
cout << HexDump(z) << " _fpclass(z) = " << FPClass(z) << "\n";

return 0;
}


To learn more

For a brief explanation of numerical limits and how floating point numbers are laid out in memory, seeAnatomy of a floating point number.

For much more detail regarding exceptions and IEEE arithmetic in general, see
What every computer scientist should know about floating-point arithmetic.

From John D. Cook's blog
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: