Bug in Excel's (not VBA's) MOD function

#21 October 8th, 2003, 11:09 PM

In message , Jerry W. Lewis
writes
I got it by experimentation, and then saw that Arvi had also determined
the same limit. I find it interesting that MS makes no attempt to
explain such an unusual limit.

The mantissa of a long floating point number has 54 bits significance
and 2^27 is the value with exactly half that number of bits.

If you choose a particularly stupid and naive way of computing x MOD y
then things can go wrong when the denominator *and* quotient exceed the
limit 2^27. However, they only tested the quotient value x/y 2^27.

The dodgy way to do it in floating point is frac(x/y)*y

But no one in their right mind would ever implement mod this way. And
anyway most high level languages have a correct mod library function.

An additional unusual limit that applies, is that MOD returns #NUM!
regardless of the quotient if the first argument exceeds
2.68873542664192E14 = 2^20+2^19+2^18+2^17+2^15+2^12+2^8+2^6
which is within the range of exact DP representation of whole numbers
by more than an order of magnitude.

???? I don't understand. 2.688E14 is a shade under 2^48

And I don't see any such odd limit in XL2k. It appears to work more or
less OK here for larger values up to around 2^54 of the numerator x and
denominator y. (provided that x/y 2^27)

I'd believe it does go wrong for some specific large values though. It's
always hard to predict the behaviour of flawed algorithms
experimentally.

Jerry

Dana DeLouis wrote:

Did you get 134217728 from the following Microsoft kb article?
XL: MOD() Function Returns #NUM! Error Value
http://support.microsoft.com/default...b;en-us;119083

Since they have documented it I guess we can expect a fix in about 2014.
(based on the latency time for the recent fixes to the statistics bugs)

Regards,
--
Martin Brown

#22 October 14th, 2003, 02:45 PM

"J.E. McGimpsey" wrote in message ...
In article ,
"Harlan*Grove" wrote:

"J.E. McGimpsey" wrote...
I get #NUM! in XL01 and XLv.X.

So will Microsoft consider this a bug or a feature?

Given their big "mea culpa" on statistical functions,

http://support.microsoft.com/?kbid=828888

I'm hopeful.

Sadly, I'm less hopeful. It is undoubtedly a step in the right
direction but not everything in
http://support.microsoft.com/?kbid=828888 is correct, nor are all the
improvements discussed in it always improvements!

For example, it contains the following paragraph.

"In summary, users of earlier versions of Excel should not be troubled
in practice by numeric deficiencies because input values that cause
#NUM! results are extreme. Additionally, you can be assured that if
you receive a result other than an error message, the result is
accurate. Round off error is not a problem here."

Unfortunately this is not correct. A counter-example is
=NEGBINOMDIST(512,512,0.5) which gives 0 when the answer should be
approx 0.0125. Note the parameters are not "big" nor is the answer
"small" by anyone's standards (i.e. there can be no "extreme case"
excuses).

Additionally in the section "Continuous Distribution Inverse
Functions" the suggestion is that the only real problem was the lack
of a good binary search process. In fact, in EXCEL 2000, neither
gammadist nor betadist works even for moderately large parameters,
which is a pretty big hindrance for function inversion!
=GAMMADIST(7000,7000,1,TRUE) gives #NUM!,
=BETADIST(0.5,30000,30000,0,1) also gives #NUM!

I do not wish to unduly criticise the new code added for
BINOMDIST,HYPGEOMDIST & POISSON. In most cases it is an improvement,
albeit it will be very slow when large parameter values are supplied
to the functions. However, for cases such as =POISSON(126,230,TRUE)
the relative error will go from 3e-14 to 0.5. Worse still, cases such
as =POISSON(125,230,TRUE) and =POISSON(125,230,FALSE) will both return
the value 0 which is completely inaccurate. Admittedly, the values are
small in these cases, =POISSON(126,230,TRUE) should deliver approx
5e-14. However, unbelievable though it may seem to Microsoft, there
are wierdos about (self included) who want accurate calculations of
probabilities smaller than 5e-14, who want accurate calculations of
probability functions involving parameters bigger than 30000 and who
want these calculations done lots of times which means they have to be
reasonably efficient!

Ian Smith

#23 October 14th, 2003, 05:36 PM

"Ian Smith" wrote in message
om...
"J.E. McGimpsey" wrote in message
...
In article ,
"Harlan Grove" wrote:

"J.E. McGimpsey" wrote...
I get #NUM! in XL01 and XLv.X.

So will Microsoft consider this a bug or a feature?

Given their big "mea culpa" on statistical functions,

http://support.microsoft.com/?kbid=828888

I'm hopeful.

Sadly, I'm less hopeful. It is undoubtedly a step in the right
direction but not everything in
http://support.microsoft.com/?kbid=828888 is correct, nor are all the
improvements discussed in it always improvements!

For example, it contains the following paragraph.

"In summary, users of earlier versions of Excel should not be troubled
in practice by numeric deficiencies because input values that cause
#NUM! results are extreme. Additionally, you can be assured that if
you receive a result other than an error message, the result is
accurate. Round off error is not a problem here."

Unfortunately this is not correct. A counter-example is
=NEGBINOMDIST(512,512,0.5) which gives 0 when the answer should be
approx 0.0125. Note the parameters are not "big" nor is the answer
"small" by anyone's standards (i.e. there can be no "extreme case"
excuses).

You are certainly correct! NEGBINOMDIST does not give an error message in
Excel XP (2002, I suppose) either and returns zero. The formula seems to
fail for numbers larger than 511. I suppose that is because of the immense
numerical values of the combinations involved but I wonder why Microsoft
does not use their own suggestion?

NEGBINOMDIST(number_s, number_f, probability_s) =

BINOMDIST(number_s, number_f + number_s - 1, probability_s, false) *
probability_s.

This actually gives an apparently correct answer in Excel 2002.

--
James V. Silverton
Potomac, Maryland, USA

#24 October 14th, 2003, 09:04 PM

"James Silverton" wrote...
"Ian Smith" wrote in message
...
Unfortunately this is not correct. A counter-example is
=NEGBINOMDIST(512,512,0.5) which gives 0 when the answer should be
approx 0.0125. Note the parameters are not "big" nor is the answer
"small" by anyone's standards (i.e. there can be no "extreme case"
excuses).

You are certainly correct! NEGBINOMDIST does not give an error message in
Excel XP (2002, I suppose) either and returns zero. The formula seems to
fail for numbers larger than 511. I suppose that is because of the immense
numerical values of the combinations involved but I wonder why Microsoft
does not use their own suggestion?

Immense numerical values! If one uses a bone-headed algorithm for calculating (n
choose k) or p^a * (1-p)^b, then the calculations can blow up. However, outside
Redmond the following approach might be used.

NEGBINOMDIST(512,512,0.5)

is defined as

COMBIN(1023,511) * 0.5^512 * (1 - 0.5)^512

which can be rewritten as

EXP(LN(COMBIN(1023,511)) + LN(0.5)*512 + LN(0.5)*512)

which in this case simplifies to

EXP(GAMMALN(1024) - 2*GAMMALN(512) + 1024*LN(0.5))

which reduces to

EXP(705.397794316793 - 709.782712893384)

which evaluates to

0.0124639029464894

The zero result proves that Microsoft's Excel programmers don't know when to use
logarithms to avoid overflow/underflow (it's the 0.5^1024 that Excel can't
handle by direct calculation, not the COMBIN value). Pathetic.

--
Never attach files.
Snip unnecessary quoted text.
Never multipost (though crossposting is usually OK).
Don't change subject lines because it corrupts Google newsgroup archives.

#25 October 14th, 2003, 10:36 PM

"Harlan Grove" wrote in message
...
"James Silverton" wrote...
"Ian Smith" wrote in message
..
Unfortunately this is not correct. A counter-example is
=NEGBINOMDIST(512,512,0.5) which gives 0 when the answer should be
approx 0.0125. Note the parameters are not "big" nor is the answer
"small" by anyone's standards (i.e. there can be no "extreme case"
excuses).

You are certainly correct! NEGBINOMDIST does not give an error message in
Excel XP (2002, I suppose) either and returns zero. The formula seems to
fail for numbers larger than 511. I suppose that is because of the
immense
numerical values of the combinations involved but I wonder why Microsoft
does not use their own suggestion?

Immense numerical values! If one uses a bone-headed algorithm for
calculating (n
choose k) or p^a * (1-p)^b, then the calculations can blow up. However,
outside
Redmond the following approach might be used.

NEGBINOMDIST(512,512,0.5)

is defined as

COMBIN(1023,511) * 0.5^512 * (1 - 0.5)^512

which can be rewritten as

EXP(LN(COMBIN(1023,511)) + LN(0.5)*512 + LN(0.5)*512)

which in this case simplifies to

EXP(GAMMALN(1024) - 2*GAMMALN(512) + 1024*LN(0.5))

which reduces to

EXP(705.397794316793 - 709.782712893384)

which evaluates to

0.0124639029464894

The zero result proves that Microsoft's Excel programmers don't know when
to use
logarithms to avoid overflow/underflow (it's the 0.5^1024 that Excel can't
handle by direct calculation, not the COMBIN value). Pathetic.

Thanks again! Bone-headed indeed but your formula looks good. I suppose
Microsoft might consider employing a few mathematicians in the Excel section
(g).

Jim.

#26 October 15th, 2003, 05:21 PM

Just to share some thoughts. I may be wrong, but it doesn't look to me that
the following two terms can be combined into - 2*GAMMALN(512) .
I "think" the f term needs to be f+1 in the GammaLn. Here are my thoughts,
but I may be wrong.

Sub Demo()
Dim f, s, p, ans
f = 512
s = 512
p = 1 / 2

With WorksheetFunction
ans = f * Log(1 - p) + s * Log(p) - .GammaLn(s) - .GammaLn(f + 1) +
..GammaLn(f + s)
ans = Exp(ans)
End With
Debug.Print FormatNumber(ans, 16)
End Sub

returns: 0.0124639029469358

Unfortunately, the program Mathematica shows this to be accurate to only 11
sig. digits. :(

--
Dana DeLouis
Using Windows XP & Office XP
= = = = = = = = = = = = = = = = =

"Harlan Grove" wrote in message
...
"James Silverton" wrote...
"Ian Smith" wrote in message
..
Unfortunately this is not correct. A counter-example is
=NEGBINOMDIST(512,512,0.5) which gives 0 when the answer should be
approx 0.0125. Note the parameters are not "big" nor is the answer
"small" by anyone's standards (i.e. there can be no "extreme case"
excuses).

You are certainly correct! NEGBINOMDIST does not give an error message in
Excel XP (2002, I suppose) either and returns zero. The formula seems to
fail for numbers larger than 511. I suppose that is because of the
immense
numerical values of the combinations involved but I wonder why Microsoft
does not use their own suggestion?

Immense numerical values! If one uses a bone-headed algorithm for
calculating (n
choose k) or p^a * (1-p)^b, then the calculations can blow up. However,
outside
Redmond the following approach might be used.

NEGBINOMDIST(512,512,0.5)

is defined as

COMBIN(1023,511) * 0.5^512 * (1 - 0.5)^512

which can be rewritten as

EXP(LN(COMBIN(1023,511)) + LN(0.5)*512 + LN(0.5)*512)

which in this case simplifies to

EXP(GAMMALN(1024) - 2*GAMMALN(512) + 1024*LN(0.5))

which reduces to

EXP(705.397794316793 - 709.782712893384)

which evaluates to

0.0124639029464894

The zero result proves that Microsoft's Excel programmers don't know when
to use
logarithms to avoid overflow/underflow (it's the 0.5^1024 that Excel can't
handle by direct calculation, not the COMBIN value). Pathetic.

#27 October 15th, 2003, 06:44 PM

"Dana DeLouis" wrote...
Just to share some thoughts. I may be wrong, but it doesn't look to me that
the following two terms can be combined into - 2*GAMMALN(512) .
...

You're right. Mental typo on my part. I used LN(COMBIN(1023,511)) in my actual
calculations, but miswrote this as GAMMALN(1024)-2*GAMMALN(512). It should have
been GAMMALN(1024)-GAMMALN(512)-GAMMALN(513). However it remains the case that
NEGBINOMDIST(512,512,0.5) doesn't need to choke.

Unfortunately, the program Mathematica shows this to be accurate to only 11
sig. digits. :(

FWIW, Mathematica and Maple differ in the results of their respective logarithm
of complete gamma functions at the 14th digit, so *prove* that Mathematica is
more accurate than Maple. Good luck finding standard mathematical tables for
this function beyond 10 significant digits.

Excel's GAMMALN function only agrees with Mathematica's and Maple's equivalents
to single precision (7 decimal digits), so there's some considerable loss of
precision using Excel's GAMMALN for something like this. Probably better to use

SUM(LN(ROW(512:1023)))-SUM(LN(ROW(2:512)))

assuming that Excel's LN is more precise (which it should be if it's a straight
pass through to the FPU's IEEE logarithm operator, but Microsoft has likely
"improved" upon IEEE).

--
Never attach files.
Snip unnecessary quoted text.
Never multipost (though crossposting is usually OK).
Don't change subject lines because it corrupts Google newsgroup archives.

#28 October 15th, 2003, 11:24 PM

Hi Harlan. I think it's a good catch on your discussion of Excel failing on
the 0.5^1024. It appears that value is apr. 5.563 E-309, which is just a
"hair" outside of Excel's limits.(2.229E-308). Maybe Excel rounds to zero,
and returned a zero final answer. Who knows? I agree with you that Excel
should not choke on this. :)

Just for discussion, the "other two programs" differing at the 14 digit does
not sound right to me, but of course, I can't tell. I don't have Maple. By
chance, were you testing with the inexact number of 0.5 so that it operated
at machine precision? I was testing at the more exact value of 1/2, thereby
using full arbitrary precision. From small testing in the past, I find that
Excel's GammaLn function to be a little more accurate than 7 digits, with a
usual range of about 9-11 digits.
Then there are the following two examples:
=GAMMALN(1)
=GAMMALN(2)

-4.1716E-11
-8.57678E-11

The Gamma of both 1 & 2 should be 1. And of course, the Log of 1 is zero.
I believe that Excel's GammaLn function should return an exact zero for both
examples. Excel is of course using an approximating function.

Here are just three different ways to calculate this in Mathematica. All
three methods agree out to 40 digits. I don't know why both programs would
give different answers. I'm not positive, but I would think they would
agree here. The Binomial example below is pretty exact. It's a 307 digit
integer divided by a 309 digit integer. (LogGamma here is similar to
Excel's GammaLn). Note that I used p of 1/2, and not 0.5.

data = {f - 512, s - 512, p - 1/2};
(1)
N[PDF[NegativeBinomialDistribution[512, 1/2], 512], 40]

0.012463902946489771856117316100129344083043651786 931274117`40.

(2)
N[Binomial[f + s - 1, s - 1]*p^s*(1 - p)^f /. data, 40]

0.012463902946489771856117316100129344083043651786 931274117`40.

(3)
N[Exp[f*Log[1 - p] + s*Log[p] - LogGamma[1 + f] - LogGamma[s] + LogGamma[f +
s] /. data], 40]

0.012463902946489771856117316100129344083043651786 931274117`40.

Well anyway, interesting discussion. :)
--
Dana DeLouis
Using Windows XP & Office XP
= = = = = = = = = = = = = = = = =

"Harlan Grove" wrote in message
...
"Dana DeLouis" wrote...
Just to share some thoughts. I may be wrong, but it doesn't look to me
that
the following two terms can be combined into - 2*GAMMALN(512) .
..

You're right. Mental typo on my part. I used LN(COMBIN(1023,511)) in my
actual
calculations, but miswrote this as GAMMALN(1024)-2*GAMMALN(512). It should
have
been GAMMALN(1024)-GAMMALN(512)-GAMMALN(513). However it remains the case
that
NEGBINOMDIST(512,512,0.5) doesn't need to choke.

Unfortunately, the program Mathematica shows this to be accurate to only
11
sig. digits. :(

FWIW, Mathematica and Maple differ in the results of their respective
logarithm
of complete gamma functions at the 14th digit, so *prove* that Mathematica
is
more accurate than Maple. Good luck finding standard mathematical tables
for
this function beyond 10 significant digits.

Excel's GAMMALN function only agrees with Mathematica's and Maple's
equivalents
to single precision (7 decimal digits), so there's some considerable loss
of
precision using Excel's GAMMALN for something like this. Probably better
to use

SUM(LN(ROW(512:1023)))-SUM(LN(ROW(2:512)))

assuming that Excel's LN is more precise (which it should be if it's a
straight
pass through to the FPU's IEEE logarithm operator, but Microsoft has
likely
"improved" upon IEEE).

--
Never attach files.
Snip unnecessary quoted text.
Never multipost (though crossposting is usually OK).
Don't change subject lines because it corrupts Google newsgroup archives.

#29 October 16th, 2003, 12:47 AM

"Dana DeLouis" wrote...
...
Just for discussion, the "other two programs" differing at the 14 digit does
not sound right to me, but of course, I can't tell. I don't have Maple.

I was one of a few regulars in this ng who log of complete gamma functions from
several packages last Spring. The results may be found in a zipped XLS file at

ftp://members.aol.com/hrlngrv/gammaln8.zip

By chance, were you testing with the inexact number of 0.5 so that it operated
at machine precision? I was testing at the more exact value of 1/2, thereby
using full arbitrary precision. . . .

Appologies if this is a joke that I'm too ironly-impaired to recognize.

???!!!

Kindly show me any Intel-based PC and/or any software for such a machine that
uses IEEE binary floating point that treats 0.5 any differently than 1/2.
There's no difference whatsoever. Both have the identical internal binary
representation. Here's a small C program I just used to test this. Compiled with
the LCC compiler that comes with GNU's Small Eiffel language package for
Windows.

/* begin foo.c */
#include stdio.h

int main() {
double d;
unsigned char *pc;

pc = (unsigned char *) &d;

d = 0.5;
printf("%g\n%02hhX %02hhX %02hhX %02hhX %02hhX %02hhX %02hhX %02hhX\n\n", \
d, pc[0], pc[1], pc[2], pc[3], pc[4], pc[5], pc[6], pc[7]);

d = 1.0 / 2.0;
printf("%g\n%02hhX %02hhX %02hhX %02hhX %02hhX %02hhX %02hhX %02hhX\n\n", \
d, pc[0], pc[1], pc[2], pc[3], pc[4], pc[5], pc[6], pc[7]);

return 0;
}
/* end foo.c */

And it's little endian output.

foo
0.5
00 00 00 00 00 00 E0 3F

0.5
00 00 00 00 00 00 E0 3F

Or perhaps you're aware of some subtle Excel functionality that renders 1/2
correctly but 0.5 as something else.

From small testing in the past, I find that Excel's GammaLn function to be a
little more accurate than 7 digits, with a usual range of about 9-11 digits.

Half the time 8 or 9 bits, the other half the time 7 bits. See the comparison
worksheet in the Zip file the url for which appears above. Since single
precision gives more than exactly 7 decimal digits of precision, it's not
surprising that it does better than 7 decimal digits some of the time.

Then there are the following two examples:
=GAMMALN(1)
=GAMMALN(2)

-4.1716E-11
-8.57678E-11

The Gamma of both 1 & 2 should be 1. And of course, the Log of 1 is zero.
I believe that Excel's GammaLn function should return an exact zero for both
examples. Excel is of course using an approximating function.
...

Since the complete gamma function is defined using a nonanalytic definite
integral, all methods of calculating it or its logarithm must be approximate.

Excel's approximation function is a rather poor one. What's new?! Why should
Excel's GAMMALN function be any higher quality than it's MOD function? Or its
NEGBINOMDIST function? Or its POISSON function? And so on . . .

FWLIW, the workbook in the linked file includes a VBA translation of a Perl
function based on FORTRAN code from http://www.netlib.org/specfun/gamma . Very
good accuracy. Pity no one at Microsoft seems to know about netlib. Instead, we
all get to enjoy - firsthand - the numerical teething pains of the Excel
developers. How long must they go on?

--
Never attach files.
Snip unnecessary quoted text.
Never multipost (though crossposting is usually OK).
Don't change subject lines because it corrupts Google newsgroup archives.

#30 October 16th, 2003, 04:09 AM

Thanks Harlan. I understand now. I appreciate the feedback. Thanks for
link also. That's interesting code. I'll probably be studying it for
awhile.
I recognized the const D1 of 0.57721... as the EulerGamma constant.

That part of 0.5 was only a guess as to why the two programs were giving
different answers. It wasn't meant for Excel. Sorry. Excel can't do
arbitrary precision. I see that you were testing both programs at machine
precision. (similar to double precision).

What I was thinking at the time was the following. As you know, Excel can't
do this, but the other programs treat the following two list as completely
different. The elements are different, and the algorithms used on them are
different.

exact = {f - 512, s - 512, p - 1/2};
approx = {f - 512., s - 512., p - 0.5};

For example...
Precision[1/2]
Infinity

Precision[0.5]
MachinePrecision

Here are the same equations, but with the different numbers from above,

InputForm[N[Binomial[f + s - 1, s - 1]*p^s*(1 - p)^f /. exact]]
0.012463902946489773

InputForm[N[Binomial[f + s - 1, s - 1]*p^s*(1 - p)^f /. approx]]
0.012463902946493622

As you can see, the last 5 digits are different. That's what I was trying
to point out as a "possible" explanation for the differences you observed.
However, I now see how it was tested, so it doesn't apply. Thanks. :)

--
Dana DeLouis
Using Windows XP & Office XP
= = = = = = = = = = = = = = = = =

snip

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode