how i calculate pearson correlation coefficient for excel graph

**James Silverton** · September 18th, 2006, 03:05 PM posted to microsoft.public.excel.charting

"Jerry W. Lewis" wrote in message
...
The numerical problems with the pre-2003 Pearson algorithm are
the same as
with the pre-2003 StDev, Rsq, Slope, etc. They have been
discussed in the
statistical literature for over 40 years and in these
newsgroups for over 10.
You might find
http://groups.google.com/group/micro...a03470e7a1c650
to be useful.

I did the numerical testing because, as I mentioned, the
information available in HELP was rather obtuse. Thank you for a
most enlightening reply!

I must admit that in the days when I relied on statistical
tests for hypothesis testing, I did not use functions from
general programs like Excel, nor indeed use Pearson's test.
These days, my data is such that trends are visible even from
graphical display. As will be obvious, I'm a user rather than a
theorist but it's always comforting to know the limitations of
calculation formulae:-)

Have the statistical formulae in Excel 2003 on been rigorously
tested? As you remarked in the reference, it is amazing how long
Microsoft kept using unstable calculation techniques

--
James Silverton
Potomac, Maryland

**Jerry W. Lewis** · September 18th, 2006, 04:04 PM posted to microsoft.public.excel.charting

"James Silverton" wrote:

... Thank you for a
most enlightening reply!

You're welcome, glad it helped.

Have the statistical formulae in Excel 2003 on been rigorously
tested? As you remarked in the reference, it is amazing how long
Microsoft kept using unstable calculation techniques

Univariate statistics functions (StDev, StDevP, Var, VarP) worked fine out
of the box. Bivariate Statistics functions (Slope, Intercept, SteYX, etc)
could produce incorrect results with non-numeric or empty cells in the data
range; that was fixed in a March 2004 patch (as was a bug in the new algorith
for RAND).

When LINEST 2003 estimates a parameter to be exactly zero, the estimate
should be distrusted unless confirmed by alternate calculations; this is
fixed in 2007 beta.

Probability calculations (other than the much improved 2003 NORMDIST and
NORMINV) remain inadequate. ...INV functions do a better job of inverting
....DIST functions, but continuous distribution functions (...DIST) continue
to have limited accuracy and continue to become totally inaccurate in at
least one tail. Algorithms were changed for discrete distributions
functions, but they introduce new numerical problems that are still not fixed
in 2007 beta. The gold standard for probability calculations remains Ian
Smith's library
http://members.aol.com/iandjmsmith/examples.xls
which is actually better (both accuracy and working range) than any other
double precision implementation that I am aware of, (including dedicated
statistics packages and math libraries).

Jerry

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode