CORRELATION( recset, valuex, valuey [ , expresssion] [, KEYED ] [, UNORDERED | ORDERED( bool ) ] [, STABLE | UNSTABLE ] [, PARALLEL [ ( numthreads ) ] ] [, ALGORITHM( name ) ] )
recset | The set of records to process. This may be the name of a dataset or a record set derived from some filter condition, or any expression that results in a derived record set. This also may be the GROUP keyword to indicate operating on the elements in each group, when used in a RECORD structure to generate crosstab statistics. |
valuex | A numeric field or expression. |
valuey | A numeric field or expression. |
expression | Optional. A logical expression indicating which records to include in the calculation. Valid only when the recset parameter is the keyword GROUP. |
KEYED | Optional. Specifies the activity is part of an index read operation, which allows the optimizer to generate optimal code for the operation. |
UNORDERED | Optional. Specifies the output record order is not significant. |
ORDERED | Specifies the significance of the output record order. |
bool | When False, specifies the output record order is not significant. When True, specifies the default output record order. |
STABLE | Optional. Specifies the input record order is significant. |
UNSTABLE | Optional. Specifies the input record order is not significant. |
PARALLEL | Optional. Try to evaluate this activity in parallel. |
numthreads | Optional. Try to evaluate this activity using numthreads threads. |
ALGORITHM | Optional. Override the algorithm used for this activity. |
name | The algorithm to use for this activity. Must be from the list of supported algorithms for the SORT function's STABLE and UNSTABLE options. |
Return: | CORRELATION returns a single REAL value. |
The CORRELATION function returns the Pearson's Product Moment Correlation Coefficient between valuex and valuey.
Example:
pointRec := { REAL x, REAL y };
analyze( ds) := MACRO
#uniquename(stats)
%stats% := TABLE(ds, { c := COUNT(GROUP),
sx := SUM(GROUP, x),
sy := SUM(GROUP, y),
sxx := SUM(GROUP, x * x),
sxy := SUM(GROUP, x * y),
syy := SUM(GROUP, y * y),
varx := VARIANCE(GROUP, x);
vary := VARIANCE(GROUP, y);
varxy := COVARIANCE(GROUP, x, y);
rc := CORRELATION(GROUP, x, y) });
OUTPUT(%stats%);
// Following should be zero
OUTPUT(%stats%, { varx - (sxx-sx*sx/c)/c,
vary - (syy-sy*sy/c)/c,
varxy - (sxy-sx*sy/c)/c,
rc - (varxy/SQRT(varx*vary)) });
OUTPUT(%stats%, { 'bestFit: y=' +
(STRING)((sy-sx*varxy/varx)/c) +
' + ' +
(STRING)(varxy/varx)+'x' });
ENDMACRO;
ds1 := DATASET([{1,1},{2,2},{3,3},{4,4},{5,5},{6,6}], pointRec);
ds2 := DATASET([ {1.93896e+009, 2.04482e+009},
{1.77971e+009, 8.54858e+008},
{2.96181e+009, 1.24848e+009},
{2.7744e+009, 1.26357e+009},
{1.14416e+009, 4.3429e+008},
{3.38728e+009, 1.30238e+009},
{3.19538e+009, 1.71177e+009} ], pointRec);
ds3 := DATASET([ {1, 1.00039},
{2, 2.07702},
{3, 2.86158},
{4, 3.87114},
{5, 5.12417},
{6, 6.20283} ], pointRec);
analyze(ds1);
analyze(ds2);
analyze(ds3);
See Also: VARIANCE, COVARIANCE