Principle Foundations      Home Page

 

Coefficient of multiple determination

and

Partial Correlation-Coefficient

THE COEFFICIENT OF MULTIPLE DETERMINATION

     The coefficient of multiple determination, R2, is defined as the proportion of the total variation in Y "explained" by the multiple regression of Y on X1 and X2, and it can be calculated by

 

Explained variance   where k is number of estimated parameters

Unexplained variance .

 

Example

The calculated F ratio or statistic for the case of a simple regression and for a regression with n=15, k=3 (ie a multiple regression), we get:

              

where the subscripts of F denote the number of degrees of freedom in the numerator and denominator, respectively. In this simple regression case,                             F1, n-2 = t2n-2 for the same level of significance. For a multiple regression with n= 15 and k=13, .

 

It is possible for the calculated F statistic to be "large" and yet none of the estimated parameters to be statistically significant. This might occur when the independent variables are highly correlated with each other. The F test is often of limited usefulness because it is likely to reject the null hypothesis, regardless of whether or not the model explains "a great deal" or the variation of Y.

 

Since the inclusion of additional independent or explanatory variables is likely to increase the for the same , R2 increases. To take into consideration the reduction in the degrees of freedom as additional independent or explanatory variables are added, the adjusted R2, or is computed

where n= the number of observations

         k= the number of parameters estimated

PARTIAL-CORRELATION COEFFICIENT

     The partial-correlation coefficient measures the net correlation between the dependent variable and one independent variable after excluding the common influence of (ie, holding constant) the other independent variables in the model. For example  rYX1X2 is the partial correlation between Y and X1, after removing the influence of X2 from both Y and X1

 

where rYX1= simple-correlation coefficient between Y and X1 and rYX2 and rX1X2 are analogously defined. Partial-correlation coefficient range in value from -1 to +1 (as do simple-correlation coefficients), have the sign of the corresponding estimated parameter and are used to determine the relative importance of the different explanatory variables in a multiple regression.

For example, rYX1X2= -1 refers to the case where there is an exact or perfect negative linear relationship between Y and X1 after removing the common influence of X2 from both Y and X1. However, rYX1X2= 1 indicates a perfect positive linear net relationship between Y and X1. And rYX1X2= 0 indicates no linear relationship between Y and X1 when the common influence of X2 has been removed from both Y and X1. As a result, X1 can be omitted from the regression.

The sign of partial correlation coefficients is the same as that of the corresponding estimated parameter. For example, for the estimated regression equation , rYX1X2 has the same sign as and rYX2X1 has the same sign as .

Partial correlation coefficients are used in multiple regressional analysis to determine the relative importance of each explanatory variable in the model. The independent variable with the highest partial correlation coefficient with respect to the dependent variable contributes most to the explanatory power of the model and is entered first in a stepwise multiple regression analysis. It should be noted, however, that partial correlation coefficients give an ordinal not a cardinal measure of net correlation, and the sum of the partial correlation coefficients between the dependent and all the independent variables in the model need not add up to 1.

 

Copyright © 2002                                                                             Back to top

Evgenia Vogiatzi                                                                    <<Previous  Next>>

1