ALSM5e pp. 116--119, 421--431; ALSM4e pp. 112--115, 400--409.
STATA reference manual [R] regression diagnostics, [R] regress
c2BP = (SSR*/(p-1) / (SSE/n)2where
SSR* is the regression sum of squares of the regression of e2 on the XkWhen n is sufficiently large and s2 is constant, c2BP is distributed as a chi-square distribution with 1 df. Large values of c2BP lead to the conclusion that s2 is not constant.
SSE is the error sum of squares of the regression of Y on the Xk
s12 | 0 | ... | 0 |
0 | s22 | ... | 0 |
... | ... | ... | ... |
0 | 0 | ... | sn2 |
Qw = Si=1 to n wi(Yi - b0 - b1Xi1 - ... - bp-1Xi,p-1)2where the weights wi=1/si2 are inversely proportional to the si2; thus WLS gives less weight to observations with large error variance, and vice-versa.
w1 | 0 | ... | 0 |
0 | w2 | ... | 0 |
... | ... | ... | ... |
0 | 0 | ... | wn |
(X'WX)bW = X'WY (normal equations)Likewise one can show that
bW = (X'WX)-1X'WY
s2{bW} = s2(X'WX)-1The WLS estimates can also be obtained by applying OLS to the data transformed by the "square root" W1/2 of W, where W1/2 contains the square roots of the wi on the diagonal, and zeros elsewhere.
s2{bW} = MSEW(X'WX)-1
MSEW = Swi(Yi - ^Yi)2/(n - p)
((W1/2X)'(W1/2X))-1(W1/2X)'(W1/2Y)Thus one can obtain bW by multiplying Y and X by the square root of the weight and applying OLS to the transformed data.
= (X'W1/2W1/2X)-1(X'W1/2W1/2Y)
= (X'WX)-1(X'WY) = bW
b = (X'X)-1X'Yhas variance matrix
bGLS = (X'W-1X)-1X'W-1Ywhere bGLS is termed the generalized least squares (GLS) estimator.
bEGLS = (X'^W-1X)-1X'^W-1Ywhere ^W denotes the estimated matrix W. bEGLS is termed the estimated generalized least squares (EGLS) or feasible generalized least squares (FGLS) estimator.
Y = Xb + ewhere E{e} = 0 and E{ee'} = W is a positive definite matrix.
s2{b} = (X'X)-1X'WX(X'X)-1When the errors are homoscedastic, W = s2I and the expression for s2{b} reduces to the usual
s2{b} = s2(X'X)-1OLSCM denotes the usual OLS covariance matrix of estimates.
OLSCM = s2{b} = MSE(X'X)-1 (where MSE = Sei2/(n-p))
^Wii = (ei - 0)2/1 = ei2This leads to the HCCM
^W = diag{ei2}
HC1 = (n/(n-p)) (X'X)-1X'diag{ei2}X(X'X)-1where n/(n-p) is a degree of freedom correction factor that becomes negligible for large samples.
HC2 = (X'X)-1X'diag{ei2/(1 - hii)}X(X'X)-1HC2 is obtained in STATA using the hc2 option (e.g., regress y x1 x2, hc2).
HC3 = (X'X)-1X'diag{ei2/(1 - hii)2}X(X'X)-1HC3 is obtained in STATA using the hc3 option (e.g., regress y x1 x2, hc3).
"1. If there is an a priori reason to suspect that there is heteroscedasticity, HCCM-based tests should be used.""Given the relative costs of correcting for heteroscedasticity using HC3 when there is homoscedasticity and using OLSCM tests when there is heteroscedasticity, we recommend that HC3-based tests should be used routinely for testing individual coefficients in the linear regression model."
"2. For samples less than 250, HC3 should be used; when samples are 500 or larger, other versions of the HCCM can also be used. The superiority of HC3 over HC2 lies in its better properties when testing coefficients that are most strongly affected by heteroscedasticity."
"3. The decision to correct for heteroscedasticity should not be based on the results of a screening test for heteroscedasticity."