STAT 541: Sandwich Estimator
Statistics 541: Sandwich Estimator
Car data example
Example: Car data.
- heteroskedasticity = fan shaped residuals
- usual estimator is "consistent." (Like who cares?)
- SEs are wrong! (Now this is important.)
- Hypothesis-tests are wrong, CIs are wrong
- Even Bonferonni doesn't work!
- Suppose X mostly equals zero and sometimes equals 1
- Suppose Y = iid N(0,1)
- slope estimate = Y at (X =1)
- Suppose it is heteroskadastic, small variance at zero large at
- high probably of looking significant, having incorrect SEs,
making bad predictions.
First solution: weighted least squares
- suppose Yi = Xi beta + sigmai Zi
instead of Yi = Xi beta + sigma Zi
- Then Yi/sigmai = Xi/sigmai beta + Zi
- But this is homoskadastic and we are done
- Where do the sigmai's come from?
- theory hopefully
- estimation possibly. E.g. fit model to
Yi2, or (Yi -
Y-hat)2. Then use predictions from this
model to weight regression.
Second solution: Sandwich estimator.
(White 1980, Long and Ervin 2000)
- Use usual LS estimators for Y
- beta-hat = (X'X)-1X'Y
- So var(beta-hat) = (X'X)-1X' var(Yi)
- Called sandwich estimator since the variance of Y is sandwiched
between the two inverses.
- Consistent for true variance of beta-hat
(Foster Stine 2001)
- Suppose you fit then compute variance
- Oops! zero variance estimate at (X = 1)
- Better is compute variance, THEN fit
Third solution: Use both!
- First change by doing weighted least squares
- Then use sandwich on resulting Y's
- If your weights are wrong, you should still get good results.
Foster, D. P. and Stine, R. A. (2001) "Variable selection in data
mining: Building a predictive model for bankruptcy."
Long, J. S. and Ervin, L. H. (2000), "Using heteroscedastic consistent
standard errors in the linear regression model," American
statistician, 54, 795 - 806.
White (1980) "A heteroscedastic-consistent covariance matrix estimator
and a direct test of heteroskedasticity," Econometrica, 48, 817 - 838.
Last modified: Tue Feb 20 08:44:47 2001