********************(c) Jan Kabatek, Universiteit van Tilburg ***********************
* This is a heteroskedasticity example for Econometrics 1 for research masters. *
* *
* *
* Version 8.11.2013, j.kabatek@uvt.nl, K606 *
*************************************************************************************
*preliminary commands
clear all //clear any data which are still in memory
set more off //disable "more" command
set obs 1000 //how many observations are going to be in the dataset
*simulating data
gen x1 = rnormal(1,0.5)
gen x2 = rnormal(0.1,0.1)
gen e = rnormal(0,1)
*DGP: x1 has got positive effect on y, x2 has got negative.
gen y = 1 + 5*x1 - 3*x2 + e
*introducing collinearity into the regressor 2:
replace x2 = x2 + x1
reg y x1 x2
*the coefficient of x2 reflects all the residual variation after accounting for the
*effect of x1: (x2'*M_1i*x1)/(x2'*M_1i*x2). The coefficient of x1 is inflated due to
*collinearity with x2. The predictions however remain unbiased (exogeneity holds).
corr y x2
*in the regression, coefficient of x2 is correctly negative, however in the context
*of correlation, the DGP efect is overridden by the positive effect of x1 on y,
*combined with strong collinearity between x1 and x2.
*Moral of the story is, do not interpret beta coefficients in terms of correlations,
*unless you are running simple linear regression (with constant and one regressor x).