Examining Residuals from Regression Analysis Using Stata
Using Stata to examine the residuals from a regression analysis is a relatively straightforward task. It requires a number of steps, but the process gives the analyst a feel for the analysis much better than a canned package.
We will use the Presidential Approval data set from last years final for this analysis.
The steps we will follow will be:
- Obtain the data
- Examine the data
- Printing results
- Plotting the data
- Run the regression analysis
- Save the residuals
- Examine the residuals
- Correcting for the problem
The data set is easy to get (using Internet Explorer - Netscape needs to be properly configured to use)
Presidential Approval 1949-1985 (Sorry it is an old data set!)
First, you always need to look at your data. Since this is a time series data set we need to look at plots of the data as well. [Note: Stata code lines are in bold italics.]
summ approval unemrate realgnp milforce, d
These results are written to the results window. Examine them there.
If your results look acceptable (you got the correct outout), open the log file. See the log icon/button at the top. Rerun the command with the log file open. Then, if desired, print the log file. The results window can also be printed directly.
Note: You can open and close the log file. When you do, you can choose between appending the next run to the log, or overwriting the entire log.
Plotting the data
Plot the data series in question.
graph twoway line approval year
or simply
graph twoway line approval year
line unemrate year
The format is straightforward. Regress dependentVar IndependentVars
regress approval unemrate
Some supplementary statistics are also available
regress approval unemrate, b
produces betas (standardized regression coefficients).
After the regress statement, use the predict statement.
predict res, r
This will save the residuals from the analysis as a new variable res.
We can also save the predicted values of the dependent variable by
predict yhat
First, simply look at the residuals
summ res, d
plot res year
Then perform statistical tests for skewness and kurtosis
sktest res
Look at some other residual plots and try to figure out how well the model works.
plot res yhat
plot res approval
Look at another data set! State Crime Rates
How to correct for non-normal errors
Transform the data
Bootstrap it!