Examining Residuals from Regression Analysis Using Stata

PS602

Using Stata to examine the residuals from a regression analysis is a relatively straightforward task. It requires a number of steps, but the process gives the analyst a feel for the analysis much better than a canned package.

We will use the Presidential Approval data set from last years final for this analysis.

The steps we will follow will be:


Obtain the data

The data set is easy to get (using Internet Explorer - Netscape needs to be properly configured to use)

Presidential Approval 1949-1985 (Sorry it is an old data set!)


Examine the data

First, you always need to look at your data. Since this is a time series data set we need to look at plots of the data as well. [Note: Stata code lines are in bold italics.]

  1. Simple descriptive statistics: Obtain means, standard deviations, number of observations. Look and see if the data is normally distributed. Does the data need to be N(mean,var)?

summ approval unemrate realgnp milforce, d

These results are written to the results window. Examine them there.


Printing results

If your results look acceptable (you got the correct outout), open the log file. See the log icon/button at the top. Rerun the command with the log file open. Then, if desired, print the log file. The results window can also be printed directly.

Note: You can open and close the log file. When you do, you can choose between appending the next run to the log, or overwriting the entire log.

Plotting the data
Plot the data series in question.

  1. Plot Approval

    graph twoway line approval year

    or simply

    graph twoway line approval year

  2. Unemployment

line unemrate year

Run the Regression analysis

The format is straightforward. Regress dependentVar IndependentVars

regress approval unemrate

Some supplementary statistics are also available

regress approval unemrate, b

produces betas (standardized regression coefficients).


Save the Residuals

After the regress statement, use the predict statement.

predict res, r

This will save the residuals from the analysis as a new variable res.

We can also save the predicted values of the dependent variable by

predict yhat

Examine the Residuals

            First, simply look at the residuals

           summ res, d

           plot res year

            Then perform statistical tests for skewness and kurtosis

            sktest res

Look at some other residual plots and try to figure out how well the model works.

plot res yhat

plot res approval

Look at another data set! State Crime Rates

How to correct for non-normal errors

           Transform the data

Bootstrap it!