An Introduction to STATA

STATA is a fairly straightforward statistical package that has a wide range of statistical procedures. It combines features from both Graphic User Interface oriented software as well as more traditional (read legacy) command or syntax based approaches. Both approaches have their utility, and Stata has enough flexibility to let the user rely on either approach fairly comfortably.

It also has some useful support on line on their WebSite.

This latter document makes a good tutorial.

This Web page providees a few basis:

First, find and execute stata (In other words, either find Stata in the Start > Programs hierarchy, or double click the STATA icon)

This brings up the Stata "desktop."

The desktop has several components

In addition, Stata has some other Windows you will need to use:

The Menu Bar

Like most windows programs, Stata uses a set of menu items from the top bar of the window. Run through them looking at the options available to you. Note especially the text label that appears if you pause the mouse over a button on the bar.

The Command Window

You run Stata by typing commands in the Command Window. Needless to say the first command you will likely enter is one to open a data set:. For instance, to open up the class data set on presidential approval, type

use "C:\WWW\DUVAL\ps601/Notes\Stata\presapp.dta", clear

Note that this command can be executed with the File > Open menu item in the Menu Bar as well

Also examine the Stata Results, Variables, and Review Windows after executing this command

The Stata Results Window

The Variables Window

The Stata Variables window simply lists all the variables in the Stata data set that is currently open. Variable names can be added to the Command window, or to GUI procedure screens, by double-clicking variable names in this window.

The Review Window

The Stata Review Window lists all of the Stata commands that have been executed since Stata opened. These can be repeated by double-clicking them and then clicking into the Command Window and hitting Enter.

Some Useful commands:

There is a basic subset of comands that are useful in using Stata. In general they fall into four groups:

File commands Data manipulation Statistical Commands Misc Utilities
  • use
  • insheet
  • generate
  • replace
  • sort
  • summ
  • regress
  • dwstat
  • prais
  • sw

There is a data set available to explore these commands, and several commands listed below have examples that use this data set.. Try using each command and seeing how Stata responds when you use it. The data set is: presapp.dta (Stata Data set)

Type Command Description
File Commands  

Open a Stata data set

e.g. open e:\www\duval\ps401\notes\spreadsheets\presapp.dta


Imports a text delimeted file saved from a spreadsheet

e.g. insheet using e:\www\duval\ps401\notes\spreadsheets\presapp.txt

  describe Provides a list description of the variables in the data file.

Defines a data set as being a time series data set

tsset time

time is the variable with unique indicator of temporal order. (year, or t, etc - not day, week, month or quarter)

Data manipulation  
  sort Sort the data

Generate a new variable

gen x3 = x1 + x2


Replaces an already existi ng variable wirth new information

replace x3 = x1 + x2

replaces the earlier version of x3

Statistical commands  

Produces a Pearsons correlation matrix

corr y x1-x5

produces the matrix for all six variables


Regression analysis

reg y x1-x3

regresses y on x1, x2, and x3

  dwstat calculates a Durbin-Watson statistic for the previous regression model

Runs a correction for autocorrelation

prais y x1 x2, corc

the corc option specifieds the Cochran-Orcutt method

Miscellaneous Utilities  

Stepwise - wraps a command in a stepwise iteration loop - appears only final step available.

sw reg y x1 x2 x3, pr(.2),pr(.05)

runs a tepwise regression with a p<.05 needed for inclusion and a p>.20 needed for removal

Stata commands are executed by typing them in the STATA Command text box and then hitting return. For instance to describe the data in the open data set, type


As shown

which produces the following resuls in the STATA results box.

Useful Statistical Procedures