PS 100
PS100: Empirical Political Analysis
Instructions for Homework Assignment #1

Acquiring data from the Web

These instructions presume that the reader is using Netscape© 2.0 or higher, Microsoft Windows© 3.1 or Windows95© operating systems and Quattro Pro 6.0© for the spreadsheet. If you are using alternate software, the specific instructions will vary, but conceptually the process is the same.

First, find your data source. Try one of the following:

Next, block the data with the cursor so that you can cut-and-paste the data you desire. Using the Edit menu item on Netscape to Copy the data (This puts a copy of the blocked portion of the Web page into Windows Clipboard).

Now minimize Netscape and open your spreadsheet. After the new spreadsheet has opened, click the pointer on a cell, and then use the Paste button to copy the data into the spreadsheet. If this has gone well, you should see your data appear in the spreadsheet. If the data is there, you can now close Netscape.

Note that your data now appears as a single column of labels. It will likely spread over several columns in width, but all the information in contained in one cell per row. It is also likely that the alignment for the columns has been altered. This can be adjusted in one of two ways: (1) Strip out all column totals and subtotals, and (2) if still needed, set the spreadsheet font to

       Courier 10.
With luck, your data's column alignment is still preserved. Without it, you need to add spaces so that the data may be parsed. If you are using the Percapita Income example, this will preserve the column formating needed for parsing.

Parsing the Data

Data that is cut-and-pasted from a Web page into a spreadsheet is not yet ready for analysis because it is stored as ASCII text (or labels) in the spreadsheet and generally all values are stored in one column. These text labels need to be converted to labels by the process of parsing. To Parse the data:

  1. Move your blinking cursor to the top left cell of your data set. In all likelihood this cell contains all of the data for each column, and thus it must be parsed, or separated into columns of numbers rather than a single column of labels. (in this example, I have cut out all totals except for the US and removed most descriptive information. In addition, I pasted my data in with a blank column on the left, so that my 50 state data begins in cell B9, and ends in cell B60. Depending on how you copied thae data, your spreadsheet may differ slightly. Simply ensure that you know where your data is (i.e. columns b9..b60)
  2. Click the Data/Parse Menu item
  3. Click the Create button in the Parse dialog box. This will add a line that looks something like this directly above the data.
    
    
    	******L>>>>>*L>>>>>*L>>>>***V>>>>***V>>>>>***V>>>>>****V>>>>>
    This is the parsing edit line. The L's indicate columns of Labels, the V's indicate the beginning of variables, the >'s indicate how wide the data columns will be, and the *'s indicate empty space between the columns. For this example, we need to alter the edit line so that we have only one column of labels for the state names. Blank spaces between the state names will not affect this, but note that the >'s must extend as far as the longest state name. The modified edit line should look like this for the example.
    	***L>>>>>>>>>>>>>>>>>>>>>**V>>>>>***V>>>>>***V>>>>>****V>>>>>
  4. Using the cursor or input line, ensure that the Input Block contains the address of the columns for the data. (In the example, this will be B9..B61. - Note that the edit line adds a new line to your data and this must be included in the Input Block.)
  5. Specify the place for the parsed data to be written. Make sure that you do not overwrite anything of value. (In my case I will use cell J10. It is out in the blank area to the right of the first line of data.)
  6. Click OK
    • If all has gone well, your data is now stored as a set of values beginning in the Output Block cell, rather than a single long row label. You may delete the column of labels if desired.
    • If this didn't work, retrace your steps, and try to see where your example differs from mine.
    • If this still does not work, ask someone for help!
This basic procedures shows how data acquisition can be greatly simplified by the use of the World Wide Web. As the Web grows in scope, complexity, and sophistication more data that is even easier to obtain will become available. I recommend that you use this basic technique to obtain the data for the rest of your assignments.