PS 400

PS400: Introductory Quantitative Methods
Instructions for Homework Assignment #1
Acquiring data from the Web

 
These instructions presume that the reader is using Netscape© 2.0 or higher, Windows95© or NT© operating systems and Corel's Quattro Pro 7.0© for the spreadsheet. If you are using alternate software, the specific instructions will vary, but conceptually the process is the same.

 First, find your data source. Try one of the following:

Next, block the data with the cursor so that you can cut-and-paste the data you desire. Using the Edit menu item on Netscape to Copy the data (This puts a copy of the blocked portion of the Web page into Windows Clipboard).

 Now minimize Netscape and open your spreadsheet. After the new spreadsheet has opened, click the pointer on a cell, and then use the Paste button to copy the data into the spreadsheet. If this has gone well, you should see your data appear in the spreadsheet. If the data is there, you can now close Netscape.

Note that your data now appears as a single column of labels. It will likely spread over several columns in width, but all the information in contained in one cell per row. It is also likely that the alignment for the columns has been altered. This can be adjusted in one of two ways: (1) Strip out all column totals and subtotals, and (2) if still needed, set the spreadsheet font to

       Courier 10.

With luck, your data's column alignment is still preserved. Without it, you need to add spaces so that the data may be parsed. If you are using the Percapita Income example, this will preserve the column formating needed for parsing.

 Parsing the Data - Instructions

Instructions for Quattro Pro 6

Data that is cut-and-pasted from a Web page into a spreadsheet is not yet ready for analysis because it is stored as ASCII text (or labels) in the spreadsheet and generally all values are stored in one column. These text labels need to be converted to labels by the process of parsing. To Parse the data:

  1. Move your blinking cursor to the top left cell of your data set. In all likelihood this cell contains all of the data for each column, and thus it must be parsed, or separated into columns of numbers rather than a single column of labels. (in this example, I have cut out all totals except for the US and removed most descriptive information. In addition, I pasted my data in with a blank column on the left, so that my 50 state data begins in cell B9, and ends in cell B60. Depending on how you copied thae data, your spreadsheet may differ slightly. Simply ensure that you know where your data is (i.e. columns b9..b60)
  2. Click the Data/Parse Menu item
  3. Click the Create button in the Parse dialog box. This will add a line that looks something like this directly above the data.
            ******L>>>>>*L>>>>>*L>>>>***V>>>>***V>>>>>***V>>>>>****V>>>>>
    This is the parsing edit line. The L's indicate columns of Labels, the V's indicate the beginning of variables, the >'s indicate how wide the data columns will be, and the *'s indicate empty space between the columns. For this example, we need to alter the edit line so that we have only one column of labels for the state names. Blank spaces between the state names will not affect this, but note that the >'s must extend as far as the longest state name. The modified edit line should look like this for the example.
            ***L>>>>>>>>>>>>>>>>>>>>>**V>>>>>***V>>>>>***V>>>>>****V>>>>>
  4. Using the cursor or input line, ensure that the Input Block contains the address of the columns for the data. (In the example, this will be B9..B61. - Note that the edit line adds a new line to your data and this must be included in the Input Block.)
  5. Specify the place for the parsed data to be written. Make sure that you do not overwrite anything of value. (In my case I will use cell J10. It is out in the blank area to the right of the first line of data.)
  6. Click OK

This basic procedures shows how data acquisition can be greatly simplified by the use of the World Wide Web. As the Web grows in scope, complexity, and sophistication more data that is even easier to obtain will become available. I recommend that you use this basic technique to obtain the data for the rest of your assignments.

Instructions for Quattro Pro 7

The newer versions of Quattro Pro simplify this process greatly.

    Click Data > Quick Columns

    You get the following dialog box

    Then specify:

 The parsed data are now in the columns specified in the Destination Box and the raw data are still in the original column (if the destination block was a different column).

Instructions for Quattro Pro 8

The newer versions of Quattro Pro simplify this process greatly.

    Click Tools > Data Tools > Quick Columns

You get the following dialog box

    Then specify:

 The parsed data are now in the columns specified in the Destination Box and the raw data are still in the original column (if the destination block was a different column).

Instructions for Excel

Excell does all this quite easily.

Open up Excel

Click Data > Get External Data > New Web Query

When Netscape or IE open up, find the table you want, and copy and pat the URL back into the Get Data Dialogue Box.

Click the Only the Tables and None Radio buttons. Click OK.

Indicate where you want the data

The Data is now in Excel, probaly all in the 1st column as text.

Delete unnecessary rows

Click Data > Text to Columns > Fixed Width > Next

Check columns and click next.

With luck, it is all there. Double check for problems.