1 Starting Up

The first thing we will do is launch SPSS. The university machines have SPSS 24. Find the relevant SPSS icon and click (or double-click) to launch SPSS. If you see a popup window that says something about “Unicode encoding, blah blah” go ahead and click Use Unicode Encoding.

2 Opening a Dataset

2.1 SPSS Format

Data saved in SPSS format have their own file extensions: .sav and .por. If you have an SPSS data file you can simultaneously open it and launch SPSS by double-clicking the data file. Lets do it now with hs1.sav

2.2 Excel files

Go to File > Open > Data… and in the Open Data dialog box, select the Files of type that you want to open, select the file you want, and then Click Open. If you see a follow-up dialogue box that asks about variable names, etc., if your variable names are in the first row of data, select the Read variable names from the first row of data check box.

2.3 csv or txt or dat files

So long as your csv or txt or dat file has column delimiters – tabs, commas, spaces, etc. that indicate where one variable finishes and another variable starts, you should have little trouble. If your file lacks this information you will have to manually split the variables (and this is a tedious process). Lets assume we have a clean csv/txt file to work with.

Click File > Read Text Data and you will see the Open Data window. By default SPSS will have selected “Text (.txt, .dat, *.csv)“. Locate your file, click the file to select it, and then then click OK.

Follow the prompts as they appear in the Text Import Wizarddialogue box. If all goes well and you see the data in SPSS, be sure to save the file as an SPSS file before you end your session. Otherwise you will have to start from scratch the next time around.

3 Variable names, value labels, missing values, descriptions

You will see two tabs in the SPSS data window, one is the Data View and the other is the Variable View.

3.1 Variable View

This is the view you will work with when cleaning your data.

3.1.1 Name

Click (or double-click depending upon your computer system) on a variable’s name and you can edit this name.

3.1.2 Type, Width, Decimal Places

Click on Type and you can specify whether this variable should be treated as a numeric variable or a string variable. Other variable types are listed as well but you will rarely use these. Note that you can also specify how “wide” the variable is, and how many decimal places should be displayed.

3.1.3 Variable Labels

Here you can enter a description of the variable

3.1.4 Value Labels

Click Values and you will see a dialogue box pop-up. Here you can map a numeric value to its corresponding label. For example, say I have saved a student’s status (Freshman, Sophomore, etc) as 1 if Freshman, 2 if Sophomore, and so on, I can now enter the value labels. Once I do this every table or chart I create I will be able to see the actually labels rather than cryptic numeric values of 1, 2, etc.

3.1.5 Missing Values

If you have missing data, this allows you to tell SPSS how to distinguish between valid observations versus observations that should be treated as having missing information on this particular variable. The most common missing value you will see will be a dot as in . but some organizations tend to use the numbers 9, 999, 9999, -9, -99, -9999, etc to flag missing values.

3.1.6 Measure

This column gives you a drop-down box that will also you to specify the measurement level for each variable.

The above options will be the ones you use most often so become familiar with them.

3.1.7 Video Guide

3.1.7.1 Variable View in SPSS

3.1.7.2 Importing Excel files into SPSS

4 Creating/Modifying Variable

The Transform menu will allow you to perform several operations. The ones you will use most often will be either recoding some variable into another, binning (i.e., grouping a numeric variable), or then computing some value for a variable.

4.1 Recoding a variable

If you look at the gender variable in the Agresti and Finlay data, you will see values of f for females and m for males. Ideally we would save this information with numeric codes that are then labeled. For example, we would like to have 1 = Male and 2 = Female. Let us create a new variable, called sex, with this mapping.

To do so, select Recode into Different Variables and choose the input Variable (the one whose values you will use to create the new variable). Now set a name and label for the Output variable. Then click Old and New Values … .

Check the Variable View, click Values and now map 1 to Male and 2 to Female.

4.2 Visual Binning

If you need to create groups out of a numeric variable, age, for example, Visual Binning will do this for you quite easily. Let us group age into specific categories. Start by clicking on Visual Binning and then select the age variable. You will see various attributes of age, including a histogram. The youngest person is 22 and the oldest is 71. Let us see what happens if we create age groups that run as follows: 20 - 30, 30 - 40, 40 - 50, 50+.

Click on Make Cutpoints… and specify where you want the first cutpoint, the number of groups you want, and then the width you want. Once you do this and click OK you’ll see how the original variable will be grouped. If you like the result go ahead and save this new variable as grouped_age and Click OK. Check the Variable View and you have your new variable. Now go in and create the labels for grouped_age.

4.3 Video Guide

4.3.1 Recoding variables

4.3.2 Visual binning

5 Descriptive Graphics

SPSS will allow you to create graphs in different ways. A good starting point is to use the Chart Builder under the Graphs menu. The first thing you will see is a warning message telling you to be sure to have set the measurement levels correctly for your data. Measurement levels determine what sort of graphic can be used for a variable; hence this warning. If all is well with your data, Click OK.

The resulting dialogue box has two panes, and we’ll start with the lower pane. Here you see the chart Gallery that allows you to select the type of graph you want to build.

5.1 Bar Charts

Select Bar and then the type of bar chart you want. For now we’ll go with the default bar (the first one you see). Drag the selected bar chart into the upper pane. Now drag the new sex variable you created to the x-axis. You have your basic bar chart.

Note that the y-axis uses the frequency counts by default. You can change this to percentages since they are a lot easier for folks to interpret and make it clear which group dominates, which one has the smallest presence, etc. You can customize the chart by double-clicking the resulting graph to open various edit functions.

5.2 Pie Charts

A similar sequence will apply to pie charts, and you can customize these as well.

5.3 Histograms

Select the Histogram instead and use the high school GPA variable. The second dialogue box – Element Properties that opens up on the left of the Chart Builder box will allow you to superimpose the Normal curve, change the bars to whiskers, etc. At minimum, superimpose the Normal curve. Close this secondary dialogue box and then Click OK

5.4 Scatterplots

If you choose Scatter/Dot you can build scatterplots with one numeric variable on the x-axis and the second numeric variable on the y-axis. Several customization options are available here as well.

5.5 Boxplots

Boxplots can be built with a single numeric variable or by seeing how a numeric variable’s distribution differs between groups flagged by another variable. These graphs can be customized as well.

5.6 Video Guide

5.6.1 Bar charts

5.6.2 Histograms

5.6.3 Scatterplots

6 Descriptive Tabulations

You will typically have three basic types of tables – (a) frequency tables for a single variable, (b) cross-tabulations where you have two (and rarely three) variables, and a table of (c) summary statistics (mean, median, variance, etc.). In SPSS, you will find options for tables under the Analyze menu.

6.1 Frequency Tables

Go to Analyze, select Descriptive Statistics, and then Frequencies. Select the sex variable we created. On the right hand side of the dialogue box you will see various sub-options. The only ones we want to tweak for now will be Charts… and Format…. if you want to generate a bar chart along with the frequency table, you can do so here. Likewise, you can organize the rows of the table by ascending/descending values of the variable, or then by ascending/descending frequency counts.

6.2 Grouped Frequency Tables

These tables are useful if you are using a grouped version of a numeric variable (grouped_age, for example) and can be constructed similarly to how the preceding frequency tables were constructed.

6.3 Cross-tabulations

Here we have two nominal/ordinal variables as in, for example, grouped_age by the student’s sex. These can be generated via the Crosstabs… option under Descriptive Statistics. Select sex as the Row(s): and grouped_age as the Column(s):. Make sure you select Display clustered bar charts before you click OK.

6.4 Video Guide

6.4.1 Descriptive Statistics