1 Hypothesis Tests with Proportions

Much as we can test hypotheses that revolve around a continuous (scale) variable, so can we test hypotheses about proportions – where the variables happen to be nominal or ordinal. For example, is there a difference in the proportion of students who attend private schools versus public schools? Do equal proportions of Washington DC’s residents commute to work by car, walking, biking, or by public transportation? Were men just as likely to survive the Titanic sinking as women?

These tests are easy to run in SPSS. How to run these tests is shown below, depending upon the testing situation.

1.1 The One-Proportion Test (Binomial test)

Here you have a single proportion – surviving the Titanic. In SPSS, these can be run with the Non-parametric Tests >>> One sample … option under Analyze.

1.1.1 Video Guide

1.2 Proportions in Two Groups

Here, we have two groups. Maybe we want to test whether the survival rates were equal for men versus women, or for adults versus children. These can be run via Analyze >>> Descriptive Statistics >>> Crosstabs…. If both variables are nominal with two categories each, you want to use lambda. If both are nominal but one has more than two categories use Cramer’s V. If you have five categories in each variable then use the Contingency Coefficient.

If both are ordinal variables then you want to use Kendall’s \(\tau_{b}\).

1.2.1 Video Guide

1.3 One Multinomial Proportion

If we have more than two categories but a single variable, then we can test our hypotheses about the distribution via Non-parametric Tests >>> One sample … option under Analyze.

1.4 Chi-square Tests of Association

Here we have two categorical variables, each with two or more categories. Using the Crosstabs… function we can run the \(\chi^2\) test. Note that If any cell has expected frequencies \(< 5\) then Fisher’s Exact test must be used. See a wonderful little example of how the test works here. Note also though that people commonly think that Fisher’s Exact test can only be used with \(2 \times 2\) tables but there is no such restriction underlying the test per se; it is just that large tables are computationally very demanding.

1.4.1 Video Guide

2 Measuring the Strength of the Association

Note that it makes sense to use the following statistics only if the \(H_0\) of no relationship between the two variables has been rejected. This is because these are measures of association that tell you how strong or weak is the relationship between \(x\) and \(y\)

2.1 Nominal Variables

If you have a \(2 \times 2\) table and both variables are Nominal, you can use phi \((\phi)\) or Cramer’s \(V\)
If you have two variables, at least one is Nominal, and one has more than 2 categories, use Cramer’s \(V\)
If you have very large tables (\(4 \times 4\) or bigger), then the Contingency Coefficient \((C)\) is recommended
If you wish to speak in terms of dependent and independent variables, then lambda \((\lambda)\) is used with Nominal variables as a proportional reduction in error (PRE) measure

2.2 Ordinal Variables

If you have Ordinal variables, and don’t care about speaking in terms of dependent versus independent variables, then:

Use gamma \((\gamma)\) as the proportional reduction in error (PRE) measure if you have no tied pairs
If you have tied pairs and “square tables”" (i.e., the same number of rows as columns), then use tau-b \((\tau_{b})\)
If you have tied pairs but “no square tables”" (i.e., the number of rows \(\neq\) the number of columns), then use tau-c \((\tau_{c})\)

If you are keen on speaking in terms of the dependent variable (must be the row variable) and the independent variable (must be the column variable), and these are Ordinal variables, then:

Use Sommers’ \(D\)

[ Crosstabs: Statistics

2.3 Interpreting the Strength of the Association

Estimate	Interpretation
+0.70 or higher	Very strong positive relationship
+0.50 to +0.69	Substantial positive relationship
+0.30 to +0.49	Moderate positive relationship
+0.10 to +0.29	Low positive relationship
+0.01 to +0.09	Negligible positive relationship
0.00	No relationship
-0.01 to -0.09	Negligible negative relationship
-0.10 to -0.29	Low negative relationship
-0.30 to -0.49	Moderate negative relationship
-0.50 to -0.69	Substantial negative relationship
-0.70 or lower	Very strong negative relationship

SPSS Tutorial 04

@ Ruhil

2017-08-30