Much as we can test hypotheses that revolve around a continuous (scale) variable, so can we test hypotheses about proportions – where the variables happen to be nominal or ordinal. For example, is there a difference in the proportion of students who attend private schools versus public schools? Do equal proportions of Washington DC’s residents commute to work by car, walking, biking, or by public transportation? Were men just as likely to survive the Titanic sinking as women?
These tests are easy to run in SPSS. How to run these tests is shown below, depending upon the testing situation.
Here you have a single proportion – surviving the Titanic. In SPSS, these can be run with the Non-parametric Tests >>> One sample …
option under Analyze
.
Here, we have two groups. Maybe we want to test whether the survival rates were equal for men versus women, or for adults versus children. These can be run via Analyze >>> Descriptive Statistics >>> Crosstabs…
. If both variables are nominal with two categories each, you want to use lambda
. If both are nominal but one has more than two categories use Cramer’s V
. If you have five categories in each variable then use the Contingency Coefficient
.
If both are ordinal variables then you want to use Kendall’s
\(\tau_{b}\).
If we have more than two categories but a single variable, then we can test our hypotheses about the distribution via Non-parametric Tests >>> One sample …
option under Analyze
.
Here we have two categorical variables, each with two or more categories. Using the Crosstabs…
function we can run the \(\chi^2\) test. Note that If any cell has expected frequencies \(< 5\) then Fisher’s Exact test must be used. See a wonderful little example of how the test works here. Note also though that people commonly think that Fisher’s Exact test can only be used with \(2 \times 2\) tables but there is no such restriction underlying the test per se; it is just that large tables are computationally very demanding.
Note that it makes sense to use the following statistics only if the \(H_0\) of no relationship between the two variables has been rejected. This is because these are measures of association that tell you how strong or weak is the relationship between \(x\) and \(y\)
If you have Ordinal variables, and don’t care about speaking in terms of dependent versus independent variables, then:
If you are keen on speaking in terms of the dependent variable (must be the row variable) and the independent variable (must be the column variable), and these are Ordinal variables, then:
[
Estimate | Interpretation |
---|---|
+0.70 or higher | Very strong positive relationship |
+0.50 to +0.69 | Substantial positive relationship |
+0.30 to +0.49 | Moderate positive relationship |
+0.10 to +0.29 | Low positive relationship |
+0.01 to +0.09 | Negligible positive relationship |
0.00 | No relationship |
-0.01 to -0.09 | Negligible negative relationship |
-0.10 to -0.29 | Low negative relationship |
-0.30 to -0.49 | Moderate negative relationship |
-0.50 to -0.69 | Substantial negative relationship |
-0.70 or lower | Very strong negative relationship |