Ordinal Logit and Probit

Quite often the outcome of interest (aka our dependent variable) may be an ordinal response. The most typical examples include survey questions where, for example, the respondent is asked: “How concerned are you about global warming?” and he/she responds with one of the following options – “not at all”, “somewhat”, “a great deal”. Note that these responses have a hierarchy of concern that goes from the lowest level (not at all concerned) to the highest level (a great deal).

Of course, the number of ordinal response categories may not be three, they could be any number – four, five, six, seven, and more. In the past analysts treated these responses as interval-level measures. Specifically, let us say the lowest level was coded 0 and the highest level was coded 2, then \(y = 0, 1, 2\) with \(0\) representing not at all concerned; \(1\) representing somewhat concerned and \(2\) representing a great deal of concern. In a linear regression model the assumption is that the difference between 0 and 1 is the same as that between a 1 and 2 but here the numbers represent categories not numeric measurements! If we ignore this important quality of the outcome variable and proceed with the usual linear regression model we end up with biased and misleading results.

The Latent Variable Formulation of the Ordinal Model

Once more we assume some underlying variable that is latent and ranges from \(-\infty\) to \(+\infty\) and can be modeled as a function of some independent variables:

\[y^{*}_{i} = x_{i}\beta + e_{i}\]

Assume the outcome of interest is subjects’ responses to the question: ‘A working mother can establish just as warm and secure a relationship with her children as a mother who does not work.’ Response categories are:

Strongly Disagree (SD)
Disagree (D)
Agree (A)
Strongly Agree (SA)

Assume too that the latent variable underlying these responses is the subjects’ degree of support for this statement about working mothers. We can then write the ordinal model as follows:

\[y=\begin{cases} 1 = SA & \text{if } \left(\tau_{0} = -\infty \right) \leq y^{*}_{i} < \tau_{1} \\ 2 = D & \text{if } \tau_{1} \leq y^{*}_{i} < \tau_{2} \\ 3 = A & \text{if } \tau_{2} \leq y^{*}_{i} < \tau_{3} \\ 4 = SA & \text{if } \tau_{3} \leq y^{*}_{i} < \left( \tau_{4} = +\infty \right) \end{cases}\]

In simple terms what this means is that we observe a particular ordinal response depending upon what part of the latent distribution an individual is in. We see the next ‘higher’ response only when the individual then crosses that particular threshold (also referred to as cutpoint). Obviously the goal becomes to find these cutpoints as a function of some independent variables available to us.

Once again, just as with the binary logit and probit models here too we need to assume something about the variance of the residuals. The logit and the probit are two commonly employed options, with the choice between the two depending upon model fit, convenience, and our ultimate goal (for example, if we are also modeling some sort of a second-stage in the outcome process then the probit is easier to work with).

Proportional-Odds

The basic model assumes proportional odds – that a unit change in any independent variable has the same impact on moving from the Strongly Disagree to Disagree as it does on moving from Disagree to Agree and also on moving from Agree to Strongly Agree. This is an assumption built into the model and is often tested because it is a strong assumption. Any independent variable may have different impacts depending on the response level. This would have to be empirically verified of course and there are some tests for checking to see if the data meet the proportional odds assumption or whether the data violate this assumption. We’ll look at this test but note that these tests have been criticized for “having a tendency to reject the null hypothesis (that the sets of coefficients are the same), and hence, indicate that there the parallel slopes assumption does not hold, in cases where the assumption does hold”.

Model Goodness-of-Fit

Diagnostics for these model are impossible to run so don’t go looking for anything along the lines of what we have used thus far. The only checks seem to be with the maximum likelihood, to see if the model converges and how fast, and how reliably.

There are some statistics we can continue to use. These include the pseudo-\(R^2\) for example, the AIC, the BIC, and the usual likelihood ratio tests to see if including the independent variables improves upon the model without any independent variables. We can also test for nested models.

An Example: The GSS Survey Data Example

Be sure to install the rms package before you run the code below.

library(rms)
library(foreign)
ordwarm2 =  read.dta("http://www.ats.ucla.edu/stat/stata/examples/long/ordwarm2.dta")
names(ordwarm2)

##  [1] "warm"    "yr89"    "male"    "white"   "age"     "ed"      "prst"   
##  [8] "warmlt2" "warmlt3" "warmlt4"

summary(ordwarm2)

##  warm       yr89         male           white           age       
##  SD:297   1977:1379   Women:1227   White   : 283   Min.   :18.00  
##  D :723   1989: 914   Men  :1066   Not Whit:2010   1st Qu.:31.00  
##  A :856                                            Median :42.00  
##  SA:417                                            Mean   :44.94  
##                                                    3rd Qu.:58.00  
##                                                    Max.   :89.00  
##        ed             prst         warmlt2     warmlt3       warmlt4    
##  Min.   : 0.00   Min.   :12.00   D,A,SA:1996   A,SA:1273   SA    : 417  
##  1st Qu.:11.00   1st Qu.:30.00   SD    : 297   D,SD:1020   A,D,SD:1876  
##  Median :12.00   Median :37.00                                          
##  Mean   :12.22   Mean   :39.59                                          
##  3rd Qu.:14.00   3rd Qu.:50.00                                          
##  Max.   :20.00   Max.   :82.00

These are data from the General Social Survey (1977 and 1989) and include the following variables:

warm - 1 (SD); 2 (D); 3 (A); 4 (SA)
yr89 - survey year: 1 = 1989; 0 = 1977
male - 1 = male; 0 = female
white - 1 = white; 0 = other
age - age in years
ed - education in years
prst - occupational prestige
… the other variables are the original dependent variable recoded into binary groupings

Fitting Ordinal Logit and Probit Models

We’ll start with the ordinal logit model and estimate the following model in particular for the 1977 responses:

\[warm = male + white + age + ed + prst\]

ol.1 = orm(warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2)
ol.1

## 
## Logistic (Proportional Odds) Ordinal Regression Model
## 
## orm(formula = warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2)
## Frequencies of Responses
## 
##  SD   D   A  SA 
## 297 723 856 417 
## 
##                       Model Likelihood          Discrimination          Rank Discrim.    
##                          Ratio Test                 Indexes                Indexes       
## Obs          2293    LR chi2     301.72    R2                  0.133    rho     0.349    
## Unique Y        4    d.f.             6    g                   0.785                     
## Median Y        3    Pr(> chi2) <0.0001    gr                  2.193                     
## max |deriv| 0.002    Score chi2  301.80    |Pr(Y>=median)-0.5| 0.139                     
##                      Pr(> chi2) <0.0001                                                  
## 
##                Coef    S.E.   Wald Z Pr(>|Z|)
## y>=D            2.4654 0.2389 10.32  <0.0001 
## y>=A            0.6309 0.2333  2.70  0.0068  
## y>=SA          -1.2619 0.2340 -5.39  <0.0001 
## yr89=1989       0.5239 0.0799  6.56  <0.0001 
## male=Men       -0.7333 0.0785 -9.34  <0.0001 
## white=Not Whit -0.3912 0.1184 -3.30  0.0010  
## age            -0.0217 0.0025 -8.78  <0.0001 
## ed              0.0672 0.0160  4.20  <0.0001 
## prst            0.0061 0.0033  1.84  0.0652

op.1 = orm(warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2, family=probit)
op.1

## 
## Probit Ordinal Regression Model
## 
## orm(formula = warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2, 
##     family = probit)
## Frequencies of Responses
## 
##  SD   D   A  SA 
## 297 723 856 417 
## 
##                       Model Likelihood          Discrimination          Rank Discrim.    
##                          Ratio Test                 Indexes                Indexes       
## Obs          2293    LR chi2     294.32    R2                  0.130    rho     0.349    
## Unique Y        4    d.f.             6    g                   0.451                     
## Median Y        3    Pr(> chi2) <0.0001    gr                  1.571                     
## max |deriv| 4e-06    Score chi2  289.84    |Pr(Y>=median)-0.5| 0.131                     
##                      Pr(> chi2) <0.0001                                                  
## 
##                Coef    S.E.   Wald Z Pr(>|Z|)
## y>=D            1.4286 0.1388 10.29  <0.0001 
## y>=A            0.3606 0.1369  2.63  0.0085  
## y>=SA          -0.7682 0.1371 -5.60  <0.0001 
## yr89=1989       0.3188 0.0469  6.80  <0.0001 
## male=Men       -0.4170 0.0455 -9.16  <0.0001 
## white=Not Whit -0.2265 0.0695 -3.26  0.0011  
## age            -0.0122 0.0014 -8.47  <0.0001 
## ed              0.0387 0.0093  4.15  <0.0001 
## prst            0.0033 0.0019  1.71  0.0881

AIC(ol.1); BIC(ol.1)

## [1] 5707.825

## [1] 5759.463

AIC(op.1); BIC(op.1)

## [1] 5715.222

## [1] 5766.861

exp(coef(ol.1))

##           y>=D           y>=A          y>=SA      yr89=1989       male=Men 
##     11.7676866      1.8793063      0.2831293      1.6886031      0.4803221 
## white=Not Whit            age             ed           prst 
##      0.6762727      0.9785675      1.0694801      1.0060912

Note the estimates generated.

The reference category is the first threshold (for a response of Strongly Disagree) and so you see the thresholds for \(y \geq D\), \(y \geq A\), and \(y \geq SA\)
Virtually everything except prst is statistically significant, and this is true in both the logit and the probit specifications.
The pseudo-\(R^2\) is 0.133 for the logit and 0.130 for the probit specifications, respectively.
The Pr(> chi2) is the p-value of the likelihood-ratio test comparing the model with all independent variables you have specified against a model with no independent variables. If the p-value indicates statistical significance then it means the model is an improvement over the NULL model (i.e., the model with no independent variables).
The AIC and BIC suggest the logit specification is slightly better than the probit specification
The odds-ratios are proportional odds-ratios in that they indicate, for example, that the odds of Men responding Disagree, Agree, or Strongly Agree are 0.48 those of Women responding similarly, holding all else constant. Likewise, the odds of a Non-White respondent responding Disagree or Agree or Strongly Agree are lower by 0.67 as compared to a White respondent, holding all else constant.
Continuous variables are trickier to work with in terms of odds-ratios because the odds are not constant across the range of the independent variable, much as was the case in the binary logit and probit models. This is where predicted probabilities come in handy (see below).
There is no easy way to calculate the pseudo\(R^2\) and the percent correctly predicted by the rms package. Other packages – polr and VGAM do these calculations but the rms package is better. Plus there is disagreement about the value of these two statistics in terms of assessing model fit.
Interpretation of the estimates is complicated but one way to interpret the estimates is by calculating predicted probabilities for specific profiles of individuals, and by plotting these probabilities as well. Note: visreg will be useless here since it is not designed to work with these models.

Calculating Predicted Probabilities

As usual we can start by creating a profile of two individuals who are similar on all dimensions except one independent variable. Let us look at the difference between men and women.

summary(ordwarm2)

##  warm       yr89         male           white           age       
##  SD:297   1977:1379   Women:1227   White   : 283   Min.   :18.00  
##  D :723   1989: 914   Men  :1066   Not Whit:2010   1st Qu.:31.00  
##  A :856                                            Median :42.00  
##  SA:417                                            Mean   :44.94  
##                                                    3rd Qu.:58.00  
##                                                    Max.   :89.00  
##        ed             prst         warmlt2     warmlt3       warmlt4    
##  Min.   : 0.00   Min.   :12.00   D,A,SA:1996   A,SA:1273   SA    : 417  
##  1st Qu.:11.00   1st Qu.:30.00   SD    : 297   D,SD:1020   A,D,SD:1876  
##  Median :12.00   Median :37.00                                          
##  Mean   :12.22   Mean   :39.59                                          
##  3rd Qu.:14.00   3rd Qu.:50.00                                          
##  Max.   :20.00   Max.   :82.00

newdata.a = data.frame(yr89="1989", male="Men", white="Not Whit", age=44.94, ed=12.22, prst=39.59)

newdata.b = data.frame(yr89="1989", male="Women", white="Not Whit", age=44.94, ed=12.22, prst=39.59)

yhat.a = predict(ol.1, newdata.a, type="fitted.ind")
yhat.b = predict(ol.1, newdata.b, type="fitted.ind")
yhat.a; yhat.b

##   warm=SD    warm=D    warm=A   warm=SA 
## 0.1242893 0.3462526 0.3845099 0.1449482

##    warm=SD     warm=D     warm=A    warm=SA 
## 0.06382115 0.23534594 0.43996977 0.26086314

Remember the question wording: ‘A working mother can establish just as warm and secure a relationship with her children as a mother who does not work.’

So holding all else constant, women were more likely to Strongly Disagree, Disagree, Agree, or Strongly Agree with the statement than were men.

Was this any different in 1977?

newdata.c = data.frame(yr89="1977", male="Men", white="Not Whit", age=44.94, ed=12.22, prst=39.59)

newdata.d = data.frame(yr89="1977", male="Women", white="Not Whit", age=44.94, ed=12.22, prst=39.59)

yhat.c = predict(ol.1, newdata.c, type="fitted.ind")
yhat.d = predict(ol.1, newdata.d, type="fitted.ind")
yhat.c; yhat.d

##    warm=SD     warm=D     warm=A    warm=SA 
## 0.19332910 0.40678313 0.30865602 0.09123174

##   warm=SD    warm=D    warm=A   warm=SA 
## 0.1032318 0.3156499 0.4082435 0.1728747

There seem to be some differences in absolute probabilities of a particular response but the general gap between Men and Women stayed the same between 1977 and 1989.

We could also ask how a woman’s age influenced her response in, say, 1977.

newdata.e = data.frame(yr89="1977", male="Women", white="Not Whit", age=seq(18, 89, by=1), ed=12.22, prst=39.59)

yhat.e = predict(ol.1, newdata.e, type="fitted.ind")

yhat.f = predict(ol.1, newdata.e, type="fitted")

Note the difference between type=“fitted.ind” and type=“fitted”. In many ways fited.ind would be more readily understood by most audiences.

Plotting Predicted Probabilities

Let us try and plot the predicted probabilities from newdata.e

plot(newdata.e$age, yhat.e[, 1], col="red", type="l", xlab="Age (in years)", ylab="Predicted Probability of Specific Response")
par(new=TRUE)
plot(newdata.e$age, yhat.e[, 2], col="blue", type="l", yaxt="no", ylab="", xaxt="no", xlab="")
par(new=TRUE)
plot(newdata.e$age, yhat.e[, 3], col="darkgreen", type="l", yaxt="no", ylab="", xaxt="no", xlab="")
par(new=TRUE)
plot(newdata.e$age, yhat.e[, 4], col="goldenrod", type="l", yaxt="no", ylab="", xaxt="no", xlab="")
text(84, 0.23, "SD", cex=0.75)
text(70, 0.25, "D", cex=0.75)
text(50, 0.25, "A", cex=0.75)
text(22, 0.23, "SA", cex=0.75)

newdata.f = data.frame(yr89="1989", male="Women", white="Not Whit", age=seq(18, 89, by=1), ed=12.22, prst=39.59)

yhat.g = predict(ol.1, newdata.f, type="fitted.ind")

yhat.h = predict(ol.1, newdata.f, type="fitted")

yhat.g = data.frame(yhat.g)
yhat.e = data.frame(yhat.e)

par(mfrow=c(1,2))

plot(newdata.f$age, yhat.g$warm.A, type="l", col="red", xlab="Age (in Years)", ylab="Probability of a Specific Response", main="GSS 1989", ylim=c(0, 0.5))
par(new=TRUE)
plot(newdata.f$age, yhat.g$warm.SA, type="l", col="green", xaxt="no", yaxt="no", xlab="", ylab="", ylim=c(0, 0.5))
par(new=TRUE)
plot(newdata.f$age, yhat.g$warm.D, type="l", col="purple", xaxt="no", yaxt="no", xlab="", ylab="", ylim=c(0, 0.5))
par(new=TRUE)
plot(newdata.f$age, yhat.g$warm.SD, type="l", col="black", xaxt="no", yaxt="no", xlab="", ylab="", ylim=c(0, 0.5))
text(80, 0.40, "A", cex=0.75)
text(20, 0.30, "SA", cex=0.75)
text(30, 0.15, "D", cex=0.75)
text(60, 0.10, "SD", cex=0.75)


plot(newdata.e$age, yhat.e$warm.A, type="l", col="red", xlab="Age (in Years)", ylab="Probability of a Specific Response", main="GSS 1977", ylim=c(0, 0.5))
par(new=TRUE)
plot(newdata.e$age, yhat.e$warm.SA, type="l", col="green", xaxt="no", yaxt="no", xlab="", ylab="", ylim=c(0, 0.5))
par(new=TRUE)
plot(newdata.e$age, yhat.e$warm.D, type="l", col="purple", xaxt="no", yaxt="no", xlab="", ylab="", ylim=c(0, 0.5))
par(new=TRUE)
plot(newdata.e$age, yhat.e$warm.SD, type="l", col="black", xaxt="no", yaxt="no", xlab="", ylab="", ylim=c(0, 0.5))
text(20, 0.47, "A", cex=0.75)
text(80, 0.12, "SA", cex=0.75)
text(80, 0.45, "D", cex=0.75)
text(30, 0.10, "SD", cex=0.75)

Testing the Proportional Odds Assumption

The VGAM package will come in very handy here. The first thing we will have to do is to specify warm as an ordered factor (otherwise VGAM won’t work). To do this we create a second data-frame and then specify warm as an ordered factor. We can then run two models, one assuming proportional odds and the other model assuming non-proportional odds. We can then use a likelihood ratio test to compare the two. If the \(p-value\) is significant then we reject the proportional odds assumption. In many ways the model summaries will also show you if the assumption holds but these are tricky models so one should not reject (or fail to reject) the proportional odds assumption without looking into the details of this assumption and the debate surrounding how best to test it.

library(VGAM)

ordwarm2a = ordwarm2
ordwarm2a$warm = ordered(ordwarm2a$warm, levels = c("SD", "D", "A", "SA"))

vglmP  = vglm(warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2a, family=cumulative(parallel=TRUE,  reverse=TRUE))

vglmNP  = vglm(warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2a, family=cumulative(parallel=FALSE,  reverse=TRUE))

lrtest(vglmP, vglmNP)

## Likelihood ratio test
## 
## Model 1: warm ~ yr89 + male + white + age + ed + prst
## Model 2: warm ~ yr89 + male + white + age + ed + prst
##    #Df  LogLik  Df  Chisq Pr(>Chisq)    
## 1 6870 -2844.9                          
## 2 6858 -2820.3 -12 49.203  1.928e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(vglmP)

## 
## Call:
## vglm(formula = warm ~ yr89 + male + white + age + ed + prst, 
##     family = cumulative(parallel = TRUE, reverse = TRUE), data = ordwarm2a)
## 
## Pearson residuals:
##                   Min      1Q  Median      3Q   Max
## logit(P[Y>=2]) -4.844  0.1531  0.2244  0.5373 1.109
## logit(P[Y>=3]) -2.872 -0.9299  0.3015  0.8573 2.525
## logit(P[Y>=4]) -1.421 -0.6085 -0.2736 -0.1611 4.939
## 
## Coefficients:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept):1  2.465357   0.239499  10.294  < 2e-16 ***
## (Intercept):2  0.630899   0.233775   2.699  0.00696 ** 
## (Intercept):3 -1.261857   0.234797  -5.374 7.69e-08 ***
## yr891989       0.523906   0.080407   6.516 7.24e-11 ***
## maleMen       -0.733295   0.078526  -9.338  < 2e-16 ***
## whiteNot Whit -0.391161   0.118675  -3.296  0.00098 ***
## age           -0.021665   0.002488  -8.710  < 2e-16 ***
## ed             0.067173   0.015952   4.211 2.54e-05 ***
## prst           0.006073   0.003290   1.846  0.06495 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  3 
## 
## Names of linear predictors: 
## logit(P[Y>=2]), logit(P[Y>=3]), logit(P[Y>=4])
## 
## Dispersion Parameter for cumulative family:   1
## 
## Residual deviance: 5689.825 on 6870 degrees of freedom
## 
## Log-likelihood: -2844.912 on 6870 degrees of freedom
## 
## Number of iterations: 4 
## 
## Exponentiated coefficients:
##      yr891989       maleMen whiteNot Whit           age            ed 
##     1.6886107     0.4803236     0.6762709     0.9785675     1.0694806 
##          prst 
##     1.0060911

summary(vglmNP)

## 
## Call:
## vglm(formula = warm ~ yr89 + male + white + age + ed + prst, 
##     family = cumulative(parallel = FALSE, reverse = TRUE), data = ordwarm2a)
## 
## Pearson residuals:
##                   Min      1Q  Median      3Q   Max
## logit(P[Y>=2]) -4.765  0.1514  0.2249  0.4749 1.260
## logit(P[Y>=3]) -2.932 -0.8985  0.2957  0.8366 2.609
## logit(P[Y>=4]) -1.305 -0.5451 -0.2681 -0.1570 4.891
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept):1    1.856953   0.390774   4.752 2.01e-06 ***
## (Intercept):2    0.719810   0.266485   2.701  0.00691 ** 
## (Intercept):3   -1.002225   0.343655  -2.916  0.00354 ** 
## yr891989:1       0.955744   0.153174   6.240 4.39e-10 ***
## yr891989:2       0.536372   0.092002   5.830 5.54e-09 ***
## yr891989:3       0.331219   0.113159   2.927  0.00342 ** 
## maleMen:1       -0.300983   0.127710  -2.357  0.01843 *  
## maleMen:2       -0.717994   0.089070  -8.061 7.57e-16 ***
## maleMen:3       -1.085617   0.121810  -8.912  < 2e-16 ***
## whiteNot Whit:1 -0.528730   0.226553  -2.334  0.01961 *  
## whiteNot Whit:2 -0.349234   0.139250  -2.508  0.01214 *  
## whiteNot Whit:3 -0.377537   0.156729  -2.409  0.01600 *  
## age:1           -0.016349   0.004013  -4.074 4.62e-05 ***
## age:2           -0.024976   0.002832  -8.820  < 2e-16 ***
## age:3           -0.018690   0.003734  -5.005 5.59e-07 ***
## ed:1             0.103247   0.024987   4.132 3.59e-05 ***
## ed:2             0.055869   0.018252   3.061  0.00221 ** 
## ed:3             0.056685   0.025138   2.255  0.02414 *  
## prst:1          -0.001691   0.005594  -0.302  0.76243    
## prst:2           0.009848   0.003778   2.606  0.00915 ** 
## prst:3           0.004922   0.004799   1.026  0.30506    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  3 
## 
## Names of linear predictors: 
## logit(P[Y>=2]), logit(P[Y>=3]), logit(P[Y>=4])
## 
## Dispersion Parameter for cumulative family:   1
## 
## Residual deviance: 5640.622 on 6858 degrees of freedom
## 
## Log-likelihood: -2820.311 on 6858 degrees of freedom
## 
## Number of iterations: 6 
## 
## Exponentiated coefficients:
##      yr891989:1      yr891989:2      yr891989:3       maleMen:1 
##       2.6006059       1.7097922       1.3926643       0.7400903 
##       maleMen:2       maleMen:3 whiteNot Whit:1 whiteNot Whit:2 
##       0.4877299       0.3376933       0.5893530       0.7052281 
## whiteNot Whit:3           age:1           age:2           age:3 
##       0.6855475       0.9837842       0.9753329       0.9814834 
##            ed:1            ed:2            ed:3          prst:1 
##       1.1087656       1.0574594       1.0583225       0.9983103 
##          prst:2          prst:3 
##       1.0098962       1.0049346

Note that the above estimates need to be exponentiated to get the estimates as odds-ratios.

You can also do this (with the resuts being shown in terms of odds-ratios of the estimates):

sf = function(y)
c('Y>=1'=qlogis(mean(y >= 1)),'Y>=2'=qlogis(mean(y >= 2)),'Y>=3'=qlogis(mean(y >= 3)), 'Y>=4'=qlogis(mean(y >= 4)))
s = summary(as.numeric(warm) ~ yr89 + male + white + age + ed + prst, data = ordwarm2a, fun=sf)
s

## as.numeric(warm)    N=2293
## 
## +-------+--------+----+----+--------+-----------+---------+
## |       |        |N   |Y>=1|Y>=2    |Y>=3       |Y>=4     |
## +-------+--------+----+----+--------+-----------+---------+
## |yr89   |1977    |1379|Inf |1.577580|-0.01885480|-1.678001|
## |       |1989    | 914|Inf |2.637886| 0.59937902|-1.272566|
## +-------+--------+----+----+--------+-----------+---------+
## |male   |Women   |1227|Inf |2.057622| 0.50431094|-1.110602|
## |       |Men     |1066|Inf |1.748649|-0.09387751|-2.132227|
## +-------+--------+----+----+--------+-----------+---------+
## |white  |White   | 283|Inf |2.334084| 0.49765516|-1.131870|
## |       |Not Whit|2010|Inf |1.854688| 0.18359858|-1.563574|
## +-------+--------+----+----+--------+-----------+---------+
## |age    |[18,32) | 614|Inf |2.537218| 0.74241823|-1.344077|
## |       |[32,43) | 542|Inf |2.141098| 0.52857757|-1.133355|
## |       |[43,59) | 598|Inf |1.679281| 0.02675745|-1.523615|
## |       |[59,89] | 539|Inf |1.467098|-0.42560522|-2.258490|
## +-------+--------+----+----+--------+-----------+---------+
## |ed     |[ 0,12) | 670|Inf |1.452922|-0.27029033|-1.863218|
## |       |12      | 782|Inf |1.831099| 0.18982628|-1.612510|
## |       |[13,15) | 362|Inf |2.518294| 0.66427920|-1.212371|
## |       |[15,20] | 479|Inf |2.540477| 0.66199201|-1.163699|
## +-------+--------+----+----+--------+-----------+---------+
## |prst   |[12,31) | 585|Inf |1.717940|-0.03077166|-1.665879|
## |       |[31,38) | 568|Inf |1.852639| 0.12693071|-1.683064|
## |       |[38,51) | 731|Inf |1.913475| 0.28369324|-1.440174|
## |       |[51,82] | 409|Inf |2.307976| 0.62065168|-1.195516|
## +-------+--------+----+----+--------+-----------+---------+
## |Overall|        |2293|Inf |1.905168| 0.22157369|-1.503811|
## +-------+--------+----+----+--------+-----------+---------+

Fitting via polr in the MASS package

These models can also be fit via a variety of other packages, and polr in the MASS package is useful at times. For example, if I wanted to show the testing of the proportional odds assumption and to put my overall results in stargazer the rms package won’t work with stargazer. So I show you a quick example below:

library(MASS)

modL = polr(warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2a, method="logistic", Hess=TRUE)
summary(modL)

## Call:
## polr(formula = warm ~ yr89 + male + white + age + ed + prst, 
##     data = ordwarm2a, Hess = TRUE, method = "logistic")
## 
## Coefficients:
##                   Value Std. Error t value
## yr891989       0.523912   0.079899   6.557
## maleMen       -0.733309   0.078483  -9.344
## whiteNot Whit -0.391140   0.118381  -3.304
## age           -0.021666   0.002469  -8.777
## ed             0.067176   0.015975   4.205
## prst           0.006072   0.003293   1.844
## 
## Intercepts:
##      Value    Std. Error t value 
## SD|D  -2.4654   0.2389   -10.3188
## D|A   -0.6309   0.2333    -2.7042
## A|SA   1.2618   0.2340     5.3919
## 
## Residual Deviance: 5689.825 
## AIC: 5707.825

modP = polr(warm ~ yr89 + male + white + age + ed + prst, data = ordwarm2a, method="probit", Hess=TRUE)
summary(modP)

## Call:
## polr(formula = warm ~ yr89 + male + white + age + ed + prst, 
##     data = ordwarm2a, Hess = TRUE, method = "probit")
## 
## Coefficients:
##                   Value Std. Error t value
## yr891989       0.318813   0.046852   6.805
## maleMen       -0.417028   0.045546  -9.156
## whiteNot Whit -0.226500   0.069478  -3.260
## age           -0.012221   0.001443  -8.471
## ed             0.038724   0.009324   4.153
## prst           0.003283   0.001925   1.705
## 
## Intercepts:
##      Value    Std. Error t value 
## SD|D  -1.4286   0.1388   -10.2942
## D|A   -0.3606   0.1369    -2.6333
## A|SA   0.7682   0.1371     5.6047
## 
## Residual Deviance: 5697.222 
## AIC: 5715.222

Predictions from polr() are straightforward and predicted prbabilities can then be plotted (as usual).

newdata.e = data.frame(yr89="1977", male="Women", white="Not Whit", age=seq(18, 89, by=1), ed=12.22, prst=39.59)

yhat.modL = predict(modL, newdata=newdata.e, type="probs")
head(yhat.modL)

##           SD         D         A        SA
## 1 0.06034093 0.2264391 0.4406661 0.2725539
## 2 0.06158115 0.2296507 0.4404886 0.2682795
## 3 0.06284515 0.2328791 0.4402280 0.2640478
## 4 0.06413332 0.2361231 0.4398845 0.2598591
## 5 0.06544605 0.2393821 0.4394581 0.2557138
## 6 0.06678374 0.2426548 0.4389493 0.2516121

yhat.modP = predict(modP, newdata=newdata.e, type="probs")
head(yhat.modP)

##           SD         D         A        SA
## 1 0.05645347 0.2460382 0.4270659 0.2704424
## 2 0.05785475 0.2489154 0.4268165 0.2664134
## 3 0.05928323 0.2517919 0.4265099 0.2624149
## 4 0.06073922 0.2546669 0.4261464 0.2584474
## 5 0.06222303 0.2575396 0.4257261 0.2545113
## 6 0.06373496 0.2604092 0.4252491 0.2506067

library(stargazer)
stargazer(modL, modP, type="html")


	Dependent variable:

	warm
	ordered	ordered
	logistic	probit
	(1)	(2)

yr891989	0.524^***	0.319^***
	(0.080)	(0.047)

maleMen	-0.733^***	-0.417^***
	(0.078)	(0.046)

whiteNot Whit	-0.391^***	-0.226^***
	(0.118)	(0.069)

age	-0.022^***	-0.012^***
	(0.002)	(0.001)

ed	0.067^***	0.039^***
	(0.016)	(0.009)

prst	0.006^*	0.003^*
	(0.003)	(0.002)


Observations	2,293	2,293

Note:	p<0.1; p<0.05; p<0.01

Other Examples: The Autos’ Data

load("~/Box Sync/MPA6020/Data/autos.RData")

The autos data-frame has 74 observations on the following variables:

make - the make of the car
model - model number
price - price
mpg - miles per gallon
rep78 - reputation in 1978 (five ordinal categories)
rep77 - reputation in 1977 (five ordinal categories)
hdroom - headroom in inches
rseat - legroom of rear seats in inches
trunk - trunk space in cubic feet
weight - curb weight in lbs
length - vehicle length in inches
turn - turning circle in feet
displ - engine displacement in cubic inches
gratio - gear-ratio
foreign - foreign versus domestic

Let us fit a small model with rep78 as the outcome of interest. We’ll use only a couple of independent variables – price, mpg and foreign.

Other Examples: The NHANES II Data

The NHANES – the national health and nutrition examination survey – data have a variable called health that will be used as the ordinal response of interest. Some possible independent variables could be sex, race, age, and region.

load("~/Box Sync/MPA6020/Data/nhanes2f.RData")