Home > Numerical Examples
ML FITTING of MPH and HLP MODELS using MPH.FIT (version 3.0): NUMERICAL EXAMPLES
Author: Joseph B. Lang, Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242 USA (September 27, 2002, last updated: April 15, 2005, February 7, 2007, May 22, 2009)
The numerical examples below were fitted using mph.fit, VERSION 3.0. Please read about the changes from version 2.0 in the header of the file mph.fit.
For another collection of more basic examples, click here.
Primary References:
Lang, J.B (2004). "Multinomial-Poisson Homogeneous Models for Contingency Tables," Annals of Statistics, 32, 340-383 .
Lang, J.B. (2005). "Homogeneous Linear Predictor Models for Contingency Tables," JASA, 100, 121-134
See also, other on-line documentation starting at www.stat.uiowa.edu/~jblang/mph.fitting/
Numerical Examples:
EXAMPLE 1. Mean Response Model (colds in children)
EXAMPLE 2. Dispersion Linear Trend Model (Danish marital status)
EXAMPLE 3. Marginal Homogeneity Model (eye grade)
EXAMPLE 4. Loglinear Models under Different Sampling Plans (bike helmet use)
EXAMPLE 5. Loglinear Models of Independence (bike helmet use)
EXAMPLE 6. Logit Models under Different Sampling Plans (bike helmet use)
EXAMPLE 7. Sparse Table Example (fine tuning the fitting algorithm)
EXAMPLE 8. Given Marginal Distributions and Odds Ratios, Compute Joint Distribution in a Two-Way Table.
EXAMPLE 9. Model Constraint: ROC Area Equal to a Constant
EXAMPLE 10. Marginal Homogeneity of Multidimensional Contingency Tables
EXAMPLE 11. Contingency Tables with Given Marginals.
EXAMPLE 12. Testing for (Conditional) Pairwise Independence
EXAMPLE 13. Kappa Regression using Direct and Indirect Smoothing Models.
EXAMPLE 14. Marginal Cumulative Probit Model (in Conjunction with Loglinear Association Model)
EXAMPLE 15. Marginal Cumulative Logit and Linear-by-Linear Loglinear Association Model (Generalized Loglinear and Indirect Smoothing Model Specification)
EXAMPLE 1. ML fit of mean-response models
Table 13.1 Colds in Children, source: Stokes, Davis, and Koch
(2000), "Categorical Data Analysis Using the SAS System."
Periods with Colds
Sex Residence 0 1 2 Total
---- --------- ------------------- -----
F Rural 45 64 71 180
F Urban 80 104 116 300
M Rural 84 124 82 290
M Urban 106 117 87 310
Models fitted below:
Sampling Distribution: Product Multinomial
Conditional on row totals,
(yij1, yij2, yij3) <- (Yij1, Yij2, Yij3) indep ~ mult(nij, pij1, pij2, pij3), i, j = 1,2.
Here, n11 = 180, n12 = 300, n21 = 290, n22 = 310 and the table probabilities are interpreted as pijk = P(Periods = k|SEX=i, RES=j).
Systematic Components (Linear Predictor Models):
Saturated mean-response model:
Mij = b(0) + b(SEX)i + b(RES)j + b(SEX*RES)ij,
No-interaction mean-response model:
Mij = b(0) + b(SEX)i + b(RES)j.
Here, Mij = 0*pij1 + 1*pij2 + 2*pij3 = mean number of periods for SEX=i, RES=j.
These mean-response models have the generic form L(p) = Xb. These are examples of 0-order HLP models.
d <- scan(what=list(sex="",residence="",periods=0,count=0))
F R 0 45
F R 1 64
F R 2 71
F U 0 80
F U 1 104
F U 2 116
M R 0 84
M R 1 124
M R 2 82
M U 0 106
M U 1 117
M U 2 87
d <- data.frame(d)
y <- d$count
sex.residence <- paste(d$sex,d$residence) #[Strata are determined by (SEX, RESIDENCE) values]
L.fct <- function(p) {
p <- matrix(p,4,3,byrow=T)
mean1 <- sum( 0:2*p[1,] )
mean2 <- sum( 0:2*p[2,] )
mean3 <- sum( 0:2*p[3,] )
mean4 <- sum( 0:2*p[4,] )
L <- matrix(c(mean1,mean2,mean3,mean4))
rownames(L) <- c("mean.FR","mean.FU","mean.MR","mean.MU")
L
}
d0 <- subset(d,periods==0)
# SATURATED MEAN-RESPONSE MODEL....
a1 <- mph.fit(y, link=L.fct,X=model.matrix(~sex*residence,d0),strata=sex.residence)
| REMARK: Don't forget to input the strata variable, as the strata determine how the table probabilities are defined. |
mph.fit, version 3.0, 5/1/09 , running...
Convergence criteria have been met.
Time Elapsed: 0.03 seconds , Iterations: 1
mph.summary(a1)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 0 ): Gsq = 0
Pearson's Score Stat (df= 0 ): Xsq = 0
Generalized Wald Stat (df= 0 ): Wsq = 0
Adj Resids: 0 0 ... 0 0 , Number |Adj Resid| > 2: 0
SAMPLING PLAN INFORMATION...
Number of strata: 4
Strata identifiers: F R, F U, M R, M U
Strata with fixed sample sizes: all
Observed strata sample sizes: 180, 300, 290, 310
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) 1.14444444 0.05885860 19.4439633 0.00000000
sexM -0.15134100 0.07374287 -2.0522796 0.04014250
residenceU -0.02444444 0.07479380 -0.3268245 0.74380065
sexM:residenceU -0.02994933 0.09779569 -0.3062438 0.75941899
OBS LINK ML LINK StdErr(L) LINK RESID
mean.FR 1.1444444 1.1444444 0.05885860 0
mean.FU 1.1200000 1.1200000 0.04614952 0
mean.MR 0.9931034 0.9931034 0.04442608 0
mean.MU 0.9387097 0.9387097 0.04467893 0
CONVERGENCE INFORMATION...
Original counts used.
iterations = 1 , time elapsed = 0.03
norm.diff = 0 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 6.7408e-14 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
# NO-INTERACTION MEAN-RESPONSE MODEL...
b <- mph.fit(y, link=L.fct,X=model.matrix(~sex+residence,d0),strata=sex.residence)
mph.summary(b,T,T)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 1 ): Gsq = 0.09383 (p = 0.75936 )
Pearson's Score Stat (df= 1 ): Xsq = 0.09386 (p = 0.75933 )
Generalized Wald Stat (df= 1 ): Wsq = 0.09379 (p = 0.75942 )
Adj Resids: -0.306 -0.306 ... 0.306 0.306 , Number |Adj Resid| > 2: 0
| REMARK: The no-interaction mean-response model fits well, Xsq = 0.09386 (df=1, p-value=0.75933) and all the adjusted cell-specific residuals are between -0.306 and +0.306. |
SAMPLING PLAN INFORMATION...
Number of strata: 4
Strata identifiers: F R, F U, M R, M U
Strata with fixed sample sizes: all
Observed strata sample sizes: 180, 300, 290, 310
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) 1.15526154 0.04695725 24.6024112 0.0000000000
sexM -0.16834269 0.04842903 -3.4760696 0.0005088201
residenceU -0.04194803 0.04817685 -0.8707093 0.3839129012
OBS LINK ML LINK StdErr(L) LINK RESID
mean.FR 1.1444444 1.1552615 0.04695725 -0.3063624
mean.FU 1.1200000 1.1133135 0.04070944 0.3063624
mean.MR 0.9931034 0.9869188 0.03957190 0.3063624
mean.MU 0.9387097 0.9449708 0.03975182 -0.3063624
CELL-SPECIFIC STATISTICS...
strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS
y1 F R 45 44.11273 4.991427 0.2450707 0.02773015 0.3063624
y2 F R 64 63.82746 6.393533 0.3545970 0.03551963 0.3063624
y3 F R 71 72.05981 5.589724 0.4003323 0.03105402 -0.3063624
y4 F U 80 80.94135 7.047109 0.2698045 0.02349036 -0.3063624
y5 F U 104 104.12325 8.235446 0.3470775 0.02745149 -0.3063624
y6 F U 116 114.93540 7.669822 0.3831180 0.02556607 0.3063624
y7 M R 84 84.90553 7.163143 0.2927777 0.02470049 -0.3063624
y8 M R 124 123.98247 8.424577 0.4275258 0.02905027 0.3063624
y9 M R 82 81.11200 7.072743 0.2796965 0.02438877 0.3063624
y10 M U 106 104.99696 7.662589 0.3386999 0.02471803 0.3063624
y11 M U 117 117.06512 8.533036 0.3776294 0.02752592 -0.3063624
y12 M U 87 87.93791 7.322569 0.2836707 0.02362119 -0.3063624
CONVERGENCE INFORMATION...
Original counts used.
iterations = 4 , time elapsed = 0.05
norm.diff = 2.15676e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 9.74239e-09 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
MODEL INFORMATION...
Linear Predictor Model Link Function L.fct (L.mean= FALSE ):
function(m) {
p <- m*c(1/Z%*%t(Z)%*%m)
as.matrix(L.input.fct(p))
}
<environment: 0x0157e07c>
Link information as originally input, L.input.fct:
function(p) {
p <- matrix(p,4,3,byrow=T)
m1 <- sum( 0:2*p[1,] )
m2 <- sum( 0:2*p[2,] )
m3 <- sum( 0:2*p[3,] )
m4 <- sum( 0:2*p[4,] )
L <- matrix(c(m1,m2,m3,m4))
rownames(L) <- c("mean.FR","mean.FU","mean.MR","mean.MU")
L
}
Derivative of Transpose Link Function derLt.fct:
[1] "Numerical derivatives used."
Linear Predictor Model Design Matrix X:
(Intercept) sexM residenceU
1 1 0 0
4 1 0 1
7 1 1 0
10 1 1 1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"
attr(,"contrasts")$residence
[1] "contr.treatment"
U = Orthogonal Complement of X:
[,1]
1 -1.773473
4 1.773473
7 1.773473
10 -1.773473
Constraint Function h.fct (h.mean= FALSE ):
function(m) {
t(U)%*%L.fct(m)
}
<environment: 0x0157e07c>
Constraint information as originally input, h.input.fct:
NULL
Derivative of Transpose Constraint Function derht.fct:
[1] "Numerical derivatives used."
Population (Strata) Matrix Z:
F R F U M R M U
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 0 1 0 0
5 0 1 0 0
6 0 1 0 0
7 0 0 1 0
8 0 0 1 0
9 0 0 1 0
10 0 0 0 1
11 0 0 0 1
12 0 0 0 1
attr(,"assign")
[1] 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
Sampling Constraint Matrix ZF:
F R F U M R M U
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 0 1 0 0
5 0 1 0 0
6 0 1 0 0
7 0 0 1 0
8 0 0 1 0
9 0 0 1 0
10 0 0 0 1
11 0 0 0 1
12 0 0 0 1
attr(,"assign")
[1] 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
Fixed Sample Sizes n:
OBS
F R 180
F U 300
M R 290
M U 310
strata:
strata
[1,] "F R"
[2,] "F R"
[3,] "F R"
[4,] "F U"
[5,] "F U"
[6,] "F U"
[7,] "M R"
[8,] "M R"
[9,] "M R"
[10,] "M U"
[11,] "M U"
[12,] "M U"
fixed.strata:
[1] "all"
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 2. Dispersion Linear Trend Model.
Marital Status of Danes. Source: Lloyd, C. (1999), "Statistical Analysis of Categorical Data," p. 72.
Age Single Married Divorced Total
17-21 17 1 0 18
21-25 16 8 0 24
25-30 8 17 1 26
30-40 6 22 4 32
40-50 5 21 6 32
50-60 3 17 8 28
60-70 2 8 6 16
70+ 1 3 5 9
Candidate Models:
(yi1, yi2, yi3) <- (Yi1, Yi2, Yi3) indep ~ mult(ni, pi1, pi2, pi3), i=1,...,8.
Here, n1 = 18,..., n8 = 9, and the table probabilities are defined as pij = P(Marital=j|Age Stratum= i).
The table probabilities are modeled as:
Linear-in-Age Model: Gi = b(0) + b(AGE)*xi or Saturated Model: Gi = bi,
where Gi = 1 - (pi12 + pi22 + pi32) = Gini dispersion for age category i and x = (x1,...x8) = (19,23,27,35,45,55,65,80) is the vector of AGE scores [midpoints of the age intervals].
d <- scan(what=list(Age="",Marital = "",Freq=0))
17-21 Single 17
17-21 Married 1
17-21 Divorced 0
21-25 Single 16
21-25 Married 8
21-25 Divorced 0
25-30 Single 8
25-30 Married 17
25-30 Divorced 1
30-40 Single 6
30-40 Married 22
30-40 Divorced 4
40-50 Single 5
40-50 Married 21
40-50 Divorced 6
50-60 Single 3
50-60 Married 17
50-60 Divorced 8
60-70 Single 2
60-70 Married 8
60-70 Divorced 6
70+ Single 1
70+ Married 3
70+ Divorced 5
d <- data.frame(d)
y <- d$Freq
strata <- d$Age
L.fct <- function(p) {
p <- matrix(p,8,3,byrow=T)
L <- c()
for (i in 1:8) {
Gi <- 1- sum(p[i,]*p[i,])
L <- rbind(L, Gi)
}
rownames(L) <- c("G.17","G.21","G.25","G.30","G.40","G.50","G.60","G.70+")
L
}
d1 <- subset(d, Marital=="Single")
Agescore <- scan()
19 23 27 35 45 55 65 80
d1 <- data.frame(d1,Agescore)
d1
Age Marital Freq Agescore
1 17-21 Single 17 19
2 21-25 Single 16 23
3 25-30 Single 8 27
4 30-40 Single 6 35
5 40-50 Single 5 45
6 50-60 Single 3 55
7 60-70 Single 2 65
8 70+ Single 1 80
a <- mph.fit(y,link=L.fct,X=model.matrix(~Agescore,d1),strata=d$Age)
| REMARK: Don't forget to input the strata variable, as the strata determine how the table probabilities are defined. |
mph.summary(a,T)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 6 ): Gsq = 6.61522 (p = 0.3579 )
Pearson's Score Stat (df= 6 ): Xsq = 4.96987 (p = 0.54768 )
Generalized Wald Stat (df= 6 ): Wsq = 10.09688 (p = 0.12063 )
Adj Resids: -2.144 -2.144 ... 0.873 2.144 , Number |Adj Resid| > 2: 3
SAMPLING PLAN INFORMATION...
Number of strata: 8
Strata identifiers: 17-21, 21-25, 25-30, 30-40, 40-50, 50-60, 60-70, 70+
Strata with fixed sample sizes: all
Observed strata sample sizes: 18, 24, 26, 32, 32, 28, 16, 9
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) 0.331544223 0.072438461 4.576909 4.718973e-06
Agescore 0.003625855 0.001398731 2.592245 9.535181e-03
OBS LINK ML LINK StdErr(L) LINK RESID
G.17 0.1049383 0.4004355 0.04914697 -2.7342474
G.21 0.4444444 0.4149389 0.04471273 0.4734820
G.25 0.4763314 0.4294423 0.04056645 0.6290227
G.30 0.4765625 0.4584491 0.03356001 0.2198400
G.40 0.5097656 0.4947077 0.02879638 0.1907121
G.50 0.5382653 0.5309662 0.03038880 0.1102546
G.60 0.5937500 0.5672248 0.03753687 0.4409538
G.70+ 0.5679012 0.6216126 0.05358163 -1.0999550
CELL-SPECIFIC STATISTICS...
strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS
y1 17-21 17 1.342063e+01 7.917228e-01 7.455907e-01 4.398460e-02 2.143869e+00
y2 17-21 1 3.642503e+00 1.177341e+00 2.023613e-01 6.540785e-02 -2.143869e+00
y3 17-21 0 9.368645e-01 8.349467e-01 5.204803e-02 4.638593e-02 -2.143869e+00
y4 21-25 16 1.694951e+01 1.300864e+00 7.062294e-01 5.420266e-02 -5.237165e-01
y5 21-25 8 7.050495e+00 1.300864e+00 2.937706e-01 5.420266e-02 5.237165e-01
y6 21-25 0 1.015483e-38 1.007712e-19 4.231178e-40 4.198798e-21 -1.015483e-38
y7 25-30 8 6.835339e+00 1.495050e+00 2.628976e-01 5.750194e-02 6.956249e-01
y8 25-30 17 1.839519e+01 1.165229e+00 7.075074e-01 4.481652e-02 -6.956249e-01
y9 25-30 1 7.694700e-01 7.980423e-01 2.959500e-02 3.069394e-02 6.956249e-01
y10 30-40 6 5.696848e+00 1.693282e+00 1.780265e-01 5.291507e-02 2.249922e-01
y11 30-40 22 2.253683e+01 9.857207e-01 7.042760e-01 3.080377e-02 -2.249922e-01
y12 30-40 4 3.766318e+00 1.498098e+00 1.176974e-01 4.681558e-02 2.249922e-01
y13 40-50 5 4.768420e+00 1.627600e+00 1.490131e-01 5.086250e-02 1.951100e-01
y14 40-50 21 2.148669e+01 9.149380e-01 6.714590e-01 2.859181e-02 -1.951100e-01
y15 40-50 6 5.744893e+00 1.733194e+00 1.795279e-01 5.416232e-02 1.951100e-01
y16 50-60 3 2.894242e+00 1.305963e+00 1.033658e-01 4.664152e-02 1.121326e-01
y17 50-60 17 1.725375e+01 1.225208e+00 6.162052e-01 4.375743e-02 -1.121326e-01
y18 50-60 8 7.852012e+00 1.976947e+00 2.804290e-01 7.060526e-02 1.121326e-01
y19 60-70 2 1.607440e+00 8.987154e-01 1.004650e-01 5.616971e-02 4.913694e-01
y20 60-70 8 8.718400e+00 1.352845e+00 5.449000e-01 8.455284e-02 -4.913694e-01
y21 60-70 6 5.674160e+00 1.795040e+00 3.546350e-01 1.121900e-01 4.913694e-01
y22 70+ 1 1.577521e+00 9.291630e-01 1.752801e-01 1.032403e-01 -8.729608e-01
y23 70+ 3 3.157069e+00 1.420296e+00 3.507855e-01 1.578107e-01 -8.729608e-01
y24 70+ 5 4.265410e+00 1.239264e+00 4.739345e-01 1.376960e-01 8.729608e-01
CONVERGENCE INFORMATION...
Original counts used.
iterations = 65 , time elapsed = 0.89
norm.diff = 1.27049 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 6.42349e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
| REMARK: The fact that norm.diff does not converge to 0 usually means that the ML estimates do not exist owing to 0 counts; indeed, for this example, a look at the fitted values indicates that the iterate estimates approach 0 for the 6th cell. The resulting estimates can be regarded as "extended" ML estimates, which allow for fitted values equal to 0, the boundary value. For the saturated model fitted below, the print.iter=T option is used. We can monitor the iterations and see how the norm.diff criterion is not met. |
#SATURATED MODEL...
a.sat <- mph.fit(y,link=L.fct,X=model.matrix(~Age,d1),print.iter=T, norm.score.conv=1e-9, strata=d$Age)
mph.fit, version 3.0, 5/1/09 , running...
iter= 1 norm.diff= 1.414214 norm.score= 0.005202601
iter= 2 norm.diff= 1.414214 norm.score= 0.00191393
iter= 3 norm.diff= 1.414214 norm.score= 0.0007040955
iter= 4 norm.diff= 1.414214 norm.score= 0.0002590222
iter= 5 norm.diff= 1.414214 norm.score= 9.528896e-05
iter= 6 norm.diff= 1.414214 norm.score= 3.505485e-05
iter= 7 norm.diff= 1.414214 norm.score= 1.289596e-05
iter= 8 norm.diff= 1.414214 norm.score= 4.744158e-06
iter= 9 norm.diff= 1.414214 norm.score= 1.745278e-06
iter= 10 norm.diff= 1.414214 norm.score= 6.42052e-07
iter= 11 norm.diff= 1.414214 norm.score= 2.361977e-07
iter= 12 norm.diff= 1.414214 norm.score= 8.689228e-08
iter= 13 norm.diff= 1.414214 norm.score= 3.196589e-08
iter= 14 norm.diff= 1.414214 norm.score= 1.175959e-08
iter= 15 norm.diff= 1.414214 norm.score= 4.326112e-09
iter= 16 norm.diff= 1.414214 norm.score= 1.591488e-09
iter= 17 norm.diff= 1.414214 norm.score= 5.854756e-10
iter= 18 norm.diff= 1.414214 norm.score= 2.153844e-10
iter= 19 norm.diff= 1.414214 norm.score= 7.92355e-11
iter= 20 norm.diff= 1.414214 norm.score= 2.914911e-11
iter= 21 norm.diff= 1.414214 norm.score= 1.072336e-11
iter= 22 norm.diff= 1.414214 norm.score= 3.944907e-12
iter= 23 norm.diff= 1.414214 norm.score= 1.451259e-12
iter= 24 norm.diff= 1.414214 norm.score= 5.339131e-13
iter= 25 norm.diff= 1.414214 norm.score= 1.964825e-13
iter= 26 norm.diff= 1.414214 norm.score= 7.24633e-14
Did NOT meet norm diff convergence criterion [1e-06]!
Norm score convergence criterion [1e-09] was met.
Time Elapsed: 0.06 seconds , Iterations: 26
mph.summary(a.sat)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 0 ): Gsq = 0
Pearson's Score Stat (df= 0 ): Xsq = 0
Generalized Wald Stat (df= 0 ): Wsq = 0
Adj Resids: 0 0 ... 0 0 , Number |Adj Resid| > 2: 0
SAMPLING PLAN INFORMATION...
Number of strata: 8
Strata identifiers: 17-21, 21-25, 25-30, 30-40, 40-50, 50-60, 60-70, 70+
Strata with fixed sample sizes: all
Observed strata sample sizes: 18, 24, 26, 32, 32, 28, 16, 9
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) 0.1049383 0.09598275 1.093303 2.742606e-01
Age21-25 0.3395062 0.11544659 2.940807 3.273580e-03
Age25-30 0.3713931 0.12049262 3.082289 2.054152e-03
Age30-40 0.3716242 0.12904010 2.879913 3.977852e-03
Age40-50 0.4048274 0.12569882 3.220614 1.279164e-03
Age50-60 0.4333270 0.11931348 3.631836 2.814116e-04
Age60-70 0.4888117 0.11346716 4.307958 1.647690e-05
Age70+ 0.4629630 0.13967541 3.314563 9.178648e-04
OBS LINK ML LINK StdErr(L) LINK RESID
G.17 0.1049383 0.1049383 0.09598275 0
G.21 0.4444444 0.4444444 0.06415003 0
G.25 0.4763314 0.4763314 0.07284080 0
G.30 0.4765625 0.4765625 0.08624766 0
G.40 0.5097656 0.5097656 0.08116345 0
G.50 0.5382653 0.5382653 0.07087326 0
G.60 0.5937500 0.5937500 0.06051536 0
G.70+ 0.5679012 0.5679012 0.10147184 0
CONVERGENCE INFORMATION...
Original counts used.
iterations = 26 , time elapsed = 0.06
norm.diff = 1.41421 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 7.24633e-14 = norm of score at last iteration.
Norm score convergence criterion [1e-09] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 3. ML Fit of Marginal Homogeneity Model
Display of unaided distance vision data from 30-39 year old FEMALE employees of United Kingdom Royal Ordnance factories during 1943-1946 (Kendall and Stuart 1961, "The Advanced Theory of Statistics, Vol. 2: Inference and Relationship," p564 and p 586. See also, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System," p. 443)
Left Grade
Right Grade 1 2 3 4 Total
----------- --------------------- -----
1 1520 266 124 66 1976
2 234 1512 432 78 2256
3 117 362 1772 205 2456
4 36 82 179 492 789
--------------------- -----
Total 1907 2222 2507 841 7477
Model of Marginal Homogeneity:
It is assumed that y <- Y ~ multinomial( n=7477, p), where the table probabilities satisfy:
pi+ = p+i, i=1,2,3, 4. That is, h(p) = [p1+ - p+1, p2+- p+2, p3+ - p+3]' = 0.
Note: The last constraint p4+ = p+4 is implied by the first three, so is omitted to avoid redundancies.
Equivalently, the model can be specified in terms of expected counts m:
mi+ = m+i, i=1,2,3, 4. That is, h(m) = [m1+ - m+1, m2+- m+2, m3+ - m+3]' = 0.
Alternatively, the marginal homogeneity model can be expressed in HLP form:
L(p) = [p1+, p2+, p3+, p4+, p+1, p+2, p+3, p+4 ]' = [b(1), b(2), b(3), b(4), b(1), b(2), b(3), b(5)]' = X b
Note: Given that (p1+ = p+1, p2+ = p+2 , p3+ = p+3) it follows that p4+ = p+4. That is, the last constraint is redundant. Hence, the introduction of b(5) in the HLP model.
y <- scan()
1520 266 124 66
234 1512 432 78
117 362 1772 205
36 82 179 492
RGrade <- gl(4,4,16)
LGrade <- gl(4,1,16)
data.frame(RGrade,LGrade,y)
strata <- rep(1,16)
MR <- M.fct(RGrade) #M.fct is included in mph.supp.Rcode.txt
ML <- M.fct(LGrade)
# Alternative code...
# MR <- kronecker(diag(4),matrix(1,1,4))
# ML <- kronecker(matrix(1,1,4),diag(4))
h.fct <- function(p) {
as.matrix(c(MR%*%p - ML%*%p)[-4])
}
# Fit the h(p) = 0 version of the model...
ap <- mph.fit(y,constraint=h.fct)
# Fit the h(m) = 0 version of the model...
am <- mph.fit(y,constraint=h.fct,h.mean=T)
#NOTE: When the model is specified in terms of m, rather than p,
# mph.fit checks for Z homogeneity. In the HLP setting, mph.fit also
# checks whether the link is an HLP link.
#
# Submitting, am <- mph.fit(y,constraint=h.fct,h.mean=T)
# results in the following message:
#
# CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed.
#
# The necessary condition is met. There is no evidence that h(.) is not Z homogeneous.
#
# If the necessary condition alluded to in the message is not passed, it could be
# for one of two reasons: (1) h(.) is actually not Z homogeneous or (2) the
# necessary condition check was subject to too much round-off error. You can
# try using larger tolerance values (default=1e-09), e.g. check.homog.tol = 1e-4.
# If the condition does not pass with relatively large tolerance value, then it is
# likely that h(.) is not Z homogeneous; which means that the model not only constrains the
# table probabilities, it also constrains the expected sample sizes.
# In this non-Z-homogeneous case, mph.fit can not be used to fit the model.
#
cbind(ap$Xsq,am$Xsq)
PEARSON SCORE STATISTIC PEARSON SCORE STATISTIC
11.96980 11.96980
cbind(ap$m,am$m,ap$p,am$p)
FV FV PROB PROB
m1 1520.00000 1520.00000 0.203290090 0.203290090
m2 252.48209 252.48209 0.033767833 0.033767833
m3 111.84293 111.84293 0.014958263 0.014958263
m4 56.96589 56.96589 0.007618817 0.007618817
m5 247.23710 247.23710 0.033066350 0.033066350
m6 1512.00000 1512.00000 0.202220142 0.202220142
m7 409.41751 409.41751 0.054756923 0.054756923
m8 70.58517 70.58517 0.009440307 0.009440307
m9 131.26859 131.26859 0.017556318 0.017556318
m10 383.13268 383.13268 0.051241497 0.051241497
m11 1772.00000 1772.00000 0.236993447 0.236993447
m12 195.25849 195.25849 0.026114549 0.026114549
m13 42.78522 42.78522 0.005722245 0.005722245
m14 91.62502 91.62502 0.012254249 0.012254249
m15 188.39931 188.39931 0.025197179 0.025197179
m16 492.00000 492.00000 0.065801792 0.065801792
mph.summary(am)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 3 ): Gsq = 11.9872 (p = 0.0074271 )
Pearson's Score Stat (df= 3 ): Xsq = 11.9698 (p = 0.0074873 )
Generalized Wald Stat (df= 3 ): Wsq = 11.95657 (p = 0.0075334 )
Adj Resids: -3.179 -2.733 ... 2.733 3.179 , Number |Adj Resid| > 2: 6
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 7477
CONVERGENCE INFORMATION...
Original counts used.
iterations = 8 , time elapsed = 0.04
norm.diff = 9.67905e-08 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 7.67784e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
# Alternative to numerically computing the derivative of h() with respect to m,
# input the derivative explicitly...
derht.fct <- function(m) {
t((MR- ML)[-4,])
}
am2 <- mph.fit(y,constraint=h.fct, h.mean=T, derht.fct=derht.fct)
CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed.
mph.fit, version 3.0, 5/1/09 , running...
Convergence criteria have been met.
Time Elapsed: 0.03 seconds , Iterations: 8
# Fit the HLP version of the marginal homogeneity model...
L.fct <- function(p) {
L <- as.matrix(c(MR%*%p,ML%*%p))
rownames(L) <- c("P(R=1)","P(R=2)","P(R=3)","P(R=4)",
"P(L=1)","P(L=2)","P(L=3)","P(L=4)")
L
}
X <- scan()
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1
X <- matrix(X, 8,5,byrow=T)
#L.fct = X beta, constrains the first three marginal probs to be equal.
# This implies the fourth marginal probs are also equal.
#
ap.HLP <- mph.fit(y,link=L.fct,X=X)
mph.summary(ap.HLP)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 3 ): Gsq = 11.9872 (p = 0.0074271 )
Pearson's Score Stat (df= 3 ): Xsq = 11.9698 (p = 0.0074873 )
Generalized Wald Stat (df= 3 ): Wsq = 11.97572 (p = 0.0074668 )
Adj Resids: -3.179 -2.733 ... 2.733 3.179 , Number |Adj Resid| > 2: 6
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 7477
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
beta1 0.2596350 0.004683916 55.43119 0
beta2 0.2994837 0.004642237 64.51281 0
beta3 0.3319058 0.004827707 68.75019 0
beta4 0.1089755 0.003177685 34.29398 0
beta5 0.1089755 0.003177685 34.29398 0
OBS LINK ML LINK StdErr(L) LINK RESID
P(R=1) 0.2642771 0.2596350 0.004683916 2.3908957
P(R=2) 0.3017253 0.2994837 0.004642237 0.8786678
P(R=3) 0.3284740 0.3319058 0.004827707 -1.3618671
P(R=4) 0.1055236 0.1089755 0.003177685 -2.0309320
P(L=1) 0.2550488 0.2596350 0.004683916 -2.3620900
P(L=2) 0.2971780 0.2994837 0.004642237 -0.9038095
P(L=3) 0.3352949 0.3319058 0.004827707 1.3449095
P(L=4) 0.1124783 0.1089755 0.003177685 2.0609046
CONVERGENCE INFORMATION...
Original counts used.
iterations = 37 , time elapsed = 0.39
norm.diff = 8.28358e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.56281e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 4. Loglinear models under different sampling plans
Bicycle Data (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")
Wearing Helmet?
Bicycle Type Yes No
Mountain 34 32
Other 10 24
Fit the independence loglinear model under a variety of sampling plans.
d <- scan(what=list(bike="",helmet="",count=0)) Mountain Yes 34 Mountain No 32 Other Yes 10 Other No 24 d <- data.frame(d) y <- d$count a1 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1) # Full Multinomial a2 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1,fixed.strata="none") # Full Poisson a3 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=d$bike) # Prod (Row) Multinomial a4 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=d$helmet) # Prod (Col) Multinomial
c(a1$Xsq,a2$Xsq,a3$Xsq,a4$Xsq) [1] 4.449383 4.449383 4.449383 4.449383
cbind(a1$m,a2$m,a3$m,a4$m)
FV FV FV FV
m1 29.04 29.04 29.04 29.04 [Fitted values are identical]
m2 36.96 36.96 36.96 36.96
m3 14.96 14.96 14.96 14.96
m4 19.04 19.04 19.04 19.04
cbind(sqrt(diag(a1$covm)), sqrt(diag(a2$covm)), sqrt(diag(a3$covm)), sqrt(diag(a4$covm)))
[,1] [,2] [,3] [,4]
m1 3.882984 4.848792 3.276154 2.084319 [Approx Std Errors of fitted values depend on sampling plan]
m2 4.215491 5.606316 3.276154 2.652769
m3 2.681934 3.070958 1.687716 2.084319
m4 3.144132 3.675702 1.687716 2.652769
cbind(a1$beta,a2$beta,a3$beta,a4$beta)
BETA BETA BETA BETA
(Intercept) 3.6098362 3.6098362 3.6098362 3.6098362
bikeOther -0.6632942 -0.6632942 -0.6632942 -0.6632942
helmetYes -0.2411621 -0.2411621 -0.2411621 -0.2411621
cbind(sqrt(diag(a1$covbeta)),sqrt(diag(a2$covbeta)),sqrt(diag(a3$covbeta)),sqrt(diag(a4$covbeta)))
[,1] [,2] [,3] [,4]
(Intercept) 0.1140555 0.1516861 8.864053e-02 7.177406e-02
bikeOther 0.2111002 0.2111002 4.121552e-07 2.111002e-01
helmetYes 0.2014557 0.2014557 2.014557e-01 5.281921e-07
cbind(a1$p,sqrt(diag(a1$covp)), a2$p,sqrt(diag(a2$covp)), a3$p,sqrt(diag(a3$covp)),a4$p,sqrt(diag(a4$covp)))
PROB PROB PROB PROB
p1 0.2904 0.03882984 0.2904 0.03882984 0.44 0.04963869 0.66 0.04737088
p2 0.3696 0.04215491 0.3696 0.04215491 0.56 0.04963869 0.66 0.04737088
p3 0.1496 0.02681934 0.1496 0.02681934 0.44 0.04963869 0.34 0.04737088
p4 0.1904 0.03144132 0.1904 0.03144132 0.56 0.04963869 0.34 0.04737088
# Summarize the result for the product (row) multinomial sampling case...
mph.summary(a3,T)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 1 ): Gsq = 4.55692 (p = 0.032786 )
Pearson's Score Stat (df= 1 ): Xsq = 4.44938 (p = 0.034914 )
Generalized Wald Stat (df= 1 ): Wsq = 4.33093 (p = 0.037426 )
Adj Resids: -2.109 -2.109 2.109 2.109 , Number |Adj Resid| > 2: 4
SAMPLING PLAN INFORMATION...
Number of strata: 2
Strata identifiers: Mountain, Other
Strata with fixed sample sizes: all
Observed strata sample sizes: 66, 34
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) 3.6098362 8.864053e-02 4.072445e+01 0.0000000
bikeOther -0.6632942 4.121552e-07 -1.609331e+06 0.0000000
helmetYes -0.2411621 2.014557e-01 -1.197097e+00 0.2312688
OBS LINK ML LINK StdErr(L) LINK RESID
link1 3.526361 3.368674 0.11281521 1.947417
link2 3.465736 3.609836 0.08864053 -2.264984
link3 2.302585 2.705380 0.11281521 -2.562617
link4 3.178054 2.946542 0.08864053 1.874599
CELL-SPECIFIC STATISTICS...
strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS
y1 Mountain 34 29.04 3.276154 0.44 0.04963869 2.109356
y2 Mountain 32 36.96 3.276154 0.56 0.04963869 -2.109356
y3 Other 10 14.96 1.687716 0.44 0.04963869 -2.109356
y4 Other 24 19.04 1.687716 0.56 0.04963869 2.109356
CONVERGENCE INFORMATION...
Original counts used.
iterations = 5 , time elapsed = 0.05
norm.diff = 5.82122e-11 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 2.37238e-12 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
# By default, the derivative of the link was numerically computed in
# the mph.fit calls above. In the log-linear model setting, the link
# derivative has a simple explicit form. This derivative with
# respect to m can be included as input. For example....
derLt.fct <- function(m) {
diag(c(1/m))
}
a11 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1,derLt.fct=derLt.fct)
EXAMPLE 5. Loglinear models of independence
Bicycle Data (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")
Wearing Helmet?
Bicycle Type Yes No
Mountain 34 32
Other 10 24
Loglinear Model of Independence:
Assume that y <- multinomial(n=100, p), where the table probabilities p satisfy:
log(pij) = b(0) + b(Bike)i + b(Helmet)j ; generically log(p) = X b.
or equivalently, in constraint form,
h(p) = U'log(p) = 0, where U is an orthogonal complement of X.
This model can also be re-expressed in terms of the expected counts m:
log(mij) = b(0) + b(Bike)i + b(Helmet)j ; generically log(m) = X b.
or equivalently, in constraint form,
h(m) = U'log(m) = 0, where U is an orthogonal complement of X.
d <- scan(what=list(bike="",helmet="",count=0))
Mountain Yes 34
Mountain No 32
Other Yes 10
Other No 24
d <- data.frame(d)
y <- d$count
# Fit the log-linear models of independence...
ap <- mph.fit(y,link="logp",X=model.matrix(~bike+helmet,d),strata=1) #log(p) = Xb
am <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1) #log(m) = Xb
# Fit the constraint specification of the independence model...
U <- create.U(model.matrix(~bike+helmet,d))
h.fct <-
function(p) {
t(U)%*%log(p)
}
ap2 <- mph.fit(y,constraint=h.fct, strata=1) #h(p) = U'log(p) = 0
am2 <- mph.fit(y,constraint=h.fct,h.mean=T,strata=1) #h(m) = U'log(m) = 0
cbind(ap$m,am$m,ap2$m,am2$m,ap$p,am$p,ap2$p,am2$p)
FV FV FV FV PROB PROB PROB PROB
m1 29.04 29.04 29.04 29.04 0.2904 0.2904 0.2904 0.2904 [Same fitted values and probability estimates.]
m2 36.96 36.96 36.96 36.96 0.3696 0.3696 0.3696 0.3696
m3 14.96 14.96 14.96 14.96 0.1496 0.1496 0.1496 0.1496
m4 19.04 19.04 19.04 19.04 0.1904 0.1904 0.1904 0.1904
#NOTE: When the models are specified in terms of m, rather than p,
# mph.fit checks for Z homogeneity. In the HLP setting, mph.fit also
# checks whether the link is an HLP link.
#
# Submitting, am <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1)
# results in the following messages:
#
# CHECKING whether L(.) is an HLP link function...Necessary condition [tol=1e-09] passed.
# CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed.
#
# The necessary conditions are met. There is no evidence that the model is not an HLP model.
#
# If the necessary conditions alluded to in the messages are not passed, you can
# try using larger tolerance values, e.g. check.homog.tol = 1e-4 and check.HLP.tol= 1e-4.
#
EXAMPLE 6. Logit models under different sampling plans
Bicycle Data (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")
Wearing Helmet?
Bicycle Type Yes No
Mountain 34 32
Other 10 24
Logit Model:
log(pi1/pi2) = b(0) + b(Bike)i.
Candidate Sampling Distributions:
If the data are the result of a simple random sample from a single population, then pij = P(Bike=i, Helmet = j). The expected counts have the form mij = ν pij, where ν is the expected sample size, which may [full-multinomial sampling plan] or may not [full-Poisson sampling plan] be fixed a priori.
If the data are the result of a row-stratified random sample, then pij = P(Helmet = j | Bike = i) . The expected counts have the form mij = νi pij, where νi is the expected sample size for the stratum [Bike = i]. These expected sample sizes may or may not be fixed a priori.
If the data are the result of a column-stratified random sample, then pij = P(Bike= i | Helmet = j) . The expected counts have the form mij = νj pij, where νj is the expected sample size for the stratum [Helmet = j]. These expected sample sizes may or may not be fixed a priori. Technically, under this sampling plan, the "logit" model is not a log odds model at all. It actually has the form log(pi1/pi2) = log(P(Bike=i|Helmet=1)/P(Bike=i|Helmet=2)) = log(P(Helmet=1|Bike=i)/P(Helmet=2|Bike=i)) + log(P(Helmet=2)/P(Helmet=1)).
d <- scan(what=list(bike="",helmet="",count=0))
Mountain Yes 34
Mountain No 32
Other Yes 10
Other No 24
d <- data.frame(d)
y <- d$count
L.fct <- function(p) {
p <- matrix(p,2,2,byrow=T)
L <- as.matrix(c(log(p[1,1]/p[1,2]), log(p[2,1]/p[2,2])))
L
}
d1 <- subset(d,helmet=="Yes")
d1
bike helmet count
1 Mountain Yes 34
3 Other Yes 10
a1 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=1) #Full Multinomial
a2 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=1, fixed.strata="none") #Full Poisson
a3 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=d$bike) #Product Multinomial
a4 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=d$helmet) #Product Multinomial
cbind(a1$beta,a2$beta,a3$beta,a4$beta)
BETA BETA BETA BETA
(Intercept) 0.06062462 0.06062462 0.06062462 0.3017867
bikeOther -0.93609336 -0.93609336 -0.93609336 -0.9360934
cbind(sqrt(diag(a1$covbeta)),sqrt(diag(a2$covbeta)),sqrt(diag(a3$covbeta)),sqrt(diag(a4$covbeta)))
[,1] [,2] [,3] [,4]
(Intercept) 0.2462961 0.2462961 0.2462961 0.1416946
bikeOther 0.4498093 0.4498093 0.4498093 0.4498093
cbind(a1$p,sqrt(diag(a1$covp)),a2$p,sqrt(diag(a2$covp)),a3$p,sqrt(diag(a3$covp)),a4$p,sqrt(diag(a4$covp)))
PROB PROB PROB PROB
p1 0.34 0.04737088 0.34 0.04737088 0.5151515 0.06151748 0.7727273 0.06317721
p2 0.32 0.04664762 0.32 0.04664762 0.4848485 0.06151748 0.5714286 0.06613001
p3 0.10 0.03000000 0.10 0.03000000 0.2941176 0.07814249 0.2272727 0.06317721
p4 0.24 0.04270831 0.24 0.04270831 0.7058824 0.07814249 0.4285714 0.06613001
EXAMPLE 7. Very sparse table: fine-tuning the fitting algorithm
y <- c(rep(0,499),1) y [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [60] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [119] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [178] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [237] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [296] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [355] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [414] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [473] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Model Fitted: y <- Y ~ multinomial(1, p), where p1 =
p2 = ... = p500 , equivalently m1 = ... =
m500. (This is the model that states all the cell probabilities are 0.002.)
C <- cbind(1,-diag(499))
h.fct <- function(m) {
C%*%m
}
derht.fct <- function(m) {
t(C)
}
a <- mph.fit(y,h.fct=h.fct, h.mean=T, derht.fct=derht.fct, print.iter=T) #Does not converge!
| REMARK: The MLe's exist for this model of equal probabilities. However, the many zero counts cause fitting problems. One way to mitigate fitting problems with zeroes is to begin the iterations using modified (non-zero) counts such as y+0.1. The next run does just this. By default, the modified counts y+y.eps are used until the 5th iteration, at which time the original counts y are used. |
a <- mph.fit(y,h.fct=h.fct,derht.fct=derht.fct,h.mean=T, y.eps=0.1) #Temporarily add 0.1 to counts # # Note: If derht.fct were not entered, then numerical derivatives of h wrt m would be used, and the # fitting would be quite a bit slower. # c(a$m) [1] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [20] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [39] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 ....[omitted output] [362] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [381] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [400] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [419] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [438] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [457] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [476] 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 [495] 0.002 0.002 0.002 0.002 0.002 0.002
# Another way to fine tune the algorithm is to set the 'step' parameter to a # smaller value (default is step=1). As an example, # a <- mph.fit(y,h.fct=h.fct,derht.fct=derht.fct,h.mean=T, step=0.5) does converge.
mph.summary(a)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 499 ): Gsq = 12.42922 (p = 1 )
Pearson's Score Stat (df= 499 ): Xsq = 499 (p = 0.49158 )
Generalized Wald Stat (df= 499 ): Wsq = 0.98008 (p = 1 )
WARNING: 100% of expected counts are less than 5.
Chi-square approximation may be questionable.
Adj Resids: -0.045 -0.045 ... -0.045 22.338 , Number |Adj Resid| > 2: 1
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 1
CONVERGENCE INFORMATION...
Original counts used.
iterations = 52 , time elapsed = 61.01
norm.diff = 1.1344e-06 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 2.43078e-09 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
| REMARK. Notice that the CONVERGENCE INFORMATION output tells you that the original counts y (not the modified counts y+y.eps) were used. |
EXAMPLE 8. Given row marginal probs = (0.2, 0.3, 0.5), column marginal probs = (0.4, 0.1, 0.5), and local odds ratios or11 = 3, or12=2, or21=1, or22=4, compute the joint distribution.
y <- matrix(1,9,1) # any reasonable positive numbers will work here
A <- gl(3,3,9)
B <- gl(3,1,9)
MA <- M.fct(A)
MB <- M.fct(B)
# Alternative way to create MA and MB...
# MA <- t(kronecker(diag(3),matrix(1,3,1)))
# MB <- t(kronecker(matrix(1,3,1),diag(3)))
#
h.fct <- function(p) {
or11 <- p[1]*p[5]/(p[2]*p[4])
or12 <- p[2]*p[6]/(p[3]*p[5])
or21 <- p[4]*p[8]/(p[5]*p[7])
or22 <- p[5]*p[9]/(p[6]*p[8])
h <- c(
MA%*%p - c(0.2,0.3,0.5),
MB%*%p - c(0.4,0.1,0.5),
or11-3,or12-2,or21-1,or22-4
)
as.matrix(h[c(-3,-6)]) #Delete the two redundant marginal prob constraints.
}
a <- mph.fit(y,h.fct=h.fct)
a$p
PROB
p1 0.15913038
p2 0.01804733
p3 0.02282230
p4 0.13631718
p5 0.04638010
p6 0.11730272
p7 0.10455244
p8 0.03557257
p9 0.35987499
#
# When both h.fct and (L.fct, X) are input, mph.fit ignores any constraints imposed by L.fct = X b, but
# it does compute ML estimates of L and b (under the model with constraint h.fct=0).
#
# This feature of mph.fit can be exploited to output estimates of functions of p.
# For example,....
#
L.fct <- function(p) {
or11 <- p[1]*p[5]/(p[2]*p[4])
or12 <- p[2]*p[6]/(p[3]*p[5])
or21 <- p[4]*p[8]/(p[5]*p[7])
or22 <- p[5]*p[9]/(p[6]*p[8])
L <- as.matrix(c(MA%*%p,MB%*%p,
or11,or12,or21,or22))
rownames(L) <- c("prow1","prow2","prow3","pcol1","pcol2","pcol3",
"or11","or12","or21","or22")
L
}
X <- diag(10)
b <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=X)
mph.summary(b)
MODEL GOODNESS OF FIT: Test of Ho: h(p)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 8 ): Gsq = 7.36137 (pval = 0.4982 )
Pearson's Score Stat (df= 8 ): Xsq = 11.37638 (pval = 0.1813 )
Generalized Wald Stat (df= 8 ): Wsq = 13.37778 (pval = 0.0995 )
WARNING: 100% of expected counts are less than 5.
Chi-square approximation may be questionable.
Adj Resids: -1.555 -0.394 ... 1.774 2.097 , Number |Adj Resid| > 2: 1
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 9
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
beta1 0.2 0 Inf 0
beta2 0.3 0 535095116 0
beta3 0.5 0 Inf 0
beta4 0.4 0 217202655 0
beta5 0.1 0 Inf 0
beta6 0.5 0 Inf 0
beta7 3.0 0 Inf 0
beta8 2.0 0 Inf 0
beta9 1.0 0 27493920 0
beta10 4.0 0 51205638 0
OBS LINK ML LINK StdErr(L) LINK RESID
prow1 0.3333 0.2 0 1.0000
prow2 0.3333 0.3 0 0.2182
prow3 0.3333 0.5 0 -1.0000
pcol1 0.3333 0.4 0 -0.4082
pcol2 0.3333 0.1 0 2.3333
pcol3 0.3333 0.5 0 -1.0000
or11 1.0000 3.0 0 -0.2101
or12 1.0000 2.0 0 -0.1319
or21 1.0000 1.0 0 0.0000
or22 1.0000 4.0 0 -0.2881
CONVERGENCE INFORMATION...
Original counts used.
iterations = 7 , time elapsed = 0.03
norm.diff = 9.18893e-09 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.23277e-10 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/16/09
EXAMPLE 9. Model constraint: area under ROC manifest "curve" (AUC) equal to a constant.
Neonatal Radiograph Data (source: see Table 1, Lang and Aspelund (2001), Statistical Modelling--An International Journal)
Video Image Rating
1 2 3 4 5 Total
------------------- -----
Normals 6 17 4 5 1 33
Abnormals 4 5 5 15 38 67
Models Fitted Below...
Sampling Distribution: Product Multinomial
(Yi1, ..., Yi5) indep ~ mult(ni, pi1, ... , pi5), i=1,2.
Here, n1 = 33 and n2 = 67. The table probabilities are defined as pij = P(Rating=j| Strata=i).
Systematic Component:
Constraint: In words, "Area under the manifest ROC `curve' equals 0.9." In symbols, h(p) = AUC(p) - 0.9 = 0, where AUC(p) = the area under the manifest ROC "curve." Specifically,
AUC(p) = p11*(p22 + p23 + p24 + p25) + p12*(p23 + p24 + p25) + p13*(p24 + p25) + p14*p25 + 0.5*(p11*p21 + p12*p22 + ... + p15*p25) = P(B > A) + 0.5*P(B=A), where A ~ (p11, ..., p15) and B ~ (p21, ..., p25).
y <- scan()
6 17 4 5 1
4 5 5 15 38
strata <- c(rep("Normals",5),rep("Abnormals",5))
h.fct <- function(p) {
p[1]*(p[7]+p[8]+p[9]+p[10]) +
p[2]*(p[8]+p[9]+p[10]) +
p[3]*(p[9]+p[10]) +
p[4]*(p[10]) +
0.5*(p[1]*p[6]+p[2]*p[7]+p[3]*p[8]+p[4]*p[9]+p[5]*p[10]) - 0.90
}
#
# To have mph.fit output the observed and fitted value of AUC(p), we augment the model
# specification by including both h.fct and (L.fct, X) as input. When h.fct and (L.fct,X),
# are input, mph.fit ignores any constraints imposed by L.fct = X b, but
# it does compute ML estimates of L(p) and b (under the model with constraint h.fct=0).
#
# This feature of mph.fit can be exploited to output estimates of function of p, here, AUC(p)...
#
L.fct <- function(p) {
L <- as.matrix(
p[1]*(p[7]+p[8]+p[9]+p[10]) +
p[2]*(p[8]+p[9]+p[10]) +
p[3]*(p[9]+p[10]) +
p[4]*(p[10]) +
0.5*(p[1]*p[6]+p[2]*p[7]+p[3]*p[8]+p[4]*p[9]+p[5]*p[10])
)
rownames(L) <- "AUC"
L
}
a <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=as.matrix(1),strata=strata)
mph.summary(a,T)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 1 ): Gsq = 1.94502 (p = 0.16312 )
Pearson's Score Stat (df= 1 ): Xsq = 2.19474 (p = 0.13848 )
Generalized Wald Stat (df= 1 ): Wsq = 1.52065 (p = 0.21752 )
Adj Resids: -1.481 -1.481 ... 1.481 1.481 , Number |Adj Resid| > 2: 0
SAMPLING PLAN INFORMATION...
Number of strata: 2
Strata identifiers: Abnormals, Normals
Strata with fixed sample sizes: all
Observed strata sample sizes: 67, 33
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
beta1 0.9 4.418826e-10 2036740159 0
OBS LINK ML LINK StdErr(L) LINK RESID
AUC 0.85346 0.9 4.418826e-10 -1.505451
CELL-SPECIFIC STATISTICS...
strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS
y1 Normals 6 6.7996921 2.2599169 0.20605128 0.06848233 -1.481464
y2 Normals 17 17.8647239 2.8022957 0.54135527 0.08491805 -1.481464
y3 Normals 4 3.8316273 1.8367962 0.11610992 0.05566049 1.481464
y4 Normals 5 3.9681015 1.7337208 0.12024550 0.05253699 1.481464
y5 Normals 1 0.5358551 0.6549779 0.01623803 0.01984781 1.481464
y6 Abnormals 4 2.5477157 1.2205911 0.03802561 0.01821778 1.481464
y7 Abnormals 5 3.8380538 1.7329256 0.05728439 0.02586456 1.481464
y8 Abnormals 5 4.6833212 2.0761170 0.06990032 0.03098682 1.481464
y9 Abnormals 15 15.2579803 3.4282561 0.22773105 0.05116800 -1.481464
y10 Abnormals 38 40.6729290 3.5674591 0.60705864 0.05324566 -1.481464
CONVERGENCE INFORMATION...
Original counts used.
iterations = 56 , time elapsed = 0.25
norm.diff = 8.33704e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.07391e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 10. Marginal homogeneity of multidimensional contingency tables
3x3x3 Table. See S. Kullback, Annals of Mathematical Statistics, Vol. 42, No. 2. (Apr., 1971), pp. 594-606.
y <- scan()
223 24 6 40 42 2 19 4 12
28 6 9 25 218 6 3 13 9
26 3 18 18 30 24 12 16 164
A <- gl(3,9,27)
B <- gl(3,3,27)
C <- gl(3,1,27)
data.frame(A,B,C,y)
MA <- M.fct(A)
MB <- M.fct(B)
MC <- M.fct(C)
# Alternatively, the M matrices can be computed as follows:
# MA <- kronecker(diag(3),matrix(1,1,9)))
# MB <- kronecker(matrix(1,1,3),kronecker(diag(3),matrix(1,1,3)))
# MC <- kronecker(matrix(1,1,9),diag(3))
C.mat <- scan()
1 0 0 -1 0 0 0 0 0
1 0 0 0 0 0 -1 0 0
0 1 0 0 -1 0 0 0 0
0 1 0 0 0 0 0 -1 0
C.mat <- matrix(C.mat,4,9,byrow=T)
L.fct <- function(p) {
pmA <- MA%*%p
pmB <- MB%*%p
pmC <- MC%*%p
L <- as.matrix(c(pmA,pmB,pmC))
rownames(L) <- c("P(A=1)","P(A=2)","P(A=3)","P(B=1)","P(B=2)","P(B=3)","P(C=1)","P(C=2)","P(C=3)")
L
}
h.fct <- function(p) {
C.mat%*%L.fct(p)
}
a <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=diag(9))
mph.summary(a)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 4 ): Gsq = 50.39723 (p = 2.983e-10 )
Pearson's Score Stat (df= 4 ): Xsq = 48.9876 (p = 5.8737e-10 )
Generalized Wald Stat (df= 4 ): Wsq = 49.87851 (p = 3.828e-10 )
Adj Resids: -5.556 -4.739 ... 5.556 6.357 , Number |Adj Resid| > 2: 16
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 1000
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
beta1 0.3687506 0.01291861 28.54413 0
beta2 0.3569100 0.01271853 28.06221 0
beta3 0.2743394 0.01210117 22.67047 0
beta4 0.3687506 0.01291861 28.54413 0
beta5 0.3569100 0.01271853 28.06221 0
beta6 0.2743394 0.01210117 22.67047 0
beta7 0.3687506 0.01291861 28.54413 0
beta8 0.3569100 0.01271853 28.06221 0
beta9 0.2743394 0.01210117 22.67047 0
OBS LINK ML LINK StdErr(L) LINK RESID
P(A=1) 0.372 0.3687506 0.01291861 0.4003281
P(A=2) 0.317 0.3569100 0.01271853 -4.8482153
P(A=3) 0.311 0.2743394 0.01210117 5.0529720
P(B=1) 0.343 0.3687506 0.01291861 -3.1724923
P(B=2) 0.405 0.3569100 0.01271853 5.8419001
P(B=3) 0.252 0.2743394 0.01210117 -3.0790551
P(C=1) 0.394 0.3687506 0.01291861 3.1107435
P(C=2) 0.356 0.3569100 0.01271853 -0.1105505
P(C=3) 0.250 0.2743394 0.01210117 -3.3547171
CONVERGENCE INFORMATION...
Original counts used.
iterations = 56 , time elapsed = 0.8
norm.diff = 7.23424e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.90481e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 11. Contingency tables with given marginals.
3x4 Table. See C. T. Ireland, S. Kullback, Biometrika, Vol. 55, No. 1. (Mar., 1968), pp. 179-188.
Model Fitted Below:
y <- Y ~ multinomial(n, p), where the table probabilities p satisfy:
h(p) = [p1+, p2+, p+1, p+2, p+3] - [0.7837288, 0.14831812, 0.07827901, 0.46148631, 0.29658409] = 0.
y <- scan()
783 7426 4709 2145
517 928 622 703
207 373 337 425
V1 <- gl(3,4,12)
V2 <- gl(4,1,12)
M1 <- M.fct(V1)
M2 <- M.fct(V2)
# Alternatively, the M matrices can be computed as
# M1 <- kronecker(diag(3),matrix(1,1,4))
# M2 <- kronecker(matrix(1,1,3),diag(4))
marg <- scan()
15028 2844 1303 1501 8849 5687 3138
margprob <- marg/19175
A <- scan()
1 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 1 0
A <- matrix(A,5,7,byrow=T)
h.fct <- function(p) {
obsmp <- rbind(M1%*%p,M2%*%p)
A%*%(obsmp - margprob)
}
a <- mph.fit(y=y,h.fct=h.fct)
# Using the augmentation trick as described in Examples 8 and 9...
L.fct <- function(p) {
L <- as.matrix(c(M1%*%p,M2%*%p))
rownames(L) <- c("p[1,+]","p[2,+]","p[3,+]","p[+,1]","p[+,2]","p[+,3]","p[+,4]")
L
}
a2 <- mph.fit(y=y,h.fct=h.fct,L.fct=L.fct,X=diag(7))
mph.summary(a2)
MODEL GOODNESS OF FIT: Test of Ho: h(p)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 5 ): Gsq = 11.30702 (pval = 0.04562 )
Pearson's Score Stat (df= 5 ): Xsq = 11.37322 (pval = 0.04446 )
Generalized Wald Stat (df= 5 ): Wsq = 11.17887 (pval = 0.04795 )
Adj Resids: -2.372 -1.736 ... 2.15 2.816 , Number |Adj Resid| > 2: 3
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 19175
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
beta1 0.7837 0 15782181923 0
beta2 0.1483 0 6489036323 0
beta3 0.0680 0 1838678421 0
beta4 0.0783 0 Inf 0
beta5 0.4615 0 17166239010 0
beta6 0.2966 0 7509947313 0
beta7 0.1637 0 3517050151 0
OBS LINK ML LINK StdErr(L) LINK RESID
p[1,+] 0.7856 0.7837 0 0.6139
p[2,+] 0.1445 0.1483 0 -1.5036
p[3,+] 0.0700 0.0680 0 1.1191
p[+,1] 0.0786 0.0783 0 0.1613
p[+,2] 0.4551 0.4615 0 -1.7673
p[+,3] 0.2956 0.2966 0 -0.3004
p[+,4] 0.1707 0.1637 0 2.6352
CONVERGENCE INFORMATION...
Original counts used.
iterations = 38 , time elapsed = 0.15
norm.diff = 4.5344e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 5.67182e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/16/09
cbind(a2$L,margprob)
ML LINK margprob
p[1,+] 0.78372881 0.78372881
p[2,+] 0.14831812 0.14831812
p[3,+] 0.06795306 0.06795306
p[+,1] 0.07827901 0.07827901
p[+,2] 0.46148631 0.46148631
p[+,3] 0.29658409 0.29658409
p[+,4] 0.16365059 0.16365059
Note: The fitted marginal probabilities do match the given marginal probabilities
EXAMPLE 12. Testing for (conditional) pairwise independence when response includes "unknown" category.
3x3x3 Table. For a description of the model, see Michael Haber's Example 2, p. 433, in Biometrics (in Shorter Communications), Vol. 42, No. 2. (Jun., 1986), pp. 429-435.
y <- scan()
201 28 37 21 8 7 12 0 0 27 9 5
14 4 4 2 0 0 142 15 0 27 12 0 0 0 0
#Counts are entered as y[i,j,k], where right most subscripts move fastest, i,j,k=3.
A <- gl(3,9,27)
B <- gl(3,3,27)
C <- gl(3,1,27)
MAB <- M.fct(paste(A,B))
MAC <- M.fct(paste(A,C))
MBC <- M.fct(paste(B,C))
# Alternatively,
# MAB <- kronecker(diag(9),matrix(1,1,3))
# MAC <- kronecker(diag(3),kronecker(matrix(1,1,3),diag(3)))
# MBC <- kronecker(matrix(1,1,3),diag(9))
M <- rbind(MAB,MAC,MBC)
Mr <- M[-c(3,6,7,8,9,12,15,16,17,18,21,24,25,26,27),]
# Mr*p gives only those marginal probabilities that do NOT involve the last unknown category.
C <- scan()
1 -1 -1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 -1 -1 1 0 0 0 0
0 0 0 0 0 0 0 0 1 -1 -1 1
C <- matrix(C,3,12,byrow=T)
L.fct <- function(m) {
p <- m/sum(m)
C%*%log(Mr%*%p)
}
X <- scan()
1 1 0
1 0 1
1 0 0
X <- matrix(X,3,3,byrow=T)
h.fct <- function(m) {
L.fct(m)
}
a <- mph.fit(y=y,h.fct=h.fct,step=0.5,maxiter=200,print.iter=T) #zero counts require some fine tuning
mph.fit, version 3.0, 5/1/09 , running...
iter= 1 [ 0 ] norm.diff= 9.519322 norm.score= 1.945013
iter= 2 [ 0 ] norm.diff= 5.295804 norm.score= 2.219742
iter= 3 [ 0 ] norm.diff= 2.827484 norm.score= 1.714237
iter= 4 [ 0 ] norm.diff= 1.75119 norm.score= 1.205030
iter= 5 [ 0 ] norm.diff= 1.603112 norm.score= 0.7939237
iter= 6 [ 0 ] norm.diff= 1.595768 norm.score= 0.5042983
iter= 7 [ 0 ] norm.diff= 1.596907 norm.score= 0.310447
iter= 8 [ 0 ] norm.diff= 1.596947 norm.score= 0.1872117
iter= 9 [ 0 ] norm.diff= 1.596726 norm.score= 0.1111047
iter= 10 [ 0 ] norm.diff= 1.596618 norm.score= 0.06525863
iter= 11 [ 0 ] norm.diff= 1.596599 norm.score= 0.03803102
iter= 12 [ 0 ] norm.diff= 1.596611 norm.score= 0.02205917
[omitted lines]
iter= 39 [ 0 ] norm.diff= 1.596669 norm.score= 3.832624e-08
iter= 40 [ 0 ] norm.diff= 1.596669 norm.score= 2.224625e-08
iter= 41 [ 0 ] norm.diff= 1.596669 norm.score= 1.443151e-08
Did NOT meet norm diff convergence criterion [1e-06]!
Norm score convergence criterion [1e-06] was met.
Time Elapsed: 0.28 seconds , Iterations: 41
mph.summary(a,T)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 3 ): Gsq = 33.25029 (p = 2.852e-07 )
Pearson's Score Stat (df= 3 ): Xsq = 38.11701 (p = 2.6698e-08 )
Generalized Wald Stat (df= 3 ): Wsq = 33.77626 (p = 2.2088e-07 )
Adj Resids: -5.526 -4.59 ... 5.749 5.779 , Number |Adj Resid| > 2: 15
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 575
CELL-SPECIFIC STATISTICS...
strata OBS FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS
y1 1 201 1.826391e+02 1.069712e+01 3.176332e-01 1.860368e-02 5.749431e+00
y2 1 28 3.712722e+01 5.169310e+00 6.456908e-02 8.990104e-03 -3.225310e+00
y3 1 37 3.531513e+01 5.745554e+00 6.141762e-02 9.992268e-03 4.589575e+00
y4 1 21 3.395460e+01 5.143368e+00 5.905148e-02 8.944987e-03 -5.526221e+00
y5 1 8 5.326702e+00 1.879169e+00 9.263829e-03 3.268120e-03 2.023091e+00
y6 1 7 9.337218e+00 2.987685e+00 1.623864e-02 5.195974e-03 -4.589575e+00
y7 1 12 1.174490e+01 3.389471e+00 2.042592e-02 5.894732e-03 1.986529e+00
y8 1 0 1.413466e-10 1.188893e-05 2.458203e-13 2.067640e-08 -1.413466e-10
y9 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
y10 1 27 3.866533e+01 5.286542e+00 6.724404e-02 9.193986e-03 -4.094285e+00
y11 1 9 7.836827e+00 1.643760e+00 1.362926e-02 2.858713e-03 5.187327e-01
y12 1 5 6.490560e+00 2.512333e+00 1.128793e-02 4.369275e-03 -4.589575e+00
y13 1 14 7.183925e+00 1.831879e+00 1.249378e-02 3.185877e-03 3.525268e+00
y14 1 4 1.102329e+00 9.212980e-01 1.917094e-03 1.602257e-03 5.778889e+00
y15 1 4 1.814156e+00 1.257619e+00 3.155055e-03 2.187164e-03 4.589575e+00
y16 1 2 2.230031e+00 1.485925e+00 3.878315e-03 2.584218e-03 -1.986529e+00
y17 1 0 1.754106e-16 1.324427e-08 3.050619e-19 2.303351e-11 -1.754106e-16
y18 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
y19 1 142 1.377167e+02 1.016841e+01 2.395074e-01 1.768419e-02 3.705642e+00
y20 1 15 1.821994e+01 4.109449e+00 3.168686e-02 7.146868e-03 -3.705642e+00
y21 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
y22 1 27 3.187476e+01 5.327032e+00 5.543437e-02 9.264404e-03 -3.705642e+00
y23 1 12 6.420549e+00 2.020355e+00 1.116617e-02 3.513660e-03 3.705642e+00
y24 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
y25 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
y26 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
y27 1 0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09 -1.250153e-11
CONVERGENCE INFORMATION...
Original counts used.
iterations = 41 , time elapsed = 0.28
norm.diff = 1.59667 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 1.44315e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 13. Kappa regression models.
Analysis of Multiple Sclerosis Data. See Landis and Koch (Biometrics 1977) for data source and a WLS analysis. See Lang (HLP 2002) for a more detailed description of the models and the corresponding ML analyses outlined below.
Model Systematic Components:
DIRECT SMOOTHING MODEL: K(p) = Xb, where K(p) is the vector of eight weighted kappa statistics. The design matrix X forces certain kappa coefficients to be equal.
INDIRECT SMOOTHING MODEL: L(p) = [L1(p), L2(p)], where L1(p) = log(p), L2(p) = K(p), the eight kappa coefficients. The model constrains the data parameters through
L1(p) = Xb, L2(p) = X2 κ, where X2 = diag(8), the 8x8 identity matrix.
The model L1(p) = Xb has (k,i,j) component of the form
log(pkij) = bk + b(A)ki + b(B)kj + b(A*B)k*i*j + b(agree)k*I(i=j), where k indexes the two populations, i and j are the responses by neurologist A and B.
The models have the generic form L(p) = Xb. They can be seen to be HLP models.
#CREATE Weighted Kappa function...
kappa.weighted.fct <- function(w,p,nr) {
#
# Computes Weighted Kappa Statistic
# See Landis and Koch 1977.
# Author: Joseph B. Lang
# Created: June 7, 2002
#
# Input: w = vectorized version of two-way
# table of weights, w = (w11,w12,...,wRR)
# p = vectorized version of two-way
# probabilities, p = (p11,p12,...,pRR)
# nr = number of rows (and columns) in the table.
# Output: Weighted kappa.
#
p <- matrix(p,nr,nr,byrow=T)
prow <- apply(p,1,sum)
pcol <- apply(p,2,sum)
pind <- prow%*%t(pcol)
pind <- matrix(t(pind),nr*nr,1)
p <- as.matrix(c(t(p)))
w <- as.matrix(w)
obsa <- t(w)%*%p
expa <- t(w)%*%pind
c((obsa - expa)/(1-expa))
}
w1 <- scan()
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
w2 <- scan()
1 1 0 0
1 1 0 0
0 0 1 0
0 0 0 1
w3 <- scan()
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 1
w4 <- scan()
1 1 0 0
1 1 1 0
0 1 1 1
0 0 1 1
y <- scan()
38 5 0 1
33 11 3 0
10 14 5 6
3 7 3 10
5 3 0 0
3 11 4 0
2 13 3 4
1 2 4 14
strata <- c(rep("Popn1",16),rep("Popn2",16))
# Fit the DIRECT SMOOTHING MODEL
L.fct <- function(p) {
p1 <- p[1:16]
p2 <- p[17:32]
L <- rbind(
kappa.weighted.fct(w1,p1 ,4),
kappa.weighted.fct(w2,p1 ,4),
kappa.weighted.fct(w3,p1 ,4),
kappa.weighted.fct(w4,p1 ,4),
kappa.weighted.fct(w1,p2 ,4),
kappa.weighted.fct(w2,p2 ,4),
kappa.weighted.fct(w3,p2 ,4),
kappa.weighted.fct(w4,p2 ,4))
rownames(L) <- c("K(w1,Popn1)","K(w2,Popn1)","K(w3,Popn1)","K(w4,Popn1)",
"K(w1,Popn2)","K(w2,Popn2)","K(w3,Popn2)","K(w4,Popn2)")
L
}
XK <- scan()
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1
XK <- matrix(XK,8,5,byrow=T)
a <- mph.fit(y,L.fct=L.fct,X=XK,norm.diff.conv=10,norm.score.conv=1e-8,strata=strata)
# Don't forget to input the strata variable, as this determines the definition of the table probabilities.
mph.summary(a)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 3 ): Gsq = 2.16255 (p = 0.53936 )
Pearson's Score Stat (df= 3 ): Xsq = 2.11354 (p = 0.54918 )
Generalized Wald Stat (df= 3 ): Wsq = 2.26792 (p = 0.51869 )
Adj Resids: -1.392 -1.379 ... 1.412 1.413 , Number |Adj Resid| > 2: 0
SAMPLING PLAN INFORMATION...
Number of strata: 2
Strata identifiers: Popn1, Popn2
Strata with fixed sample sizes: all
Observed strata sample sizes: 149, 69
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
beta1 0.2349553 0.04241687 5.539195 3.038648e-08
beta2 0.3150272 0.04935842 6.382442 1.742859e-10
beta3 0.3888738 0.05743424 6.770766 1.281020e-11
beta4 0.5831200 0.06909629 8.439236 0.000000e+00
beta5 0.7934268 0.08080265 9.819316 0.000000e+00
OBS LINK ML LINK StdErr(L) LINK RESID
K(w1,Popn1) 0.2079425 0.2349553 0.04241687 -0.97835070
K(w2,Popn1) 0.3275399 0.3150272 0.04935842 0.31584009
K(w3,Popn1) 0.4081121 0.3888738 0.05743424 0.44185351
K(w4,Popn1) 0.5964663 0.5831200 0.06909629 0.41284558
K(w1,Popn2) 0.2965166 0.2349553 0.04241687 0.94238307
K(w2,Popn2) 0.3324734 0.3150272 0.04935842 0.26087642
K(w3,Popn2) 0.3864188 0.3888738 0.05743424 -0.02982431
K(w4,Popn2) 0.7893773 0.7934268 0.08080265 -0.12157636
CONVERGENCE INFORMATION...
Original counts used.
iterations = 16 , time elapsed = 5.17
norm.diff = 2.26061 = dist between last and second last iterates.
Norm diff convergence criterion [10] was met.
norm.score = 5.97723e-09 = norm of score at last iteration.
Norm score convergence criterion [1e-08] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
# Fit the INDIRECT SMOOTHING MODEL
L.fct <- function(p) {
p1 <- p[1:16]
p2 <- p[17:32]
L <- rbind(log(p),
kappa.weighted.fct(w1,p1 ,4),
kappa.weighted.fct(w2,p1 ,4),
kappa.weighted.fct(w3,p1 ,4),
kappa.weighted.fct(w4,p1 ,4),
kappa.weighted.fct(w1,p2 ,4),
kappa.weighted.fct(w2,p2 ,4),
kappa.weighted.fct(w3,p2 ,4),
kappa.weighted.fct(w4,p2 ,4))
rownames(L) <- c(paste(sep=".","logp",1:32), "K(w1,Popn1)","K(w2,Popn1)","K(w3,Popn1)","K(w4,Popn1)",
"K(w1,Popn2)","K(w2,Popn2)","K(w3,Popn2)","K(w4,Popn2)")
L
}
A <- factor(gl(4,4))
B <- factor(gl(4,1,16))
Ascore <- c(A)
Bscore <- c(B)
agree <- scan()
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
XA1 <- model.matrix(~A + B + Ascore:Bscore + agree)
XA2 <- XA1
X <- block.fct(XA1,XA2,diag(8)) #block.fct is available in mph.supp.Rcode.txt
b <- mph.fit(y,L.fct=L.fct,X=X,strata=strata)
mph.summary(b)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 14 ): Gsq = 18.25286 (p = 0.19551 )
Pearson's Score Stat (df= 14 ): Xsq = 22.48127 (p = 0.069252 )
Generalized Wald Stat (df= 14 ): Wsq = 13.13214 (p = 0.51615 )
Adj Resids: -1.735 -1.675 ... 2.314 2.748 , Number |Adj Resid| > 2: 3
SAMPLING PLAN INFORMATION...
Number of strata: 2
Strata identifiers: Popn1, Popn2
Strata with fixed sample sizes: all
Observed strata sample sizes: 149, 69
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) -2.18195695 0.21789484 -10.01380733 0.000000e+00
A2 -0.98252487 0.34280421 -2.86614005 4.155104e-03
A3 -2.63574417 0.59030845 -4.46502870 8.005802e-06
A4 -5.07053731 1.00283130 -5.05622162 4.276444e-07
B2 -2.50209975 0.36908649 -6.77916906 1.208700e-11
B3 -5.95129014 0.85057662 -6.99677137 2.619238e-12
B4 -8.20966143 1.38875396 -5.91153053 3.389434e-09
agree -0.02783032 0.24285556 -0.11459617 9.087652e-01
Ascore:Bscore 0.80384747 0.15516362 5.18064414 2.211210e-07
(Intercept) -3.82657082 0.40135119 -9.53422059 0.000000e+00
A2 -0.93504420 0.59981172 -1.55889618 1.190210e-01
A3 -3.00645110 1.21010157 -2.48446177 1.297474e-02
A4 -6.18631027 2.08586141 -2.96582997 3.018673e-03
B2 -1.26080589 0.64291838 -1.96106681 4.987123e-02
B3 -5.23293410 1.44644377 -3.61779298 2.971259e-04
B4 -8.36055037 2.47380527 -3.37963156 7.258306e-04
agree 0.02769186 0.34868656 0.07941764 9.367004e-01
Ascore:Bscore 1.04115569 0.29708388 3.50458495 4.573196e-04
[3.1] 0.20794246 0.05113013 4.06692644 4.763727e-05
[3.2] 0.33160222 0.05237560 6.33123511 2.432063e-10
[3.3] 0.41743362 0.06030132 6.92246222 4.438672e-12
[3.4] 0.55622648 0.07116497 7.81601533 5.551115e-15
[3.5] 0.29651657 0.07818321 3.79258641 1.490863e-04
[3.6] 0.38035962 0.07129014 5.33537446 9.534758e-08
[3.7] 0.47547451 0.07625832 6.23505065 4.516318e-10
[3.8] 0.73473413 0.09436958 7.78570910 6.883383e-15
OBS LINK ML LINK StdErr(L) LINK RESID
logp.1 -1.3663601 -1.4059398 0.13749315 9.427814e-01
logp.2 -3.3945084 -3.0763617 0.28963410 -1.357818e+00
logp.3 -Inf -5.7217047 0.54095671 -Inf
logp.4 -5.0039463 -7.1762285 0.87175123 7.674599e-01
logp.5 -1.5074387 -1.5567869 0.15132358 1.046012e+00
logp.6 -2.6060510 -2.4790220 0.24826770 -1.173576e+00
logp.7 -3.9053340 -4.2926871 0.37191660 6.585275e-01
logp.8 -Inf -4.9433634 0.51585603 -Inf
logp.9 -2.7013612 -2.4061587 0.20273772 -1.809184e+00
logp.10 -2.3648890 -2.4967160 0.21627385 7.877068e-01
logp.11 -3.3945084 -3.5621943 0.36104443 5.317376e-01
logp.12 -3.2121868 -3.3811928 0.31556472 5.600586e-01
logp.13 -3.9053340 -4.0371044 0.44794027 3.168800e-01
logp.14 -3.0580362 -3.3238142 0.27927295 8.336371e-01
logp.15 -3.9053340 -3.5576147 0.36606789 -1.129845e+00
logp.16 -2.7013612 -2.6284264 0.26455789 -5.719039e-01
logp.17 -2.6246686 -2.7577233 0.40026306 5.738569e-01
logp.18 -3.1354942 -3.0050653 0.42768254 -4.227939e-01
logp.19 -Inf -5.9360378 0.76941354 -Inf
logp.20 -Inf -8.0224984 1.42386504 -Inf
logp.21 -3.1354942 -2.6793036 0.38587064 -2.085649e+00
logp.22 -1.8362112 -1.8301063 0.25548478 -5.932316e-02
logp.23 -2.8478121 -3.7476149 0.45024654 1.427064e+00
logp.24 -Inf -4.7929198 0.71595805 -Inf
logp.25 -3.5409593 -3.7095548 0.48406780 2.878691e-01
logp.26 -1.6691571 -1.8468936 0.24432611 1.336046e+00
logp.27 -3.1354942 -2.6678629 0.38466892 -2.171738e+00
logp.28 -2.8478121 -2.6997039 0.35137338 -5.315800e-01
logp.29 -4.2341065 -5.8482583 1.00711518 8.075958e-01
logp.30 -3.5409593 -2.9444414 0.40136378 -1.888459e+00
logp.31 -2.8478121 -2.7519469 0.38094578 -3.688923e-01
logp.32 -1.5950492 -1.6872485 0.23696240 1.051840e+00
K(w1,Popn1) 0.2079425 0.2079425 0.05113013 1.854186e-11
K(w2,Popn1) 0.3275399 0.3316022 0.05237560 -1.162164e-01
K(w3,Popn1) 0.4081121 0.4174336 0.06030132 -2.387584e-01
K(w4,Popn1) 0.5964663 0.5562265 0.07116497 1.160200e+00
K(w1,Popn2) 0.2965166 0.2965166 0.07818321 9.511947e-12
K(w2,Popn2) 0.3324734 0.3803596 0.07129014 -1.198556e+00
K(w3,Popn2) 0.3864188 0.4754745 0.07625832 -1.556305e+00
K(w4,Popn2) 0.7893773 0.7347341 0.09436958 1.884445e+00
CONVERGENCE INFORMATION...
Original counts used.
iterations = 6 , time elapsed = 2.36
norm.diff = 1.27439e-08 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.20776e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
EXAMPLE 14. Marginal cumulative probit model (in conjunction with loglinear association model)
Marijuana use among U.S. male teens. Data Source: U.S. National Youth Survey (Elliott et al. 1989).
Use at 17
0 1 2
--------------
0 56 13 7
Use at 15 1 6 4 10
2 1 5 15
0 = None, 1=Used no more than once, 2=Used more than once per month in the past year
The models that are fitted below have the following form...
L(p) = [L1(p), L2(p)] = [X1b, X2a], where L1(p) = Φ-1[p2++ p3+, p3+, p+2 + p+3, p+3] and L2(p) = log(p).
Here, Φ(.) is the standard normal pdf.
The linear predictor for the marginal probit model is defined as
X1b = [b(cut1), b(cut2), b(cut1)+b(AGE), b(cut2)+b(AGE)],
and the loglinear association model L2(p) = log(p) = X2a has (i,j) component
log(pij) = a + a(Use15)i + a(Use17)j + a*Use15.scorei*Use17.scorej [linear-by-linear association model] or
log(pij) = a + a(Use15)i + a(Use17)j [independence association model].
The models, which have the generic form L(p) = Xb, are HLP models.
d <- scan(what=list(Use15="",Use17="",count=0))
0 0 56
0 1 13
0 2 7
1 0 6
1 1 4
1 2 10
2 0 1
2 1 5
2 2 15
d <- data.frame(d)
Use15.score <- c(d$Use15)-1
Use17.score <- c(d$Use17)-1
d <- data.frame(d,Use15.score,Use17.score)
y <- d$count
M15 <- M.fct(d$Use15)
M17 <- M.fct(d$Use17)
L.fct <- function(p) {
p15 <- M15%*%p
p17 <- M17%*%p
L <- as.matrix(c(
qnorm(p15[2]+p15[3]),
qnorm(p15[3]),
qnorm(p17[2]+p17[3]),
qnorm(p17[3]),
log(p)
))
rownames(L) <- c("qnorm(P(Use15>0))","qnorm(P(Use15>1))",
"qnorm(P(Use17>0))","qnorm(P(Use17>1))",
paste("logp",1:9))
L
}
CUT <- factor(c(1,2,1,2))
AGE <- factor(c(15,15,17,17))
Xprobit <- model.matrix(~CUT+AGE-1)
Xloglin <- model.matrix(~ Use15 + Use17 +Use15.score:Use17.score,d)
X <- block.fct(Xprobit,Xloglin)
a1 <- mph.fit(y,L.fct=L.fct,X=X)
mph.summary(a1)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 4 ): Gsq = 1.77057 (p = 0.77786 )
Pearson's Score Stat (df= 4 ): Xsq = 1.87796 (p = 0.75819 )
Generalized Wald Stat (df= 4 ): Wsq = 1.84803 (p = 0.76368 )
Adj Resids: -1.197 -0.64 ... 0.831 1.038 , Number |Adj Resid| > 2: 0
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 117
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
CUT1 -0.3909243 0.11400659 -3.428962 6.058944e-04
CUT2 -0.9091368 0.12846677 -7.076824 1.475042e-12
AGE17 0.2999332 0.09813575 3.056309 2.240799e-03
(Intercept) -0.7542226 0.09488147 -7.949103 1.776357e-15
Use151 -2.1584116 0.26134587 -8.258832 2.220446e-16
Use152 -3.7075766 0.62358890 -5.945546 2.755366e-09
Use171 -1.3618662 0.20239880 -6.728627 1.712719e-11
Use172 -2.0394637 0.36806236 -5.541082 3.006082e-08
Use15.score:Use17.score 1.1362827 0.20718661 5.484344 4.150058e-08
OBS LINK ML LINK StdErr(L) LINK RESID
qnorm(P(Use15>0)) -0.38416697 -0.39092427 0.11400659 0.1952372
qnorm(P(Use15>1)) -0.91732119 -0.90913681 0.12846677 -0.1962242
qnorm(P(Use17>0)) -0.09655862 -0.09099107 0.11249628 -0.1955444
qnorm(P(Use17>1)) -0.60224878 -0.60920361 0.11879774 0.1950816
logp 1 -0.73682224 -0.75422257 0.09488147 0.6982065
logp 2 -2.19722458 -2.11608873 0.14875078 -0.4043946
logp 3 -2.81626379 -2.79368631 0.32721044 -0.1455725
logp 4 -2.97041447 -2.91263418 0.22644742 -0.1850601
logp 5 -3.37587957 -3.13821761 0.28176346 -0.7192649
logp 6 -2.45958884 -2.67953247 0.16875835 0.7432298
logp 7 -4.76217393 -4.46179917 0.59937397 -0.4919918
logp 8 -3.15273602 -3.55109988 0.25921308 0.8452161
logp 9 -2.05412373 -1.95613201 0.21405273 -1.2569055
CONVERGENCE INFORMATION...
Original counts used.
iterations = 6 , time elapsed = 0.05
norm.diff = 3.97857e-09 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 2.05338e-10 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
#Fit the Probit Marginal model, along with the Independence Association model
Xprobit <- model.matrix(~CUT+AGE) Xloglin <- model.matrix(~Use15+Use17,d) X <- block.fct(Xprobit,Xloglin) a2 <- mph.fit(y,L.fct=L.fct,X=X)
a1$Xsq; a1$df; a2$Xsq; a2$df
PEARSON SCORE STATISTIC
1.877965
[1] 4
PEARSON SCORE STATISTIC
45.3922
[1] 5
cbind(a1$beta[3],sqrt(a1$covbeta[3,3]),a2$beta[3],sqrt(a2$covbeta[3,3]))
[,1] [,2] [,3] [,4]
[1,] 0.2999332 0.09813575 0.2975494 0.1575182 #The b(AGE) coef estimates and ase's.
| Remark: As expected, the independence/probit model appears to be untenable for these repeated measures data. Although the independence-model estimator of the marginal probit parameter b(AGE) is consistent, the reported standard error is not. In fact, it can be anticipated that the actual standard error of this independence-model estimator is significantly larger than the saturated-model estimator which correctly accommodates the association between the two responses, AGE15 and AGE17 marijuana use responses. |
#If we wanted to use log(m), rather than log(p), loglinear model, we'd have to use ...
Lm.fct <- function(m) {
p <- m/sum(m)
p15 <- M15%*%p
p17 <- M17%*%p
L <- as.matrix(c(
qnorm(p15[2]+p15[3]),
qnorm(p15[3]),
qnorm(p17[2]+p17[3]),
qnorm(p17[3]),
log(m)
))
rownames(L) <- c("qnorm(P(Use15>0))","qnorm(P(Use15>1))",
"qnorm(P(Use17>0))","qnorm(P(Use17>1))",
paste("logm",1:9))
L
}
Xprobit <- model.matrix(~CUT+AGE-1)
Xloglin <- model.matrix(~Use15+Use17+Use15.score:Use17.score,d)
X <- block.fct(Xprobit,Xloglin)
a3 <- mph.fit(y,L.fct=Lm.fct,X=X,L.mean=T)
EXAMPLE 15. Marginal Cumulative Logit and Linear-by-Linear Loglinear Association Model.
Opinions about Government Spending (data sources: General Social Survey 1989 and Table 1 of Lang and Agresti, JASA94).
Cities 1 2 3
Law Enforcement 1 2 3 1 2 3 1 2 3
Environment Health
1 1 62 17 5 90 42 3 74 31 11
2 11 7 0 22 18 1 19 14 3
3 2 3 1 2 0 1 1 3 1
2 1 11 3 0 21 13 2 20 8 3
2 1 4 0 6 9 0 6 5 2
3 1 0 1 2 1 1 4 3 1
3 1 3 0 0 2 1 0 9 2 1
2 1 0 0 2 1 0 4 2 0
3 1 0 0 0 0 0 1 2 3
1= too little, 2= about right, 3 = too much spending
Models fitted below:
y ← Y ~ multinomial(n, p), where the table probabilities p satisfy...
...the generalized loglinear model used in Lang and Agresti (JASA94): L(p) = [L1(p), L2(p)] = [X1a, X2b] = Xδ, where the components are defined as follows:
L1(p) = log(p).
L2(p) = vector of oneway cumulative logits = log (odd(Rt <= h) , t=1,2,3,4, h=1,2). Here, Rt is the tth response variable (t=1=Environment, 2=Health, 3=Cities, 4=Law Enforcement).
X1a has (ijkl)th element of the form:
a(0) + a(1)i + a(2)j+ a(3)k + a(4)l + a(1,2)*i*j + a(1,3)*i*k + a(1,4)*i*l + a(2,3)*j*k + a(2,4)*j*l + a(3,4)*k*l
X2b has (t,h)th element of the form: b(level)h + b(response)t
Note that L(p) = Xδ has the generalized loglinear model form ClogAp = Xδ. This model was denoted J(LxL)∩M(PO) in Lang and Agresti (JASA94), because the association structure is modeled via a linear-by-linear loglinear model and the one-way marginal structure is modeled via a proportional odds cumulative logit model.
For convenience, we will augment this model so that estimates of the oneway marginal probabilities along with their corresponding adjusted residuals will be output. Specifically, we will re-express the model in the equivalent form
L(p) = [L1(p), L2(p), L3(p)] = [X1a, X2b, X3θ] = Xβ, where L3(p) = (P(Rt=h), t=1,2,3,4, h=1,2,3) is the vector of one-way marginal probabilities and X3 = diag(12), the 12x12 identity matrix.
In this model, the marginal probabilities L3(p) are indirectly smoothed via the constraints implied by [L1(p), L2(p)] = [X1a, X2b]. See Lang (JASA 2005) for a discussion of these "indirect smoothing models."
The data model (data sampling model along with the data parameter model), y ←Y ~ multinomial(n, p) and L(p) = Xβ, is a homogeneous linear predictor model (see Lang JASA 2005).
y <- scan()
62 17 5 90 42 3 74 31 11
11 7 0 22 18 1 19 14 3
2 3 1 2 0 1 1 3 1
11 3 0 21 13 2 20 8 3
1 4 0 6 9 0 6 5 2
1 0 1 2 1 1 4 3 1
3 0 0 2 1 0 9 2 1
1 0 0 2 1 0 4 2 0
1 0 0 0 0 0 1 2 3
# The counts are entered as y = (yijkl: i,j,k,l=1,2,3), where the order follows the rule that
# the rightmost subscripts move faster. The subscripts on yijkl, correspond
# to the four responses as follows:
# i=Environment, j=Health, k=Cities, and l= Law Enforcement.
# For example, y1212 = 7 is the 11th observation in the vector y; it means
# that 7 of those sampled responded (Environ=1, Health=2, Cities=1, Law=2).
# See Table 1 of Lang and Agresti (JASA94).
Environ <- factor(gl(3,27,81))
Health <- factor(gl(3,9,81))
Cities <- factor(gl(3,3,81))
Law <- factor(gl(3,1,81))
data.frame(Environ,Health,Cities,Law,y)
Escore <- c(Environ)
Hscore <- c(Health)
Cscore <- c(Cities)
Lscore <- c(Law)
## CREATE THE LINK FUNCTION L(m). ##
ME <- M.fct(Environ)
MH <- M.fct(Health)
MC <- M.fct(Cities)
ML <- M.fct(Law)
M <- rbind(ME,MH,MC,ML)
CUM0 <- scan()
1 0 0
0 1 1
1 1 0
0 0 1
CUM0 <- matrix(CUM0,4,3,byrow=T)
CUM <- kronecker(diag(4),CUM0)
AMarg <- CUM%*%M
C0 <- scan()
1 -1 0 0
0 0 1 -1
C0 <- matrix(C0,2,4,byrow=T)
CMarg <- kronecker(diag(4),C0)
Lp.fct <- function(p) {
logp <- log(p)
onewayclogit <- CMarg%*%log(AMarg%*%p)
onewayprob <- M%*%p
L <- as.matrix(c(logp,onewayclogit,onewayprob))
rownames(L) <- c(paste("logp",1:81),
"loddsE<2","loddsE<3",
"loddsH<2","loddsH<3",
"loddsC<2","loddsC<3",
"loddsL<2","loddsL<3",
"P(E=1)","P(E=2)","P(E=3)",
"P(H=1)","P(H=2)","P(H=3)",
"P(C=1)","P(C=2)","P(C=3)",
"P(L=1)","P(L=2)","P(L=3)")
L
}
##CREATE DESIGN MATRIX X = block[XAssoc, XMarg, diag(12)] ##
XAssoc <- model.matrix(~Environ+Health+Cities+Law+Escore:Hscore + Escore:Cscore
+ Escore:Lscore + Hscore:Cscore + Hscore:Lscore + Cscore:Lscore)
dm <- scan(what=list(CUT="",RESP=""))
1 E
2 E
1 H
2 H
1 C
2 C
1 L
2 L
dm <- data.frame(dm)
XMarg <- model.matrix(~CUT + RESP - 1,data=dm)
X <- block.fct(XAssoc, XMarg, diag(12))
## FIT THE MODEL AND SUMMARIZE ##
ap <- mph.fit(y, L.fct=Lp.fct, X=X)
mph.summary(ap)
MODEL GOODNESS OF FIT: Test of Ho: h(m)=0 vs. Ha: not Ho...
Likelihood Ratio Stat (df= 69 ): Gsq = 71.535 (p = 0.39365 )
Pearson's Score Stat (df= 69 ): Xsq = 64.33728 (p = 0.6365 )
Generalized Wald Stat (df= 69 ): Wsq = 43.12471 (p = 0.99382 )
Adj Resids: -1.996 -1.743 ... 2.043 2.28 , Number |Adj Resid| > 2: 2
SAMPLING PLAN INFORMATION...
Number of strata: 1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes: 607
LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA) Z-ratio p-value
(Intercept) -3.856486433 0.178483120 -21.60700935 0.000000e+00
Environ2 -2.588205013 0.308823990 -8.38084183 0.000000e+00
Environ3 -5.493086736 0.655267978 -8.38296227 0.000000e+00
Health2 -2.595621274 0.297230071 -8.73270078 0.000000e+00
Health3 -5.610605951 0.637689942 -8.79832907 0.000000e+00
Cities2 -0.034223012 0.204332191 -0.16748713 8.669868e-01
Cities3 -0.937638792 0.413278262 -2.26878323 2.328151e-02
Law2 -1.826510927 0.263654376 -6.92767159 4.278133e-12
Law3 -4.197613831 0.551070670 -7.61719695 2.597922e-14
Escore:Hscore 0.498689176 0.112274533 4.44169449 8.925322e-06
Escore:Cscore 0.314319282 0.103927942 3.02439627 2.491299e-03
Escore:Lscore -0.003494676 0.112011470 -0.03119927 9.751106e-01
Hscore:Cscore 0.051558696 0.099563584 0.51784692 6.045651e-01
Hscore:Lscore 0.454699678 0.103442225 4.39568735 1.104228e-05
Cscore:Lscore 0.198569482 0.089578780 2.21670224 2.664345e-02
CUT1 -1.342154170 0.086191456 -15.57177748 0.000000e+00
CUT2 0.515427959 0.079998129 6.44300014 1.171345e-10
RESPE 2.336871534 0.117010601 19.97145142 0.000000e+00
RESPH 2.255916301 0.120197721 18.76837833 0.000000e+00
RESPL 1.874491508 0.111986680 16.73852200 0.000000e+00
[3.1] 0.730018682 0.017844267 40.91054536 0.000000e+00
[3.2] 0.215418742 0.013701431 15.72235327 0.000000e+00
[3.3] 0.054562575 0.005782026 9.43658465 0.000000e+00
[3.4] 0.713769394 0.018116177 39.39955879 0.000000e+00
[3.5] 0.227338141 0.013759216 16.52260869 0.000000e+00
[3.6] 0.058892465 0.006166318 9.55066935 0.000000e+00
[3.7] 0.207156029 0.014156292 14.63349475 0.000000e+00
[3.8] 0.418922021 0.014035304 29.84773458 0.000000e+00
[3.9] 0.373921950 0.018727908 19.96602862 0.000000e+00
[3.10] 0.630028094 0.019252284 32.72484993 0.000000e+00
[3.11] 0.286027282 0.014261072 20.05650639 0.000000e+00
[3.12] 0.083944625 0.007931189 10.58411547 0.000000e+00
OBS LINK ML LINK StdErr(L) LINK RESID
logp 1 -2.28139441 -2.34214479 0.095988109 0.7667174334
logp 2 -3.57531545 -3.51888124 0.121653595 -0.2852222757
logp 3 -4.79909088 -5.24020966 0.263546567 0.9008338501
logp 4 -1.90871912 -1.81192035 0.055480544 -1.3221152708
logp 5 -2.67085917 -2.79008731 0.083716619 0.8845427650
logp 6 -5.30991650 -4.31284624 0.146242602 -3.1537711693
logp 7 -2.10446370 -2.15088867 0.091828124 0.7271838532
logp 8 -2.97454159 -2.93048614 0.099761757 -0.3173819288
logp 9 -4.01063352 -4.25467560 0.194390396 0.8817534685
logp 10 -4.01063352 -3.93281852 0.147114161 -0.3155472181
logp 11 -4.46261864 -4.65485528 0.151960064 0.4990257493
logp 12 -Inf -5.92148402 0.258577660 -Inf
logp 13 -3.31748634 -3.35103537 0.103701811 0.1803506853
logp 14 -3.51815703 -3.87450266 0.116467520 1.4071862606
logp 15 -6.40852879 -4.94256192 0.143423649 -3.2094351956
logp 16 -3.46408981 -3.63844500 0.136629080 0.8473253486
logp 17 -3.76947146 -3.96334280 0.125889658 0.7369118170
logp 18 -5.30991650 -4.83283257 0.183270761 -1.1515979690
logp 19 -5.71538161 -5.94285565 0.314590661 0.3133186698
logp 20 -5.30991650 -6.21019273 0.279577646 1.0463497281
logp 21 -6.40852879 -7.02212179 0.399062617 0.4725321071
logp 22 -5.71538161 -5.30951381 0.215228332 -0.7599555379
logp 23 -Inf -5.37828141 0.161798208 -Inf
logp 24 -6.40852879 -5.99164099 0.265133659 -0.5440565922
logp 25 -6.40852879 -5.54536473 0.282673600 -1.4797191620
logp 26 -5.30991650 -5.41556285 0.217235401 0.1862820936
logp 27 -6.40852879 -5.83035295 0.309194311 -0.8490914514
logp 28 -4.01063352 -4.12083603 0.159071976 0.4036179341
logp 29 -5.30991650 -5.30106714 0.189103424 -0.0163488085
logp 30 -Inf -7.02589024 0.330757519 -Inf
logp 31 -3.36400635 -3.27629230 0.101436265 -0.4927805100
logp 32 -3.84357943 -4.25795393 0.130274768 1.3250353670
logp 33 -5.71538161 -5.78420754 0.207945842 0.0982490088
logp 34 -3.41279652 -3.30094133 0.120324284 -0.6616066827
logp 35 -4.32908725 -4.08403349 0.132070516 -0.8732852306
logp 36 -5.30991650 -5.41171762 0.225586007 0.1809478286
logp 37 -6.40852879 -5.21282057 0.184460807 -2.3148291504
logp 38 -5.02223443 -5.93835201 0.191118931 1.1960147896
logp 39 -Inf -7.20847543 0.292969107 -Inf
logp 40 -4.61676932 -4.31671815 0.132339573 -0.9290704495
logp 41 -4.21130421 -4.84368011 0.145148255 1.4646959078
logp 42 -Inf -5.91523404 0.168305243 -Inf
logp 43 -4.61676932 -4.28980849 0.142729601 -1.0435538445
logp 44 -4.79909088 -4.61820096 0.137112069 -0.4726692938
logp 45 -5.71538161 -5.49118542 0.179786309 -0.3707831882
logp 46 -6.40852879 -6.72416852 0.313328218 0.2799401698
logp 47 -Inf -6.99500029 0.280566245 -Inf
logp 48 -6.40852879 -7.81042402 0.394972611 0.7094006523
logp 49 -5.71538161 -5.77650740 0.193018061 0.0870906645
logp 50 -6.40852879 -5.84876968 0.141283077 -0.7549586446
logp 51 -6.40852879 -6.46562394 0.238341328 0.0570865128
logp 52 -5.02223443 -5.69803905 0.236612342 1.0261032969
logp 53 -5.30991650 -5.57173184 0.168630327 0.4124166195
logp 54 -6.40852879 -5.99001662 0.258680890 -0.5451050874
logp 55 -5.30991650 -6.21620397 0.340140679 1.0773149652
logp 56 -Inf -7.39992976 0.370492121 -Inf
logp 57 -Inf -9.12824753 0.552626047 -Inf
logp 58 -5.71538161 -5.05734095 0.193879185 -1.4039179534
logp 59 -6.40852879 -6.04249727 0.221722074 -0.4565869832
logp 60 -Inf -7.57224555 0.398403024 -Inf
logp 61 -4.21130421 -4.76767071 0.227096973 1.4838118658
logp 62 -5.71538161 -5.55425754 0.232411843 -0.2649112170
logp 63 -6.40852879 -6.88543634 0.400463523 0.3961777810
logp 64 -6.40852879 -6.80949934 0.327427053 0.3407838090
logp 65 -Inf -7.53852545 0.326104252 -Inf
logp 66 -Inf -8.81214355 0.475427281 -Inf
logp 67 -5.71538161 -5.59907763 0.179375460 -0.1813535926
logp 68 -6.40852879 -6.12953426 0.167189288 -0.3272239667
logp 69 -Inf -7.20458287 0.310174095 -Inf
logp 70 -5.02223443 -5.25784869 0.205559437 0.4513390204
logp 71 -5.71538161 -5.58973584 0.171184375 -0.1962228393
logp 72 -Inf -6.46621497 0.308626547 -Inf
logp 73 -6.40852879 -7.82215811 0.468598685 0.7167739163
logp 74 -Inf -8.09648455 0.431739288 -Inf
logp 75 -Inf -8.91540296 0.560422556 -Inf
logp 76 -Inf -6.56017771 0.314340321 -Inf
logp 77 -Inf -6.63593466 0.255312333 -Inf
logp 78 -Inf -7.25628360 0.380583676 -Inf
logp 79 -6.40852879 -6.16739007 0.340998263 -0.2950815133
logp 80 -5.71538161 -6.04457754 0.265725370 0.4171827178
logp 81 -5.30991650 -6.46635699 0.380972464 1.2104544224
loddsE<2 1.00207436 0.99471736 0.090538025 0.5786279423
loddsE<3 2.79379093 2.85229949 0.112086258 -0.4203606869
loddsH<2 0.91975294 0.91376213 0.088673239 0.4227572342
loddsH<3 2.79379093 2.77134426 0.111256896 0.1704312075
loddsC<2 -1.26125559 -1.34215417 0.086191456 1.5860601196
loddsC<3 0.47321734 0.51542796 0.079998130 -1.6717484121
loddsL<2 0.50117219 0.53233734 0.082594968 -1.9876428816
loddsL<3 2.65147985 2.38991947 0.103139187 2.5184851520
P(E=1) 0.73146623 0.73001868 0.017844267 0.5776478132
P(E=2) 0.21087315 0.21541874 0.013701431 -0.4772650886
P(E=3) 0.05766063 0.05456258 0.005782026 0.4314828693
P(H=1) 0.71499176 0.71376939 0.018116177 0.4222152620
P(H=2) 0.22734761 0.22733814 0.013759216 0.0009466647
P(H=3) 0.05766063 0.05889246 0.006166318 -0.1687532296
P(C=1) 0.22075783 0.20715603 0.014156292 1.6236398142
P(C=2) 0.39538715 0.41892202 0.014035304 -1.6476003693
P(C=3) 0.38385502 0.37392195 0.018727908 1.6804419408
P(L=1) 0.62273476 0.63002809 0.019252284 -1.9955680999
P(L=2) 0.31136738 0.28602728 0.014261072 2.1968010087
P(L=3) 0.06589786 0.08394462 0.007931189 -2.2597073814
CONVERGENCE INFORMATION...
Original counts used.
iterations = 10 , time elapsed = 1.5
norm.diff = 4.26534e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 3.3329e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.
FITTING PROGRAM USED: mph.fit, version 3.0, 5/1/09
# REMARK:
# The model fitted above was expressed in terms of the table probabilities p, as L(p) = X b.
# Alternatively, the model could have been specified in terms of the expected counts m,
# as L(m) = X b, where L(.) is slightly modified from the definition above.
# Below, the relevant code is given...
#
Lm.fct <- function(m) {
logm <- log(m)
onewayclogit <- CMarg%*%log(AMarg%*%m)
onewayprob <- M%*%m/sum(m)
L <- as.matrix(c(logm,onewayclogit,onewayprob))
rownames(L) <- c(paste("logm",1:81),
"loddsE<2","loddsE<3",
"loddsH<2","loddsH<3",
"loddsC<2","loddsC<3",
"loddsL<2","loddsL<3",
"P(E=1)","P(E=2)","P(E=3)",
"P(H=1)","P(H=2)","P(H=3)",
"P(C=1)","P(C=2)","P(C=3)",
"P(L=1)","P(L=2)","P(L=3)")
L
}
am <- mph.fit(y, L.fct=Lm.fct, L.mean=T, X=X)
mph.summary(am)
Document Last Updated:
05/22/2009 03:37:07 PM, Joseph B. Lang