Home > Numerical Examples

ML FITTING of MPH and HLP MODELS using MPH.FIT  (version 3.0):    NUMERICAL EXAMPLES

Author:   Joseph B. Lang,   Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242 USA   (September 27, 2002,    last updated: April 15, 2005, February 7, 2007,  May 22, 2009)

The  numerical examples below were fitted using mph.fit, VERSION 3.0.    Please read about the changes from version 2.0 in the header of the file mph.fit.

Primary References:

Lang, J.B (2004).  "Multinomial-Poisson Homogeneous Models for Contingency Tables,"   Annals of Statistics, 32, 340-383 .

Lang, J.B. (2005). "Homogeneous Linear Predictor Models for Contingency Tables,"   JASA, 100, 121-134

Numerical Examples:

EXAMPLE  1. Mean Response Model (colds in children)
EXAMPLE  2. Dispersion Linear Trend Model (Danish marital status)
EXAMPLE  3. Marginal Homogeneity Model (eye grade)
EXAMPLE  4. Loglinear Models under Different Sampling Plans (bike helmet use)
EXAMPLE  5. Loglinear Models of Independence (bike helmet use)
EXAMPLE  6. Logit Models under Different Sampling Plans (bike helmet use)
EXAMPLE  7. Sparse Table Example (fine tuning the fitting algorithm)
EXAMPLE  8. Given Marginal Distributions and Odds Ratios, Compute Joint Distribution in a Two-Way Table.
EXAMPLE  9. Model Constraint:  ROC Area Equal to a Constant
EXAMPLE 10. Marginal Homogeneity of Multidimensional Contingency Tables
EXAMPLE 11. Contingency Tables with Given Marginals.
EXAMPLE 12. Testing for (Conditional) Pairwise Independence
EXAMPLE 13. Kappa Regression using Direct and Indirect Smoothing Models.
EXAMPLE 14. Marginal Cumulative Probit Model (in Conjunction with Loglinear Association Model)
EXAMPLE 15. Marginal Cumulative Logit and Linear-by-Linear Loglinear Association Model (Generalized Loglinear and Indirect Smoothing Model Specification)

EXAMPLE 1.  ML fit of mean-response models

Table 13.1 Colds in Children,  source:  Stokes, Davis, and Koch (2000), "Categorical Data Analysis Using the SAS System."

```                          Periods with Colds
Sex    Residence     0       1       2    Total
----   ---------    -------------------   -----
F    Rural         45      64      71    180
F    Urban         80     104     116    300
M    Rural         84     124      82    290
M    Urban        106     117      87    310

```

Models fitted below:

Sampling Distribution:   Product Multinomial

Conditional on row totals,

(yij1, yij2, yij3)  <- (Yij1, Yij2, Yij3)  indep ~ mult(nij, pij1, pij2, pij3),  i, j = 1,2.

Here,  n11 = 180, n12 = 300, n21 = 290,  n22 = 310  and the table probabilities are interpreted as pijk =  P(Periods = k|SEX=i, RES=j).

Systematic Components (Linear Predictor Models):

Saturated mean-response model:

Mij = b(0) + b(SEX)i + b(RES)j + b(SEX*RES)ij,

No-interaction mean-response model:

Mij = b(0) + b(SEX)i + b(RES)j.

Here, Mij  = 0*pij1 + 1*pij2 + 2*pij3 = mean number of periods for SEX=i, RES=j.

These mean-response models have the generic form L(p) = Xb.  These are examples of  0-order HLP models.

```
d <- scan(what=list(sex="",residence="",periods=0,count=0))
F R 0 45
F R 1 64
F R 2 71
F U 0 80
F U 1 104
F U 2 116
M R 0 84
M R 1 124
M R 2 82
M U 0 106
M U 1 117
M U 2 87

d <- data.frame(d)
y <- d\$count
sex.residence <- paste(d\$sex,d\$residence)        #[Strata are determined by (SEX, RESIDENCE) values]

L.fct <- function(p) {
p <- matrix(p,4,3,byrow=T)
mean1 <- sum( 0:2*p[1,] )
mean2 <- sum( 0:2*p[2,] )
mean3 <- sum( 0:2*p[3,] )
mean4 <- sum( 0:2*p[4,] )
L <- matrix(c(mean1,mean2,mean3,mean4))
rownames(L) <- c("mean.FR","mean.FU","mean.MR","mean.MU")
L
}

d0 <- subset(d,periods==0)

# SATURATED MEAN-RESPONSE MODEL....
 REMARK:  Don't forget to input the  strata  variable, as the strata determine how the table probabilities are defined.
```
mph.fit, version 3.0, 5/1/09 , running...
Convergence criteria have been met.
Time Elapsed: 0.03 seconds ,   Iterations:  1
mph.summary(a1)
MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 0 ):  Gsq =  0
Pearson's Score Stat  (df= 0 ):  Xsq =  0
Generalized Wald Stat (df= 0 ):  Wsq =  0

Adj Resids: 0 0 ... 0 0 , Number |Adj Resid| > 2:  0

SAMPLING PLAN INFORMATION...
Number of strata:   4
Strata identifiers: F R, F U, M R, M U
Strata with fixed sample sizes: all
Observed strata sample sizes:   180, 300, 290, 310

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)    Z-ratio    p-value
(Intercept)      1.14444444   0.05885860 19.4439633 0.00000000
sexM            -0.15134100   0.07374287 -2.0522796 0.04014250
residenceU      -0.02444444   0.07479380 -0.3268245 0.74380065
sexM:residenceU -0.02994933   0.09779569 -0.3062438 0.75941899

mean.FR 1.1444444 1.1444444 0.05885860          0
mean.FU 1.1200000 1.1200000 0.04614952          0
mean.MR 0.9931034 0.9931034 0.04442608          0
mean.MU 0.9387097 0.9387097 0.04467893          0

CONVERGENCE INFORMATION...
Original counts used.
iterations = 1 ,  time elapsed = 0.03
norm.diff  = 0 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 6.7408e-14 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

#   NO-INTERACTION MEAN-RESPONSE MODEL...

mph.summary(b,T,T)
MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 1 ):  Gsq =  0.09383 (p =  0.75936 )
Pearson's Score Stat  (df= 1 ):  Xsq =  0.09386 (p =  0.75933 )
Generalized Wald Stat (df= 1 ):  Wsq =  0.09379 (p =  0.75942 )

Adj Resids: -0.306 -0.306 ... 0.306 0.306 , Number |Adj Resid| > 2:  0```
 REMARK:  The no-interaction mean-response model fits well,  Xsq = 0.09386 (df=1, p-value=0.75933)  and all the adjusted cell-specific residuals are between -0.306 and +0.306.
```SAMPLING PLAN INFORMATION...
Number of strata:   4
Strata identifiers: F R, F U, M R, M U
Strata with fixed sample sizes: all
Observed strata sample sizes:   180, 300, 290, 310

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)    Z-ratio      p-value
(Intercept)  1.15526154   0.04695725 24.6024112 0.0000000000
sexM        -0.16834269   0.04842903 -3.4760696 0.0005088201
residenceU  -0.04194803   0.04817685 -0.8707093 0.3839129012

mean.FR 1.1444444 1.1552615 0.04695725 -0.3063624
mean.FU 1.1200000 1.1133135 0.04070944  0.3063624
mean.MR 0.9931034 0.9869188 0.03957190  0.3063624
mean.MU 0.9387097 0.9449708 0.03975182 -0.3063624

CELL-SPECIFIC STATISTICS...
strata OBS        FV StdErr.FV      PROB StdErr.PROB ADJUSTED.RESIDS
y1     F R  45  44.11273  4.991427 0.2450707  0.02773015       0.3063624
y2     F R  64  63.82746  6.393533 0.3545970  0.03551963       0.3063624
y3     F R  71  72.05981  5.589724 0.4003323  0.03105402      -0.3063624
y4     F U  80  80.94135  7.047109 0.2698045  0.02349036      -0.3063624
y5     F U 104 104.12325  8.235446 0.3470775  0.02745149      -0.3063624
y6     F U 116 114.93540  7.669822 0.3831180  0.02556607       0.3063624
y7     M R  84  84.90553  7.163143 0.2927777  0.02470049      -0.3063624
y8     M R 124 123.98247  8.424577 0.4275258  0.02905027       0.3063624
y9     M R  82  81.11200  7.072743 0.2796965  0.02438877       0.3063624
y10    M U 106 104.99696  7.662589 0.3386999  0.02471803       0.3063624
y11    M U 117 117.06512  8.533036 0.3776294  0.02752592      -0.3063624
y12    M U  87  87.93791  7.322569 0.2836707  0.02362119      -0.3063624

CONVERGENCE INFORMATION...
Original counts used.
iterations = 4 ,  time elapsed = 0.05
norm.diff  = 2.15676e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 9.74239e-09 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

MODEL INFORMATION...
Linear Predictor Model Link Function  L.fct (L.mean= FALSE ):
function(m) {
p <- m*c(1/Z%*%t(Z)%*%m)
as.matrix(L.input.fct(p))
}
<environment: 0x0157e07c>

Link information as originally input, L.input.fct:
function(p) {
p <- matrix(p,4,3,byrow=T)
m1 <- sum( 0:2*p[1,] )
m2 <- sum( 0:2*p[2,] )
m3 <- sum( 0:2*p[3,] )
m4 <- sum( 0:2*p[4,] )
L <- matrix(c(m1,m2,m3,m4))
rownames(L) <- c("mean.FR","mean.FU","mean.MR","mean.MU")
L
}

Derivative of Transpose Link Function derLt.fct:
 "Numerical derivatives used."

Linear Predictor Model Design Matrix  X:
(Intercept) sexM residenceU
1            1    0          0
4            1    0          1
7            1    1          0
10           1    1          1
attr(,"assign")
 0 1 2
attr(,"contrasts")
attr(,"contrasts")\$sex
 "contr.treatment"

attr(,"contrasts")\$residence
 "contr.treatment"

U = Orthogonal Complement of X:
[,1]
1  -1.773473
4   1.773473
7   1.773473
10 -1.773473

Constraint Function  h.fct (h.mean= FALSE ):
function(m) {
t(U)%*%L.fct(m)
}
<environment: 0x0157e07c>

Constraint information as originally input, h.input.fct:
NULL

Derivative of Transpose Constraint Function derht.fct:
 "Numerical derivatives used."

Population (Strata) Matrix Z:
F R F U M R M U
1    1   0   0   0
2    1   0   0   0
3    1   0   0   0
4    0   1   0   0
5    0   1   0   0
6    0   1   0   0
7    0   0   1   0
8    0   0   1   0
9    0   0   1   0
10   0   0   0   1
11   0   0   0   1
12   0   0   0   1
attr(,"assign")
 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$A
 "contr.treatment"

Sampling Constraint Matrix ZF:
F R F U M R M U
1    1   0   0   0
2    1   0   0   0
3    1   0   0   0
4    0   1   0   0
5    0   1   0   0
6    0   1   0   0
7    0   0   1   0
8    0   0   1   0
9    0   0   1   0
10   0   0   0   1
11   0   0   0   1
12   0   0   0   1
attr(,"assign")
 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")\$A
 "contr.treatment"

Fixed Sample Sizes n:
OBS
F R 180
F U 300
M R 290
M U 310

strata:
strata
[1,] "F R"
[2,] "F R"
[3,] "F R"
[4,] "F U"
[5,] "F U"
[6,] "F U"
[7,] "M R"
[8,] "M R"
[9,] "M R"
[10,] "M U"
[11,] "M U"
[12,] "M U"

fixed.strata:
 "all"

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09```
```
```

EXAMPLE 2.  Dispersion Linear Trend Model.

Marital Status of Danes.  Source:  Lloyd, C. (1999), "Statistical Analysis of Categorical Data," p. 72.

```
Age       Single    Married   Divorced    Total
17-21         17         1         0        18
21-25         16         8         0        24
25-30          8        17         1        26
30-40          6        22         4        32
40-50          5        21         6        32
50-60          3        17         8        28
60-70          2         8         6        16
70+           1         3         5         9
```

Candidate Models:

(yi1, yi2, yi3)  <-  (Yi1, Yi2, Yi3) indep ~ mult(ni, pi1, pi2, pi3),   i=1,...,8.

Here,  n1 = 18,..., n8 = 9,   and the table probabilities are defined as     pij = P(Marital=j|Age Stratum= i).

The table probabilities are modeled as:

Linear-in-Age Model:     Gi = b(0) + b(AGE)*xi         or     Saturated Model:          Gi  = bi,

where  Gi = 1 - (pi12 + pi22 + pi32) = Gini dispersion for age category i and x = (x1,...x8) = (19,23,27,35,45,55,65,80) is the vector of AGE scores [midpoints of the age intervals].

```d <- scan(what=list(Age="",Marital = "",Freq=0))
17-21   Single   17
17-21  Married    1
17-21 Divorced    0
21-25   Single   16
21-25  Married    8
21-25 Divorced    0
25-30   Single    8
25-30  Married   17
25-30 Divorced    1
30-40   Single    6
30-40  Married   22
30-40 Divorced    4
40-50   Single    5
40-50  Married   21
40-50 Divorced    6
50-60   Single    3
50-60  Married   17
50-60 Divorced    8
60-70   Single    2
60-70  Married    8
60-70 Divorced    6
70+   Single    1
70+  Married    3
70+ Divorced    5

d <- data.frame(d)

y <- d\$Freq
strata <- d\$Age

L.fct <- function(p) {
p <- matrix(p,8,3,byrow=T)
L <- c()
for (i in 1:8) {
Gi <- 1- sum(p[i,]*p[i,])
L <- rbind(L, Gi)
}
rownames(L) <- c("G.17","G.21","G.25","G.30","G.40","G.50","G.60","G.70+")
L
}

d1 <- subset(d, Marital=="Single")
Agescore <- scan()
19 23 27 35 45 55 65 80

d1 <- data.frame(d1,Agescore)
d1
Age Marital Freq Agescore
1 17-21  Single   17       19
2 21-25  Single   16       23
3 25-30  Single    8       27
4 30-40  Single    6       35
5 40-50  Single    5       45
6 50-60  Single    3       55
7 60-70  Single    2       65
8   70+  Single    1       80

```
 REMARK:  Don't forget to input the  strata  variable, as the strata determine how the table probabilities are defined.
```mph.summary(a,T)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 6 ):  Gsq =  6.61522 (p =  0.3579 )
Pearson's Score Stat  (df= 6 ):  Xsq =  4.96987 (p =  0.54768 )
Generalized Wald Stat (df= 6 ):  Wsq =  10.09688 (p =  0.12063 )

Adj Resids: -2.144 -2.144 ... 0.873 2.144 , Number |Adj Resid| > 2:  3

SAMPLING PLAN INFORMATION...
Number of strata:   8
Strata identifiers: 17-21, 21-25, 25-30, 30-40, 40-50, 50-60, 60-70, 70+
Strata with fixed sample sizes: all
Observed strata sample sizes:   18, 24, 26, 32, 32, 28, 16, 9

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)  Z-ratio      p-value
(Intercept) 0.331544223  0.072438461 4.576909 4.718973e-06
Agescore    0.003625855  0.001398731 2.592245 9.535181e-03

G.17  0.1049383 0.4004355 0.04914697 -2.7342474
G.21  0.4444444 0.4149389 0.04471273  0.4734820
G.25  0.4763314 0.4294423 0.04056645  0.6290227
G.30  0.4765625 0.4584491 0.03356001  0.2198400
G.40  0.5097656 0.4947077 0.02879638  0.1907121
G.50  0.5382653 0.5309662 0.03038880  0.1102546
G.60  0.5937500 0.5672248 0.03753687  0.4409538
G.70+ 0.5679012 0.6216126 0.05358163 -1.0999550

CELL-SPECIFIC STATISTICS...
strata OBS           FV    StdErr.FV         PROB  StdErr.PROB ADJUSTED.RESIDS
y1   17-21  17 1.342063e+01 7.917228e-01 7.455907e-01 4.398460e-02    2.143869e+00
y2   17-21   1 3.642503e+00 1.177341e+00 2.023613e-01 6.540785e-02   -2.143869e+00
y3   17-21   0 9.368645e-01 8.349467e-01 5.204803e-02 4.638593e-02   -2.143869e+00
y4   21-25  16 1.694951e+01 1.300864e+00 7.062294e-01 5.420266e-02   -5.237165e-01
y5   21-25   8 7.050495e+00 1.300864e+00 2.937706e-01 5.420266e-02    5.237165e-01
y6   21-25   0 1.015483e-38 1.007712e-19 4.231178e-40 4.198798e-21   -1.015483e-38
y7   25-30   8 6.835339e+00 1.495050e+00 2.628976e-01 5.750194e-02    6.956249e-01
y8   25-30  17 1.839519e+01 1.165229e+00 7.075074e-01 4.481652e-02   -6.956249e-01
y9   25-30   1 7.694700e-01 7.980423e-01 2.959500e-02 3.069394e-02    6.956249e-01
y10  30-40   6 5.696848e+00 1.693282e+00 1.780265e-01 5.291507e-02    2.249922e-01
y11  30-40  22 2.253683e+01 9.857207e-01 7.042760e-01 3.080377e-02   -2.249922e-01
y12  30-40   4 3.766318e+00 1.498098e+00 1.176974e-01 4.681558e-02    2.249922e-01
y13  40-50   5 4.768420e+00 1.627600e+00 1.490131e-01 5.086250e-02    1.951100e-01
y14  40-50  21 2.148669e+01 9.149380e-01 6.714590e-01 2.859181e-02   -1.951100e-01
y15  40-50   6 5.744893e+00 1.733194e+00 1.795279e-01 5.416232e-02    1.951100e-01
y16  50-60   3 2.894242e+00 1.305963e+00 1.033658e-01 4.664152e-02    1.121326e-01
y17  50-60  17 1.725375e+01 1.225208e+00 6.162052e-01 4.375743e-02   -1.121326e-01
y18  50-60   8 7.852012e+00 1.976947e+00 2.804290e-01 7.060526e-02    1.121326e-01
y19  60-70   2 1.607440e+00 8.987154e-01 1.004650e-01 5.616971e-02    4.913694e-01
y20  60-70   8 8.718400e+00 1.352845e+00 5.449000e-01 8.455284e-02   -4.913694e-01
y21  60-70   6 5.674160e+00 1.795040e+00 3.546350e-01 1.121900e-01    4.913694e-01
y22    70+   1 1.577521e+00 9.291630e-01 1.752801e-01 1.032403e-01   -8.729608e-01
y23    70+   3 3.157069e+00 1.420296e+00 3.507855e-01 1.578107e-01   -8.729608e-01
y24    70+   5 4.265410e+00 1.239264e+00 4.739345e-01 1.376960e-01    8.729608e-01

CONVERGENCE INFORMATION...
Original counts used.
iterations = 65 ,  time elapsed = 0.89
norm.diff  = 1.27049 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 6.42349e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09
```
 REMARK:  The fact that norm.diff does not converge to 0 usually means that the ML estimates do not exist owing to 0 counts; indeed, for this example, a look at the fitted values indicates that the iterate estimates approach 0 for the 6th cell.   The  resulting estimates can be regarded as "extended" ML estimates, which allow for fitted values equal to 0, the boundary value. For the saturated model fitted below, the print.iter=T  option is used.  We can monitor the iterations and see how the norm.diff criterion is not met.
```#SATURATED MODEL...
mph.fit, version 3.0, 5/1/09 , running...
iter= 1  norm.diff= 1.414214  norm.score= 0.005202601
iter= 2  norm.diff= 1.414214  norm.score= 0.00191393
iter= 3  norm.diff= 1.414214  norm.score= 0.0007040955
iter= 4  norm.diff= 1.414214  norm.score= 0.0002590222
iter= 5  norm.diff= 1.414214  norm.score= 9.528896e-05
iter= 6  norm.diff= 1.414214  norm.score= 3.505485e-05
iter= 7  norm.diff= 1.414214  norm.score= 1.289596e-05
iter= 8  norm.diff= 1.414214  norm.score= 4.744158e-06
iter= 9  norm.diff= 1.414214  norm.score= 1.745278e-06
iter= 10  norm.diff= 1.414214  norm.score= 6.42052e-07
iter= 11  norm.diff= 1.414214  norm.score= 2.361977e-07
iter= 12  norm.diff= 1.414214  norm.score= 8.689228e-08
iter= 13  norm.diff= 1.414214  norm.score= 3.196589e-08
iter= 14  norm.diff= 1.414214  norm.score= 1.175959e-08
iter= 15  norm.diff= 1.414214  norm.score= 4.326112e-09
iter= 16  norm.diff= 1.414214  norm.score= 1.591488e-09
iter= 17  norm.diff= 1.414214  norm.score= 5.854756e-10
iter= 18  norm.diff= 1.414214  norm.score= 2.153844e-10
iter= 19  norm.diff= 1.414214  norm.score= 7.92355e-11
iter= 20  norm.diff= 1.414214  norm.score= 2.914911e-11
iter= 21  norm.diff= 1.414214  norm.score= 1.072336e-11
iter= 22  norm.diff= 1.414214  norm.score= 3.944907e-12
iter= 23  norm.diff= 1.414214  norm.score= 1.451259e-12
iter= 24  norm.diff= 1.414214  norm.score= 5.339131e-13
iter= 25  norm.diff= 1.414214  norm.score= 1.964825e-13
iter= 26  norm.diff= 1.414214  norm.score= 7.24633e-14
Did NOT meet norm diff convergence criterion [1e-06]!
Norm score convergence criterion [1e-09] was met.
Time Elapsed: 0.06 seconds ,   Iterations:  26

mph.summary(a.sat)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 0 ):  Gsq =  0
Pearson's Score Stat  (df= 0 ):  Xsq =  0
Generalized Wald Stat (df= 0 ):  Wsq =  0

Adj Resids: 0 0 ... 0 0 , Number |Adj Resid| > 2:  0

SAMPLING PLAN INFORMATION...
Number of strata:   8
Strata identifiers: 17-21, 21-25, 25-30, 30-40, 40-50, 50-60, 60-70, 70+
Strata with fixed sample sizes: all
Observed strata sample sizes:   18, 24, 26, 32, 32, 28, 16, 9

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)  Z-ratio      p-value
(Intercept) 0.1049383   0.09598275 1.093303 2.742606e-01
Age21-25    0.3395062   0.11544659 2.940807 3.273580e-03
Age25-30    0.3713931   0.12049262 3.082289 2.054152e-03
Age30-40    0.3716242   0.12904010 2.879913 3.977852e-03
Age40-50    0.4048274   0.12569882 3.220614 1.279164e-03
Age50-60    0.4333270   0.11931348 3.631836 2.814116e-04
Age60-70    0.4888117   0.11346716 4.307958 1.647690e-05
Age70+      0.4629630   0.13967541 3.314563 9.178648e-04

G.17  0.1049383 0.1049383 0.09598275          0
G.21  0.4444444 0.4444444 0.06415003          0
G.25  0.4763314 0.4763314 0.07284080          0
G.30  0.4765625 0.4765625 0.08624766          0
G.40  0.5097656 0.5097656 0.08116345          0
G.50  0.5382653 0.5382653 0.07087326          0
G.60  0.5937500 0.5937500 0.06051536          0
G.70+ 0.5679012 0.5679012 0.10147184          0

CONVERGENCE INFORMATION...
Original counts used.
iterations = 26 ,  time elapsed = 0.06
norm.diff  = 1.41421 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 7.24633e-14 = norm of score at last iteration.
Norm score convergence criterion [1e-09] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

```

EXAMPLE 3.  ML Fit of Marginal Homogeneity Model

Display of unaided distance vision data from 30-39 year old FEMALE employees of United Kingdom Royal Ordnance factories during 1943-1946 (Kendall and Stuart 1961, "The Advanced  Theory of Statistics, Vol. 2: Inference and Relationship," p564 and p 586. See also, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System," p. 443)

```                        Left Grade
Right Grade     1     2      3    4  Total
-----------   ---------------------  -----
1         1520   266   124   66   1976
2          234  1512   432   78   2256
3          117   362  1772  205   2456
4           36    82   179  492    789
---------------------  -----
Total        1907  2222  2507  841   7477
```

Model of Marginal Homogeneity:

It is assumed that y <- Y ~ multinomial( n=7477,   p),   where  the table probabilities satisfy:

pi+ = p+i,  i=1,2,3, 4.  That is,     h(p) = [p1+ - p+1, p2+- p+2, p3+ - p+3]' = 0.

Note:  The last constraint  p4+  = p+4  is implied by the first three, so is omitted to avoid redundancies.

Equivalently,  the model can be specified in terms of expected counts m:

mi+ = m+i,  i=1,2,3, 4.  That is,     h(m) = [m1+ - m+1, m2+- m+2, m3+ - m+3]' = 0.

Alternatively,  the marginal homogeneity model can be expressed in HLP form:

L(p) = [p1+, p2+, p3+, p4+,   p+1, p+2, p+3, p+4 ]' = [b(1), b(2), b(3), b(4),   b(1), b(2), b(3), b(5)]' = X b

Note:  Given that (p1+ = p+1, p2+ = p+2  , p3+ = p+3) it follows that p4+ = p+4.   That is,   the last constraint is redundant.  Hence, the introduction of b(5) in the HLP model.

```
y <- scan()
1520 266 124 66
234 1512 432 78
117 362 1772 205
36 82 179 492

strata <- rep(1,16)

MR  <- M.fct(RGrade)  #M.fct is included in mph.supp.Rcode.txt
# Alternative code...
#   MR <-  kronecker(diag(4),matrix(1,1,4))
#   ML <-  kronecker(matrix(1,1,4),diag(4))

h.fct <- function(p) {
as.matrix(c(MR%*%p - ML%*%p)[-4])
}
# Fit the h(p) = 0  version of the model...
ap <- mph.fit(y,constraint=h.fct)
# Fit the h(m) = 0  version of the model...
am <- mph.fit(y,constraint=h.fct,h.mean=T)```
```
#NOTE:  When the model is specified in terms of m, rather than p,
#       mph.fit checks for Z homogeneity.  In the HLP setting, mph.fit also
#
#       Submitting,  am <- mph.fit(y,constraint=h.fct,h.mean=T)
#       results in the following message:
#
#         CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed.
#
#       The necessary condition is met. There is no evidence that h(.) is not Z homogeneous.
#
#       If the necessary condition alluded to in the message is not passed, it could be
#       for one of two reasons:  (1)  h(.) is actually not Z homogeneous  or (2) the
#       necessary condition check was subject to too much round-off error.  You can
#       try using larger tolerance values (default=1e-09),  e.g. check.homog.tol = 1e-4.
#       If the condition does not pass with relatively large tolerance value, then it is
#       likely that h(.) is not Z homogeneous; which means that the model not only constrains the
#       table probabilities, it also constrains the expected sample sizes.
#       In this non-Z-homogeneous case, mph.fit can not be used to fit the model.
#

cbind(ap\$Xsq,am\$Xsq)
PEARSON SCORE STATISTIC  PEARSON SCORE STATISTIC
11.96980                11.96980
cbind(ap\$m,am\$m,ap\$p,am\$p)
FV         FV        PROB        PROB
m1  1520.00000 1520.00000 0.203290090 0.203290090
m2   252.48209  252.48209 0.033767833 0.033767833
m3   111.84293  111.84293 0.014958263 0.014958263
m4    56.96589   56.96589 0.007618817 0.007618817
m5   247.23710  247.23710 0.033066350 0.033066350
m6  1512.00000 1512.00000 0.202220142 0.202220142
m7   409.41751  409.41751 0.054756923 0.054756923
m8    70.58517   70.58517 0.009440307 0.009440307
m9   131.26859  131.26859 0.017556318 0.017556318
m10  383.13268  383.13268 0.051241497 0.051241497
m11 1772.00000 1772.00000 0.236993447 0.236993447
m12  195.25849  195.25849 0.026114549 0.026114549
m13   42.78522   42.78522 0.005722245 0.005722245
m14   91.62502   91.62502 0.012254249 0.012254249
m15  188.39931  188.39931 0.025197179 0.025197179
m16  492.00000  492.00000 0.065801792 0.065801792
```
```mph.summary(am)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 3 ):  Gsq =  11.9872 (p =  0.0074271 )
Pearson's Score Stat  (df= 3 ):  Xsq =  11.9698 (p =  0.0074873 )
Generalized Wald Stat (df= 3 ):  Wsq =  11.95657 (p =  0.0075334 )

Adj Resids: -3.179 -2.733 ... 2.733 3.179 , Number |Adj Resid| > 2:  6

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   7477

CONVERGENCE INFORMATION...
Original counts used.
iterations = 8 ,  time elapsed = 0.04
norm.diff  = 9.67905e-08 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 7.67784e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

# Alternative to numerically computing the derivative of h() with respect to m,
# input the derivative explicitly...```
```derht.fct <- function(m) {
t((MR- ML)[-4,])
}

am2 <- mph.fit(y,constraint=h.fct, h.mean=T, derht.fct=derht.fct)
CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed.

mph.fit, version 3.0, 5/1/09 , running...
Convergence criteria have been met.
Time Elapsed: 0.03 seconds ,   Iterations:  8
```
`# Fit the HLP version of the marginal homogeneity model...`
```L.fct <- function(p) {
L <- as.matrix(c(MR%*%p,ML%*%p))
rownames(L) <- c("P(R=1)","P(R=2)","P(R=3)","P(R=4)",
"P(L=1)","P(L=2)","P(L=3)","P(L=4)")
L
}

X <- scan()
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1

X <- matrix(X, 8,5,byrow=T)

#L.fct = X beta,  constrains the first three marginal probs to be equal.
#                 This implies the fourth marginal probs are also equal.
#

mph.summary(ap.HLP)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 3 ):  Gsq =  11.9872 (p =  0.0074271 )
Pearson's Score Stat  (df= 3 ):  Xsq =  11.9698 (p =  0.0074873 )
Generalized Wald Stat (df= 3 ):  Wsq =  11.97572 (p =  0.0074668 )

Adj Resids: -3.179 -2.733 ... 2.733 3.179 , Number |Adj Resid| > 2:  6

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   7477

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)  Z-ratio p-value
beta1 0.2596350  0.004683916 55.43119       0
beta2 0.2994837  0.004642237 64.51281       0
beta3 0.3319058  0.004827707 68.75019       0
beta4 0.1089755  0.003177685 34.29398       0
beta5 0.1089755  0.003177685 34.29398       0

P(R=1) 0.2642771 0.2596350 0.004683916  2.3908957
P(R=2) 0.3017253 0.2994837 0.004642237  0.8786678
P(R=3) 0.3284740 0.3319058 0.004827707 -1.3618671
P(R=4) 0.1055236 0.1089755 0.003177685 -2.0309320
P(L=1) 0.2550488 0.2596350 0.004683916 -2.3620900
P(L=2) 0.2971780 0.2994837 0.004642237 -0.9038095
P(L=3) 0.3352949 0.3319058 0.004827707  1.3449095
P(L=4) 0.1124783 0.1089755 0.003177685  2.0609046

CONVERGENCE INFORMATION...
Original counts used.
iterations = 37 ,  time elapsed = 0.39
norm.diff  = 8.28358e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.56281e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09
```
```
```

EXAMPLE 4.  Loglinear models under different sampling plans

Bicycle Data  (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")

```                  Wearing Helmet?
Bicycle Type      Yes       No
Mountain       34        32
Other          10        24```

Fit the independence loglinear model under a variety of sampling plans.

```d <- scan(what=list(bike="",helmet="",count=0))
Mountain Yes 34
Mountain No 32
Other    Yes 10
Other    No 24

d <- data.frame(d)
y <- d\$count

a1 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1)                   # Full Multinomial
a2 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1,fixed.strata="none") # Full Poisson
a3 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=d\$bike)              # Prod (Row) Multinomial
a4 <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=d\$helmet)            # Prod (Col) Multinomial```
```
c(a1\$Xsq,a2\$Xsq,a3\$Xsq,a4\$Xsq)
 4.449383 4.449383 4.449383 4.449383```
```
cbind(a1\$m,a2\$m,a3\$m,a4\$m)
FV    FV    FV    FV
m1 29.04 29.04 29.04 29.04                [Fitted values are identical]
m2 36.96 36.96 36.96 36.96
m3 14.96 14.96 14.96 14.96
m4 19.04 19.04 19.04 19.04```
```
cbind(sqrt(diag(a1\$covm)), sqrt(diag(a2\$covm)), sqrt(diag(a3\$covm)), sqrt(diag(a4\$covm)))
[,1]     [,2]     [,3]     [,4]
m1 3.882984 4.848792 3.276154 2.084319    [Approx Std Errors of fitted values depend on sampling plan]
m2 4.215491 5.606316 3.276154 2.652769
m3 2.681934 3.070958 1.687716 2.084319
m4 3.144132 3.675702 1.687716 2.652769```
```
cbind(a1\$beta,a2\$beta,a3\$beta,a4\$beta)
BETA       BETA       BETA       BETA
(Intercept)  3.6098362  3.6098362  3.6098362  3.6098362
bikeOther   -0.6632942 -0.6632942 -0.6632942 -0.6632942
helmetYes   -0.2411621 -0.2411621 -0.2411621 -0.2411621```
```
cbind(sqrt(diag(a1\$covbeta)),sqrt(diag(a2\$covbeta)),sqrt(diag(a3\$covbeta)),sqrt(diag(a4\$covbeta)))
[,1]      [,2]         [,3]         [,4]
(Intercept) 0.1140555 0.1516861 8.864053e-02 7.177406e-02
bikeOther   0.2111002 0.2111002 4.121552e-07 2.111002e-01
helmetYes   0.2014557 0.2014557 2.014557e-01 5.281921e-07```
```
cbind(a1\$p,sqrt(diag(a1\$covp)), a2\$p,sqrt(diag(a2\$covp)), a3\$p,sqrt(diag(a3\$covp)),a4\$p,sqrt(diag(a4\$covp)))
PROB              PROB            PROB            PROB
p1 0.2904 0.03882984 0.2904 0.03882984 0.44 0.04963869 0.66 0.04737088
p2 0.3696 0.04215491 0.3696 0.04215491 0.56 0.04963869 0.66 0.04737088
p3 0.1496 0.02681934 0.1496 0.02681934 0.44 0.04963869 0.34 0.04737088
p4 0.1904 0.03144132 0.1904 0.03144132 0.56 0.04963869 0.34 0.04737088```
```
# Summarize the result for the product (row) multinomial sampling case...
mph.summary(a3,T)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 1 ):  Gsq =  4.55692 (p =  0.032786 )
Pearson's Score Stat  (df= 1 ):  Xsq =  4.44938 (p =  0.034914 )
Generalized Wald Stat (df= 1 ):  Wsq =  4.33093 (p =  0.037426 )

Adj Resids: -2.109 -2.109 2.109 2.109 , Number |Adj Resid| > 2:  4

SAMPLING PLAN INFORMATION...
Number of strata:   2
Strata identifiers: Mountain, Other
Strata with fixed sample sizes: all
Observed strata sample sizes:   66, 34

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)       Z-ratio   p-value
(Intercept)  3.6098362 8.864053e-02  4.072445e+01 0.0000000
bikeOther   -0.6632942 4.121552e-07 -1.609331e+06 0.0000000
helmetYes   -0.2411621 2.014557e-01 -1.197097e+00 0.2312688

CELL-SPECIFIC STATISTICS...
strata OBS    FV StdErr.FV PROB StdErr.PROB ADJUSTED.RESIDS
y1 Mountain  34 29.04  3.276154 0.44  0.04963869        2.109356
y2 Mountain  32 36.96  3.276154 0.56  0.04963869       -2.109356
y3    Other  10 14.96  1.687716 0.44  0.04963869       -2.109356
y4    Other  24 19.04  1.687716 0.56  0.04963869        2.109356

CONVERGENCE INFORMATION...
Original counts used.
iterations = 5 ,  time elapsed = 0.05
norm.diff  = 5.82122e-11 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 2.37238e-12 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

```
```# By default, the derivative of the link was numerically computed in
# the mph.fit calls above.  In the log-linear model setting, the link
# derivative has a simple explicit form.  This derivative with
# respect to m can be included as input.    For example....

derLt.fct <- function(m) {
diag(c(1/m))
}
```

EXAMPLE 5.  Loglinear models of independence

Bicycle Data  (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")

```                  Wearing Helmet?
Bicycle Type      Yes       No
Mountain       34        32
Other          10        24
```

Loglinear Model of Independence:

Assume that  y <- multinomial(n=100, p),  where  the table probabilities p satisfy:

log(pij) = b(0) + b(Bike)i  + b(Helmet)j ;   generically  log(p) = X b.

or equivalently,  in constraint form,

h(p) = U'log(p) = 0,   where   U is an orthogonal complement of X.

This model can also be re-expressed in terms of the expected counts m:

log(mij) = b(0) + b(Bike)i  + b(Helmet)j ;   generically  log(m) = X b.

or equivalently,  in constraint form,

h(m) = U'log(m) = 0,   where   U is an orthogonal complement of X.

```d <- scan(what=list(bike="",helmet="",count=0))
Mountain Yes 34
Mountain No 32
Other    Yes 10
Other    No 24

d <- data.frame(d)
y <- d\$count

# Fit the log-linear models of independence...
ap <- mph.fit(y,link="logp",X=model.matrix(~bike+helmet,d),strata=1)  #log(p) = Xb
am <- mph.fit(y,link="logm",X=model.matrix(~bike+helmet,d),strata=1)  #log(m) = Xb

# Fit the constraint specification of the independence model...
U <- create.U(model.matrix(~bike+helmet,d))
h.fct <-
function(p) {
t(U)%*%log(p)
}

ap2 <- mph.fit(y,constraint=h.fct,         strata=1)    #h(p) = U'log(p) = 0
am2 <- mph.fit(y,constraint=h.fct,h.mean=T,strata=1)    #h(m) = U'log(m) = 0

cbind(ap\$m,am\$m,ap2\$m,am2\$m,ap\$p,am\$p,ap2\$p,am2\$p)
FV    FV    FV    FV   PROB   PROB   PROB   PROB
m1 29.04 29.04 29.04 29.04 0.2904 0.2904 0.2904 0.2904    [Same fitted values and probability estimates.]
m2 36.96 36.96 36.96 36.96 0.3696 0.3696 0.3696 0.3696
m3 14.96 14.96 14.96 14.96 0.1496 0.1496 0.1496 0.1496
m4 19.04 19.04 19.04 19.04 0.1904 0.1904 0.1904 0.1904

#NOTE:  When the models are specified in terms of m, rather than p,
#       mph.fit checks for Z homogeneity.  In the HLP setting, mph.fit also
#
#       results in the following messages:
#
#         CHECKING whether L(.) is an HLP link function...Necessary condition [tol=1e-09] passed.
#         CHECKING whether h(.) is Z homogeneous...Necessary condition [tol=1e-09] passed.
#
#       The necessary conditions are met.  There is no evidence that the model is not an HLP model.
#
#       If the necessary conditions alluded to in the messages are not passed, you can
#       try using larger tolerance values,  e.g. check.homog.tol = 1e-4 and check.HLP.tol= 1e-4.
#
```
```
```

EXAMPLE 6.  Logit models under different sampling plans

Bicycle Data  (Table 16.4, p. 557, Stokes, Davis, and Koch (2000), "Categorical Data Analysis using the SAS System")

```                 Wearing Helmet?
Bicycle Type      Yes       No
Mountain       34        32
Other          10        24

```

Logit Model:

log(pi1/pi2)  = b(0) + b(Bike)i.

Candidate Sampling Distributions:

If the data are the result of a simple random sample from a single population, then  pij = P(Bike=i, Helmet = j).   The expected counts have the form mij = ν pij, where    ν is the expected sample size, which may [full-multinomial sampling plan] or may not  [full-Poisson sampling plan] be fixed a priori.

If the data are the result of a row-stratified random sample, then  pij = P(Helmet = j | Bike = i) .  The expected counts have the form mij = νi pij,  where   νi  is the expected sample size for the stratum [Bike = i].  These expected sample sizes may or may not be fixed a priori.

If the data are the result of a column-stratified random sample, then  pij = P(Bike= i | Helmet = j) .  The expected counts have the form mij = νj pij,  where   νj  is the expected sample size for the stratum [Helmet = j].  These expected sample sizes may or may not be fixed a priori.  Technically, under this sampling plan, the "logit" model is not a log odds model at all.  It actually has the form  log(pi1/pi2) = log(P(Bike=i|Helmet=1)/P(Bike=i|Helmet=2)) = log(P(Helmet=1|Bike=i)/P(Helmet=2|Bike=i)) + log(P(Helmet=2)/P(Helmet=1)).

```d <- scan(what=list(bike="",helmet="",count=0))
Mountain Yes 34
Mountain No 32
Other    Yes 10
Other    No 24

d <- data.frame(d)
y <- d\$count

L.fct <- function(p) {
p <- matrix(p,2,2,byrow=T)
L <- as.matrix(c(log(p[1,1]/p[1,2]),  log(p[2,1]/p[2,2])))
L
}

d1 <- subset(d,helmet=="Yes")
d1
bike helmet count
1 Mountain    Yes    34
3    Other    Yes    10

a1 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=1)                     #Full Multinomial
a2 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=1, fixed.strata="none")  #Full Poisson
a3 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=d\$bike)                #Product Multinomial
a4 <- mph.fit(y,link=L.fct, X=model.matrix(~bike,d1),strata=d\$helmet)              #Product Multinomial
```
```cbind(a1\$beta,a2\$beta,a3\$beta,a4\$beta)
BETA        BETA        BETA       BETA
(Intercept)  0.06062462  0.06062462  0.06062462  0.3017867
bikeOther   -0.93609336 -0.93609336 -0.93609336 -0.9360934```
```cbind(sqrt(diag(a1\$covbeta)),sqrt(diag(a2\$covbeta)),sqrt(diag(a3\$covbeta)),sqrt(diag(a4\$covbeta)))
[,1]      [,2]      [,3]      [,4]
(Intercept) 0.2462961 0.2462961 0.2462961 0.1416946
bikeOther   0.4498093 0.4498093 0.4498093 0.4498093

cbind(a1\$p,sqrt(diag(a1\$covp)),a2\$p,sqrt(diag(a2\$covp)),a3\$p,sqrt(diag(a3\$covp)),a4\$p,sqrt(diag(a4\$covp)))
PROB            PROB                 PROB                 PROB
p1 0.34 0.04737088 0.34 0.04737088 0.5151515 0.06151748 0.7727273 0.06317721
p2 0.32 0.04664762 0.32 0.04664762 0.4848485 0.06151748 0.5714286 0.06613001
p3 0.10 0.03000000 0.10 0.03000000 0.2941176 0.07814249 0.2272727 0.06317721
p4 0.24 0.04270831 0.24 0.04270831 0.7058824 0.07814249 0.4285714 0.06613001
```
```
```

EXAMPLE 7. Very sparse table:  fine-tuning the fitting algorithm

```y <- c(rep(0,499),1)
y
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
```

Model Fitted:  y <- Y ~ multinomial(1, p),   where  p1 = p2 = ... = p500 ,  equivalently m1 = ... = m500.    (This is the model that states all the cell probabilities are 0.002.)

```C <- cbind(1,-diag(499))
h.fct <- function(m) {
C%*%m
}
derht.fct <- function(m) {
t(C)
}
a <- mph.fit(y,h.fct=h.fct, h.mean=T, derht.fct=derht.fct, print.iter=T)             #Does not converge!```
 REMARK:  The MLe's exist for this model of equal probabilities.  However, the many zero counts cause fitting problems. One way to mitigate fitting problems with zeroes is to begin the iterations using modified (non-zero) counts such as y+0.1.     The next run does just this.  By default, the modified counts y+y.eps are used until the 5th iteration, at which time the original counts y are used.
```a <- mph.fit(y,h.fct=h.fct,derht.fct=derht.fct,h.mean=T, y.eps=0.1)       #Temporarily add 0.1 to counts
#
#    Note:  If derht.fct were not entered, then numerical derivatives of h wrt m would be used, and the
#           fitting would be quite a bit slower.
#
c(a\$m)
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
....[omitted output]
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002
 0.002 0.002 0.002 0.002 0.002 0.002
```
```#  Another way to fine tune the algorithm is to set the 'step' parameter to a
#  smaller value (default is step=1).  As an example,
#  a <- mph.fit(y,h.fct=h.fct,derht.fct=derht.fct,h.mean=T, step=0.5)  does converge.

```
```mph.summary(a)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 499 ):  Gsq =  12.42922 (p =  1 )
Pearson's Score Stat  (df= 499 ):  Xsq =  499 (p =  0.49158 )
Generalized Wald Stat (df= 499 ):  Wsq =  0.98008 (p =  1 )

WARNING: 100% of expected counts are less than 5.
Chi-square approximation may be questionable.

Adj Resids: -0.045 -0.045 ... -0.045 22.338 , Number |Adj Resid| > 2:  1

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   1

CONVERGENCE INFORMATION...
Original counts used.
iterations = 52 ,  time elapsed = 61.01
norm.diff  = 1.1344e-06 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 2.43078e-09 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09 ```
 REMARK. Notice that the CONVERGENCE INFORMATION output tells you that the original counts y (not the modified counts y+y.eps) were used.
` `
```
```

EXAMPLE 8.   Given row marginal probs = (0.2, 0.3, 0.5),   column marginal probs = (0.4, 0.1, 0.5),  and  local odds ratios  or11 = 3, or12=2, or21=1, or22=4,   compute the joint distribution.

```y <- matrix(1,9,1)  # any reasonable positive numbers will work here
A <- gl(3,3,9)
B <- gl(3,1,9)

MA <- M.fct(A)
MB <- M.fct(B)

# Alternative way to create MA and MB...
#  MA <- t(kronecker(diag(3),matrix(1,3,1)))
#  MB <- t(kronecker(matrix(1,3,1),diag(3)))
#

h.fct <- function(p) {
or11 <- p*p/(p*p)
or12 <- p*p/(p*p)
or21 <- p*p/(p*p)
or22 <- p*p/(p*p)
h <- c(
MA%*%p - c(0.2,0.3,0.5),
MB%*%p - c(0.4,0.1,0.5),
or11-3,or12-2,or21-1,or22-4
)
as.matrix(h[c(-3,-6)])   #Delete the two redundant marginal prob constraints.
}
a <- mph.fit(y,h.fct=h.fct)
a\$p
PROB
p1 0.15913038
p2 0.01804733
p3 0.02282230
p4 0.13631718
p5 0.04638010
p6 0.11730272
p7 0.10455244
p8 0.03557257
p9 0.35987499

#
#  When both h.fct and (L.fct, X) are input,  mph.fit ignores any constraints imposed by L.fct = X b, but
#  it does compute ML estimates of L and b  (under the model with constraint h.fct=0).
#
#  This feature of mph.fit can be exploited to output estimates of functions of p.
#  For example,....
#
L.fct <- function(p) {
or11 <- p*p/(p*p)
or12 <- p*p/(p*p)
or21 <- p*p/(p*p)
or22 <- p*p/(p*p)
L <- as.matrix(c(MA%*%p,MB%*%p,
or11,or12,or21,or22))
rownames(L) <- c("prow1","prow2","prow3","pcol1","pcol2","pcol3",
"or11","or12","or21","or22")
L
}
X <- diag(10)

b <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=X)
mph.summary(b)
```
```MODEL GOODNESS OF FIT:   Test of   Ho: h(p)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 8 ):  Gsq =  7.36137 (pval =  0.4982 )
Pearson's Score Stat  (df= 8 ):  Xsq =  11.37638 (pval =  0.1813 )
Generalized Wald Stat (df= 8 ):  Wsq =  13.37778 (pval =  0.0995 )

WARNING: 100% of expected counts are less than 5.
Chi-square approximation may be questionable.

Adj Resids: -1.555 -0.394 ... 1.774 2.097 , Number |Adj Resid| > 2:  1

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   9

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)   Z-ratio p-value
beta1   0.2            0       Inf       0
beta2   0.3            0 535095116       0
beta3   0.5            0       Inf       0
beta4   0.4            0 217202655       0
beta5   0.1            0       Inf       0
beta6   0.5            0       Inf       0
beta7   3.0            0       Inf       0
beta8   2.0            0       Inf       0
beta9   1.0            0  27493920       0
beta10  4.0            0  51205638       0

prow1   0.3333     0.2         0     1.0000
prow2   0.3333     0.3         0     0.2182
prow3   0.3333     0.5         0    -1.0000
pcol1   0.3333     0.4         0    -0.4082
pcol2   0.3333     0.1         0     2.3333
pcol3   0.3333     0.5         0    -1.0000
or11    1.0000     3.0         0    -0.2101
or12    1.0000     2.0         0    -0.1319
or21    1.0000     1.0         0     0.0000
or22    1.0000     4.0         0    -0.2881

CONVERGENCE INFORMATION...
Original counts used.
iterations = 7 ,  time elapsed = 0.03
norm.diff  = 9.18893e-09 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.23277e-10 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/16/09
```
```
```

EXAMPLE 9.   Model constraint: area under ROC manifest "curve" (AUC)  equal to a constant.

Neonatal Radiograph Data (source: see Table 1, Lang and Aspelund (2001), Statistical Modelling--An International Journal)

```               Video Image Rating
1   2   3   4   5   Total
-------------------  -----
Normals     6  17   4   5   1    33
Abnormals   4   5   5  15  38    67

```

Models Fitted Below...

Sampling Distribution:    Product Multinomial

(Yi1, ..., Yi5)  indep ~ mult(ni, pi1, ... , pi5),  i=1,2.

Here,  n1 = 33 and n2 = 67.   The table probabilities are defined as pij = P(Rating=j| Strata=i).

Systematic Component:

Constraint:      In words, "Area under the manifest ROC `curve' equals 0.9."   In symbols,   h(p) =  AUC(p)  -  0.9 = 0,          where AUC(p) = the area under the manifest ROC "curve."    Specifically,

AUC(p) = p11*(p22 + p23 + p24 + p25) + p12*(p23  + p24 + p25) + p13*(p24 + p25) + p14*p25 + 0.5*(p11*p21 + p12*p22 + ... + p15*p25) = P(B > A) + 0.5*P(B=A),  where A ~ (p11, ..., p15)  and B ~ (p21, ..., p25).

```
y <- scan()
6 17 4 5 1
4 5 5 15 38

strata <- c(rep("Normals",5),rep("Abnormals",5))

h.fct <- function(p) {
p*(p+p+p+p) +
p*(p+p+p) +
p*(p+p) +
p*(p) +
0.5*(p*p+p*p+p*p+p*p+p*p) - 0.90
}
#
#  To have mph.fit output the observed and fitted value of AUC(p),  we augment the model
#  specification by including both h.fct and (L.fct, X) as input.   When h.fct and (L.fct,X),
#  are input,  mph.fit ignores any constraints imposed by L.fct = X b, but
#  it does compute ML estimates of L(p) and b  (under the model with constraint h.fct=0).
#
#  This feature of mph.fit can be exploited to output estimates of function of p, here, AUC(p)...
#
L.fct <- function(p) {
L <- as.matrix(
p*(p+p+p+p) +
p*(p+p+p) +
p*(p+p) +
p*(p) +
0.5*(p*p+p*p+p*p+p*p+p*p)
)
rownames(L) <- "AUC"
L
}

a <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=as.matrix(1),strata=strata)
mph.summary(a,T)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 1 ):  Gsq =  1.94502 (p =  0.16312 )
Pearson's Score Stat  (df= 1 ):  Xsq =  2.19474 (p =  0.13848 )
Generalized Wald Stat (df= 1 ):  Wsq =  1.52065 (p =  0.21752 )

Adj Resids: -1.481 -1.481 ... 1.481 1.481 , Number |Adj Resid| > 2:  0

SAMPLING PLAN INFORMATION...
Number of strata:   2
Strata identifiers: Abnormals, Normals
Strata with fixed sample sizes: all
Observed strata sample sizes:   67, 33

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)    Z-ratio p-value
beta1  0.9 4.418826e-10 2036740159       0

AUC  0.85346     0.9 4.418826e-10  -1.505451

CELL-SPECIFIC STATISTICS...
strata OBS         FV StdErr.FV       PROB StdErr.PROB ADJUSTED.RESIDS
y1    Normals   6  6.7996921 2.2599169 0.20605128  0.06848233       -1.481464
y2    Normals  17 17.8647239 2.8022957 0.54135527  0.08491805       -1.481464
y3    Normals   4  3.8316273 1.8367962 0.11610992  0.05566049        1.481464
y4    Normals   5  3.9681015 1.7337208 0.12024550  0.05253699        1.481464
y5    Normals   1  0.5358551 0.6549779 0.01623803  0.01984781        1.481464
y6  Abnormals   4  2.5477157 1.2205911 0.03802561  0.01821778        1.481464
y7  Abnormals   5  3.8380538 1.7329256 0.05728439  0.02586456        1.481464
y8  Abnormals   5  4.6833212 2.0761170 0.06990032  0.03098682        1.481464
y9  Abnormals  15 15.2579803 3.4282561 0.22773105  0.05116800       -1.481464
y10 Abnormals  38 40.6729290 3.5674591 0.60705864  0.05324566       -1.481464

CONVERGENCE INFORMATION...
Original counts used.
iterations = 56 ,  time elapsed = 0.25
norm.diff  = 8.33704e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.07391e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09
```
```
```

EXAMPLE 10.  Marginal homogeneity of multidimensional contingency tables

3x3x3 Table.  See S. Kullback, Annals of Mathematical Statistics, Vol. 42, No. 2. (Apr., 1971), pp. 594-606.

```y <- scan()
223 24 6 40 42 2 19 4 12
28 6 9 25 218 6 3 13 9
26 3 18 18 30 24 12 16 164

A <- gl(3,9,27)
B <- gl(3,3,27)
C <- gl(3,1,27)

data.frame(A,B,C,y)

MA <- M.fct(A)
MB <- M.fct(B)
MC <- M.fct(C)

#  Alternatively, the M matrices can be computed as follows:
#    MA <- kronecker(diag(3),matrix(1,1,9)))
#    MB <- kronecker(matrix(1,1,3),kronecker(diag(3),matrix(1,1,3)))
#    MC <- kronecker(matrix(1,1,9),diag(3))

C.mat <- scan()
1 0 0 -1 0 0 0 0 0
1 0 0 0 0 0 -1 0 0
0 1 0 0 -1 0 0 0 0
0 1 0 0 0 0 0 -1 0

C.mat <- matrix(C.mat,4,9,byrow=T)

L.fct <- function(p) {
pmA <- MA%*%p
pmB <- MB%*%p
pmC <- MC%*%p
L <- as.matrix(c(pmA,pmB,pmC))
rownames(L) <- c("P(A=1)","P(A=2)","P(A=3)","P(B=1)","P(B=2)","P(B=3)","P(C=1)","P(C=2)","P(C=3)")
L
}

h.fct <- function(p) {
C.mat%*%L.fct(p)
}

a <- mph.fit(y,h.fct=h.fct,L.fct=L.fct,X=diag(9))
mph.summary(a)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 4 ):  Gsq =  50.39723 (p =  2.983e-10 )
Pearson's Score Stat  (df= 4 ):  Xsq =  48.9876 (p =  5.8737e-10 )
Generalized Wald Stat (df= 4 ):  Wsq =  49.87851 (p =  3.828e-10 )

Adj Resids: -5.556 -4.739 ... 5.556 6.357 , Number |Adj Resid| > 2:  16

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   1000

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)  Z-ratio p-value
beta1 0.3687506   0.01291861 28.54413       0
beta2 0.3569100   0.01271853 28.06221       0
beta3 0.2743394   0.01210117 22.67047       0
beta4 0.3687506   0.01291861 28.54413       0
beta5 0.3569100   0.01271853 28.06221       0
beta6 0.2743394   0.01210117 22.67047       0
beta7 0.3687506   0.01291861 28.54413       0
beta8 0.3569100   0.01271853 28.06221       0
beta9 0.2743394   0.01210117 22.67047       0

P(A=1)    0.372 0.3687506 0.01291861  0.4003281
P(A=2)    0.317 0.3569100 0.01271853 -4.8482153
P(A=3)    0.311 0.2743394 0.01210117  5.0529720
P(B=1)    0.343 0.3687506 0.01291861 -3.1724923
P(B=2)    0.405 0.3569100 0.01271853  5.8419001
P(B=3)    0.252 0.2743394 0.01210117 -3.0790551
P(C=1)    0.394 0.3687506 0.01291861  3.1107435
P(C=2)    0.356 0.3569100 0.01271853 -0.1105505
P(C=3)    0.250 0.2743394 0.01210117 -3.3547171

CONVERGENCE INFORMATION...
Original counts used.
iterations = 56 ,  time elapsed = 0.8
norm.diff  = 7.23424e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.90481e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

```

EXAMPLE 11. Contingency tables with given marginals.

3x4 Table.  See C. T. Ireland, S. Kullback, Biometrika, Vol. 55, No. 1. (Mar., 1968), pp. 179-188.

Model Fitted Below:

y <- Y ~ multinomial(n, p),   where the table probabilities p satisfy:

h(p) = [p1+, p2+, p+1, p+2, p+3] - [0.7837288, 0.14831812, 0.07827901, 0.46148631, 0.29658409] = 0.

```y <- scan()
783 7426 4709 2145
517 928 622 703
207 373 337 425

V1 <- gl(3,4,12)
V2 <- gl(4,1,12)

M1 <- M.fct(V1)
M2 <- M.fct(V2)

# Alternatively, the M matrices can be computed as
#   M1 <- kronecker(diag(3),matrix(1,1,4))
#   M2 <- kronecker(matrix(1,1,3),diag(4))

marg <- scan()
15028 2844 1303 1501 8849 5687 3138

margprob <- marg/19175

A <- scan()
1 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 1 0

A <- matrix(A,5,7,byrow=T)

h.fct <- function(p) {
obsmp <- rbind(M1%*%p,M2%*%p)
A%*%(obsmp - margprob)
}

a <- mph.fit(y=y,h.fct=h.fct)

#  Using the augmentation trick as described in Examples 8 and 9...
L.fct <- function(p) {
L <- as.matrix(c(M1%*%p,M2%*%p))
rownames(L) <- c("p[1,+]","p[2,+]","p[3,+]","p[+,1]","p[+,2]","p[+,3]","p[+,4]")
L
}

a2 <- mph.fit(y=y,h.fct=h.fct,L.fct=L.fct,X=diag(7))

mph.summary(a2)```
```MODEL GOODNESS OF FIT:   Test of   Ho: h(p)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 5 ):  Gsq =  11.30702 (pval =  0.04562 )
Pearson's Score Stat  (df= 5 ):  Xsq =  11.37322 (pval =  0.04446 )
Generalized Wald Stat (df= 5 ):  Wsq =  11.17887 (pval =  0.04795 )

Adj Resids: -2.372 -1.736 ... 2.15 2.816 , Number |Adj Resid| > 2:  3

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   19175

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)     Z-ratio p-value
beta1 0.7837            0 15782181923       0
beta2 0.1483            0  6489036323       0
beta3 0.0680            0  1838678421       0
beta4 0.0783            0         Inf       0
beta5 0.4615            0 17166239010       0
beta6 0.2966            0  7509947313       0
beta7 0.1637            0  3517050151       0

p[1,+]   0.7856  0.7837         0     0.6139
p[2,+]   0.1445  0.1483         0    -1.5036
p[3,+]   0.0700  0.0680         0     1.1191
p[+,1]   0.0786  0.0783         0     0.1613
p[+,2]   0.4551  0.4615         0    -1.7673
p[+,3]   0.2956  0.2966         0    -0.3004
p[+,4]   0.1707  0.1637         0     2.6352

CONVERGENCE INFORMATION...
Original counts used.
iterations = 38 ,  time elapsed = 0.15
norm.diff  = 4.5344e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 5.67182e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/16/09

cbind(a2\$L,margprob)
p[1,+] 0.78372881 0.78372881
p[2,+] 0.14831812 0.14831812
p[3,+] 0.06795306 0.06795306
p[+,1] 0.07827901 0.07827901
p[+,2] 0.46148631 0.46148631
p[+,3] 0.29658409 0.29658409
p[+,4] 0.16365059 0.16365059
Note: The fitted marginal probabilities do match the given marginal probabilities
```
```
```

EXAMPLE 12. Testing for (conditional) pairwise independence when response includes "unknown" category.

3x3x3 Table.  For a description of the model, see Michael Haber's Example 2, p. 433, in  Biometrics (in Shorter Communications), Vol. 42, No. 2. (Jun., 1986), pp. 429-435.

```
y <- scan()
201 28 37 21 8 7 12 0 0 27 9 5
14 4 4 2 0 0 142 15 0 27 12 0 0 0 0

#Counts are entered as y[i,j,k],  where right most subscripts move fastest, i,j,k=3.

A <- gl(3,9,27)
B <- gl(3,3,27)
C <- gl(3,1,27)

MAB <- M.fct(paste(A,B))
MAC <- M.fct(paste(A,C))
MBC <- M.fct(paste(B,C))

# Alternatively,
#   MAB <- kronecker(diag(9),matrix(1,1,3))
#   MAC <- kronecker(diag(3),kronecker(matrix(1,1,3),diag(3)))
#   MBC <- kronecker(matrix(1,1,3),diag(9))

M <- rbind(MAB,MAC,MBC)
Mr <- M[-c(3,6,7,8,9,12,15,16,17,18,21,24,25,26,27),]

# Mr*p gives only those marginal probabilities that do NOT involve the last unknown category.

C <- scan()
1 -1 -1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 -1 -1 1 0 0 0 0
0 0 0 0 0 0 0 0 1 -1 -1 1

C <- matrix(C,3,12,byrow=T)

L.fct <- function(m) {
p <- m/sum(m)
C%*%log(Mr%*%p)
}

X <- scan()
1 1 0
1 0 1
1 0 0

X <- matrix(X,3,3,byrow=T)

h.fct <- function(m) {
L.fct(m)
}
a <- mph.fit(y=y,h.fct=h.fct,step=0.5,maxiter=200,print.iter=T)    #zero counts require some fine tuning

mph.fit, version 3.0, 5/1/09 , running...
iter= 1 [ 0 ]  norm.diff= 9.519322  norm.score= 1.945013
iter= 2 [ 0 ]  norm.diff= 5.295804  norm.score= 2.219742
iter= 3 [ 0 ]  norm.diff= 2.827484  norm.score= 1.714237
iter= 4 [ 0 ]  norm.diff= 1.75119  norm.score= 1.205030
iter= 5 [ 0 ]  norm.diff= 1.603112  norm.score= 0.7939237
iter= 6 [ 0 ]  norm.diff= 1.595768  norm.score= 0.5042983
iter= 7 [ 0 ]  norm.diff= 1.596907  norm.score= 0.310447
iter= 8 [ 0 ]  norm.diff= 1.596947  norm.score= 0.1872117
iter= 9 [ 0 ]  norm.diff= 1.596726  norm.score= 0.1111047
iter= 10 [ 0 ]  norm.diff= 1.596618  norm.score= 0.06525863
iter= 11 [ 0 ]  norm.diff= 1.596599  norm.score= 0.03803102
iter= 12 [ 0 ]  norm.diff= 1.596611  norm.score= 0.02205917
[omitted lines]
iter= 39 [ 0 ]  norm.diff= 1.596669  norm.score= 3.832624e-08
iter= 40 [ 0 ]  norm.diff= 1.596669  norm.score= 2.224625e-08
iter= 41 [ 0 ]  norm.diff= 1.596669  norm.score= 1.443151e-08
Did NOT meet norm diff convergence criterion [1e-06]!
Norm score convergence criterion [1e-06] was met.
Time Elapsed: 0.28 seconds ,   Iterations:  41

mph.summary(a,T)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 3 ):  Gsq =  33.25029 (p =  2.852e-07 )
Pearson's Score Stat  (df= 3 ):  Xsq =  38.11701 (p =  2.6698e-08 )
Generalized Wald Stat (df= 3 ):  Wsq =  33.77626 (p =  2.2088e-07 )

Adj Resids: -5.526 -4.59 ... 5.749 5.779 , Number |Adj Resid| > 2:  15

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   575

CELL-SPECIFIC STATISTICS...
strata OBS           FV    StdErr.FV         PROB  StdErr.PROB ADJUSTED.RESIDS
y1       1 201 1.826391e+02 1.069712e+01 3.176332e-01 1.860368e-02    5.749431e+00
y2       1  28 3.712722e+01 5.169310e+00 6.456908e-02 8.990104e-03   -3.225310e+00
y3       1  37 3.531513e+01 5.745554e+00 6.141762e-02 9.992268e-03    4.589575e+00
y4       1  21 3.395460e+01 5.143368e+00 5.905148e-02 8.944987e-03   -5.526221e+00
y5       1   8 5.326702e+00 1.879169e+00 9.263829e-03 3.268120e-03    2.023091e+00
y6       1   7 9.337218e+00 2.987685e+00 1.623864e-02 5.195974e-03   -4.589575e+00
y7       1  12 1.174490e+01 3.389471e+00 2.042592e-02 5.894732e-03    1.986529e+00
y8       1   0 1.413466e-10 1.188893e-05 2.458203e-13 2.067640e-08   -1.413466e-10
y9       1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11
y10      1  27 3.866533e+01 5.286542e+00 6.724404e-02 9.193986e-03   -4.094285e+00
y11      1   9 7.836827e+00 1.643760e+00 1.362926e-02 2.858713e-03    5.187327e-01
y12      1   5 6.490560e+00 2.512333e+00 1.128793e-02 4.369275e-03   -4.589575e+00
y13      1  14 7.183925e+00 1.831879e+00 1.249378e-02 3.185877e-03    3.525268e+00
y14      1   4 1.102329e+00 9.212980e-01 1.917094e-03 1.602257e-03    5.778889e+00
y15      1   4 1.814156e+00 1.257619e+00 3.155055e-03 2.187164e-03    4.589575e+00
y16      1   2 2.230031e+00 1.485925e+00 3.878315e-03 2.584218e-03   -1.986529e+00
y17      1   0 1.754106e-16 1.324427e-08 3.050619e-19 2.303351e-11   -1.754106e-16
y18      1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11
y19      1 142 1.377167e+02 1.016841e+01 2.395074e-01 1.768419e-02    3.705642e+00
y20      1  15 1.821994e+01 4.109449e+00 3.168686e-02 7.146868e-03   -3.705642e+00
y21      1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11
y22      1  27 3.187476e+01 5.327032e+00 5.543437e-02 9.264404e-03   -3.705642e+00
y23      1  12 6.420549e+00 2.020355e+00 1.116617e-02 3.513660e-03    3.705642e+00
y24      1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11
y25      1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11
y26      1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11
y27      1   0 1.250153e-11 3.535750e-06 2.174179e-14 6.149131e-09   -1.250153e-11

CONVERGENCE INFORMATION...
Original counts used.
iterations = 41 ,  time elapsed = 0.28
norm.diff  = 1.59667 = dist between last and second last iterates.
Did NOT meet norm diff convergence criterion [1e-06]!
norm.score = 1.44315e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09
```
```
```

EXAMPLE 13. Kappa regression models.

Analysis of Multiple Sclerosis Data. See  Landis and Koch (Biometrics 1977) for data source and a WLS analysis.  See Lang (HLP 2002) for a more detailed description of the models and the corresponding ML analyses outlined below.

Model Systematic Components:

DIRECT SMOOTHING MODEL:  K(p) = Xb,  where K(p) is the vector of eight weighted kappa statistics.  The design matrix X forces certain kappa coefficients to be equal.

INDIRECT SMOOTHING MODEL:  L(p) = [L1(p), L2(p)],  where  L1(p) = log(p),  L2(p) = K(p), the eight kappa coefficients.   The model constrains the data parameters through

L1(p) = Xb,  L2(p)  = X2 κ,    where  X2  = diag(8), the 8x8 identity matrix.

The model L1(p) =  Xb has (k,i,j) component of the form

log(pkij) =    bk + b(A)ki + b(B)kj + b(A*B)k*i*j + b(agree)k*I(i=j),  where k indexes the two populations, i and j are the responses by neurologist A and B.

The models have the generic form L(p) = Xb.  They can be seen to be HLP models.

```
#CREATE Weighted Kappa function...
kappa.weighted.fct <- function(w,p,nr) {
#
#  Computes Weighted Kappa Statistic
#  See Landis and Koch 1977.
#  Author:  Joseph B. Lang
#  Created: June 7, 2002
#
#  Input: w = vectorized version of two-way
#                  table of weights, w = (w11,w12,...,wRR)
#           p = vectorized version of two-way
#                 probabilities, p = (p11,p12,...,pRR)
#          nr = number of rows (and columns) in the table.
#  Output:    Weighted kappa.
#
p <- matrix(p,nr,nr,byrow=T)
prow <- apply(p,1,sum)
pcol <- apply(p,2,sum)
pind <- prow%*%t(pcol)
pind <- matrix(t(pind),nr*nr,1)
p <- as.matrix(c(t(p)))
w <- as.matrix(w)
obsa <- t(w)%*%p
expa <- t(w)%*%pind
c((obsa - expa)/(1-expa))
}

w1 <- scan()
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

w2 <- scan()
1 1 0 0
1 1 0 0
0 0 1 0
0 0 0 1

w3 <- scan()
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 1

w4 <- scan()
1 1 0 0
1 1 1 0
0 1 1 1
0 0 1 1

y <- scan()
38  5  0  1
33 11  3  0
10 14  5  6
3  7  3 10
5 3 0 0
3 11 4 0
2 13 3 4
1 2 4 14

strata <- c(rep("Popn1",16),rep("Popn2",16))
# Fit the DIRECT SMOOTHING MODEL
L.fct <- function(p) {
p1 <- p[1:16]
p2 <- p[17:32]
L <- rbind(
kappa.weighted.fct(w1,p1 ,4),
kappa.weighted.fct(w2,p1 ,4),
kappa.weighted.fct(w3,p1 ,4),
kappa.weighted.fct(w4,p1 ,4),
kappa.weighted.fct(w1,p2 ,4),
kappa.weighted.fct(w2,p2 ,4),
kappa.weighted.fct(w3,p2 ,4),
kappa.weighted.fct(w4,p2 ,4))
rownames(L) <- c("K(w1,Popn1)","K(w2,Popn1)","K(w3,Popn1)","K(w4,Popn1)",
"K(w1,Popn2)","K(w2,Popn2)","K(w3,Popn2)","K(w4,Popn2)")
L
}

XK <- scan()
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1

XK <- matrix(XK,8,5,byrow=T)
a <- mph.fit(y,L.fct=L.fct,X=XK,norm.diff.conv=10,norm.score.conv=1e-8,strata=strata)
# Don't forget to input the strata variable, as this determines the definition of the table probabilities.
mph.summary(a)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 3 ):  Gsq =  2.16255 (p =  0.53936 )
Pearson's Score Stat  (df= 3 ):  Xsq =  2.11354 (p =  0.54918 )
Generalized Wald Stat (df= 3 ):  Wsq =  2.26792 (p =  0.51869 )

Adj Resids: -1.392 -1.379 ... 1.412 1.413 , Number |Adj Resid| > 2:  0

SAMPLING PLAN INFORMATION...
Number of strata:   2
Strata identifiers: Popn1, Popn2
Strata with fixed sample sizes: all
Observed strata sample sizes:   149, 69

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)  Z-ratio      p-value
beta1 0.2349553   0.04241687 5.539195 3.038648e-08
beta2 0.3150272   0.04935842 6.382442 1.742859e-10
beta3 0.3888738   0.05743424 6.770766 1.281020e-11
beta4 0.5831200   0.06909629 8.439236 0.000000e+00
beta5 0.7934268   0.08080265 9.819316 0.000000e+00

K(w1,Popn1) 0.2079425 0.2349553 0.04241687 -0.97835070
K(w2,Popn1) 0.3275399 0.3150272 0.04935842  0.31584009
K(w3,Popn1) 0.4081121 0.3888738 0.05743424  0.44185351
K(w4,Popn1) 0.5964663 0.5831200 0.06909629  0.41284558
K(w1,Popn2) 0.2965166 0.2349553 0.04241687  0.94238307
K(w2,Popn2) 0.3324734 0.3150272 0.04935842  0.26087642
K(w3,Popn2) 0.3864188 0.3888738 0.05743424 -0.02982431
K(w4,Popn2) 0.7893773 0.7934268 0.08080265 -0.12157636

CONVERGENCE INFORMATION...
Original counts used.
iterations = 16 ,  time elapsed = 5.17
norm.diff  = 2.26061 = dist between last and second last iterates.
Norm diff convergence criterion  was met.
norm.score = 5.97723e-09 = norm of score at last iteration.
Norm score convergence criterion [1e-08] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

# Fit the INDIRECT SMOOTHING MODEL

L.fct <- function(p) {
p1 <- p[1:16]
p2 <- p[17:32]
L <- rbind(log(p),
kappa.weighted.fct(w1,p1 ,4),
kappa.weighted.fct(w2,p1 ,4),
kappa.weighted.fct(w3,p1 ,4),
kappa.weighted.fct(w4,p1 ,4),
kappa.weighted.fct(w1,p2 ,4),
kappa.weighted.fct(w2,p2 ,4),
kappa.weighted.fct(w3,p2 ,4),
kappa.weighted.fct(w4,p2 ,4))
rownames(L) <- c(paste(sep=".","logp",1:32), "K(w1,Popn1)","K(w2,Popn1)","K(w3,Popn1)","K(w4,Popn1)",
"K(w1,Popn2)","K(w2,Popn2)","K(w3,Popn2)","K(w4,Popn2)")
L
}

A <- factor(gl(4,4))
B <- factor(gl(4,1,16))
Ascore <- c(A)
Bscore <- c(B)

agree <- scan()
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

XA1 <-  model.matrix(~A + B + Ascore:Bscore + agree)
XA2 <- XA1

X <- block.fct(XA1,XA2,diag(8))         #block.fct  is available in mph.supp.Rcode.txt
b <- mph.fit(y,L.fct=L.fct,X=X,strata=strata)

mph.summary(b)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 14 ):  Gsq =  18.25286 (p =  0.19551 )
Pearson's Score Stat  (df= 14 ):  Xsq =  22.48127 (p =  0.069252 )
Generalized Wald Stat (df= 14 ):  Wsq =  13.13214 (p =  0.51615 )

Adj Resids: -1.735 -1.675 ... 2.314 2.748 , Number |Adj Resid| > 2:  3

SAMPLING PLAN INFORMATION...
Number of strata:   2
Strata identifiers: Popn1, Popn2
Strata with fixed sample sizes: all
Observed strata sample sizes:   149, 69

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)      Z-ratio      p-value
(Intercept)   -2.18195695   0.21789484 -10.01380733 0.000000e+00
A2            -0.98252487   0.34280421  -2.86614005 4.155104e-03
A3            -2.63574417   0.59030845  -4.46502870 8.005802e-06
A4            -5.07053731   1.00283130  -5.05622162 4.276444e-07
B2            -2.50209975   0.36908649  -6.77916906 1.208700e-11
B3            -5.95129014   0.85057662  -6.99677137 2.619238e-12
B4            -8.20966143   1.38875396  -5.91153053 3.389434e-09
agree         -0.02783032   0.24285556  -0.11459617 9.087652e-01
Ascore:Bscore  0.80384747   0.15516362   5.18064414 2.211210e-07
(Intercept)   -3.82657082   0.40135119  -9.53422059 0.000000e+00
A2            -0.93504420   0.59981172  -1.55889618 1.190210e-01
A3            -3.00645110   1.21010157  -2.48446177 1.297474e-02
A4            -6.18631027   2.08586141  -2.96582997 3.018673e-03
B2            -1.26080589   0.64291838  -1.96106681 4.987123e-02
B3            -5.23293410   1.44644377  -3.61779298 2.971259e-04
B4            -8.36055037   2.47380527  -3.37963156 7.258306e-04
agree          0.02769186   0.34868656   0.07941764 9.367004e-01
Ascore:Bscore  1.04115569   0.29708388   3.50458495 4.573196e-04
[3.1]          0.20794246   0.05113013   4.06692644 4.763727e-05
[3.2]          0.33160222   0.05237560   6.33123511 2.432063e-10
[3.3]          0.41743362   0.06030132   6.92246222 4.438672e-12
[3.4]          0.55622648   0.07116497   7.81601533 5.551115e-15
[3.5]          0.29651657   0.07818321   3.79258641 1.490863e-04
[3.6]          0.38035962   0.07129014   5.33537446 9.534758e-08
[3.7]          0.47547451   0.07625832   6.23505065 4.516318e-10
[3.8]          0.73473413   0.09436958   7.78570910 6.883383e-15

logp.1      -1.3663601 -1.4059398 0.13749315  9.427814e-01
logp.2      -3.3945084 -3.0763617 0.28963410 -1.357818e+00
logp.3            -Inf -5.7217047 0.54095671          -Inf
logp.4      -5.0039463 -7.1762285 0.87175123  7.674599e-01
logp.5      -1.5074387 -1.5567869 0.15132358  1.046012e+00
logp.6      -2.6060510 -2.4790220 0.24826770 -1.173576e+00
logp.7      -3.9053340 -4.2926871 0.37191660  6.585275e-01
logp.8            -Inf -4.9433634 0.51585603          -Inf
logp.9      -2.7013612 -2.4061587 0.20273772 -1.809184e+00
logp.10     -2.3648890 -2.4967160 0.21627385  7.877068e-01
logp.11     -3.3945084 -3.5621943 0.36104443  5.317376e-01
logp.12     -3.2121868 -3.3811928 0.31556472  5.600586e-01
logp.13     -3.9053340 -4.0371044 0.44794027  3.168800e-01
logp.14     -3.0580362 -3.3238142 0.27927295  8.336371e-01
logp.15     -3.9053340 -3.5576147 0.36606789 -1.129845e+00
logp.16     -2.7013612 -2.6284264 0.26455789 -5.719039e-01
logp.17     -2.6246686 -2.7577233 0.40026306  5.738569e-01
logp.18     -3.1354942 -3.0050653 0.42768254 -4.227939e-01
logp.19           -Inf -5.9360378 0.76941354          -Inf
logp.20           -Inf -8.0224984 1.42386504          -Inf
logp.21     -3.1354942 -2.6793036 0.38587064 -2.085649e+00
logp.22     -1.8362112 -1.8301063 0.25548478 -5.932316e-02
logp.23     -2.8478121 -3.7476149 0.45024654  1.427064e+00
logp.24           -Inf -4.7929198 0.71595805          -Inf
logp.25     -3.5409593 -3.7095548 0.48406780  2.878691e-01
logp.26     -1.6691571 -1.8468936 0.24432611  1.336046e+00
logp.27     -3.1354942 -2.6678629 0.38466892 -2.171738e+00
logp.28     -2.8478121 -2.6997039 0.35137338 -5.315800e-01
logp.29     -4.2341065 -5.8482583 1.00711518  8.075958e-01
logp.30     -3.5409593 -2.9444414 0.40136378 -1.888459e+00
logp.31     -2.8478121 -2.7519469 0.38094578 -3.688923e-01
logp.32     -1.5950492 -1.6872485 0.23696240  1.051840e+00
K(w1,Popn1)  0.2079425  0.2079425 0.05113013  1.854186e-11
K(w2,Popn1)  0.3275399  0.3316022 0.05237560 -1.162164e-01
K(w3,Popn1)  0.4081121  0.4174336 0.06030132 -2.387584e-01
K(w4,Popn1)  0.5964663  0.5562265 0.07116497  1.160200e+00
K(w1,Popn2)  0.2965166  0.2965166 0.07818321  9.511947e-12
K(w2,Popn2)  0.3324734  0.3803596 0.07129014 -1.198556e+00
K(w3,Popn2)  0.3864188  0.4754745 0.07625832 -1.556305e+00
K(w4,Popn2)  0.7893773  0.7347341 0.09436958  1.884445e+00

CONVERGENCE INFORMATION...
Original counts used.
iterations = 6 ,  time elapsed = 2.36
norm.diff  = 1.27439e-08 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 1.20776e-08 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

```

EXAMPLE 14.  Marginal cumulative probit model (in conjunction with loglinear association model)

Marijuana use among U.S. male teens.  Data Source: U.S. National Youth Survey (Elliott et al. 1989).

```                      Use at  17
0    1    2
--------------
0     56   13    7
Use at 15     1      6    4   10
2      1    5   15

0 = None,  1=Used no more than once, 2=Used more than once per month in the past year
```

The models that are fitted below have the following form...

L(p) = [L1(p), L2(p)] = [X1b, X2a],     where     L1(p) = Φ-1[p2++ p3+, p3+, p+2 + p+3, p+3]   and   L2(p) = log(p).

Here,   Φ(.) is the standard normal pdf.

The linear predictor for the marginal probit model is defined as

X1b  = [b(cut1), b(cut2), b(cut1)+b(AGE), b(cut2)+b(AGE)],

and the loglinear association model L2(p) = log(p) = X2a has (i,j) component

log(pij) =  a + a(Use15)i + a(Use17)j + a*Use15.scorei*Use17.scorej   [linear-by-linear association model]  or

log(pij) = a + a(Use15)i + a(Use17)j   [independence association model].

The models, which have the generic form L(p) = Xb, are HLP models.

```d <- scan(what=list(Use15="",Use17="",count=0))
0 0 56
0 1 13
0 2 7
1 0 6
1 1 4
1 2 10
2 0 1
2 1 5
2 2 15

d <- data.frame(d)
Use15.score <- c(d\$Use15)-1
Use17.score <- c(d\$Use17)-1
d <- data.frame(d,Use15.score,Use17.score)
y <- d\$count

M15 <- M.fct(d\$Use15)
M17 <- M.fct(d\$Use17)

L.fct <- function(p) {
p15 <- M15%*%p
p17 <- M17%*%p
L <- as.matrix(c(
qnorm(p15+p15),
qnorm(p15),
qnorm(p17+p17),
qnorm(p17),
log(p)
))
rownames(L) <- c("qnorm(P(Use15>0))","qnorm(P(Use15>1))",
"qnorm(P(Use17>0))","qnorm(P(Use17>1))",
paste("logp",1:9))
L
}

CUT <- factor(c(1,2,1,2))
AGE <- factor(c(15,15,17,17))

Xprobit <- model.matrix(~CUT+AGE-1)
Xloglin <- model.matrix(~ Use15 + Use17 +Use15.score:Use17.score,d)
X <- block.fct(Xprobit,Xloglin)

a1 <- mph.fit(y,L.fct=L.fct,X=X)

mph.summary(a1)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 4 ):  Gsq =  1.77057 (p =  0.77786 )
Pearson's Score Stat  (df= 4 ):  Xsq =  1.87796 (p =  0.75819 )
Generalized Wald Stat (df= 4 ):  Wsq =  1.84803 (p =  0.76368 )

Adj Resids: -1.197 -0.64 ... 0.831 1.038 , Number |Adj Resid| > 2:  0

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   117

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)   Z-ratio      p-value
CUT1                    -0.3909243   0.11400659 -3.428962 6.058944e-04
CUT2                    -0.9091368   0.12846677 -7.076824 1.475042e-12
AGE17                    0.2999332   0.09813575  3.056309 2.240799e-03
(Intercept)             -0.7542226   0.09488147 -7.949103 1.776357e-15
Use151                  -2.1584116   0.26134587 -8.258832 2.220446e-16
Use152                  -3.7075766   0.62358890 -5.945546 2.755366e-09
Use171                  -1.3618662   0.20239880 -6.728627 1.712719e-11
Use172                  -2.0394637   0.36806236 -5.541082 3.006082e-08
Use15.score:Use17.score  1.1362827   0.20718661  5.484344 4.150058e-08

qnorm(P(Use15>0)) -0.38416697 -0.39092427 0.11400659  0.1952372
qnorm(P(Use15>1)) -0.91732119 -0.90913681 0.12846677 -0.1962242
qnorm(P(Use17>0)) -0.09655862 -0.09099107 0.11249628 -0.1955444
qnorm(P(Use17>1)) -0.60224878 -0.60920361 0.11879774  0.1950816
logp 1            -0.73682224 -0.75422257 0.09488147  0.6982065
logp 2            -2.19722458 -2.11608873 0.14875078 -0.4043946
logp 3            -2.81626379 -2.79368631 0.32721044 -0.1455725
logp 4            -2.97041447 -2.91263418 0.22644742 -0.1850601
logp 5            -3.37587957 -3.13821761 0.28176346 -0.7192649
logp 6            -2.45958884 -2.67953247 0.16875835  0.7432298
logp 7            -4.76217393 -4.46179917 0.59937397 -0.4919918
logp 8            -3.15273602 -3.55109988 0.25921308  0.8452161
logp 9            -2.05412373 -1.95613201 0.21405273 -1.2569055

CONVERGENCE INFORMATION...
Original counts used.
iterations = 6 ,  time elapsed = 0.05
norm.diff  = 3.97857e-09 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 2.05338e-10 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

#Fit the Probit Marginal model, along with the Independence Association model```
```
Xprobit <- model.matrix(~CUT+AGE)
Xloglin <- model.matrix(~Use15+Use17,d)
X <- block.fct(Xprobit,Xloglin)

a2 <- mph.fit(y,L.fct=L.fct,X=X)```
```
a1\$Xsq; a1\$df; a2\$Xsq; a2\$df
PEARSON SCORE STATISTIC
1.877965
 4
PEARSON SCORE STATISTIC
45.3922
 5
cbind(a1\$beta,sqrt(a1\$covbeta[3,3]),a2\$beta,sqrt(a2\$covbeta[3,3]))
[,1]       [,2]      [,3]      [,4]
[1,] 0.2999332 0.09813575 0.2975494 0.1575182    #The b(AGE) coef estimates and ase's.```
 Remark:  As expected, the independence/probit model appears to be untenable for these repeated measures data.  Although the independence-model estimator of the marginal probit parameter b(AGE) is consistent, the reported standard error is not.  In fact, it can be anticipated that the actual standard error of this independence-model estimator is significantly larger than the  saturated-model estimator which correctly accommodates the association between the two responses, AGE15 and AGE17 marijuana use responses.
```#If we wanted to use log(m), rather than log(p),  loglinear model, we'd have to use ...

Lm.fct <- function(m) {
p <- m/sum(m)
p15 <- M15%*%p
p17 <- M17%*%p
L <- as.matrix(c(
qnorm(p15+p15),
qnorm(p15),
qnorm(p17+p17),
qnorm(p17),
log(m)
))
rownames(L) <- c("qnorm(P(Use15>0))","qnorm(P(Use15>1))",
"qnorm(P(Use17>0))","qnorm(P(Use17>1))",
paste("logm",1:9))
L
}
Xprobit <- model.matrix(~CUT+AGE-1)
Xloglin <- model.matrix(~Use15+Use17+Use15.score:Use17.score,d)
X <- block.fct(Xprobit,Xloglin)

a3 <- mph.fit(y,L.fct=Lm.fct,X=X,L.mean=T)

```

EXAMPLE 15.   Marginal Cumulative Logit and Linear-by-Linear Loglinear Association Model.

Opinions about Government Spending (data sources: General Social Survey 1989 and  Table 1 of Lang and Agresti, JASA94).

```       Cities                      1              2             3
Law Enforcement             1    2   3     1    2   3    1    2    3
Environment   Health
1           1            62   17   5    90   42   3   74   31   11
2            11    7   0    22   18   1   19   14    3
3             2    3   1     2    0   1    1    3    1
2           1            11    3   0    21   13   2   20    8    3
2             1    4   0     6    9   0    6    5    2
3             1    0   1     2    1   1    4    3    1
3           1             3    0   0     2    1   0    9    2    1
2             1    0   0     2    1   0    4    2    0
3             1    0   0     0    0   0    1    2    3

1= too little,  2= about right,  3 = too much  spending```

Models fitted below:

y ← Y ~ multinomial(n, p),  where the table probabilities p  satisfy...

...the generalized loglinear model  used in Lang and Agresti (JASA94):  L(p) = [L1(p), L2(p)] = [X1a, X2b] = Xδ,   where the components are defined as follows:

L1(p) = log(p).

L2(p) = vector of oneway cumulative logits = log (odd(Rt <= h) , t=1,2,3,4,  h=1,2).  Here, Rt is the tth response variable (t=1=Environment, 2=Health, 3=Cities, 4=Law Enforcement).

X1a  has (ijkl)th element of the form:

a(0) + a(1)i + a(2)j+ a(3)k + a(4)l + a(1,2)*i*j + a(1,3)*i*k + a(1,4)*i*l + a(2,3)*j*k + a(2,4)*j*l + a(3,4)*k*l

X2b has (t,h)th element of the form:   b(level)h + b(response)t

Note that L(p) = Xδ  has the generalized loglinear model form ClogAp = Xδ.  This model was denoted J(LxL)∩M(PO) in Lang and Agresti (JASA94), because the association structure is modeled via a  linear-by-linear loglinear model and the one-way marginal structure is modeled via a proportional odds cumulative logit model.

For convenience, we will augment this model so that estimates of the oneway marginal probabilities along with their corresponding adjusted residuals will be output. Specifically,  we will re-express the  model in the equivalent  form

L(p) = [L1(p), L2(p), L3(p)] = [X1a, X2b, X3θ] = Xβ,   where  L3(p) = (P(Rt=h), t=1,2,3,4, h=1,2,3) is the vector of one-way marginal probabilities and X3 = diag(12), the 12x12 identity matrix.

In this model, the marginal probabilities L3(p) are indirectly smoothed via the constraints implied by [L1(p), L2(p)] = [X1a, X2b].   See Lang (JASA 2005) for a discussion of these "indirect smoothing models."

The data model (data sampling model along with the data parameter model),  y ←Y ~ multinomial(n, p)  and   L(p) = Xβ,  is a homogeneous linear predictor model  (see Lang JASA 2005).

```y <- scan()
62 17 5 90 42 3 74 31 11
11 7 0 22 18 1 19 14 3
2 3 1 2 0 1 1 3 1
11 3 0 21 13 2 20 8 3
1 4 0 6 9 0 6 5 2
1 0 1 2 1 1 4 3 1
3 0 0 2 1 0 9 2 1
1 0 0 2 1 0 4 2 0
1 0 0 0 0 0 1 2 3

# The counts are entered as y = (yijkl: i,j,k,l=1,2,3), where the order follows the rule that
# the rightmost subscripts move faster. The subscripts on yijkl,  correspond
# to the four responses as follows:
#        i=Environment, j=Health, k=Cities,  and l= Law Enforcement.
# For example, y1212 = 7 is the 11th observation in the vector y; it means
# that 7 of those sampled responded (Environ=1, Health=2, Cities=1, Law=2).
# See Table 1 of Lang and Agresti (JASA94).

Environ <- factor(gl(3,27,81))
Health  <- factor(gl(3,9,81))
Cities <- factor(gl(3,3,81))
Law <- factor(gl(3,1,81))

data.frame(Environ,Health,Cities,Law,y)

Escore <- c(Environ)
Hscore <- c(Health)
Cscore <- c(Cities)
Lscore <- c(Law)

## CREATE THE LINK FUNCTION L(m). ##

ME <- M.fct(Environ)
MH <- M.fct(Health)
MC <- M.fct(Cities)
ML <- M.fct(Law)
M <- rbind(ME,MH,MC,ML)

CUM0 <- scan()
1 0 0
0 1 1
1 1 0
0 0 1

CUM0 <- matrix(CUM0,4,3,byrow=T)
CUM <- kronecker(diag(4),CUM0)
AMarg <- CUM%*%M

C0 <- scan()
1 -1 0 0
0 0 1 -1

C0 <- matrix(C0,2,4,byrow=T)
CMarg <- kronecker(diag(4),C0)

Lp.fct <- function(p) {
logp <- log(p)
onewayclogit <- CMarg%*%log(AMarg%*%p)
onewayprob <- M%*%p
L <- as.matrix(c(logp,onewayclogit,onewayprob))
rownames(L) <- c(paste("logp",1:81),
"loddsE<2","loddsE<3",
"loddsH<2","loddsH<3",
"loddsC<2","loddsC<3",
"loddsL<2","loddsL<3",
"P(E=1)","P(E=2)","P(E=3)",
"P(H=1)","P(H=2)","P(H=3)",
"P(C=1)","P(C=2)","P(C=3)",
"P(L=1)","P(L=2)","P(L=3)")
L
}

##CREATE DESIGN MATRIX  X = block[XAssoc, XMarg, diag(12)] ##

XAssoc <- model.matrix(~Environ+Health+Cities+Law+Escore:Hscore + Escore:Cscore
+ Escore:Lscore + Hscore:Cscore + Hscore:Lscore + Cscore:Lscore)

dm <- scan(what=list(CUT="",RESP=""))
1 E
2 E
1 H
2 H
1 C
2 C
1 L
2 L

dm <- data.frame(dm)

XMarg <- model.matrix(~CUT + RESP - 1,data=dm)

X <- block.fct(XAssoc, XMarg, diag(12))

## FIT THE MODEL AND SUMMARIZE ##

ap <-  mph.fit(y, L.fct=Lp.fct, X=X)
mph.summary(ap)

MODEL GOODNESS OF FIT:   Test of   Ho: h(m)=0 vs. Ha: not Ho...

Likelihood Ratio Stat (df= 69 ):  Gsq =  71.535 (p =  0.39365 )
Pearson's Score Stat  (df= 69 ):  Xsq =  64.33728 (p =  0.6365 )
Generalized Wald Stat (df= 69 ):  Wsq =  43.12471 (p =  0.99382 )

Adj Resids: -1.996 -1.743 ... 2.043 2.28 , Number |Adj Resid| > 2:  2

SAMPLING PLAN INFORMATION...
Number of strata:   1
Strata identifiers: 1
Strata with fixed sample sizes: all
Observed strata sample sizes:   607

LINEAR PREDICTOR MODEL RESULTS...
BETA StdErr(BETA)      Z-ratio      p-value
(Intercept)   -3.856486433  0.178483120 -21.60700935 0.000000e+00
Environ2      -2.588205013  0.308823990  -8.38084183 0.000000e+00
Environ3      -5.493086736  0.655267978  -8.38296227 0.000000e+00
Health2       -2.595621274  0.297230071  -8.73270078 0.000000e+00
Health3       -5.610605951  0.637689942  -8.79832907 0.000000e+00
Cities2       -0.034223012  0.204332191  -0.16748713 8.669868e-01
Cities3       -0.937638792  0.413278262  -2.26878323 2.328151e-02
Law2          -1.826510927  0.263654376  -6.92767159 4.278133e-12
Law3          -4.197613831  0.551070670  -7.61719695 2.597922e-14
Escore:Hscore  0.498689176  0.112274533   4.44169449 8.925322e-06
Escore:Cscore  0.314319282  0.103927942   3.02439627 2.491299e-03
Escore:Lscore -0.003494676  0.112011470  -0.03119927 9.751106e-01
Hscore:Cscore  0.051558696  0.099563584   0.51784692 6.045651e-01
Hscore:Lscore  0.454699678  0.103442225   4.39568735 1.104228e-05
Cscore:Lscore  0.198569482  0.089578780   2.21670224 2.664345e-02
CUT1          -1.342154170  0.086191456 -15.57177748 0.000000e+00
CUT2           0.515427959  0.079998129   6.44300014 1.171345e-10
RESPE          2.336871534  0.117010601  19.97145142 0.000000e+00
RESPH          2.255916301  0.120197721  18.76837833 0.000000e+00
RESPL          1.874491508  0.111986680  16.73852200 0.000000e+00
[3.1]          0.730018682  0.017844267  40.91054536 0.000000e+00
[3.2]          0.215418742  0.013701431  15.72235327 0.000000e+00
[3.3]          0.054562575  0.005782026   9.43658465 0.000000e+00
[3.4]          0.713769394  0.018116177  39.39955879 0.000000e+00
[3.5]          0.227338141  0.013759216  16.52260869 0.000000e+00
[3.6]          0.058892465  0.006166318   9.55066935 0.000000e+00
[3.7]          0.207156029  0.014156292  14.63349475 0.000000e+00
[3.8]          0.418922021  0.014035304  29.84773458 0.000000e+00
[3.9]          0.373921950  0.018727908  19.96602862 0.000000e+00
[3.10]         0.630028094  0.019252284  32.72484993 0.000000e+00
[3.11]         0.286027282  0.014261072  20.05650639 0.000000e+00
[3.12]         0.083944625  0.007931189  10.58411547 0.000000e+00

logp 1   -2.28139441 -2.34214479 0.095988109  0.7667174334
logp 2   -3.57531545 -3.51888124 0.121653595 -0.2852222757
logp 3   -4.79909088 -5.24020966 0.263546567  0.9008338501
logp 4   -1.90871912 -1.81192035 0.055480544 -1.3221152708
logp 5   -2.67085917 -2.79008731 0.083716619  0.8845427650
logp 6   -5.30991650 -4.31284624 0.146242602 -3.1537711693
logp 7   -2.10446370 -2.15088867 0.091828124  0.7271838532
logp 8   -2.97454159 -2.93048614 0.099761757 -0.3173819288
logp 9   -4.01063352 -4.25467560 0.194390396  0.8817534685
logp 10  -4.01063352 -3.93281852 0.147114161 -0.3155472181
logp 11  -4.46261864 -4.65485528 0.151960064  0.4990257493
logp 12         -Inf -5.92148402 0.258577660          -Inf
logp 13  -3.31748634 -3.35103537 0.103701811  0.1803506853
logp 14  -3.51815703 -3.87450266 0.116467520  1.4071862606
logp 15  -6.40852879 -4.94256192 0.143423649 -3.2094351956
logp 16  -3.46408981 -3.63844500 0.136629080  0.8473253486
logp 17  -3.76947146 -3.96334280 0.125889658  0.7369118170
logp 18  -5.30991650 -4.83283257 0.183270761 -1.1515979690
logp 19  -5.71538161 -5.94285565 0.314590661  0.3133186698
logp 20  -5.30991650 -6.21019273 0.279577646  1.0463497281
logp 21  -6.40852879 -7.02212179 0.399062617  0.4725321071
logp 22  -5.71538161 -5.30951381 0.215228332 -0.7599555379
logp 23         -Inf -5.37828141 0.161798208          -Inf
logp 24  -6.40852879 -5.99164099 0.265133659 -0.5440565922
logp 25  -6.40852879 -5.54536473 0.282673600 -1.4797191620
logp 26  -5.30991650 -5.41556285 0.217235401  0.1862820936
logp 27  -6.40852879 -5.83035295 0.309194311 -0.8490914514
logp 28  -4.01063352 -4.12083603 0.159071976  0.4036179341
logp 29  -5.30991650 -5.30106714 0.189103424 -0.0163488085
logp 30         -Inf -7.02589024 0.330757519          -Inf
logp 31  -3.36400635 -3.27629230 0.101436265 -0.4927805100
logp 32  -3.84357943 -4.25795393 0.130274768  1.3250353670
logp 33  -5.71538161 -5.78420754 0.207945842  0.0982490088
logp 34  -3.41279652 -3.30094133 0.120324284 -0.6616066827
logp 35  -4.32908725 -4.08403349 0.132070516 -0.8732852306
logp 36  -5.30991650 -5.41171762 0.225586007  0.1809478286
logp 37  -6.40852879 -5.21282057 0.184460807 -2.3148291504
logp 38  -5.02223443 -5.93835201 0.191118931  1.1960147896
logp 39         -Inf -7.20847543 0.292969107          -Inf
logp 40  -4.61676932 -4.31671815 0.132339573 -0.9290704495
logp 41  -4.21130421 -4.84368011 0.145148255  1.4646959078
logp 42         -Inf -5.91523404 0.168305243          -Inf
logp 43  -4.61676932 -4.28980849 0.142729601 -1.0435538445
logp 44  -4.79909088 -4.61820096 0.137112069 -0.4726692938
logp 45  -5.71538161 -5.49118542 0.179786309 -0.3707831882
logp 46  -6.40852879 -6.72416852 0.313328218  0.2799401698
logp 47         -Inf -6.99500029 0.280566245          -Inf
logp 48  -6.40852879 -7.81042402 0.394972611  0.7094006523
logp 49  -5.71538161 -5.77650740 0.193018061  0.0870906645
logp 50  -6.40852879 -5.84876968 0.141283077 -0.7549586446
logp 51  -6.40852879 -6.46562394 0.238341328  0.0570865128
logp 52  -5.02223443 -5.69803905 0.236612342  1.0261032969
logp 53  -5.30991650 -5.57173184 0.168630327  0.4124166195
logp 54  -6.40852879 -5.99001662 0.258680890 -0.5451050874
logp 55  -5.30991650 -6.21620397 0.340140679  1.0773149652
logp 56         -Inf -7.39992976 0.370492121          -Inf
logp 57         -Inf -9.12824753 0.552626047          -Inf
logp 58  -5.71538161 -5.05734095 0.193879185 -1.4039179534
logp 59  -6.40852879 -6.04249727 0.221722074 -0.4565869832
logp 60         -Inf -7.57224555 0.398403024          -Inf
logp 61  -4.21130421 -4.76767071 0.227096973  1.4838118658
logp 62  -5.71538161 -5.55425754 0.232411843 -0.2649112170
logp 63  -6.40852879 -6.88543634 0.400463523  0.3961777810
logp 64  -6.40852879 -6.80949934 0.327427053  0.3407838090
logp 65         -Inf -7.53852545 0.326104252          -Inf
logp 66         -Inf -8.81214355 0.475427281          -Inf
logp 67  -5.71538161 -5.59907763 0.179375460 -0.1813535926
logp 68  -6.40852879 -6.12953426 0.167189288 -0.3272239667
logp 69         -Inf -7.20458287 0.310174095          -Inf
logp 70  -5.02223443 -5.25784869 0.205559437  0.4513390204
logp 71  -5.71538161 -5.58973584 0.171184375 -0.1962228393
logp 72         -Inf -6.46621497 0.308626547          -Inf
logp 73  -6.40852879 -7.82215811 0.468598685  0.7167739163
logp 74         -Inf -8.09648455 0.431739288          -Inf
logp 75         -Inf -8.91540296 0.560422556          -Inf
logp 76         -Inf -6.56017771 0.314340321          -Inf
logp 77         -Inf -6.63593466 0.255312333          -Inf
logp 78         -Inf -7.25628360 0.380583676          -Inf
logp 79  -6.40852879 -6.16739007 0.340998263 -0.2950815133
logp 80  -5.71538161 -6.04457754 0.265725370  0.4171827178
logp 81  -5.30991650 -6.46635699 0.380972464  1.2104544224
loddsE<2  1.00207436  0.99471736 0.090538025  0.5786279423
loddsE<3  2.79379093  2.85229949 0.112086258 -0.4203606869
loddsH<2  0.91975294  0.91376213 0.088673239  0.4227572342
loddsH<3  2.79379093  2.77134426 0.111256896  0.1704312075
loddsC<2 -1.26125559 -1.34215417 0.086191456  1.5860601196
loddsC<3  0.47321734  0.51542796 0.079998130 -1.6717484121
loddsL<2  0.50117219  0.53233734 0.082594968 -1.9876428816
loddsL<3  2.65147985  2.38991947 0.103139187  2.5184851520
P(E=1)    0.73146623  0.73001868 0.017844267  0.5776478132
P(E=2)    0.21087315  0.21541874 0.013701431 -0.4772650886
P(E=3)    0.05766063  0.05456258 0.005782026  0.4314828693
P(H=1)    0.71499176  0.71376939 0.018116177  0.4222152620
P(H=2)    0.22734761  0.22733814 0.013759216  0.0009466647
P(H=3)    0.05766063  0.05889246 0.006166318 -0.1687532296
P(C=1)    0.22075783  0.20715603 0.014156292  1.6236398142
P(C=2)    0.39538715  0.41892202 0.014035304 -1.6476003693
P(C=3)    0.38385502  0.37392195 0.018727908  1.6804419408
P(L=1)    0.62273476  0.63002809 0.019252284 -1.9955680999
P(L=2)    0.31136738  0.28602728 0.014261072  2.1968010087
P(L=3)    0.06589786  0.08394462 0.007931189 -2.2597073814

CONVERGENCE INFORMATION...
Original counts used.
iterations = 10 ,  time elapsed = 1.5
norm.diff  = 4.26534e-07 = dist between last and second last iterates.
Norm diff convergence criterion [1e-06] was met.
norm.score = 3.3329e-07 = norm of score at last iteration.
Norm score convergence criterion [1e-06] was met.

FITTING PROGRAM USED:  mph.fit, version 3.0, 5/1/09

# REMARK:
#     The model fitted above was expressed in terms of the table probabilities p,  as L(p) = X b.
#     Alternatively,  the model could have been specified in terms of the expected counts m,
#     as L(m) = X b, where L(.) is slightly modified from the definition above.
#     Below,  the relevant code is given...
#
Lm.fct <- function(m) {
logm <- log(m)
onewayclogit <- CMarg%*%log(AMarg%*%m)
onewayprob <- M%*%m/sum(m)
L <- as.matrix(c(logm,onewayclogit,onewayprob))
rownames(L) <- c(paste("logm",1:81),
"loddsE<2","loddsE<3",
"loddsH<2","loddsH<3",
"loddsC<2","loddsC<3",
"loddsL<2","loddsL<3",
"P(E=1)","P(E=2)","P(E=3)",
"P(H=1)","P(H=2)","P(H=3)",
"P(C=1)","P(C=2)","P(C=3)",
"P(L=1)","P(L=2)","P(L=3)")
L
}
am <-  mph.fit(y, L.fct=Lm.fct, L.mean=T, X=X)
mph.summary(am)

```

Document Last Updated:  05/22/2009 03:37:07 PM, Joseph B. Lang