C.2 Code for rhDNase Data Analyses The data in Section D.3 of Appendix D for the first few subjects can be re-formatted and read into S-PLUS or R as a data frame in the counting process formulation as follows : > rhDNase <- read.table("rhDNase.dat", header=F) > dimnames(rhDNase)[[2]] <- c("id","trt","fev","fev2","time1","time2", "status","etype","enum","enum1","enum2") > rhDNase[1:18, c("id","trt","fev","time1","time2","status", "etype","enum","enum1")] id trt fev time1 time2 status etype enum enum1 1 493301 1 28.8 0 168 0 1 1 1 2 493303 1 64.0 0 169 0 1 1 1 3 493305 0 67.2 0 65 1 1 1 1 4 493305 0 67.2 65 75 1 2 2 1 5 493305 0 67.2 75 168 0 1 3 2 6 493309 1 57.6 0 168 0 1 1 1 7 493310 0 57.6 0 171 0 1 1 1 8 493311 1 25.6 0 166 0 1 1 1 9 493312 0 86.4 0 168 0 1 1 1 10 493313 0 32.0 0 90 1 1 1 1 11 493313 0 32.0 90 104 1 2 2 1 12 493313 0 32.0 104 166 0 1 3 2 13 589301 1 86.4 0 169 0 1 1 1 14 589302 0 28.8 0 8 1 1 1 1 15 589302 0 28.8 8 22 1 2 2 1 16 589302 0 28.8 22 63 1 1 3 2 17 589302 0 28.8 63 88 1 2 4 2 18 589302 0 28.8 88 169 0 1 5 3 Consider the data for the first two events, create a gap time variable, and center FEV. > rhDNase.etype1 <- rhDNase[rhDNase$etype == 1,] > rhDNase.etype1$gtime <- rhDNase.etype1$time2 - rhDNase.etype1$time1 > rhDNase.etype1$fevc <- rhDNase.etype1$fev - mean(rhDNase.etype1$fev[rhDNase.etype1$enum1==1]) Data for First and Second Gaps : > rhDNase1 <- rhDNase.etype1[rhDNase.etype1$enum1 == 1,] > rhDNase2 <- rhDNase.etype1[rhDNase.etype1$enum1 == 2,] Create a data frame in which we can use the first gap time as a covariate in the second gap time analysis. > temp <- rhDNase1[rhDNase1$status == 1, c("id","gtime")] > dimnames(temp)[[2]] <- c("id","gtime1") > rhDNase2 <- merge(rhDNase2, temp, by="id", all.x=T) > rhDNase2$gtime1c <- rhDNase2$gtime1 - mean(rhDNase2$gtime1) Fit a Cox model to first and second gap times. > fit1 <- coxph(Surv(gtime,status) ~ trt + fevc, data=rhDNase1, method="breslow") > summary(fit1) Call: coxph(formula = Surv(gtime, status) ~ trt + fevc, data = rhDNase1, method = "breslow") n= 645 coef exp(coef) se(coef) z p trt -0.3828 0.682 0.12971 -2.95 3.2e-03 fevc -0.0206 0.980 0.00277 -7.44 1.0e-13 exp(coef) exp(-coef) lower .95 upper .95 trt 0.682 1.47 0.529 0.879 fevc 0.980 1.02 0.974 0.985 Rsquare= 0.103 (max possible= 0.991 ) Likelihood ratio test= 69.8 on 2 df, p=6.66e-16 Wald test = 63.5 on 2 df, p=1.65e-14 Score (logrank) test = 65.9 on 2 df, p=4.77e-15 > fit2 <- coxph(Surv(gtime,status) ~ trt + fevc + gtime1c, data=rhDNase2, method="breslow") > summary(fit2) Call: coxph(formula = Surv(gtime, status) ~ trt + fevc + gtime1c, data = rhDNase2, method = "breslow") n= 227 coef exp(coef) se(coef) z p trt 0.358071 1.431 0.22456 1.595 0.11000 fevc 0.000932 1.001 0.00538 0.173 0.86000 gtime1c -0.014310 0.986 0.00394 -3.634 0.00028 exp(coef) exp(-coef) lower .95 upper .95 trt 1.431 0.699 0.921 2.222 fevc 1.001 0.999 0.990 1.012 gtime1c 0.986 1.014 0.978 0.993 Rsquare= 0.068 (max possible= 0.968 ) Likelihood ratio test= 15.9 on 3 df, p=0.00121 Wald test = 14.8 on 3 df, p=0.00196 Score (logrank) test = 15.4 on 3 df, p=0.00153 Model checking can be carried out using the S-PLUS function cox.zph. Test for proportion hazards assumption. > cox.zph(fit1, transform="identity") rho chisq p trt 0.0239 0.139 0.709 fevc 0.0363 0.288 0.591 GLOBAL NA 0.422 0.810 > cox.zph(fit1, transform="log") rho chisq p trt 0.0621 0.932 0.334 fevc 0.0477 0.499 0.480 GLOBAL NA 1.412 0.494 Gap time analysis may also be carried out using parametric log-normal models as in Section 4.3.2. > fit1 <- survReg(Surv(gtime,status) ~ trt + fevc, data=rhDNase1, dist="lognormal") > summary(fit1) Call: survReg(formula = Surv(gtime, status) ~ trt + fevc, data = rhDNase1, dist = "lognormal") Value Std. Error z p (Intercept) 5.4030 0.1048 51.53 0.00e+00 trt 0.4302 0.1371 3.14 1.71e-03 fevc 0.0217 0.0029 7.47 8.03e-14 Log(scale) 0.3688 0.0512 7.21 5.68e-13 Scale= 1.45 Log Normal distribution Loglik(model)= -1625 Loglik(intercept only)= -1660.8 Chisq= 71.51 on 2 degrees of freedom, p= 3.3e-16 Number of Newton-Raphson Iterations: 3 n= 645 > rhDNase2$lgtime1 <- log(rhDNase2$gtime1) > fit2 <- survReg(Surv(gtime,status) ~ trt + fevc + lgtime1, data=rhDNase2, dist="lognormal") Call: survReg(formula = Surv(gtime, status) ~ trt + fevc + lgtime1, data = rhDNase2, dist = "lognormal") Value Std. Error z p (Intercept) 3.20899 0.4867 6.593 4.30e-11 trt -0.22657 0.2105 -1.077 2.82e-01 fevc -0.00454 0.0047 -0.965 3.34e-01 lgtime1 0.41730 0.1322 3.157 1.59e-03 Log(scale) 0.20559 0.0859 2.392 1.67e-02 Scale= 1.23 Log Normal distribution Loglik(model)= -484.9 Loglik(intercept only)= -490.9 Chisq= 11.99 on 3 degrees of freedom, p= 0.0074 Number of Newton-Raphson Iterations: 3 n= 227