\documentclass[11pt]{article} \setlength{\topmargin}{-0.25in} \setlength{\oddsidemargin}{0.0in} \setlength{\evensidemargin}{0.0in} \setlength{\textwidth}{5.75in} \setlength{\textheight}{8.5in} \setlength{\parindent}{0in} \setlength{\parsep}{0in} \setlength{\topsep}{0in} \setlength{\parskip}{1.2ex} \usepackage{Sweave} \begin{document} \large \begin{verbatim} Name: ___________________________________ \end{verbatim} \begin{center} {\bf Computing in Statistics}, STAT:5400\\ Midterm 2, Fall 2018 \\ \end{center} \normalsize \vskip 0.2 in You must work in the Linux environment. Submit your answers in the ICON drop box as an .Rnw file and .pdf file produced using Sweave, with your name as author. If you can't get your .Rnw file to compile, submit it anyway and include your R output in a separate text file. In your document, have a named section for each problem, and, where needed, a numbered list of answers to multipart questions. You don't have to type any other text except where needed to answer a question. \section{Simulation study} Carry out two simulation studies to estimate the bias of the sample .95 quantile as an estimator of the population .95 quantile when the population distribution is poisson with parameter 6. (Note that the R {\tt quantile} function computes sample quantiles, and the {\tt qpois} function produces the true theoretical quantile.) Do one simulation study for each of two sample sizes -- $n = 10$ and $n = 50$. In both cases, set your number of replicate datasets $S$ such that the standard error of your estimate of bias will be no larger than 0.01. Time how long it takes to run each simulation study. Submit: \begin{itemize} \item Your R code. \item Your R output containing the results of the simulation studies. \item R output showing how long it took to run each simulation study. \item A brief paragraph interpreting the results. \end{itemize} <<>>= # find out the true populaton value trueq <- qpois(0.95,6) # function to carry out simulation study simstudy <- function(S,n, truth) { mydat <- matrix(rpois(S * n, 6), nrow = S) sampquants <- apply(mydat,1,quantile, 0.95) variance = var(sampquants) list( bias = mean(sampquants) - truth, variance = variance, se = sqrt(variance/S) ) } set.seed(24) # pilot study to estimate standard error S <- 500 n <- 10 pilot.out <- simstudy(S,n,trueq) S <- ceiling(pilot.out$variance / 0.01^2) # carry out the sim studies system.time(n10out <- simstudy(S, n, trueq)) n <- 50 system.time(n50out <- simstudy(S, n, trueq)) results <- rbind(cbind(n10out$bias, n50out$bias), cbind(n10out$se, n50out$se)) colnames(results) <- cbind("n = 10", "n = 50") rownames(results) <- cbind("bias", "standard error") print(results) @ The number of replicate datasets required to keep the standard error below 0.01 was \Sexpr{S}. The bias of the sample quantile was substantial and negative for sample size $n=10$ at \Sexpr{round(results[1,1],4)}. When the sample size was $n = 50$, the bias was positive and small at \Sexpr{round(results[1,2],5)}, but still significantly different from zero since the standard error was only \Sexpr{round(results[2,2],5)}. \section{The bootstrap} Here is a dataset of size 10. \begin{verbatim} 4 8 5 5 6 6 3 5 6 7 \end{verbatim} \subsection{Nonparametric bootstrap} Now perform a nonparametric bootstrap analysis using this dataset to assess the bias in the .95 sample quantile as an estimator of the .95 quantile of the population from which this sample was drawn. Sutmit: \begin{itemize} \item Your R code. \item Your R output containing the numeric results of the nonparametric bootstrap \end{itemize} <<>>= library(boot) mydat <- c(4, 8, 5, 5, 6, 6, 3, 5, 6, 7) myquantfunc <- function(x,ind) quantile(x[ind], 0.95) bootout <- boot(mydat, myquantfunc, 3000) print(bootout) bias <- mean(bootout$t) - bootout$t0 print(paste("Estimated bias is", bias)) @ \subsection{Parametric bootstrap} Now perform a parametric bootstrap analysis for the same purpose. Assume that the population is poisson. Submit: \begin{itemize} \item Your R code. \item Your R output containing the numeric results of the parametric bootstrap \end{itemize} <<>>= myrandgen <- function(x,d) rpois(length(x), d$parm) myfunc2 <- function(x) quantile(x,0.95) bootout2 <- boot(mydat,myfunc2, 3000, sim="parametric", ran.gen = myrandgen, mle = list(parm = mean(mydat) )) print(bootout2) bias2 <- mean(bootout2$t) - bootout2$t0 print(paste("Estimated bias is", bias2)) @ \end{document}