## Files and Folders

It is essential that you name your folders and files exactly as specified. We will run checks like

cd HW1
Rscript -e 'rmarkdown::render("hw1.Rmd")'

from the top of a clone of your repository. If the folders and files are not named exactly as specified these checks will fail.

Consistent file naming is often an important part of a data analysis project. Consistent use of upper and lower case letters is important for an analysis to be reproducible. These may seem like small issues, but if your files do not use the conventions your code expects then the code will fail.

There are two kinds of file systems in common use:

• Case preserving but case insensitive (Windows, most Mac OS): If you create a file as hw1.Rmd it will be shown with this spelling, but you can also access it with the name Hw1.rMd and other case variations.

• Case sensitive (some Mac OS, Linux): You can have separate files named hw1.Rmd and Hw1.rMd.

To make your work as reproducible as possible:

• Always use the same naming and case conventions, even on a case insensitive system.

• Never create two files with names that differ only in case, even if your file system is case sensitive and allows this.

## Reproducibility

Make sure your .Rmd file will knit without errors.

• Except for packages your code should not depend on anything not contained in your repository.

• Your code should not attempt to make any modifications outside your repository, including installing packages.

For now, if your submission does not knit successfully on our test systems (the CLAS Linux systems) we will:

• Fix your .Rmd file, commit the changes, and push them to your UI GitLab repository.
• Open an issue on GitLab notifying you of the change. The issue is assigned to you, which generates an email to you.
• You will need to pull our changes to bring them into your local repository.
• There will of course be a deduction if we have to do this.

## Rmarkdown Usage and Coding Style

Make sure you are using Rmarkdown properly, with explanatory texts surrounding short code chunks. In particular you should not have just one big code chunk.

• Your Rmarkdown code and your R code should be readable, and the R code should follow the coding standards. This makes maintaining your code and document easier.

• Your rendered HTML page should be a report with text supporting numerical and graphical results. Code only needs to be visible if you are explaining how to do something (which is a goal of the class notes).

You can use chunk options to hide code and show only results. For example, the chunk

{r, echo = FALSE}
hist(faithful$waiting)  will show only the plot and not the code. • Numbers in your text inserted with inline code and numbers in tables should be rounded to an appropriate number of decimals. The round function can be used in inline code. The kable functions have options for controlling the number of digits. • Your homework solutions should use the same headers as the assignment to make it easier to grade. The assignment problem headers are all level two headers, created by starting a line with ##. • The text should be all your own, not stray material from templates. ## Name and Date Make sure your Rmarkdown file header contains a name: field with your name; a date: field with an appropriate date is also useful. Your header should look something like this: --- title: "Assignment 1" output: html_document name: "Your Name" date: "January 26, 2022" --- The template shows a way to have the current date inserted. The following is a sample solution. ## 1. Average Waiting Time Between Eruptions The average waiting time between eruptions of the Old Faithful geyser in the data set faithful is 70.9 minutes. ## 2. First Four Eruption Durations The first four eruption durations can be computed using by using the $ operator to extract the eruptions variable, and then the subset operator to get the subset of the first four observations:

faithful\$eruptions[1 : 4]
## [1] 3.600 1.800 3.333 2.283

## 3. First Five Records of the Eruptions Data

The following table shows the first five eruption durations and waiting times to the subsequent eruption for the Old Faithful geyser recorded in the faithful data frame. Times are in minutes.

eruptions waiting
3.60 79
1.80 54
3.33 74
2.28 62
4.53 85

## 4. Histogram of Eruption Durations

The following plot shows a histogram of the eruption durations for the faithful data set.

The distribution appears to be bimodal, with one mode around 2 minutes and one around 4.5 minutes.