Background 
R is a language, or an environment, for data analysis and data visualization.
R is derived form the [S  language](https://en.wikipedia.org/wiki/S_(programming_language)  developed at ATT Bell Laboratories.
R was originally developed for teaching at the University of Auckland, New Zealand, by Ross Ihaka and Robert Gentleman.
R is now maintained by an international group of about 20 statisticians and computer scientists.
A great strength of R is the large number of extension packages that have been developed.
The number available on CRAN  is now over 19,000.
 
Basic Usage 
Interactive R uses a command line interface  (CLI).
The interface runs a read-evaluate-print loop  (REPL).
A simple interaction with the R interpreter:
> 1 + 2
[1] 3Values can be assigned to variables using a left arrow <- combination:
> x <- c(1, 3, 5)
> x
[1] 1 3 5
The = sign can also be used for assignment, but <- is recommended.
 
Basic arithmetic operations work element-wise on vectors:
> x + x
[1]  2  6 10Scalars are recycled  to the length of the longer operand:
> x + 1
[1] 2 4 6> 2 * x
[1]  2  6 10Some ways to create new vectors:
> c(1, 2, 3)
[1] 1 2 3> c("a", "b", "c")
[1] "a" "b" "c"> 1 : 3
[1] 1 2 3These examples show a prompt  as you would see in the interpreter.
Usually Rmarkdown documents show code and results like this:
2 * x
## [1]  2  6 10This makes it easier to copy code for pasting it into another document or the R console.
 
Data Frames 
Data sets in R are often organized in named lists  of variables called data frames .
The value of the variable faithful is a data frame with two variables recorded for eruptions of the Old Faithful  geyser in Yellowstone National Park:
eruptions: Eruption duration (minutes)waiting: Waiting time to next eruption (minutes) 
head() shows the first 6 rows:
head(faithful)
##   eruptions waiting
## 1     3.600      79
## 2     1.800      54
## 3     3.333      74
## 4     2.283      62
## 5     4.533      85
## 6     2.883      55 
A Simple Scatter Plot 
with(faithful,
     plot(eruptions, waiting,
          xlab = "Eruption time (min)",
          ylab = "Waiting time to next eruption (min)"))
Several graphics systems are available for R.
plot() is part of base graphics .
 
Fitting a Linear Regression 
fit <- with(faithful, lm(waiting ~ eruptions))
fit
## 
## Call:
## lm(formula = waiting ~ eruptions)
## 
## Coefficients:
## (Intercept)    eruptions  
##       33.47        10.73You can also use the data argument to lm():
fit <- lm(waiting ~ eruptions, data = faithful)coef() extracts the coefficients:
coef(fit)
## (Intercept)   eruptions 
##    33.47440    10.72964summary(fit) provides more details:
summary(fit)
## 
## Call:
## lm(formula = waiting ~ eruptions, data = faithful)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.0796  -4.4831   0.2122   3.9246  15.9719 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  33.4744     1.1549   28.98   <2e-16 ***
## eruptions    10.7296     0.3148   34.09   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.914 on 270 degrees of freedom
## Multiple R-squared:  0.8115, Adjusted R-squared:  0.8108 
## F-statistic:  1162 on 1 and 270 DF,  p-value: < 2.2e-16 
Adding the Regression Line to the Plot 
Original plot:
with(faithful,
     plot(eruptions, waiting,
          xlab = "Eruption time (min)",
          ylab = "Waiting time to next eruption (min)"))
With a regression line:
with(faithful,
     plot(eruptions, waiting,
          xlab = "Eruption time (min)",
          ylab = "Waiting time to next eruption (min)"))
abline(coef(fit), col = "red", lwd = 3)
 
Packages and Package Libraries 
Extension code and data sets are often made available in packages .
Packages are stored in folders or directories as collections called libraries .
.libPaths() will show you the libraries your R process will search.
search() shows what packages are attached to the global search path.
The library() function is used to find packages in the libraries and attach them to the search path.
The expression pkg::var gets the value of variable var from package pkg without attaching pkg.
You can install packages using the install.packages function or the Install Packages  item in the RStudio Tools  menu.
By default packages are installed from CRAN .
It is also possible to use functions in the remotes package to install packages hosted on GitHub  or GitLab .
 
A Useful Package: ggplot2 
The ggplot2 package provides a powerful alternative to the base graphics system.
The geyser example can be done in ggplot2 like this:
library(ggplot2)
ggplot(data = faithful) +
    geom_point(mapping = aes(x = eruptions, y = waiting)) +
    geom_smooth(mapping = aes(x = eruptions, y = waiting),
                method = "lm", se = FALSE)
ggplot2 is part of a useful collection of packages called the tidyverse 
ggplot is based on the Grammar of Graphics .
A basic template for creating a plot with ggplot:
ggplot(data = <DATA>) + <GEOM>(mapping = aes(<MAPPINGS>)) 
 
Subsetting and Extracting Components 
The subset operator  [ can be used to extract element by index:
month.abb
##  [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
month.abb[1 : 3]
## [1] "Jan" "Feb" "Mar"Subsetting can also be based on a logical expression that returns TRUE or FALSE for each element:
(starts_with_J <- substr(month.abb, 1, 1) == "J")
##  [1]  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
month.abb[starts_with_J]
## [1] "Jan" "Jun" "Jul"
The value of an assignment operation is the right hand side value.
Ordinarily this value is not printed. 
Placing the assignment expression in parentheses causes it to be printed. 
 
 
Individual elements can be extracted using the element operator  [[:
month.abb[[3]]
## [1] "Mar"Components of named lists, like data frames , can be extracted with the $ operator:
names(faithful)
## [1] "eruptions" "waiting"
head(faithful, 4)
##   eruptions waiting
## 1     3.600      79
## 2     1.800      54
## 3     3.333      74
## 4     2.283      62
head(faithful$eruptions, 4)
## [1] 3.600 1.800 3.333 2.283The element operator can be used as well:
head(faithful[["eruptions"]], 4)
## [1] 3.600 1.800 3.333 2.283 
Functions 
Simple Functions 
All computations in R are carried out by functions.
Defining a function allows you to avoid cutting and pasting code.
A simple function:
ms <- function(x) list(mean = mean(x), sd = sd(x))
ms(faithful$eruptions)
## $mean
## [1] 3.487783
## 
## $sd
## [1] 1.141371 
Generic Functions and Object-Oriented Programming 
R supports several mechanisms for object-oriented programming based on generic functions .
The most commonly used mechanism, called S3, allows a function to dispatch  to a method  based on the class  of its first argument.
plot is a very simple generic function.
plot
## function (x, y, ...) 
## UseMethod("plot")
## <bytecode: 0x562822b00028>
## <environment: namespace:base>For example, the plot method for linear model fit objects produces a set of 4 plots commonly used to assess regression fits.
plot(fit)
 
Lazy and Non-Standard Evaluation 
An unusual but useful feature of R is that function arguments are not evaluated until their value is needed, so they may not be evaluated at all.
This is called lazy evaluation .
log("A")
## Error in log("A"): non-numeric argument to mathematical function
f <- function(x) NULL
f(log("A"))
## NULLFunctions can also capture the expression of the arguments they were called with:
f <- function(x) deparse(substitute(x))
f(a + b)
## [1] "a + b"Together these features allow functions to evaluate their arguments in non-standard  ways.
This is most commonly used to allow values for variables in arguments to be found in a provided data frame.
The with() function is a simple example:
mean(eruptions)
## Error in eval(expr, envir, enclos): object 'eruptions' not foundwith(faithful, mean(eruptions))
## [1] 3.487783Non-standard evaluation of this type is used extensively in the tidyverse .
 
 
The Tidyverse 
Tidyverse 
The dplyr package provides a grammar for data manipulation .
A simple example: computing means and standard deviations for the waiting times after the short (less than 3 minutes) and the long (3 minutes or more) eruptions:
library(dplyr)
tmp <- mutate(faithful,
              type = ifelse(eruptions < 3,
                            "short",
                            "long"))
head(tmp)
##   eruptions waiting  type
## 1     3.600      79  long
## 2     1.800      54 short
## 3     3.333      74  long
## 4     2.283      62 short
## 5     4.533      85  long
## 6     2.883      55 shortsummarize(group_by(tmp, type),
          mean = mean(waiting),
          sd = sd(waiting))
## # A tibble: 2 × 3
##   type   mean    sd
##   <chr> <dbl> <dbl>
## 1 long   80.0  5.99
## 2 short  54.5  5.84
Tidyverse functions like to work with an enhanced form of data frame called a tibble .
 
A computation like this can be viewed as a transformation pipeline  consisting of three stages:
mutation (adding a new variable) 
grouping (splitting by type) 
summarizing within groups. 
 
Tidyverse code often uses the forward pipe operator  %>% provided by the magrittr package to express such a pipeline.
R 4.1.0 and later also provides a native pipe operator  |>.
The pipe operator allows a call f(x) to be written as
x |> f()The left hand value is passed implicitly as the first argument to the function called on the right.
Using the pipe operator, the code for computing means and standard deviations can be written as
faithful |>
    mutate(type = ifelse(eruptions < 3, "short", "long")) |>
    group_by(type) |>
    summarize(mean = mean(waiting),
              sd = sd(waiting))
## # A tibble: 2 × 3
##   type   mean    sd
##   <chr> <dbl> <dbl>
## 1 long   80.0  5.99
## 2 short  54.5  5.84There are trade-offs:
Manipulation pipelines expressed this way are often more compact than ones using intermediate variables and/or nested calls.
With pipe notation there is no need to come up with intermediate variable names.
Pipe notation obscures the function calls that are actually happening and this can make debugging harder.
 
 
Contrast to Point-and-Click Interfaces 
Even simple tasks require learning some of the R language.
Once you can do simple tasks, you have learned some of the R language.
More complicated tasks become easier.
Even very complicated tasks become possible.
 
 
R and Reproducibility 
Analyses in R are carried out by running code describing the tasks to perform.
This code can be
audited to make sure the analysis is right;
replayed to make sure the results are repoducible;
reused after changes in the data or on new data.
 
Literate data analysis  tools like Rmarkdown provide support for this.
 
Finding Out More 
Getting Help on Functions 
help(mean) will show help for the function mean.This can be abbreviated as ?mean 
 
 
Some R Introductions and Tutorials 
 
Introductions to the Tidiverse 
 
 
Interactive Tutorial 
An interactive learnravailable .
You can run the tutorial with
STAT4580::runTutorial("Rintro")You can install the current version of the STAT4580 package with
remotes::install_gitlab("luke-tierney/STAT4580")You may need to install the remotes package from CRAN first.
 
Exercises 
Compute the mean of the numbers 1, 3, 5, 8. 
 
What is the mean of the eruptions variable in the faithful data frame? 
 
Find the average of the first 50 eruption durations in the faithful data frame. 
 
Use the median function to modify the pipe example in the tidyverse section  to include medians. 
 
 
---
title: "A Brief Overview of R"
output:
  html_document:
    toc: yes
    code_folding: show
    code_download: true
---

```{r setup, include = FALSE}
source(here::here("setup.R"))
knitr::opts_chunk$set(collapse = TRUE,
                      fig.height = 5, fig.width = 6, fig.align = "center")
```
<link rel="stylesheet" href="stat4580.css" type="text/css" />


## Background

R is a language, or an environment, for data analysis and data visualization.

R is derived form the [_S_
language](https://en.wikipedia.org/wiki/S_(programming_language)
developed at ATT Bell Laboratories.

R was originally developed for teaching at the University of Auckland,
New Zealand, by Ross Ihaka and Robert Gentleman.

R is now maintained by an international group of about 20
statisticians and computer scientists.

A great strength of R is the large number of extension packages that
have been developed.

The number available on [CRAN](https://cran.r-project.org) is now over
19,000.


## Basic Usage

Interactive R uses a _command line interface_ (CLI).

The interface runs a _read-evaluate-print loop_ (REPL).

A simple interaction with the R interpreter:

```{r, prompt = TRUE, comment = ""}
1 + 2
```

Values can be assigned to variables using a left arrow `<-` combination:

```{r, prompt = TRUE, comment = ""}
x <- c(1, 3, 5)
x
```

<div class = "alert">
The `=` sign can also be used for assignment, but `<-` is recommended.
</div>

Basic arithmetic operations work element-wise on vectors:

```{r, prompt = TRUE, comment = ""}
x + x
```

Scalars are _recycled_ to the length of the longer operand:

```{r, prompt = TRUE, comment = ""}
x + 1
```

```{r, prompt = TRUE, comment = ""}
2 * x
```

Some ways to create new vectors:

```{r, prompt = TRUE, comment = ""}
c(1, 2, 3)
```

```{r, prompt = TRUE, comment = ""}
c("a", "b", "c")
```

```{r, prompt = TRUE, comment = ""}
1 : 3
```

These examples show a _prompt_ as you would see in the interpreter.

Usually Rmarkdown documents show code and results like this:

```{r}
2 * x
```

This makes it easier to copy code for pasting it into another document
or the R console.


## Data Frames

Data sets in R are often organized in _named lists_ of variables
called _data frames_.

The value of the variable `faithful` is a data frame with two
variables recorded for eruptions of the _Old Faithful_ geyser in Yellowstone
National Park:

* `eruptions`:  Eruption duration (minutes)
* `waiting`:  Waiting time to next eruption (minutes)

`head()` shows the first 6 rows:

```{r}
head(faithful)
```


## A Simple Scatter Plot

```{r geyser, eval = FALSE}
with(faithful,
     plot(eruptions, waiting,
          xlab = "Eruption time (min)",
          ylab = "Waiting time to next eruption (min)"))
```

<!-- ## nolint start -->
```{r eval = TRUE, echo = FALSE, fig.height = 4}
op <- par(mar = c(4, 4, 0.1, 0.1))
<<geyser>>
par(op)
```
<!-- ## nolint end -->

Several graphics systems are available for R.

`plot()` is part of _base graphics_.


## Fitting a Linear Regression
```{r}
fit <- with(faithful, lm(waiting ~ eruptions))
fit
```

You can also use the `data` argument to `lm()`:

```{r}
fit <- lm(waiting ~ eruptions, data = faithful)
```

`coef()` extracts the coefficients:

```{r}
coef(fit)
```

`summary(fit)` provides more details:

```{r}
summary(fit)
```


## Adding the Regression Line to the Plot

Original plot:

<!-- ## nolint start -->
```{r, eval = FALSE}
<<geyser>>
```
<!-- ## nolint end -->
<!-- ## nolint start -->
```{r, eval = TRUE, echo = FALSE}
<<geyser>>
```
<!-- ## nolint end -->

With a regression line:

<!-- ## nolint start -->
```{r geyser-with-line, eval = FALSE}
<<geyser>>
abline(coef(fit), col = "red", lwd = 3)
```
<!-- ## nolint end -->

```{r geyser-with-line, eval = TRUE, echo = FALSE}
```


## Packages and Package Libraries

Extension code and data sets are often made available in _packages_.

Packages are stored in folders or directories as collections called _libraries_.

`.libPaths()` will show you the libraries your R process will search.

`search()` shows what packages are attached to the global search path.

The `library()` function is used to find packages in the libraries and
attach them to the search path.

The expression `pkg::var` gets the value of variable `var` from
package `pkg` without attaching `pkg`.

You can install packages using the `install.packages` function or
the **Install Packages** item in the RStudio **Tools** menu.

By default packages are installed from [CRAN](cran.r-project.org).

It is also possible to use functions in the `remotes` package to
install packages hosted on [GitHub](https://github.com/) or
[GitLab](https://about.gitlab.com/).


## A Useful Package: `ggplot2`

The `ggplot2` package provides a powerful alternative to the base
graphics system.

The geyser example can be done in `ggplot2` like this:

```{r geyser-ggplot, eval = FALSE, echo = TRUE}
library(ggplot2)
ggplot(data = faithful) +
    geom_point(mapping = aes(x = eruptions, y = waiting)) +
    geom_smooth(mapping = aes(x = eruptions, y = waiting),
                method = "lm", se = FALSE)
```
```{r geyser-ggplot, eval = TRUE, echo = FALSE, message = FALSE}
```

`ggplot2` is part of a useful collection of packages called the
[_tidyverse_](https://www.tidyverse.org/).

<div class="alert">
`ggplot` is based on the _Grammar of Graphics_.

* Plots are composed of _geometric objects_ (`geoms`).

* Variables are _mapped_ to _aesthetic features_ of geometric objects.

A basic template for creating a plot with `ggplot`:

<!-- # nolint start -->
```{r, eval = FALSE}
ggplot(data = <DATA>) + <GEOM>(mapping = aes(<MAPPINGS>))
```
<!-- # nolint end -->
</div>


## Subsetting and Extracting Components

The _subset operator_ `[` can be used to extract element by index:

```{r}
month.abb
month.abb[1 : 3]
```

Subsetting can also be based on a logical expression that returns
`TRUE` or `FALSE` for each element:

```{r}
(starts_with_J <- substr(month.abb, 1, 1) == "J")
month.abb[starts_with_J]
```

<div class = "alert">
The value of an assignment operation is the right hand side value.

* Ordinarily this value is not printed.
* Placing the assignment expression in parentheses causes it to be
  printed.
</div>

Individual elements can be extracted using the _element operator_ `[[`:

```{r}
month.abb[[3]]
```

Components of named lists, like _data frames_, can be extracted with
the `$` operator:

```{r}
names(faithful)
head(faithful, 4)
head(faithful$eruptions, 4)
```

The element operator can be used as well:

```{r}
head(faithful[["eruptions"]], 4)
```


## Functions


### Simple Functions

All computations in R are carried out by functions.

Defining a function allows you to avoid cutting and pasting code.

A simple function:

```{r}
ms <- function(x) list(mean = mean(x), sd = sd(x))
ms(faithful$eruptions)
```


### Generic Functions and Object-Oriented Programming

R supports several mechanisms for object-oriented programming based on
_generic functions_.

The most commonly used mechanism, called S3, allows a function to
_dispatch_ to a _method_ based on the _class_ of its first argument.

`plot` is a very simple generic function.

```{r}
plot
```

For example, the `plot` method for linear model fit objects produces a
set of 4 plots commonly used to assess regression fits.

```{r geyser_lm_fit, eval = FALSE}
plot(fit)
```
```{r, echo = FALSE, fig.height = 6.5, fig.width = 7}
op <- par(mfrow = c(2, 2))
<<geyser_lm_fit>>
par(op)
```


### Lazy and Non-Standard Evaluation

An unusual but useful feature of R is that function arguments are not
evaluated until their value is needed, so they may not be evaluated at
all.

This is called _lazy evaluation_.

```{r, error = TRUE}
log("A")
f <- function(x) NULL
f(log("A"))
```

Functions can also capture the expression of the arguments they were
called with:

```{r}
f <- function(x) deparse(substitute(x))
f(a + b)
```

Together these features allow functions to evaluate their arguments in
_non-standard_ ways.

This is most commonly used to allow values for variables in arguments
to be found in a provided data frame.

The `with()` function is a simple example:

```{r, error = TRUE}
mean(eruptions)
```

```{r}
with(faithful, mean(eruptions))
```

Non-standard evaluation of this type is used extensively in the _tidyverse_.


## The Tidyverse

[_Tidyverse_](https://www.tidyverse.org/) functions are designed to
perform operations on data frames.

The `dplyr` package provides a _grammar for data manipulation_.

A simple example: computing means and standard deviations for the
waiting times after the short (less than 3 minutes) and the long (3
minutes or more) eruptions:

```{r, message = FALSE}
library(dplyr)
tmp <- mutate(faithful,
              type = ifelse(eruptions < 3,
                            "short",
                            "long"))
head(tmp)
```
```{r}
summarize(group_by(tmp, type),
          mean = mean(waiting),
          sd = sd(waiting))
```
<div class = "alert">
Tidyverse functions like to work with an enhanced form of data frame
called a _tibble_.
</div>

A computation like this can be viewed as a _transformation pipeline_
consisting of three stages:

* mutation (adding a new variable)
* grouping (splitting by `type`)
* summarizing within groups.
  

Tidyverse code often uses the _forward pipe operator_ `%>%` provided by
the `magrittr` package to express such a pipeline.

R 4.1.0 and later also provides a _native pipe operator_ `|>`.

The pipe operator allows a call `f(x)` to be written as

```{r, eval = FALSE}
x |> f()
```

The left hand value is passed implicitly as the first argument to the
function called on the right.

Using the pipe operator, the code for computing means and standard
deviations can be written as

```{r}
faithful |>
    mutate(type = ifelse(eruptions < 3, "short", "long")) |>
    group_by(type) |>
    summarize(mean = mean(waiting),
              sd = sd(waiting))
```

There are trade-offs:

* Manipulation pipelines expressed this way are often more compact
  than ones using intermediate variables and/or nested calls.

* With pipe notation there is no need to come up with intermediate
  variable names.

* Pipe notation obscures the function calls that are actually
  happening and this can make debugging harder.


## Contrast to Point-and-Click Interfaces

* Even simple tasks require learning some of the R language.

* Once you can do simple tasks, you have learned some of the R language.

* More complicated tasks become easier.

* Even very complicated tasks become possible.


## R and Reproducibility

Analyses in R are carried out by running code describing the tasks to
perform.

This code can be

* audited to make sure the analysis is right;

* replayed to make sure the results are repoducible;

* reused after changes in the data or on new data.

_Literate data analysis_ tools like Rmarkdown provide support for
this.


## Finding Out More


### Getting Help on Functions

* `help(mean)` will show help for the function `mean`.
* This can be abbreviated as `?mean`


### Some R Introductions and Tutorials

* [An Introduction to R](https://cran.r-project.org/doc/manuals/R-intro.html)
  introduces the language and shows how to use R for
  statistical analysis and graphics.
* Another
  [introduction to R](http://zoonek2.free.fr/UNIX/48_R/all.html) by
  Vincent Zoonekynd.
* [Quick-R](https://www.statmethods.net/) web site related to *R
  in Action* book.
* [R For
  Beginners](https://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf).
* [TryR](https://www.pluralsight.com/search?q=R) at Codeschool.
* [swirl: Learn R in R](https://swirlstats.com/).
* [_Hands-On Programming with R_](https://rstudio-education.github.io/hopr/).
* [_R for Data Science_](https://r4ds.hadley.nz/).
* [Data Science Dojo YouTube
  tutorials](https://www.youtube.com/c/Datasciencedojo/playlists?view=50&sort=dd&shelf_id=2).
* [Tutorials ad RStudio](https://education.rstudio.com/learn/).
* [R for the Rest of Us](https://rfortherestofus.com/).
* There are _many_ others.


### Introductions to the Tidiverse

  * Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund
    (2023), [_R for Data Science (2nd
    Edition)_](https://r4ds.hadley.nz/), O'Reilly. ([Book source on
    GitHub](https://github.com/hadley/r4ds))

  * [R Basics
    chapter](https://rafalab.dfci.harvard.edu/dsbook-part-1/R/R-basics.html)
    in Rafael A. Irizarry (2019), [Introduction to Data Science: _Data
    Analysis and Prediction Algorithms with
    R_](https://rafalab.dfci.harvard.edu/dsbook-part-1/), Chapman &
    Hall/CRC. ([Book source on
    GitHub](https://github.com/rafalab/dsbook-part-1))


### R Markdown Tutorials

* [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/)
  by Yihui Xie is a book-length presentation.
* The [R Markdown Home Page](https://rmarkdown.rstudio.com) has a link
  to a [tutorial](https://rmarkdown.rstudio.com/lesson-1.html).


## Interactive Tutorial

An interactive [`learnr`](https://rstudio.github.io/learnr/) tutorial
for these notes is [available](`r WLNK("tutorials/Rintro.Rmd")`).

You can run the tutorial with

```{r, eval = FALSE}
STAT4580::runTutorial("Rintro")
```

You can install the current version of the `STAT4580` package with

```{r, eval = FALSE}
remotes::install_gitlab("luke-tierney/STAT4580")
```

You may need to install the `remotes` package from CRAN first.


## Exercises

1. Compute the mean of the numbers 1, 3, 5, 8.

<!--
The answer to Exercise 1 is closest to
* 4.25
  5.75
  3.75
  5.25
-->

2. What is the mean of the `eruptions` variable in the `faithful` data
   frame?

<!--
The answer to Exercise 2 is closest to
* 3.49
  3.35
  3.87
  3.16
-->

3. Find the average of the first 50 eruption durations in the `faithful`
   data frame.

<!--
The answer to Exercise 3 is closest to
* 3.30
  2.50
  3.13
  4.33
-->

4. Use the `median` function to modify the pipe example in the
   [tidyverse section](#the-tidyverse) to include medians.
