--- title: "Deep Learning Examples" author: "Luke Tierney" output: html_document --- ```{r, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, message = FALSE) set.seed(1234) ``` These examples use the R package [`keras`](https://keras.rstudio.com/). This packages uses package [`tensorflow`](https://tensorflow.rstudio.com/) to make use of the [_TensorFlow_](https://www.tensorflow.org/) platrotm. ## California Air Pollution Data This is adapted from the [basic regression tutorial](https://keras.rstudio.com/articles/tutorial_basic_regression.html) at the `keras` package [web site](https://keras.rstudio.com). The data needs to have features scaled comparably. ```{r} data(calif.air.poll, package = "SemiPar") calif.scaled <- scale(calif.air.poll[-1]) calif.ozone <- calif.air.poll$ozone.level ``` This function fits a neural network with two densely connected hidden layers of `size` nodes with an _ReLU_ activation function to a subset of the data specified by `train`: ```{r} library(keras) ca.dnn <- function(train = TRUE, epochs = 100, size = 64) { X <- calif.scaled[train,] y <- calif.ozone[train] nx <- ncol(X) model <- keras_model_sequential() %>% layer_dense(units = size, activation = "relu", input_shape = nx) %>% layer_dense(units = size, activation = "relu") %>% layer_dense(units = 1) compile(model, loss = "mse", optimizer = optimizer_rmsprop(), metrics = list("mean_squared_error")) history <- fit(model, calif.scaled, calif.ozone, epochs = epochs, validation_split = 0.2, verbose = 0) attr(model, "history") <- history model } ``` The fit is obtained by ```{r} dnn.ca <- ca.dnn() ``` The print method provides a summary of the model characteristics: ```{r} op <- options(width = 60) dnn.ca options(op) ``` A diagnostic visualization: ```{r} library(ggplot2) plot(attr(dnn.ca, "history"), metrics = "mean_squared_error", smooth = FALSE) + coord_cartesian(ylim = c(0, 50)) ``` ```{r} ca.predict.keras <- function(fit, X) { X.scaled <- scale(X, attr(calif.scaled, "scaled:center"), attr(calif.scaled, "scaled:scale")) X.scaled <- X.scaled predict(fit, X.scaled) } ``` The fit surface: ```{r} np <- 50 dpg <- seq(-60, 100, len = np) ibh <- seq(200, 4000, len = np) gg <- expand.grid(daggett.pressure.gradient = dpg, inversion.base.height = ibh, inversion.base.temp = c(60, 80)) library(lattice) wireframe(ca.predict.keras(dnn.ca, gg) ~ daggett.pressure.gradient * inversion.base.height, group = inversion.base.temp, data = gg, auto.key = TRUE) wireframe(ca.predict.keras(dnn.ca, gg) ~ daggett.pressure.gradient * inversion.base.height | inversion.base.temp, data = gg) ``` 10-fold cross vaildation: ```{r} ca.dnn.mse <- function(fit, test = TRUE) { y <- calif.air.poll$ozone.level[test] yhat <- ca.predict.keras(fit, calif.air.poll[test, -1]) mean((y - yhat) ^ 2) } set.seed(54321) cvsplit <- function(n, k) { x <- seq_len(n) brk <- quantile(x, seq(0, 1, length.out = k + 1)) y <- sample(n, n) structure(split(y, cut(x, brk, include.lowest=TRUE)), names = NULL) } cv <- cvsplit(nrow(calif.air.poll), 10) system.time(mse_dnn <- sapply(cv, function(test) ca.dnn.mse(ca.dnn(-test), test))) mean(mse_dnn) sd(mse_dnn) / sqrt(length(mse_dnn)) ``` ## Recognizing Handwritten Digits The [`keras` package web page](https://keras.rstudio.com/) shows an example of fitting a two-layer DNN to the MNIST digits data. - The accuracy of the fit on the test data is about 98%. - The fit takes about 2 minutes to compute on a laptop. An example using a convolutional neural network (CNN) is also [available](https://keras.rstudio.com/articles/examples/mnist_cnn.html).