General Issues

---
title: "Assignment 3"
author: "Fred Frog"
date: "`r Sys.Date()`"
output: html_document
---

General Comments

1. Life Expectancy Distribution by Continent

The subset of the data for years since 1990 can be extracted using the filter function from dplyr:

library(gapminder)
library(dplyr)
gap1990 <- filter(gapminder, year >= 1990)

A faceted display using ggplot and facet_wrap:

library(ggplot2)
ggplot(gap1990, aes(x = lifeExp)) + geom_density() + facet_wrap(~continent)

2. Boxplots of Life Expectancy by Continent

Boxplots for the same data:

ggplot(gap1990) + geom_boxplot(aes(x = continent, y = lifeExp))

3. Ridgeline Plots of Life Expectancy

Density ridges for the 12 years show that overall life exectancy distributions have shifted upwards.

ggplot(gapminder) +
    geom_density_ridges(aes(x = lifeExp, y = year, group = year))
## Picking joint bandwidth of 3.88

The distribution shape has changed from skewed right in 1952 to skewed left in 2007. Adding lines at the medians emphasises this shift:

ggplot(gapminder) +
    geom_density_ridges(aes(x = lifeExp, y = year, group = year),
                        quantile_lines = TRUE, quantiles = 2)
## Picking joint bandwidth of 3.88

Separating the distributions by continent shows some striking differences:

ggplot(mutate(gapminder, continent = reorder(continent, -lifeExp))) +
    geom_density_ridges(aes(x = lifeExp, y = year,
                            group = interaction(year, continent),
                            fill = continent), scale = 1.3, alpha = 0.8)
## Picking joint bandwidth of 2.24

Life expectancy is highest among European countries, with a steady increase over the years and consistently low variability among countries. Variability in life expectancy among the Americas has decreased and overall levels have increased, but remain below those for Europe. Life expectancy among countries in Asia has improved overall, but variability among the countries remains substantially higher than among European countries. Variability among African countries has increased, with some at life expectancy levels comparable to the Americas but the bulk remaining quite a bit lower.

4. Find a Better Visualization

The original:

Some issues:

A simple bar chart with a zero base line:

d <- data.frame(pres = c("Obama", "Carter", "Clinton",
                         "G.W. Bush", "Reagan", "G.H.W Bush", "Trump"),
                appr = c(79, 78, 68, 65, 58, 56, 40),
                party = c("D", "D", "D", "R", "R", "R", "R"),
                year = c(2009, 1977, 1993, 2001, 1981, 1989, 2017))
d <- mutate(d, pres = reorder(pres, appr))

p <- ggplot(d, aes(x = pres, y = appr, fill = party)) +
     geom_col() + coord_flip()
p

This can be changed using scale_fill_manual:

p + scale_fill_manual(values = c(R = "red", D = "blue")) 

We can reduce the saturation and the value in the HSV color representation to obtain less intense colors; this is commonly used in red state/blue state maps:

myred <- hsv(0, 0.6, 0.8)
myblue <- hsv(2 / 3, 0.6, 0.8)
p + scale_fill_manual(values = c(R = myred, D = myblue)) 

Some enhancements:

p + scale_fill_manual(values = c(R = myred, D = myblue)) + theme_void() +
    geom_text(aes(y = 3, label = pres),
              size = 8, hjust = "left", color = "white") +
    geom_text(aes(y = appr - 3, label = appr),
              size = 8, hjust = "right", color = "white")

Some notes: