---
title: "Color Issues"
output:
html_document:
toc: yes
code_folding: show
code_download: true
---
```{r setup, include = FALSE}
source(here::here("setup.R"))
options(htmltools.dir.version = FALSE)
options(conflicts.policy = "depends.ok")
knitr::opts_chunk$set(collapse = TRUE,
fig.height = 5, fig.width = 6, fig.align = "center")
library(lattice)
library(tidyverse)
theme_set(theme_minimal() +
theme(text = element_text(size = 16)) +
theme(panel.border = element_rect(color = "grey30", fill = NA)))
set.seed(12345)
```
Color is very effective when used well.
But using color well is not easy.
Some of the issues:
* Perception depends on context.
* Simple color assignments may not separate equally well.
* Effectiveness may vary with the medium (screen, projector, print).
* Some people do not perceive the full spectrum of colors.
* Grey scale printing.
* Some colors have cultural significance.
* Cultural significance may vary among cultures and with time.
An internet "controversy" in 2015:
[The Dress](https://en.wikipedia.org/wiki/The_dress) (and a follow-up
[article](https://www.independent.co.uk/life-style/fashion/news/the-dress-actual-colour-brand-and-price-details-revealed-10074686.html))
## Color Spaces
### RGB and HSV Color Spaces
Computer monitors and projectors work in terms of red, green, and blue
light.
Amounts of red green and blue (and alpha level) are stored as integers
in the range between 0 and 255 (8-bit bytes).
```{r}
cols <- c("red", "green", "blue", "yellow", "cyan", "magenta")
(rgbcols <- col2rgb(cols))
colnames(rgbcols) <- cols
```
Colors are often encoded in _hexadecimal_ form (base 16).
```{r}
rgb(1, 0, 0) ## pure red
rgb(0, 0, 1) ## pure blue
```
```{r}
rgb(255, 0, 0, maxColorValue = 255)
rgb(0, 0, 255, maxColorValue = 255)
```
Hue, saturation, value (HSV) is a simple transformation of RGB.
```{r}
rgb2hsv(rgbcols)
```
HSV is a little more convenient since it allows the hue to be
controlled separately.
But saturation and value attributes are not particularly useful for
specifying colors that work well perceptually.
A color wheel of fully saturated colors:
```{r, class.source = "fold-hide"}
wheel <- function(col, radius = 1, ...)
pie(rep(1, length(col)),
col = col, radius = radius, ...)
wheel(rainbow(6))
```
Removing saturation:
```{r, class.source = "fold-hide"}
library(colorspace)
wheel(desaturate(rainbow(6)))
```
Fully saturated yellow is brighter than red, which is brighter than
blue.
### HCL Color Space
The _rainbow_ palette of the color wheel is often a default in
visualization systems.
A [blog post](https://eeecon.uibk.ac.at/~zeileis/news/endrainbow/)
illustrates why this is a bad idea.
The rainbow hues are evenly spaced in the color spectrum, but chroma
and luminance are not.
Luminance in particular is not monotone across the palette.
```{r, fig.width = 10, class.source = "fold-hide"}
rgb2hcl <- function(col) {
## ignores alpha
col <- RGB(t(col[1 : 3, ]) / 255)
col <- as(col, "polarLUV")
col <- t(col@coords[, 3 : 1, drop = FALSE])
rownames(col) <- tolower(rownames(col))
col
}
col2hcl <- function(col) rgb2hcl(col2rgb(col))
pal <- function(col, border = "light gray", ...) {
n <- length(col)
plot(0, 0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "", ...)
rect((0 : (n - 1)) / n, 0, (1 : n) / n, 1, col = col, border = border)
}
par(mfrow = c(1, 2))
pal(rainbow(6), main = "Saturated Rainbow")
pal(desaturate(rainbow(6)), main = "Desaturated")
```
```{r, class.source = "fold-hide"}
specplot(rainbow(6), ldw = 4)
```
The hue, chroma, luminance (HCL) space allows separate control of:
* _Hue_, the color.
* _Chroma_, the amount of the color.
* _Luminance_, or perceived brightness.
HCL makes it easier to create perceptually uniform color palettes.
A palette with constant chroma, evenly spaced hues and evenly spaced
luminance values:
```{r, fig.width = 10, class.source = "fold-hide"}
rain6 <- hcl(seq(0, 360 * 5 / 6, len = 6), 50, seq(60, 80, len = 6))
par(mfrow = c(1, 2))
pal(rain6, main = "Uniform Rainbow")
pal(desaturate(rain6), main = "Desaturated")
```
```{r, class.source = "fold-hide"}
specplot(rain6, lwd = 4)
```
For a fully saturated red, varying only chroma to reduce the amount of
color:
```{r, class.source = "fold-hide"}
red_hcl <- list(h = 12.17395, c = 179.04076, l = 53.24059)
specplot(hcl(red_hcl$h, red_hcl$c * seq(0, 1, len = 10), red_hcl$l), lwd = 4)
```
For a given hue, not all combinations of chroma and luminance are
possible.
In particular, for low luminance values the available chroma range is
limited.
The `ggplot` book contains this visualization of the HCL space.
* Hue is mapped to angle.
* Chroma is mapped to radius.
* Luminance is mapped to facets.
The origins with zero chroma are shades of grey.
```{r, out.width = 500, echo = FALSE}
knitr::include_graphics(IMG("ggplot2_hclspace.png"))
```
HCL is a transformation of the
[CIEluv](https://en.wikipedia.org/wiki/CIELUV) color space designed
for perceptual uniformity.
The definition of the luminance takes into account the light
sensitivity of a standard human observer at various wave lengths.
Light sensitivity for different wave lengths in daylight conditions
(photopic vision) and under dark adapted conditions (scotopic vision):
```{r, echo = FALSE}
knitr::include_graphics(IMG("light-sensitivity.jpg"))
```
### Munsell Color Space
Another color space, similar to HCL, is the
[Munsell system](https://en.wikipedia.org/wiki/Munsell_color_system)
developed in the early 1900s.
This system uses a Hue, Value, Chroma encoding.
The [`munsell` package](https://github.com/cwickham/munsell) provides
an R interface and is used in `ggplot`.
Munsell specifications are of the form `"H V/C"`, such as
`5R 5/10`.
Possible hues are
```{r, message = FALSE}
library(munsell, exclude = "desaturate")
mnsl_hues()
```
`V` should be an integer between 0 and 10.
`C` should be an even integer less than 24, but not all combinations
are possible.
Adjusting colors in the value, chroma, and hue dimensions:
```{r munsell-blues, echo = FALSE}
my_blue <- "5PB 5/8"
plot_mnsl(c(
lighter(my_blue, 2), my_blue, darker(my_blue, 2),
munsell::desaturate(my_blue, 2), my_blue, saturate(my_blue, 2),
rygbp(my_blue, 2), my_blue, pbgyr(my_blue, 2)))
```
```{r munsell-blues, eval = FALSE}
```
Creating scales:
```{r, warning=FALSE, fig.width = 8, fig.height = 4}
plot_mnsl(sapply(0 : 6, darker, col = "5PB 7/4")) + facet_wrap(~ num, nrow = 1)
```
Examining available colors:
```{r, warning = FALSE}
hue_slice("5R")
```
```{r}
value_slice(5)
```
Complementary colors:
```{r, warning=FALSE, fig.width = 8, fig.height = 4}
complement_slice("5R")
```
## Opponent Process Theory
The _Opponent Process Model_ of vision says that the brain divides the
visual signal among three opposing contrast pairs:
* black and white;
* red and green;
* yellow and blue.
The black/white pair corresponds to luminance in HCL
Hue and chroma in HCL span the two chromatic axes.
The luminance axis has higher resolution than the two chromatic axes.
The major form of color vision deficiency reflects an inability to
distinguish differences along the red/green axis.
Impairment along the yellow/blue axis does occur as well but is much
rarer.
## Contrast and Comparisons
Vision reacts to differences, not absolutes.
* Contrast is very important.
* Context is very important.
Small differences in shading or hue can be recognized when objects are
contiguous but be much harder to see when they are separated.

_Simultaneous brightness contrast_: a grey patch on a dark background
looks lighter than the same grey patch on a light background.
```{r, class.source = "fold-hide"}
plot(0, 0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "")
rect(0, 0, 0.5, 1, col = "lightgrey", border = NA)
rect(0.5, 0, 1, 1, col = "darkgrey", border = NA)
rect(0.2, 0.3, 0.3, 0.7, col = "grey", border = NA)
rect(0.7, 0.3, 0.8, 0.7, col = "grey", border = NA)
```
An example we saw earlier:
`) `)
Some more are available
[here](https://blog.revolutionanalytics.com/2018/08/luminance-illusion.html),
including:
```{r, echo = FALSE}
knitr::include_graphics(IMG("luminanim.gif"))
```
```{r, eval = FALSE, echo = FALSE}
plot(0, 0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "")
rect(0, 0, 0.5, 1, col = hcl(0), border = NA)
rect(0.5, 0, 1, 1, col = hcl(120), border = NA)
pink1 <- rgb(1, 192 / 255, 220 / 255)
rect(0.2, 0.3, 0.3, 0.7, col = pink1, border = NA)
rect(0.7, 0.3, 0.8, 0.7, col = pink1, border = NA)
```
Using luminance or grey scale alone does not work well for encoding
categorical variables against a key.
Grey scale can be effective for showing continuous transitions in
pseudo-color images.
```{r, class.source = "fold-hide"}
filled.contour(volcano, color.palette = grey.colors)
```
Grey scale is less effective for segmented maps, or choropleth maps;
only a few levels can be accurately decoded.
## Interactions with Size, Background and Proximity
```{r, echo = FALSE, eval = FALSE}
library(latticeExtra)
library(mapproj)
data(USCancerRates)
mapplot(rownames(USCancerRates) ~ log(rate.male), data = USCancerRates,
##colramp = grey.colors,
border = NA,
map = map("county", plot = FALSE, fill = TRUE,
projection = "mercator"))
```
For small items more contrast and more saturated colors are needed:
```{r, fig.height = 7, class.source = "fold-hide"}
x <- runif(6, 0.1, 0.9)
y <- runif(6, 0.1, 0.9)
cols <- c("red", "green", "blue", "yellow", "cyan", "magenta")
f <- function(size = 1, black = FALSE) {
plot(x, y, type = "n", xlim = c(0, 1), ylim = c(0, 1))
if (black) rect(0, 0, 1, 1, col = "black")
text(x, y, cols, col = cols, cex = size)
}
opar <- par(mfrow = c(2, 2))
f(1)
f(4)
f(1, TRUE)
f(4, TRUE)
par(opar)
```
Variations in luminance are particularly helpful for seeing fine
structure, such as small text or small symbols:
```{r, class.source = "fold-hide"}
plot(0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "")
rect(0, 0, 1, 1, col = hcl(0)) ## defaults: c = 35, l = 85
qbf <- "The quick brown fox jumps ..."
text(0.5, 0.3, label = qbf, col = hcl(180)) ## hue
text(0.5, 0.5, label = qbf, col = hcl(0, c = 70)) ## chroma
text(0.5, 0.7, label = qbf, col = hcl(0, l = 50)) ## luminance
```
Chrominance (hue and chroma) differences alone are not sufficient for
small items.
Ware recommends a luminance contrast of at least 3:1 for small text;
10:1 is preferable.
Small areas also need variation in more than hue:
```{r, echo = FALSE}
plot(0, 0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "")
pal0 <- function(col, bottom = 0, top = 1) {
n <- length(col)
rect((0 : (n - 1)) / n, bottom,
(1 : n) / n, top, col = col, border = "lightgrey")
}
pal0(rainbow_hcl(8), 0, 0.2)
pal0(rainbow_hcl(8), 0.3, 0.31)
pal0(rainbow(8), 0.5, 0.7)
pal0(rainbow(8), 0.8, 0.81)
```
Contrasting borders can help for larger areas with similar luminance:
* outlines on text;
* borders on symbols;
* borders on regions, e.g in [maps](http://colorbrewer2.org)
```{r, fig.width = 10, echo = FALSE}
N <- nrow(trees)
op <- palette(rainbow(N, end = 0.9))
f <- function(fg, main = "")
with(trees,
symbols(Height, Volume, circles = Girth / 16,
inches = FALSE, bg = 1 : N,
fg = fg, main = main))
par(mfrow = c(1, 2))
f(NA, "No Borders")
f("gray30", "Grey Borders")
palette(op)
```
## Color Specification in R
```{r, include = FALSE}
ncols <- length(colors())
```
A large number of _named colors_ are available (currently `r ncols`).
Some examples:
```{r}
col2rgb("red")
col2rgb("forestgreen")
col2rgb("deepskyblue")
col2rgb("firebrick")
```
These will show some details:
```{r, eval = FALSE}
colors()
demo(colors)
```
The available named colors follow a widely used
[standard](https://www.w3schools.com/colors/colors_x11.asp).
These colors include the 140 [_web
colors_](https://htmlcolorcodes.com/color-names/) supported on modern
browsers.
Individual colors can also be specified using `rgb()` or `hcl()` or as
hexadecimal specifications.
```{r}
library(colorspace)
hex2RGB("#FF0000")
```
Using color spaces:
```{r}
rgb(1, 0, 0)
rgb(255, 0, 0, max = 255)
rgb2hsv(col2rgb("red"))
```
Converting to HCL:
```{r}
rgb2hcl <- function(col) {
## ignores alpha
col <- RGB(t(col[1 : 3, ]) / 255)
col <- as(col, "polarLUV")
col <- t(col@coords[, 3 : 1, drop = FALSE])
rownames(col) <- tolower(rownames(col))
col
}
col2hcl <- function(col) rgb2hcl(col2rgb(col))
col2hcl("red")
col2hcl("green")
col2hcl("blue")
col2hcl("yellow")
col2hcl("cyan")
col2hcl("magenta")
hcl(12.17, 179.04, 53.24)
```
Color pickers can help:
* Google search: https://www.google.com/search?q=color+picker
* `colourPicker()` in the `colourpicker` package.
When a set of colors is needed to encode variable values it is usually
best to use a suitable _palette_.
## Color Palettes
_Color palettes_ are collections of colors that work well together.
It is useful to distinguish three kinds of palettes:
* qualitative/categorical palettes;
* sequential palettes;
* diverging palettes.
Tools for selecting palettes include:
* [ColorBrewer](http://colorbrewer2.org); available in the
`RColorBrewer` package.
* [HCL Wizard](http://hclwizard.org/); also available as `hclwizard`
in the `colorspace` package.
A
[blog post](http://flowingdata.com/2018/04/12/visualization-color-picker-based-on-perception-research/)
with some further options.
Some [current US government
work](https://designsystem.digital.gov/components/colors/) on color
palettes; [more extensive
notes](https://xdgov.github.io/data-design-standards/components/colors)
and [code](https://github.com/uswds/uswds).
R color palette functions:
* `rainbow()`
* `heat.colors()`
* `terrain.colors()`
* `topo.colors()`
* `cm.colors()`
* `grey.colors()`
* `gray.colors()`
These all take the number of colors as an argument, as well as some
additional optional arguments.
The `hcl.color()` function provides access to the palettes defined in
the `colorspace` package.
`colorRampPalette()` can be used to create a _palette function_ that
interpolates between a set of colors using
* RGB space or Lab (similar to HCL) space;
* linear or spline interpolation.
```{r}
rwb <- colorRampPalette(
c("red", "white", "blue"))
rwb(5)
```
```{r palette-function-example, eval = FALSE}
filled.contour(volcano,
color.palette = rwb,
asp = 1)
```
```{r palette-function-example, echo = FALSE}
```
With more perceptually comparable extremes (from the Blue-Red palette
of HCL Wizard):
```{r palette-function-muted1, eval = FALSE}
rwb1 <- colorRampPalette(
c("#8E063B", "white", "#023FA5"))
filled.contour(volcano,
color.palette = rwb1,
asp = 1)
```
```{r palette-function-muted1, echo = FALSE}
```
An alternative uses the `muted` function from package `scales`:
```{r palette-function-muted2, eval = FALSE}
rwb2 <- colorRampPalette(
c(scales::muted("red"),
"white",
scales::muted("blue")))
filled.contour(volcano,
color.palette = rwb2,
asp = 1)
```
```{r palette-function-muted2, echo = FALSE}
```
Most base and `lattice` functions allow a vector of colors to be
specified.
Some, like `filled.contour()` and `levelplot()` allow a palette function
to be provided.
`ggplot` provides a framework for specifying palette functions to use
with `scale_color_xyz()` and `scale_fill_xyz()` functions.
Packages like `colorspace` and `viridis` provide additional
`scale_color_xyz()` and `scale_fill_xyz()` functions.
## RColorBrewer Palettes
The available palettes:
```{r brewer-palletes, eval = FALSE}
library(RColorBrewer)
display.brewer.all()
```
Palettes in the first group are _sequential_.
The second group are _qualitative_.
The third group are _diverging_.
```{r brewer-palletes, echo = FALSE, fig.width = 8, fig.height = 8}
```
The `"Blues"` palette:
```{r blues-palette, eval = FALSE}
display.brewer.pal(9, "Blues")
```
```{r blues-palette, echo = FALSE}
```
As RGB values:
```{r}
brewer.pal(9, "Blues")
```
The palettes are limited to a maximum number of levels.
To obtain more levels you can interpolate.
```{r}
brewer.pal(10, "Blues")
pbrbl <- colorRampPalette(brewer.pal(9, "Blues"), interpolate = "spline")
pbrbl
pbrbl(10)
```
## Colorspace Palettes
The `colorspace` package provides a wide range of pre-defined palettes:
```{r, fig.width = 10}
library(colorspace)
hcl_palettes(plot = TRUE)
```
A particular number of colors from one of these palettes can be
obtained with
```{r}
qualitative_hcl(4, palette = "Dark 3")
```
The functions `sequential_hcl()` and `diverging_hcl()` are analogous.
For use with `ggplot2` the package provides scale functions like
`scale_fill_discrete_qualitative()` and
`scale_color_continuous_sequential()`.
A [package
vignette](https://cran.r-project.org/package=colorspace/vignettes/colorspace.html)
provides more details and background.
## Viridis Palettes
```{r, message = FALSE, echo = FALSE}
n_col <- 128
img <- function(obj, nam) {
image(1:length(obj), 1, as.matrix(1:length(obj)), col=obj,
main = nam, ylab = "", xaxt = "n", yaxt = "n", bty = "n")
}
library(viridis)
par(mfrow=c(8, 1), mar=rep(1, 4))
img(rev(viridis(n_col)), "viridis")
img(rev(magma(n_col)), "magma")
img(rev(plasma(n_col)), "plasma")
img(rev(inferno(n_col)), "inferno")
img(rev(cividis(n_col)), "cividis")
img(rev(mako(n_col)), "mako")
img(rev(rocket(n_col)), "rocket")
img(rev(turbo(n_col)), "turbo")
```
These are provided in package `viridisLite`.
Palette functions are `viridis()`, `mako()`, etc..
They are also available via the `hcl.colors()` function.
For use in `ggplot` they can be specified in the viridis color scale
functions.
## Palettes in R Graphics
```{r, echo = FALSE}
if (! file.exists("Playfair.dat"))
download.file("http://www.stat.uiowa.edu/~luke/data/Playfair",
"Playfair.dat")
Playfair <- read.table("Playfair.dat")
Playfair$city <- rownames(Playfair)
rownames(Playfair) <- NULL
```
`ggplot` uses `scale_color_xyz()` or `scale_fill_xyz()`.
For discrete scales the choices for `xyz` include
* `hue` varies the hue (default for unordered factors);
* `grey` uses grey scale;
* `brewer` uses ColorBrewer palettes;
* `manual` allows explicit specification.
* `viridis_d` for an alternative palette family (default for orderer factors).
For continuous scales the choices for `xyz` include
* `gradient` interpolates between two colors, low and high;
* `gradient2` interpolates between three colors, low, medium, high;
* `gradientn` interpolates between a vector of colors;
* `distiller` for ColorBrewer palettes.
* `viridis_c` for an alternative palette family.
Others are available in packages such as `colorspace`.
The default qualitative and sequential discrete palettes:
```{r, fig.width = 10, fig.height = 4, class.source = "fold-hide"}
library(gapminder)
gap_2007 <- filter(gapminder, year == 2007) %>% top_n(20, pop)
p <- mutate(gap_2007, country = reorder(country, pop)) %>%
ggplot(aes(x = gdpPercap, y = lifeExp, fill = continent)) +
scale_size_area(max_size = 10) +
scale_x_log10() +
geom_point(size = 4, shape = 21) +
guides(fill = guide_legend(override.aes = list(size = 4)))
p1 <- p + ggtitle("Hue")
p2 <- p + scale_fill_viridis_d() + ggtitle("Viridis")
library(patchwork)
p1 + p2
```
Discrete examples for `brewer`, `colorspace` and `manual`:
```{r, fig.width = 9, fig.height = 6, class.source = "fold-hide"}
p1 <- p + scale_fill_brewer(palette = "Set1") +
ggtitle("Brewer Set1")
p2 <- p + scale_fill_brewer(palette = "Set2") +
ggtitle("Brewer Set2")
p3 <- p + scale_fill_discrete_qualitative("Dark 3") +
ggtitle("Colorspace Dark 3")
p4 <- p + scale_fill_manual(values = c(Africa = "red", Asia = "blue",
Americas = "green", Europe = "grey")) +
ggtitle("Manual")
(p1 + p2) / (p3 + p4)
```
The default for continuous scales is `gradient` from a dark blue to a
light blue:
```{r, class.source = "fold-hide"}
V <- data.frame(x = rep(seq_len(nrow(volcano)), ncol(volcano)),
y = rep(seq_len(ncol(volcano)), each = nrow(volcano)),
z = as.vector(volcano))
p <- ggplot(V, aes(x, y, fill = z)) + geom_raster() + coord_fixed()
p
```
Some alternatives:
```{r alt-continuous, echo = FALSE, fig.width = 12, fig.height = 9}
```
```{r alt-continuous, eval = FALSE, class.source = "fold-hide"}
p1 <- p + scale_fill_gradient2(
low = "red", mid = "white", high = "blue",
midpoint = median(volcano)) +
ggtitle("Red-White-Blue Gradient")
p2 <- p + scale_fill_viridis_c() +
ggtitle("Viridis")
p3 <- p + scale_fill_gradientn(
colors = terrain.colors(8)) +
ggtitle("Terrain")
vbins <- seq(80, by = 20, length.out = 7)
nc <- length(vbins) - 1
p4 <- ggplot(mutate(V, z = fct_rev(cut(z, vbins))),
aes(x, y, fill = z)) +
geom_raster() +
scale_fill_manual(values = rev(terrain.colors(nc))) +
ggtitle("Discretized Terrain")
(p1 + p2) / (p3 + p4)
```
Discretizing a continuous range to a modest number of levels can make
decoding values from a legend easier.
## Reduced Color Vision
Color vision deficiency affects about 10% of males, a smaller
percentage of females.
The most common form is reduced ability to distinguish red and green.
Some web sites provide tools to simulate how a visualization would
look to a color vision deficient viewer.
The R packages `dichromat`, `colorspace`, and `colorblindr` provide
tools for simulating how colors would look to a color vision deficient
viewer for three major types of color vision deficiency:
* deuteranomaly (green cone cells defective);
* protanomaly (red cone cells defective);
* tritanomaly (blue cone cells defective).
An article explaining the color vision impairment simulation is
available
[here](http://colorspace.r-forge.r-project.org/articles/color_vision_deficiency.html)
Using some tools from packages `colorspace` and `colorblinder` we can
simulate what a plot would look like in grey scale and to someone
with some of the major types of color impairment.
A plot with the default discrete color palette:
```{r, class.source = "fold-hide"}
p <- ggplot(gap_2007, aes(gdpPercap, lifeExp, color = continent)) +
geom_point(size = 4) +
scale_x_log10() +
guides(color = guide_legend(override.aes = list(size = 4)))
p
```
```{r cvd-examples, fig.width = 9.5, fig.height = 6.5, class.source = "fold-hide"}
library(colorblindr)
library(colorspace)
library(grid)
color_check <- function(p) {
p1 <- edit_colors(p + ggtitle("Desaturated"), desaturate)
p2 <- edit_colors(p + ggtitle("deutan"), deutan)
p3 <- edit_colors(p + ggtitle("protan"), protan)
p4 <- edit_colors(p + ggtitle("tritan"), tritan)
gridExtra::grid.arrange(p1, p2, p3, p4, nrow = 2)
}
color_check(p)
```
For the Viridis palette:
```{r, class.source = "fold-hide"}
pv <- p + scale_color_viridis_d()
pv
```
```{r, fig.width = 9.5, fig.height = 6.5, class.source = "fold-hide"}
color_check(pv)
```
The `swatchplot()` function in the `colorspace` package can be used
with the `cvd = TRUE` argument to simulate how specific palettes work
for different color vision deficiencies:
```{r}
colorspace::swatchplot(rainbow(6), cvd = TRUE)
```
```{r}
colorspace::swatchplot(hcl.colors(6), cvd = TRUE)
```
## Two Issues to Watch Out For
### Missing Values
It is common for default settings to not assign a color for missing values.
In a choropleth map with (made-up) data where one state's value is
missing this might not be noticed.
```{r, class.source = "fold-hide"}
m <- map_data("state")
d <- data.frame(region = unique(m$region),
val = ordered(sample(1 : 4, 49, replace = TRUE)))
m <- left_join(m, d, "region")
pm <- ggplot(m) +
geom_polygon(aes(long, lat, group = group, fill = val)) +
coord_map() +
ggthemes::theme_map()
dnm <- mutate(m, val = replace(val, region == "michigan", NA))
pm %+% dnm
```
Unless the viewer is very familiar with US geography.
Or is from Michigan.
In a scatterplot there are even fewer cues:
```{r, fig.width = 10, fig.height = 4, class.source = "fold-hide"}
gnc <- mutate(gap_2007, continent = replace(continent, country == "China", NA))
pv +
pv %+% gnc
```
Specifying `na.value = "red"`, or some other color, will make sure
`NA` values are visible:
```{r, fig.width = 14, fig.height = 5, class.source = "fold-hide"}
(pm %+% dnm +
scale_fill_viridis_d(na.value = "red") + theme(legend.position = "top")) +
(pv %+% gnc + scale_color_viridis_d(na.value = "red"))
```
Using outlines can also help:
```{r, fig.width = 14, fig.height = 5, class.source = "fold-hide"}
p1 <- pm %+% dnm +
geom_polygon(aes(long, lat, group = group),
fill = NA, color = "black", size = 0.1) +
theme(legend.position = "top")
p2 <- pv %+% gnc +
geom_point(shape = 21, fill = NA, color = "black", size = 4)
p1 + p2
```
A final plot might handle missing values differently, but for initial
explorations it is a good idea to make sure they are clearly visible.
### Aligning Diverging Palettes
Diverging palettes are very useful for showing deviations above or
below a baseline.
```{r, fig.height = 3, fig.width = 7, class.source = "fold-hide"}
par(mfrow = c(1, 2))
RColorBrewer::display.brewer.pal(7, "PRGn")
RColorBrewer::display.brewer.pal(6, "PRGn")
```
For a diverging palette to work properly, the palette base line needs
to be aligned with the data baseline.
How to do this will depend on the palette, but you do need to keep
this in mind when using a diverging palette.
Just using `scale_fill_brewer` is not enough when the value range is
not symmetric around the baseline:
```{r, class.source = "fold-hide"}
m <- map_data("state")
d <- data.frame(region = unique(m$region),
val = ordered(sample((1 : 6) - 3, 49, replace = TRUE)))
m <- left_join(m, d, "region")
p <- ggplot(m) +
geom_polygon(aes(long, lat, group = group, fill = val)) +
coord_map() +
ggthemes::theme_map() +
theme(legend.position = "right")
p + scale_fill_brewer(palette = "PRGn")
```
Setting the scale limits explicitly forces a 7-category symmetric scale
that aligns the zero value with the middle color:
```{r, warning = FALSE}
lims <- -3 : 3
p + scale_fill_brewer(palette = "PRGn",
limits = lims)
```
This shows a category in the legend for -3 that does not appear in the map.
This is often what you want.
But if you want to drop the -3 category, one option is to use a manual
scale:
```{r}
vals <- RColorBrewer::brewer.pal(7, "PRGn")
names(vals) <- lims
p + scale_fill_manual(values = vals[-1])
```
## Bivariate Palettes
It is possible to encode two variables in a palette.
Some sample palettes:
```{r, echo = FALSE}
knitr::include_graphics(IMG("bivpal.png"))
```
Bivariate palettes are sometimes used in [bivariate choropleth
maps](http://www.joshuastevens.net/cartography/make-a-bivariate-choropleth-map/).
Some recommendations from Cynthia Brewer are available
[here](http://www.personal.psu.edu/cab38/ColorSch/Schemes.html).
Unless one variable is binary, and the palette is very well chosen, it
is hard to decode a visualization using a binary palette without
constantly referring to the key.
## Culture, Tradition, and Conventions
Colors can have different meanings in different cultures and at
different times.
* A [visual
representation](http://www.informationisbeautiful.net/visualizations/colours-in-cultures/)
* A [blog
post](http://www.huffingtonpost.com/smartertravel/what-colors-mean-in-other_b_9078674.html)
at the Huffington Post.
* A [similar
post](https://www.shutterstock.com/blog/color-symbolism-and-meanings-around-the-world)
at Shutterstock.
Conventions can also give colors particular meanings:
* red/green in traffic lights;
* red/green colors in microarray heatmaps;
* red states and blue states;
* pink for breast cancer;
* pink for girls, blue for boys;
* black for mourning.
### Traffic Lights
[Traffic
lights](https://www.autoevolution.com/news/automotive-wiki-why-are-traffic-lights-red-yellow-and-green-42557.html)
use red/green, even though this is a major axis of color vision
deficiency.
The convention comes from railroads.
The red used generally contains some orange and the green contains
blue to help with red/green color vision deficiency.
Position provides an alternate encoding. Orientations do vary.
### Microarray Heatmaps
- [Microarrays](https://en.wikipedia.org/wiki/DNA_microarray) are used
for the analysis of gene-level changes and differences in
bio-medical research.
- Dyes are used that result in genes with a high response appearing red
and genes with a low response appearing green.
- In keeping with this physical characteristic of microarrays, a
common visualization of the data is as a red/green heat map.
### Red States and Blue States
It is now standard in the US to refer to Republican-leaning states as
red states and Democrat leaning states as blue states.
This is a fairly recent convention, dating back to the 2000
presidential election.
Prior to 1980 it was somewhat more traditional to use red for more
left-leaning Democrats.
A map of the 1960 election results uses these more traditional colors.
```{r, echo = FALSE}
knitr::include_graphics(IMG("e1960_ecmap.GIF"))
```
In 1996 the _New York Times_ used blue for Democrat, red for
Republican, but the _Washington Post_ used the opposite color scheme.
The long, drawn out process of the 2000 election may have contributed
to fixing the color schema at the current convention.
## Notes
Points need more saturation, luminance than areas.
False color images may benefit from discretizing.
Bivariate encodings (e.g. `x = hue, y = luminance`) are possible but
tricky and not often a good idea. Best if at least one is binary.
Providing a second encoding, e.g. shape, position can help for color
vision deficient viewers and photocopying.
In area plots and maps it is important to distinguish between base
line values and missing values.
If observed values only cover part of a possible range, it is
sometimes appropriate to use a color coding that applies to the entire
possible range.
For diverging palettes, some care may be needed to make sure the
neutral color and the neutral value are properly aligned.
## References
> Few, Stephen. "Practical rules for using color in charts." Visual
> Business Intelligence Newsletter 11
> (2008): 25. ([PDF](http://www.perceptualedge.com/articles/visual_business_intelligence/rules_for_using_color.pdf))
> Harrower, M. A. and Brewer, C. M. (2003). ColorBrewer.org: An online
> tool for selecting color schemes for maps.
> _The Cartographic Journal_, 40, 27--37. [ColorBrewer web
> site](http://colorbrewer2.org). The `RColopBrewer` package provides
> an R interface.
> Ihaka, R. (2003). Colour for presentation graphics, in K. Hornik,
> F. Leisch, and A. Zeileis (eds.), [_Proceedings of the 3rd_
> _International Workshop on Distributed Statistical_
> _Computing_](https://www.r-project.org/conferences/DSC-2003/Proceedings/),
> Vienna,
> Austria. [PDF](https://www.r-project.org/conferences/DSC-2003/Proceedings/Ihaka.pdf).
> See also the `colorspace` package and
> [vignette](https://cran.r-project.org/package=colorspace/vignettes/hcl-colors.pdf).
> Lumley, T. (2006). Color coding and color blindness in statistical
> graphics. _ASA Statistical Computing & Graphics Newsletter_, 17(2),
> 4--7. [PDF](http://stat-computing.org/newsletter/issues/scgn-17-2.pdf).
> Munzner, T. (2014), _Visualization Analysis and Design_, Chapter 10.
> Lisa Charlotte Muth (2021). 4-part series of blog posts on choosing
> color scales. [Part
> 1](https://blog.datawrapper.de/which-color-scale-to-use-in-data-vis/);
> [Part
> 2](https://blog.datawrapper.de/quantitative-vs-qualitative-color-scales);
> [Part
> 3](https://blog.datawrapper.de/diverging-vs-sequential-color-scales);
> [Part
> 4](https://blog.datawrapper.de/classed-vs-unclassed-color-scales).
> Lisa Charlotte Muth (2022). A detailed guide to colors in data vis
> style guides. [Blog
> post](https://blog.datawrapper.de/colors-for-data-vis-style-guides/).
> Treinish, Lloyd A. "Why Should Engineers and Scientists Be Worried
> About Color?." IBM Thomas J. Watson Research Center, Yorktown
> Heights, NY (2009): 46. ([pdf](https://www.researchgate.net/profile/Ahmed_Elhattab2/post/Please_suggest_some_good_3D_plot_tool_Software_for_surface_plot/attachment/5c05ba35cfe4a7645506948e/AS%3A699894335557644%401543879221725/download/Why+Should+Engineers+and+Scientists+Be+Worried+About+Color_.pdf))
> Ware, C. (2012), _Information Visualization: Perception for Design_,
> 3rd ed, Chapters 3
> & 4.
> Zeileis, A., Murrell, P. and Hornik, K. (2009). Escaping RGBland:
> Selecting colors for statistical graphics, _Computational Statistics
> & Data Analysis_, 53(9), 3259-–3270
> ([PDF](http://eeecon.uibk.ac.at/~zeileis/papers/Zeileis+Hornik+Murrell-2009.pdf)).
> Achim Zeileis, Paul Murrell (2019). HCL-Based Color Palettes in
> `grDevices`. [R Blog
> post](https://developer.r-project.org/Blog/public/2019/04/01/hcl-based-color-palettes-in-grdevices/index.html).
> Achim Zeileis et al. (2020). “colorspace: A Toolbox for Manipulating
> and Assessing Colors and Palettes.” Journal of Statistical Software,
> 96(1),
> 1-49. [doi:10.18637/jss.v096.i01](https://doi.org/10.18637/jss.v096.i01).
## Coloring Political Statements
POLITIFACT reviews the accuracy of statements by politicians and
publishes [summaries of the
results](http://www.politifact.com/personalities/michele-bachmann/).
A [2016
post](http://www.dailykos.com/story/2016/8/7/1556666/-Three-lessons-from-the-rise-of-Donald-Trump)
on [Daily Kos](http://www.dailykos.com/) included a
[visualization](http://images.dailykos.com/images/283152/large/dataviz_robert_mann.png?1470323514)
of the results for a number of politicians.
Kaiser Fung posted a
[critique](http://junkcharts.typepad.com/junk_charts/2017/04/what-does-lying-politicians-have-in-common-with-rainbow-colors.html)
at [JunkCharts](http://junkcharts.typepad.com) and proposed an
alternative.
I scraped the data as of April 11, 2017, from POLITIFACT; they are
available [here](https://www.stat.uiowa.edu/~luke/data/polfac.dat).
```{r, class.source = "fold-hide"}
if (! file.exists("polfac.dat"))
download.file("https://www.stat.uiowa.edu/~luke/data/polfac.dat",
"polfac.dat")
pft <- read.table("polfac.dat")
vcp <- prop.table(as.matrix(pft), 1)[, 6 : 1]
colnames(vcp) <- gsub("\\.", " ", colnames(vcp))
head(vcp)
```
The Daily Kos chart is ordered by the percentage of statements that
are more false than true. A function to produce a bar chart with a
specified color palette:
```{r, class.source = "fold-hide"}
## lattice version
polbars <- function(col = cm.colors(6)) {
barchart(vcp[order(rowSums(vcp[, 1 : 3])), ], auto.key = TRUE,
par.settings = list(superpose.polygon = list(col = col)))
}
## ggplot version
gvcp <- as.data.frame(vcp) %>%
rownames_to_column("Name") %>%
pivot_longer(-1, names_to = "Grade", values_to = "prop") %>%
mutate(Grade = fct_rev(fct_inorder(Grade)))
nm <- mutate(gvcp, Grade = ordered(Grade)) %>%
filter(Grade <= "Half True") %>%
group_by(Name) %>%
summarize(prop = sum(prop)) %>%
arrange(desc(prop)) %>%
pull(Name)
gvcp <- mutate(gvcp, Name = factor(Name, nm))
pvcp <- ggplot(gvcp, aes(Name, prop, fill = Grade)) +
geom_col(position = "fill", width = 0.7) +
coord_flip() +
theme(legend.position = "top",
plot.margin = margin(r = 50),
legend.text = element_text(size = 10)) +
scale_y_continuous(labels = scales::percent, expand = c(0, 0)) +
guides(fill = guide_legend(title = NULL, nrow = 1, reverse = TRUE)) +
labs(x = "", y = "")
polbars <- function(col = cm.colors(6))
pvcp + scale_fill_manual(values = rev(col))
polbars()
```
The original Daily Kos chart seems to use a slightly modified version
of the Color Brewer `Spectral` palette, a diverging palette.
```{r, class.source = "fold-hide"}
polbars(brewer.pal(6, "Spectral"))
dkcols <- brewer.pal(6, "Spectral")
dkcols[4] <- "lightgrey"
polbars(dkcols)
```
The JunkCharts plot uses another diverging palette, close to the
`Blue-Red ` palette available in `hclwizard`.
```{r, class.source = "fold-hide"}
rwbcols <- c("#4A6FE3", "#8595E1", "#B5BBE3", "#E2E2E2",
"#E6AFB9", "#E07B91", "#D33F6A")
polbars(rwbcols)
polbars(rev(rwbcols))
```
Another diverging palette:
```{r, class.source = "fold-hide"}
polbars(brewer.pal(7, "PiYG"))
```
A sequential palette:
```{r, class.source = "fold-hide"}
polbars(rev(brewer.pal(6, "Oranges")))
```
## Reading
Section [_Perception and Data
Visualization_](https://socviz.co/lookatdata.html#perception-and-data-visualization)
in [_Data Visualization_](https://socviz.co/).
Chapter [_Color scales_](https://clauswilke.com/dataviz/color-basics.html)
in [_Fundamentals of Data
Visualization_](https://clauswilke.com/dataviz/).
## Exercises
1. A color can be specified in hexadecimal notation. Given such a
color specification you can find out what it looks like by using
it in a simple plot, or by using the Google color picker. Which
of the following best describe the color `#B22222`?
a. a shade of green
b. a shade of blue
c. orange
d. a shade of red
2. The following shows how to view the colors in the
`RColorBrewer` palette named `Reds` with 7 colors:
```{r, fig.cap = ""}
library(RColorBrewer)
display.brewer.pal(7, "Reds")
```
Which of the following `RColorBrewer` palettes is diverging?
a. `Blues`
b. `PuRd`
c. `Set1`
d. `RdGy`