Color is very effective when used well.
But using color well is not easy.
Some of the issues:
Perception depends on context.
Simple color assignments may not separate equally well.
Effectiveness may vary with the medium (screen, projector,
print).
Some people do not perceive the full spectrum of colors.
Grey scale printing.
Some colors have cultural significance.
Cultural significance may vary among cultures and with
time.
An internet “controversy” in 2015: The Dress (and a
follow-up article )
Color Spaces
RGB and HSV Color Spaces
Computer monitors and projectors work in terms of red, green, and
blue light.
Amounts of red green and blue (and alpha level) are stored as
integers in the range between 0 and 255 (8-bit bytes).
cols <- c("red", "green", "blue", "yellow", "cyan", "magenta")
rgbcols <- col2rgb(cols); colnames(rgbcols) <- cols
rgbcols
## red green blue yellow cyan magenta
## red 255 0 0 255 0 255
## green 0 255 0 255 255 0
## blue 0 0 255 0 255 255
Colors are often encoded in hexadecimal form (base 16).
rgb(1, 0, 0) ## pure red
## [1] "#FF0000"
rgb(0, 0, 1) ## pure blue
## [1] "#0000FF"
rgb(255, 0, 0, maxColorValue = 255)
## [1] "#FF0000"
rgb(0, 0, 255, maxColorValue = 255)
## [1] "#0000FF"
Hue, saturation, value (HSV) is a simple transformation of RGB.
rgb2hsv(rgbcols)
## red green blue yellow cyan magenta
## h 0 0.3333333 0.6666667 0.1666667 0.5 0.8333333
## s 1 1.0000000 1.0000000 1.0000000 1.0 1.0000000
## v 1 1.0000000 1.0000000 1.0000000 1.0 1.0000000
HSV is a little more convenient since it allows the hue to be
controlled separately.
A simple color picker: https://www.google.com/search?q=color+picker
But saturation and value attributes are not particularly useful for
specifying colors that work well perceptually.
A color wheel of fully saturated colors:
wheel <- function(col, radius = 1, ...)
pie(rep(1, length(col)),
col = col, radius = radius, ...)
wheel(rainbow(6))
Removing saturation:
library(colorspace)
wheel(desaturate(rainbow(6)))
Fully saturated yellow is brighter than red, which is brighter than
blue.
HCL Color Space
The rainbow palette of the color wheel is often a default in
visualization systems.
A blog post
illustrates why this is a bad idea.
The rainbow hues are evenly spaced in the color spectrum, but chroma
and luminance are not.
Luminance in particular is not monotone across the palette.
rgb2hcl <- function(col) {
## ignores alpha
col <- RGB(t(col[1 : 3, ]) / 255)
col <- as(col, "polarLUV")
col <- t(col@coords[, 3 : 1, drop = FALSE])
rownames(col) <- tolower(rownames(col))
col
}
col2hcl <- function(col) rgb2hcl(col2rgb(col))
pal <- function(col, border = "light gray", ...) {
n <- length(col)
plot(0, 0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "", ...)
rect((0 : (n - 1)) / n, 0, (1 : n) / n, 1, col = col, border = border)
}
par(mfrow = c(1, 2))
pal(rainbow(6), main = "Saturated Rainbow")
pal(desaturate(rainbow(6)), main = "Desaturated")
specplot(rainbow(6), ldw = 4)
The hue, chroma, luminance (HCL ) space
allows separate control of:
Hue , the color.
Chroma , the amount of the color.
Luminance , or perceived brightness.
HCL makes it easier to create perceptually uniform color
palettes.
A palette with constant chroma, evenly spaced hues and evenly spaced
luminance values:
rain6 <- hcl(seq(0, 360 * 5 / 6, len = 6), 50, seq(60, 80, len = 6))
par(mfrow = c(1, 2))
pal(rain6, main = "Uniform Rainbow")
pal(desaturate(rain6), main = "Desaturated")
specplot(rain6, lwd = 4)
For a fully saturated red, varying only chroma to reduce the amount
of color:
red_hcl <- list(h = 12.17395, c = 179.04076, l = 53.24059)
specplot(hcl(red_hcl$h, red_hcl$c * seq(0, 1, len = 10), red_hcl$l), lwd = 4)
For a given hue, not all combinations of chroma and luminance are
possible.
In particular, for low luminance values the available chroma range is
limited.
The ggplot book
contains this visualization of the HCL space.
Hue is mapped to angle.
Chroma is mapped to radius.
Luminance is mapped to facets.
The origins with zero chroma are shades of grey.
HCL color
picker.
HCL is a transformation of the CIEluv color space
designed for perceptual uniformity.
The definition of the luminance takes into account the light
sensitivity of a standard human observer at various wave lengths.
Light sensitivity for different wave lengths in daylight conditions
(photopic vision) and under dark adapted conditions (scotopic
vision):
Munsell Color Space
Another color space, similar to HCL, is the Munsell
system developed in the early 1900s.
This system uses a Hue, Value, Chroma encoding.
The munsell
package provides an R interface and is used in
ggplot.
Munsell specifications are of the form "H V/C", such as
5R 5/10.
Possible hues are
library(munsell, exclude = "desaturate")
mnsl_hues()
## [1] "2.5R" "5R" "7.5R" "10R" "2.5YR" "5YR" "7.5YR" "10YR" "2.5Y"
## [10] "5Y" "7.5Y" "10Y" "2.5GY" "5GY" "7.5GY" "10GY" "2.5G" "5G"
## [19] "7.5G" "10G" "2.5BG" "5BG" "7.5BG" "10BG" "2.5B" "5B" "7.5B"
## [28] "10B" "2.5PB" "5PB" "7.5PB" "10PB" "2.5P" "5P" "7.5P" "10P"
## [37] "2.5RP" "5RP" "7.5RP" "10RP"
V should be an integer between 0 and 10.
C should be an even integer less than 24, but not all
combinations are possible.
Adjusting colors in the value, chroma, and hue dimensions:
my_blue <- "5PB 5/8"
plot_mnsl(c(
lighter(my_blue, 2), my_blue, darker(my_blue, 2),
munsell::desaturate(my_blue, 2), my_blue, saturate(my_blue, 2),
rygbp(my_blue, 2), my_blue, pbgyr(my_blue, 2)))
Creating scales:
plot_mnsl(sapply(0 : 6, darker, col = "5PB 7/4")) + facet_wrap(~ num, nrow = 1)
Examining available colors:
hue_slice("5R")
value_slice(5)
Complementary colors:
complement_slice("5R")
Opponent Process Theory
The Opponent Process Model of vision says that the brain
divides the visual signal among three opposing contrast pairs:
black and white;
red and green;
yellow and blue.
The black/white pair corresponds to luminance in HCL
Hue and chroma in HCL span the two chromatic axes.
The luminance axis has higher resolution than the two chromatic
axes.
The major form of color vision deficiency reflects an inability to
distinguish differences along the red/green axis.
Impairment along the yellow/blue axis does occur as well but is much
rarer.
Contrast and Comparisons
Vision reacts to differences, not absolutes.
Small differences in shading or hue can be recognized when objects
are contiguous but be much harder to see when they are separated.
Simultaneous brightness contrast : a grey patch on a dark
background looks lighter than the same grey patch on a light
background.
plot(0, 0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "")
rect(0, 0, 0.5, 1, col = "lightgrey", border = NA)
rect(0.5, 0, 1, 1, col = "darkgrey", border = NA)
rect(0.2, 0.3, 0.3, 0.7, col = "grey", border = NA)
rect(0.7, 0.3, 0.8, 0.7, col = "grey", border = NA)
An example we saw earlier:
Some more are available here ,
including:
Using luminance or grey scale alone does not work well for encoding
categorical variables against a key.
Grey scale can be effective for showing continuous transitions in
pseudo-color images.
filled.contour(volcano, color.palette = grey.colors)
Grey scale is less effective for segmented maps, or choropleth maps;
only a few levels can be accurately decoded.
Interactions with Size, Background and Proximity
For small items more contrast and more saturated colors are
needed:
x <- runif(6, 0.1, 0.9)
y <- runif(6, 0.1, 0.9)
cols <- c("red", "green", "blue", "yellow", "cyan", "magenta")
f <- function(size = 1, black = FALSE) {
plot(x, y, type = "n", xlim = c(0, 1), ylim = c(0, 1))
if (black) rect(0, 0, 1, 1, col = "black")
text(x, y, cols, col = cols, cex = size)
}
opar <- par(mfrow = c(2, 2))
f(1)
f(4)
f(1, TRUE)
f(4, TRUE)
par(opar)
Variations in luminance are particularly helpful for seeing fine
structure, such as small text or small symbols:
plot(0, type = "n", xlim = c(0, 1), ylim = c(0, 1),
axes = FALSE, xlab = "", ylab = "")
rect(0, 0, 1, 1, col = hcl(0)) ## defaults: c = 35, l = 85
qbf <- "The quick brown fox jumps ..."
text(0.5, 0.3, label = qbf, col = hcl(180)) ## hue
text(0.5, 0.5, label = qbf, col = hcl(0, c = 70)) ## chroma
text(0.5, 0.7, label = qbf, col = hcl(0, l = 50)) ## luminance
Chrominance (hue and chroma) differences alone are not sufficient for
small items.
Ware recommends a luminance contrast of at least 3:1 for small text;
10:1 is preferable.
Small areas also need variation in more than hue:
Contrasting borders can help for larger areas with similar
luminance:
Color Specification in R
A large number of named colors are available (currently
657).
Some examples:
col2rgb("red")
## [,1]
## red 255
## green 0
## blue 0
col2rgb("forestgreen")
## [,1]
## red 34
## green 139
## blue 34
col2rgb("deepskyblue")
## [,1]
## red 0
## green 191
## blue 255
col2rgb("firebrick")
## [,1]
## red 178
## green 34
## blue 34
These will show some details:
colors()
demo(colors)
The available named colors follow a widely used standard .
These colors include the 140 web colors
supported on modern browsers.
Individual colors can also be specified using rgb() or
hcl() or as hexadecimal specifications.
library(colorspace)
hex2RGB("#FF0000")
## R G B
## [1,] 1 0 0
Using color spaces:
rgb(1, 0, 0)
## [1] "#FF0000"
rgb(255, 0, 0, max = 255)
## [1] "#FF0000"
rgb2hsv(col2rgb("red"))
## [,1]
## h 0
## s 1
## v 1
Converting to HCL:
rgb2hcl <- function(col) {
## ignores alpha
col <- RGB(t(col[1 : 3, ]) / 255)
col <- as(col, "polarLUV")
col <- t(col@coords[, 3 : 1, drop = FALSE])
rownames(col) <- tolower(rownames(col))
col
}
col2hcl <- function(col) rgb2hcl(col2rgb(col))
col2hcl("red")
## [,1]
## h 12.17395
## c 179.04076
## l 53.24059
col2hcl("green")
## [,1]
## h 127.7235
## c 135.7811
## l 87.7351
col2hcl("blue")
## [,1]
## h 265.87278
## c 130.67593
## l 32.29567
col2hcl("yellow")
## [,1]
## h 85.87351
## c 107.06462
## l 97.13951
col2hcl("cyan")
## [,1]
## h 192.16714
## c 72.09794
## l 91.11330
col2hcl("magenta")
## [,1]
## h 307.72618
## c 137.40166
## l 60.32351
hcl(12.17, 179.04, 53.24)
## [1] "#FF0000"
Color pickers can help:
When a set of colors is needed to encode variable values it is
usually best to use a suitable palette .
Color Palettes
Color palettes are collections of colors that work well
together.
It is useful to distinguish three kinds of palettes:
Tools for selecting palettes include:
A blog
post with some further options.
Some current US
government work on color palettes; more
extensive notes and code .
R color palette functions:
rainbow()
heat.colors()
terrain.colors()
topo.colors()
cm.colors()
grey.colors()
gray.colors()
These all take the number of colors as an argument, as well as some
additional optional arguments.
The hcl.color() function provides access to the palettes
defined in the colorspace package.
colorRampPalette() can be used to create a palette
function that interpolates between a set of colors using
rwb <- colorRampPalette(
c("red", "white", "blue"))
rwb(5)
## [1] "#FF0000" "#FF7F7F" "#FFFFFF" "#7F7FFF" "#0000FF"
filled.contour(volcano,
color.palette = rwb,
asp = 1)
With more perceptually comparable extremes (from the Blue-Red palette
of HCL Wizard):
rwb1 <- colorRampPalette(
c("#8E063B", "white", "#023FA5"))
filled.contour(volcano,
color.palette = rwb1,
asp = 1)
An alternative uses the muted function from package
scales:
rwb2 <- colorRampPalette(
c(scales::muted("red"),
"white",
scales::muted("blue")))
filled.contour(volcano,
color.palette = rwb2,
asp = 1)
Most base and lattice functions allow a vector of colors
to be specified.
Some, like filled.contour() and levelplot()
allow a palette function to be provided.
ggplot provides a framework for specifying palette
functions to use with scale_color_xyz() and
scale_fill_xyz() functions.
Packages like colorspace and viridis
provide additional scale_color_xyz() and
scale_fill_xyz() functions.
RColorBrewer Palettes
The available palettes:
library(RColorBrewer)
display.brewer.all()
Palettes in the first group are sequential .
The second group are qualitative .
The third group are diverging .
The "Blues" palette:
display.brewer.pal(9, "Blues")
As RGB values:
brewer.pal(9, "Blues")
## [1] "#F7FBFF" "#DEEBF7" "#C6DBEF" "#9ECAE1" "#6BAED6" "#4292C6" "#2171B5"
## [8] "#08519C" "#08306B"
The palettes are limited to a maximum number of levels.
To obtain more levels you can interpolate.
brewer.pal(10, "Blues")
## Warning in brewer.pal(10, "Blues"): n too large, allowed maximum for palette Blues is 9
## Returning the palette you asked for with that many colors
## [1] "#F7FBFF" "#DEEBF7" "#C6DBEF" "#9ECAE1" "#6BAED6" "#4292C6" "#2171B5"
## [8] "#08519C" "#08306B"
pbrbl <- colorRampPalette(brewer.pal(9, "Blues"), interpolate = "spline")
pbrbl
## function (n)
## {
## x <- ramp(seq.int(0, 1, length.out = n))
## if (ncol(x) == 4L)
## rgb(x[, 1L], x[, 2L], x[, 3L], x[, 4L], maxColorValue = 255)
## else rgb(x[, 1L], x[, 2L], x[, 3L], maxColorValue = 255)
## }
## <bytecode: 0x59e9da5e7628>
## <environment: 0x59e9d9124d20>
pbrbl(10)
## [1] "#F7FBFF" "#E0ECF7" "#CCDEF1" "#ADD0E5" "#81BBDA" "#57A1CF" "#3687C0"
## [8] "#1A69B0" "#064D98" "#08306B"
Colorspace Palettes
The colorspace package provides a wide range of
pre-defined palettes:
library(colorspace)
hcl_palettes(plot = TRUE)
A particular number of colors from one of these palettes can be
obtained with
qualitative_hcl(4, palette = "Dark 3")
## [1] "#E16A86" "#909800" "#00AD9A" "#9183E6"
The functions sequential_hcl() and
diverging_hcl() are analogous.
For use with ggplot2 the package provides scale
functions like scale_fill_discrete_qualitative() and
scale_color_continuous_sequential().
A package
vignette provides more details and background.
Viridis Palettes
These are provided in package viridisLite.
Palette functions are viridis(), mako(),
etc..
They are also available via the hcl.colors()
function.
For use in ggplot they can be specified in the viridis
color scale functions.
Palettes in R Graphics
ggplot uses scale_color_xyz() or
scale_fill_xyz().
For discrete scales the choices for xyz include
hue varies the hue (default for unordered
factors);
grey uses grey scale;
brewer uses ColorBrewer palettes;
manual allows explicit specification.
viridis_d for an alternative palette family (default
for orderer factors).
For continuous scales the choices for xyz include
gradient interpolates between two colors, low and
high;
gradient2 interpolates between three colors, low,
medium, high;
gradientn interpolates between a vector of colors;
distiller for ColorBrewer palettes.
viridis_c for an alternative palette family.
Others are available in packages such as colorspace.
The default qualitative and sequential discrete palettes:
library(gapminder)
gap_2007 <- filter(gapminder, year == 2007) |> slice_max(pop, n = 20)
p <- mutate(gap_2007, country = reorder(country, pop)) |>
ggplot(aes(x = gdpPercap, y = lifeExp, fill = continent)) +
scale_size_area(max_size = 10) +
scale_x_log10() +
geom_point(size = 4, shape = 21) +
guides(fill = guide_legend(override.aes = list(size = 4)))
p1 <- p + ggtitle("Hue")
p2 <- p + scale_fill_viridis_d() + ggtitle("Viridis")
library(patchwork)
p1 + p2
Discrete examples for brewer, colorspace
and manual:
p1 <- p + scale_fill_brewer(palette = "Set1") +
ggtitle("Brewer Set1")
p2 <- p + scale_fill_brewer(palette = "Set2") +
ggtitle("Brewer Set2")
p3 <- p + scale_fill_discrete_qualitative("Dark 3") +
ggtitle("Colorspace Dark 3")
p4 <- p + scale_fill_manual(values = c(Africa = "red", Asia = "blue",
Americas = "green", Europe = "grey")) +
ggtitle("Manual")
(p1 + p2) / (p3 + p4)
The default for continuous scales is gradient from a
dark blue to a light blue:
V <- data.frame(x = rep(seq_len(nrow(volcano)), ncol(volcano)),
y = rep(seq_len(ncol(volcano)), each = nrow(volcano)),
z = as.vector(volcano))
p <- ggplot(V, aes(x, y, fill = z)) + geom_raster() + coord_fixed()
p
Some alternatives:
p1 <- p + scale_fill_gradient2(
low = "red", mid = "white", high = "blue",
midpoint = median(volcano)) +
ggtitle("Red-White-Blue Gradient")
p2 <- p + scale_fill_viridis_c() +
ggtitle("Viridis")
p3 <- p + scale_fill_gradientn(
colors = terrain.colors(8)) +
ggtitle("Terrain")
vbins <- seq(80, by = 20, length.out = 7)
nc <- length(vbins) - 1
p4 <- ggplot(mutate(V, z = fct_rev(cut(z, vbins))),
aes(x, y, fill = z)) +
geom_raster() +
scale_fill_manual(values = rev(terrain.colors(nc))) +
ggtitle("Discretized Terrain")
(p1 + p2) / (p3 + p4)
Discretizing a continuous range to a modest number of levels can make
decoding values from a legend easier.
Reduced Color Vision
Color vision deficiency affects about 10% of males, a smaller
percentage of females.
The most common form is reduced ability to distinguish red and
green.
Some web sites provide tools to simulate how a visualization would
look to a color vision deficient viewer.
The R packages dichromat, colorspace, and
colorblindr provide tools for simulating how colors would
look to a color vision deficient viewer for three major types of color
vision deficiency:
deuteranomaly (green cone cells defective);
protanomaly (red cone cells defective);
tritanomaly (blue cone cells defective).
An article explaining the color vision impairment simulation is
available here
Using some tools from packages colorspace and
colorblinder we can simulate what a plot would look like in
grey scale and to someone with some of the major types of color
impairment.
A plot with the default discrete color palette:
p <- ggplot(gap_2007, aes(gdpPercap, lifeExp, color = continent)) +
geom_point(size = 4) +
scale_x_log10() +
guides(color = guide_legend(override.aes = list(size = 4)))
p
library(colorblindr)
library(colorspace)
library(grid)
color_check <- function(p) {
p1 <- edit_colors(p + ggtitle("Desaturated"), desaturate)
p2 <- edit_colors(p + ggtitle("deutan"), deutan)
p3 <- edit_colors(p + ggtitle("protan"), protan)
p4 <- edit_colors(p + ggtitle("tritan"), tritan)
gridExtra::grid.arrange(p1, p2, p3, p4, nrow = 2)
}
color_check(p)
For the Viridis palette:
pv <- p + scale_color_viridis_d()
pv
color_check(pv)
The swatchplot() function in the colorspace
package can be used with the cvd = TRUE argument to
simulate how specific palettes work for different color vision
deficiencies:
colorspace::swatchplot(rainbow(6), cvd = TRUE)
colorspace::swatchplot(hcl.colors(6), cvd = TRUE)
Two Issues to Watch Out For
Missing Values
It is common for default settings to not assign a color for missing
values.
In a choropleth map with (made-up) data where one state’s value is
missing this might not be noticed.
m <- map_data("state")
d <- data.frame(region = unique(m$region),
val = ordered(sample(1 : 4, 49, replace = TRUE)))
m <- left_join(m, d, "region")
pm <- ggplot(m) +
geom_polygon(aes(long, lat, group = group, fill = val)) +
coord_map() +
ggthemes::theme_map()
dnm <- mutate(m, val = replace(val, region == "michigan", NA))
pm %+% dnm
## Warning: <ggplot> %+% x was deprecated in ggplot2 4.0.0.
## ℹ Please use <ggplot> + x instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Unless the viewer is very familiar with US geography.
Or is from Michigan.
In a scatterplot there are even fewer cues:
gnc <- mutate(gap_2007, continent = replace(continent, country == "China", NA))
pv +
pv %+% gnc
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Specifying na.value = "red", or some other color, will
make sure NA values are visible:
(pm %+% dnm +
scale_fill_viridis_d(na.value = "red") + theme(legend.position = "top")) +
(pv %+% gnc + scale_color_viridis_d(na.value = "red"))
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
Using outlines can also help:
p1 <- pm %+% dnm +
geom_polygon(aes(long, lat, group = group),
fill = NA, color = "black", linewidth = 0.1) +
theme(legend.position = "top")
p2 <- pv %+% gnc +
geom_point(shape = 21, fill = NA, color = "black", size = 4)
p1 + p2
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
A final plot might handle missing values differently, but for initial
explorations it is a good idea to make sure they are clearly
visible.
Aligning Diverging Palettes
Diverging palettes are very useful for showing deviations above or
below a baseline.
par(mfrow = c(1, 2))
RColorBrewer::display.brewer.pal(7, "PRGn")
RColorBrewer::display.brewer.pal(6, "PRGn")
For a diverging palette to work properly, the palette base line needs
to be aligned with the data baseline.
How to do this will depend on the palette, but you do need to keep
this in mind when using a diverging palette.
Just using scale_fill_brewer is not enough when the
value range is not symmetric around the baseline:
m <- map_data("state")
d <- data.frame(region = unique(m$region),
val = ordered(sample((1 : 6) - 3, 49, replace = TRUE)))
m <- left_join(m, d, "region")
p <- ggplot(m) +
geom_polygon(aes(long, lat, group = group, fill = val)) +
coord_map() +
ggthemes::theme_map() +
theme(legend.position = "right")
p + scale_fill_brewer(palette = "PRGn")
Setting the scale limits explicitly forces a 7-category symmetric
scale that aligns the zero value with the middle color:
lims <- as.character(-3 : 3)
p + scale_fill_brewer(palette = "PRGn",
limits = lims)
This shows a category in the legend for -3 that does not appear in
the map.
This is often what you want.
But if you want to drop the -3 category, one option is to use a
manual scale:
vals <- RColorBrewer::brewer.pal(7, "PRGn")
names(vals) <- lims
p + scale_fill_manual(values = vals[-1])
Bivariate Palettes
It is possible to encode two variables in a palette.
Some sample palettes:
Bivariate palettes are sometimes used in bivariate
choropleth maps .
Some recommendations from Cynthia Brewer are available here .
A discussion
of a recent example.
Unless one variable is binary, and the palette is very well chosen,
it is hard to decode a visualization using a binary palette without
constantly referring to the key.
Culture, Tradition, and Conventions
Colors can have different meanings in different cultures and at
different times.
Conventions can also give colors particular meanings:
red/green in traffic lights;
red/green colors in microarray heatmaps;
red states and blue states;
pink for breast cancer;
pink for girls, blue for boys;
black for mourning.
Traffic Lights
Traffic
lights use red/green, even though this is a major axis of color
vision deficiency.
The convention comes from railroads.
The red used generally contains some orange and the green contains
blue to help with red/green color vision deficiency.
Position provides an alternate encoding. Orientations do vary.
Microarray Heatmaps
Microarrays are
used for the analysis of gene-level changes and differences in
bio-medical research.
Dyes are used that result in genes with a high response appearing
red and genes with a low response appearing green.
In keeping with this physical characteristic of microarrays, a
common visualization of the data is as a red/green heat map.
Red States and Blue States
It is now standard in the US to refer to Republican-leaning states as
red states and Democrat-leaning states as blue states.
This is a fairly recent convention, dating back to the 2000
presidential election.
Prior to 1980 it was somewhat more traditional to use red for more
left-leaning Democrats.
A map of the 1960 election results uses these more traditional
colors.
In 1996 the New York Times used blue for Democrat, red for
Republican, but the Washington Post used the opposite color
scheme.
The long, drawn out process of the 2000 election may have contributed
to fixing the color schema at the current convention.
Notes
Points need more saturation, luminance than areas.
False color images may benefit from discretizing.
Bivariate encodings (e.g. x = hue, y = luminance) are
possible but tricky and not often a good idea. Best if at least one is
binary.
Providing a second encoding, e.g. shape, position can help for color
vision deficient viewers and photocopying.
In area plots and maps it is important to distinguish between base
line values and missing values.
If observed values only cover part of a possible range, it is
sometimes appropriate to use a color coding that applies to the entire
possible range.
For diverging palettes, some care may be needed to make sure the
neutral color and the neutral value are properly aligned.
Using a well-designed palette is usually better than creating your
own.
Choosing a palette can consider many factors, including appearance
and branding.
A UI Diverging Palette?
UI color guidelines are
spelled out in the brand
manual .
UIgold <- rgb(255, 205, 0, maxColorValue=255)
pdUI <- colorRampPalette(c("black", "white", UIgold))
colorspace::swatchplot(pdUI(5))
colorspace::swatchplot(pdUI(5), cvd = TRUE)
References
Few, Stephen. “Practical rules for using color in charts.” Visual
Business Intelligence Newsletter 11 (2008): 25. (PDF )
Harrower, M. A. and Brewer, C. M. (2003). ColorBrewer.org: An online
tool for selecting color schemes for maps. The Cartographic
Journal , 40, 27–37. ColorBrewer
web site . The RColopBrewer package provides an R
interface.
Ihaka, R. (2003). Colour for presentation graphics, in K. Hornik, F.
Leisch, and A. Zeileis (eds.), Proceedings
of the 3rd International Workshop on Distributed
Statistical Computing , Vienna, Austria. PDF .
See also the colorspace package and vignette .
Lumley, T. (2006). Color coding and color blindness in statistical
graphics. ASA Statistical Computing & Graphics Newsletter ,
17(2), 4–7. PDF .
Munzner, T. (2014), Visualization Analysis and Design ,
Chapter 10.
Lisa Charlotte Muth (2021). 4-part series of blog posts on choosing
color scales. Part
1 ; Part
2 ; Part
3 ; Part
4 .
Lisa Charlotte Muth (2022). A detailed guide to colors in data vis
style guides. Blog
post .
Treinish, Lloyd A. “Why Should Engineers and Scientists Be Worried
About Color?.” IBM Thomas J. Watson Research Center, Yorktown Heights,
NY (2009): 46. (pdf )
Ware, C. (2012), Information Visualization: Perception for
Design , 3rd ed, Chapters 3 & 4.
Zeileis, A., Murrell, P. and Hornik, K. (2009). Escaping RGBland:
Selecting colors for statistical graphics, Computational Statistics
& Data Analysis , 53(9), 3259-–3270 (PDF ).
Achim Zeileis, Paul Murrell (2019). HCL-Based Color Palettes in
grDevices. R
Blog post .
Achim Zeileis et al. (2020). “colorspace: A Toolbox for Manipulating
and Assessing Colors and Palettes.” Journal of Statistical Software,
96(1), 1-49. doi:10.18637/jss.v096.i01 .
Coloring Political Statements
POLITIFACT reviews the accuracy of statements by politicians and
publishes summaries
of the results .
A 2016
post on Daily Kos included a
visualization
of the results for a number of politicians.
Kaiser Fung posted a critique
at JunkCharts and
proposed an alternative.
I scraped the data as of April 11, 2017, from POLITIFACT; they are
available here .
if (! file.exists("polfac.dat"))
download.file("https://stat.uiowa.edu/~luke/data/polfac.dat",
"polfac.dat")
pft <- read.table("polfac.dat")
vcp <- prop.table(as.matrix(pft), 1)[, 6 : 1]
colnames(vcp) <- gsub("\\.", " ", colnames(vcp))
head(vcp)
## Pants on Fire False Mostly False Half True Mostly True True
## Trump 0.16279070 0.3281654 0.1989664 0.14470284 0.12403101 0.04134367
## Bachmann 0.26229508 0.3606557 0.1311475 0.09836066 0.06557377 0.08196721
## Cruz 0.06779661 0.2796610 0.3050847 0.12711864 0.16101695 0.05932203
## Gingrich 0.13924051 0.1898734 0.2025316 0.25316456 0.12658228 0.08860759
## Palin 0.09523810 0.3015873 0.1428571 0.14285714 0.09523810 0.22222222
## Santorum 0.08333333 0.2833333 0.2000000 0.21666667 0.11666667 0.10000000
The Daily Kos chart is ordered by the percentage of statements that
are more false than true. A function to produce a bar chart with a
specified color palette:
## lattice version
polbars <- function(col = cm.colors(6)) {
barchart(vcp[order(rowSums(vcp[, 1 : 3])), ], auto.key = TRUE,
par.settings = list(superpose.polygon = list(col = col)))
}
## ggplot version
gvcp <- as.data.frame(vcp) |>
rownames_to_column("Name") |>
pivot_longer(-1, names_to = "Grade", values_to = "prop") |>
mutate(Grade = fct_rev(fct_inorder(Grade)))
nm <- mutate(gvcp, Grade = ordered(Grade)) |>
filter(Grade <= "Half True") |>
group_by(Name) |>
summarize(prop = sum(prop)) |>
arrange(desc(prop)) |>
pull(Name)
gvcp <- mutate(gvcp, Name = factor(Name, nm))
pvcp <- ggplot(gvcp, aes(Name, prop, fill = Grade)) +
geom_col(position = "fill", width = 0.7) +
coord_flip() +
theme(legend.position = "top",
plot.margin = margin(r = 50),
legend.text = element_text(size = 10)) +
scale_y_continuous(labels = scales::percent, expand = c(0, 0)) +
guides(fill = guide_legend(title = NULL, nrow = 1, reverse = TRUE)) +
labs(x = "", y = "")
polbars <- function(col = cm.colors(6))
pvcp + scale_fill_manual(values = rev(col))
polbars()
The original Daily Kos chart seems to use a slightly modified version
of the Color Brewer Spectral palette, a diverging
palette.
polbars(brewer.pal(6, "Spectral"))
dkcols <- brewer.pal(6, "Spectral")
dkcols[4] <- "lightgrey"
polbars(dkcols)
The JunkCharts plot uses another diverging palette, close to the
Blue-Red palette available in hclwizard.
rwbcols <- c("#4A6FE3", "#8595E1", "#B5BBE3", "#E2E2E2",
"#E6AFB9", "#E07B91", "#D33F6A")
polbars(rwbcols)
polbars(rev(rwbcols))
Another diverging palette:
polbars(brewer.pal(7, "PiYG"))
A sequential palette:
polbars(rev(brewer.pal(6, "Oranges")))
Exercises
A color can be specified in hexadecimal notation. Given such a
color specification you can find out what it looks like by using it in a
simple plot, or by using the Google color picker. Which of the following
best describe the color #B22222?
a shade of green
a shade of blue
orange
a shade of red
The following shows how to view the colors in the
RColorBrewer palette named Reds with 7
colors:
library(RColorBrewer)
display.brewer.pal(7, "Reds")
Which of the following RColorBrewer palettes is
diverging?
Blues
PuRd
Set1
RdGy
