Some things to keep an eye out for when looking at data on a numeric variable:

skewness, multimodality

gaps, outliers

rounding, e.g. to integer values, or

*heaping*, i.e. a few particular values occur very frequentlyimpossible or suspicious values

A variant of the dot plot is known as a *strip plot*. A strip plot for the city temperature data is

```
p1 <- stripplot(~ temp, data = citytemps)
p2 <- ggplot(citytemps) + geom_point(aes(x = temp, y = "All"))
grid.arrange(p1, p2, nrow = 1)
```

One way to reduce the vertical space is to use the *chunk option* `fig.height = 2`

, which produces

The strip plot can reveal gaps and outliers.

After looking at the plot we might want to examine the one high and one low values:

```
filter(citytemps, temp > 100)
## city temp
## 1 Asuncion 102
filter(citytemps, temp < 0)
## city temp
## 1 Anadyr -5
```

The strip plot is most useful for showing subsets corresponding to a categorical variable.

A strip plot for the yields for different varieties in the barley data is

`ggplot(barley) + geom_point(aes(x = yield, y = variety))`

Scalability in this form is limited due to over-plotting.

A simple strip plot of `price`

within the different `cut`

levels in the `diamonds`

data is not very helpful:

`ggplot(diamonds) + geom_point(aes(x = price, y = cut))`

Several approaches are available to reduce the impact of over-plotting:

reduce the point size;

random displacement of points, called

*jittering*;making the points translucent, or

*alpha blending*.

Combining all three produces

```
ggplot(diamonds) +
geom_point(aes(x = price, y = cut),
size = 0.2, position = "jitter", alpha = 0.2)
```

Skewness of the price distributions can be seen in this plot, though other approaches will show this more clearly.

A peculiar feature reveled by this plot is the gap below 2000. Examining the subset with `price < 2000`

shows the gap is roughly symmetric around 1500:

```
ggplot(filter(diamonds, price < 2000)) +
geom_point(aes(x = price, y = cut),
size = 0.2, position = "jitter", alpha = 0.2)
```

With a good combination of point size choice, jittering, and alpha blending the strip plot for groups of data can scale to several hundred thousand observations and ten to twenty of groups.

Strip plots can reveal gaps, outliers, and data outside of the expected range.

Skewness and multi-modality can be seen, but other visualizations show these more clearly.

Storage needed for vector graphics images grows linearly with the number of observations.

Base graphics provides `stripchart`

:

`stripchart(yield ~ variety, data = barley)`

Lattice provides `stripplot`

:

`stripplot(variety ~ yield, data = barley)`