1. Air Pollution Data

When there is a clear dependent variable, that variable should go on the vertical axis; here that is ozone.level.

If you use ggpairs it is a good idea to put the dependent variable last so you have a plot with the dependent variable on the vertical axis against each predictor variable:

## Loading required package: ggplot2
ggpairs(calif.air.poll[c(2 : 4, 1)])

The conditional distributions show an increasing relation between ozone level and inversion temperature; the slope decreases with increasing inversion height.

xyplot(ozone.level ~ inversion.base.temp |
       equal.count(inversion.base.height, 9, overlap = 0),
       type = c("p", "smooth"), data = calif.air.poll, col.line="red")

The top two height panels both contain points with heights of 5000.

2. Olive Oils

olives <- read.csv("http://homepage.divms.uiowa.edu/~luke/data/olives.csv")

Focus on the northern region:

olivesN <- filter(olives, Region == "North")
olivesN <- droplevels(olivesN)

A parallel coordinates plot of all the values suggests looking more closely at oleic, stearic, and linolenic:

ggparcoord(olivesN, 3:10, groupColumn="Area", scale = "uniminmax")

The plot of In the plot of stearic against oleic shows the the Umbria oils all have oleic values above 7870:

ggplot(olivesN) +
    geom_point(aes(oleic, stearic, color = Area)) +
    geom_vline(aes(xintercept = 7870), linetype = 2)

Among the oils with oleic > 7870 all Umbria oils, and only the Umbria oils have values of stearic < 230 and linolenic > 15:

ggplot(filter(olivesN, oleic > 7870)) +
    geom_point(aes(linolenic, stearic, color = Area)) +
    geom_vline(aes(xintercept = 15), linetype = 2) +
    geom_hline(aes(yintercept = 230), linetype = 2)