# 22S:295:001 Topics in Statistical Graphics and Visualization

Luke Tierney

Fall 2008

### Course Information

• Class meets 10:30-11:20 AM on Thursdays in Schaeffer 241B.
• September 4: Read the first three chapters of Sarkar's book on Lattice and try out some examples. You might also look at the material available on the Trellis Display page, and you might want to look at some of the on-line resources shown below.
• September 11: Read chapters 3, 4, and 5 of Sarkar's book on lattice and try out some examples. Some questions to think about:
• How do you change themes, and is there a way to change the theme for a single plot? Some notes.
• How might you add the outliers from a box-and-whisker plot to a violin plot? One possible solution.
• September 18: Read chapters 5, 6, and 7 of Sarkar's book on lattice and try out some examples. Some questions to think about:
• How would you get the regression line and smooth in Figure 5.10 to have different line styles and/or colors? Some notes.
A note on panel.grid.
• September 25: Read sections 5.4-5.6 and chapter 6, and skim chapters 7, 8, and 9. Some questions to think about:
• How can you add error bands to a smooth or linear fit in a scatterplot?
• How can you put histograms or density plots on the diagonal panels of a scatter-plot matrix? One possible solution.
• Why isn't the linear fit in Figure 6.8 linear?
• Can you create a stereo view of a surface? An example.
Note: It appears that parallel(..., horizontal = FALSE) works in the version of lattice bundled with R 2.7.2 but not yet in 2.7.1.
• October 2: Read Chapter 6 and skim chapters 7-12.Some questions to think about:
• How can you add error bands to a smooth or linear fit in a scatterplot? One possible solution.
• Why isn't the linear fit in Figure 6.8 linear?
• How can you change the colors used by a wireframe plot with shade = TRUE?
• October 9: Read chapters 13 and 14 of Sarkar's book. Some questions to think about:
• Can you use the gridBase package to put standard graphics histogram on the diagonals of a splom? A possible approach.
• How can you change the colors used by a wireframe plot with shade = TRUE? If this can't be done now, can you write a version of the appropriate panel function that allows the colors to be changed using the col.regions argument? Some ideas.
• Do linked micromaps (here is one example) fit into the lattice framework? Another example at USDA. Slides from a talk at UseR 2006.
• Try producing some maps of the county-level 2004 U.S. presidential election results. You can look at some examples in a lecture from Deb Nolan's class at Berkeley. Some issues:
• The order of polygons/names needs to be matched.
• Some counties have multiple polygons.
• There are also some polygons that do not correspond to counties
• Data for Virginia is missing but can be filled in from the USA Today web page.
• October 16: No meeting this week.
• October 23: Read the draft ggplot2 book and look at the supporting materials on the ggplot2 home page. It may also be useful to look at the first two chapters of Wilkinson's Grammar of Graphics book.
• October 30: Look at the ggplot2 materials on the web site and read Chapter 2 in Hadley's thesis (or the paper if I get a copy in time). Some things to think about and try:
• Can you create a ggplot theme that makes the defaults more like those in standard graphics (white background, no grid lines, larger labels, etc.)?
• Do lattice shingles fit into the ggplot framework?
• How can one apply a color scale that uses alpha blending?
• How would one create violin plots in the ggplot framework?
• Can you create a scatterplot matrix with ggplot?
• The coxcomb (Nightingale) plot needs is a bar chart of square roots of values in polar coordinates. How can this be done (the coord_polar help page example misses the square root).
• Is there a mechanism to control aspect ratio, and would it be possible to fit the 45-degree banking rule into this framework?
• Try some variations on the election plots on the ISU Elections page.
• Try to reproduce some interesting standard or lattice graphics in ggplot2.
• November 6: Read chapters 1 and 2 of the GGobi book.
• November 13: Read hapters 1-3 of the GGobi book. Also work though the GGobi manual and have a look at the paper on rggobi in the October 2008 RNews issue. There are also some demos and lectures on the GGobi web site that are worth looking at.

Some notes:

• Some files with R code and additional data are available here.
• The XML version of the places rated data does contain the location names.
• The xml file for the laser data contains some information about the experiment.
• November 20: Read chapters 4 and 5 of the GGobi book and try some of the examples and problems.
• December 4: Read chapters 5 and 6 of the GGobi book and try some of the examples and problems. You might also try looking at the simple data sets created in the file surface.R and see if you can detect the surface using a combination of brushing and touring.
• Deceember 11: Some things to do for the last class:
• Look at the simple data sets created in the file surface.R (or variations of your own) and see if you can detect the surface using a combination of brushing and touring.
• Look at the article on the animation package in the latest RNews news letter.
• Look at some of the examples at the gapminder site.
• Some larger projects:
• Some displays for small samples: A Lattice version of stem and leaf plots, or a stacked dot plot (like a strip plot but with tied or near-tied observations shown as a stack of points). A variant of the MINITAB dot plot is available here. Another, lattice-based, variant is available as panel.dotplot.tb in package HH.
• Create a variant of panel.locfit that draws only the fit, not the points, and makes use of the lattice parameters it receives in the ... argument for choosing line properties.
• Is it possible to modify the code of panel.densityplot and/or panel.histogram so they can be used for the diagonals of a scatterplot matrix?
• Instead of using transparency or cutouts, another way of displaying multiple contours of a function of three variables is to place each contour in a different plot but use a common bounding box. Create a lattice function that does this. A possible formula interface might be f ~ x * y * z | levels.

### References and Resources

#### Interactive and Dynamic Graphics

• Dianne Cook and Deborah F. Swayne, Interactive and Dynamic Graphics for Data Analysis with R and GGobi, Springer-Verlag, 2007. Springer e-book link.
• GGobi web site.

#### ASA Section on Statistical Graphics

• Joint Computing/Graphics Newsletter. A number of interesting graphics papers have appeared in the newsletter.
• Movies. Library of videos, many on developments in statistical graphics.
• Data Expo. Data sets and some results from the Data Exposition sponsored by the Sections on Statistical Computing and Statistical Graphics every few years at the Joint Statistical Meetings

#### Miscellaneous

Luke Tierney 2008-12-04