--- title: "Interactive Exploratory Data Visualization in R" subtitle: A workshop introduction to loon and related packages output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set( echo = TRUE, tidy.opts = list(width.cutoff = 75), tidy = TRUE) set.seed(12314159) imageDirectory <- "../img" dataDirectory <- "../data" prepDirectory <- "../prep" path_concat <- function(path1, ..., sep="/") { # The "/" is standard unix directory separator and so will # work on Macs and Linux. # In windows the separator might have to be sep = "\" or # even sep = "\\" or possibly something else. paste(path1, ..., sep = sep) } ``` ![](../img/SSC_arms.png){width=15%} ![](../img/loon.png){width=10%} ![](../img/loon.data.png){width=10%} ![](../img/loon.ggplot.png){width=10%} ## Workshop date and times: The workshop will be held live online via a **Webex** meeting on Saturday, June 6 2020. The **meeting times across the country and the online link** are found [here](../prep/meeting_link.html) ## Preparing for the workshop Please follow [these instructions](../prep/Before_the_workshop.html) for downloading loon **before** the workshop. Download again the night before the workshop in case there have been updates. ---- ## Abstract > "Exposure, the effective laying open of the data to display the unanticipated, > is to us a major portion of data analysis." ... John Tukey and Martin Wilk (1966) This workshop will demonstrate the power of highly interactive visualization tools in exploratory data analysis. Through a variety of "hands-on" analyses in `R`, workshop participants will develop a facility for interactive exploratory data visualization using `loon` and related `R` packages. `loon` is an interactive visualization toolkit for analysts/users/developers engaged in open-ended, creative, and possibly unscripted data exploration. Loon's base set of plots include scatterplots, histograms, barplots, pairs plots, parallel and radial axes plots, graph structures, and any combination of these. Among the topics that will be covered are the following - interactive [histograms](https://youtu.be/JFJcg855HcQ), scatterplots, and [3d scatterplots](https://youtu.be/mMJllGOuDLY) - data query by selection across linked displays - brushing scatterplots and histograms - conditional analysis by linking, by facetting, or both - interactive parallel (and radial) coordinate plots - connecting interactive graphics with pipes - interactive graphics and the `R` `grid` package - turning `ggplot`s into interactive `loon` plots - using interactive graphics to teach statistics In this workshop, participants will become familiar with loon's functionality through a series of examples and hands-on exercises. These will cover a wide spectrum of applications beginning with data analysis, including high-dimensional exploratory data analysis, methodological exploration for the classroom or research, as well as exploratory prototyping of new interactive visualizations. All `R` code will be provided. No previous exposure to interactive graphics or `loon` is expected. Even participants who have never used `R` will be able explore data using `loon` and will get some appreciation of the value of interactive data visualization. The main personal requirement is an interest in data and its exploration. Participants will of course need access to their own computer and (ideally) two screens (or one really big one). You do need to [check and prepare your computing environment](../prep/Before_the_workshop.html) prior to the workshop. ---- ## Instructor Wayne Oldford is a [Professor of Statistics](https://uwaterloo.ca/statistics-and-actuarial-science/people-profiles/wayne-oldford) at the [University of Waterloo](https://uwaterloo.ca/math/) and will look something like this during the workshop: ![](../img/rwo.png){width=25%} ### Recent related talks Some sense of loon and its interactive use is demonstrated in the slides of a couple of recent talks which might be of interest: - [Interactive Visualization for Exploratory Data Analysis](http://www.math.uwaterloo.ca/~rwoldfor/talks/Arizona2019/). A 50 minute invited talk given in June 2019 at the School of Informatics, Computing, and Cyber Systems of Northern Arizona University. - [Interactive ggplots in R](http://www.math.uwaterloo.ca/~rwoldfor/talks/SDSS2019/loon.ggplot/). A 20 talk given in May 2019 given at the ASA's Symposium on Data Science and Statistics in Seattle, Washington. - [Exploratory Visualization via Extendible Interactive Graphics](http://www.math.uwaterloo.ca/~rwoldfor/talks/SDSS2018/). A 20 minute talk given in May 2018 at the ASA's Symposium on Data Science and Statistics in Reston, Virgina. - [Exploratory Visualization of higher dimensional data](http://www.math.uwaterloo.ca/~rwoldfor/talks/Vienna2018/). A 50 minute talk given in June 2018 at the Institute for Statistcs and Mathematics at Vienna University in Austria. ### Some introductory notes and videos on `loon` - [Introduction to loon](https://great-northern-diver.github.io/loon/articles/introduction.html) - [loon manual](https://great-northern-diver.github.io/loon/) - [3D interactive scatterplots in loon](https://youtu.be/mMJllGOuDLY) - [the (not so) humble (interactive) histogram](https://www.youtube.com/watch?v=JFJcg855HcQ&feature=youtu.be) - Adrian Waddell's PhD thesis: [Interactive Visualization and Exploration of High-Dimensional Data](https://uwspace.uwaterloo.ca/handle/10012/10188) The workshop will consist of two 2 hour blocks with lots of opportunities for you to see how you might make use of `loon` as a data analysis tool as well as a research tool in data visualization. ---- ## Presentation material Please follow [these instructions](../prep/Before_the_workshop.html) for downloading loon **before** the workshop. The workshop will have two parts of about 2 hours each. Material is consists mainly of R scripts (below) and a few slides. Since this is the first time for this material AND the first time trying to work through it in a large online group, I am not sure where the material will break according to the parts. I am also not sure whether there is too much material or too little material. I am trying to not go into too much depth in the workshop, assuming that `loon` is new to pretty much everyone. We do not need to cover it all, and we can always do more on the fly if there is interest and time. We can also end early. So please take this as a rough guess at how the material may divide between parts. - Part 0: 0 hours - [beforeWeBegin.R](./R/beforeWeBegin.R) contains some R code, mainly `library()` statements to make sure you have everything you need installed before we start. - Please make sure that your machine has all the necessary software installed and that everything is working. Again see and follow [these instructions](../prep/Before_the_workshop.html) **before** the workshop. - Note that you will need the `0.1.0` version of the `loon.data` package (which just got pushed to CRAN this past week). - Part 1: 2 hours - [pdf slides](./slides/Overview.pdf) - R code: - [covidNZ.R](./R/covidNZ.R) An introduction to loon via an exploratory analysis of a simple but topical data set: New Zealand Covid-19 cases. - [plots3D.R](./R/plots3D.R) Three dimensional scatterplots involving the analysis of two different data sets: the human immunoglobulin G1 antibody molecule (seriologically detected to confirm Covid-19) and the Tonga trench earthquakes north of New Zealand. - Big break: 1 hour (Go eat; have a walk; stretch) - Part 2: 2 hours - pdf slides (same as above) - R code - [higher_dimensional.R](./R/higher_dimensional.R) A few interactive methods for visualizing more than three dimensions. A few standard data sets are used to illustrate. - [pipes.R](./R/pipes.R) With `tidyverse`, especially `dplyr`, `magrittr`, and `ggplot2`, using pipes to organize data analyses has become increasingly popular. Here, we illustrate how the `construction` of loon plots is done using the pipes of `magrittr`. This also involves analysis of an interesting Canadian data set on the distribution of visible minorities in cities across Canada in 2006. Some of the higher dimensional methods are used. - [loon.ggplot.R](./R/loon.ggplot.R) How loon and ggplot2 can work together is shown through several quick examples. This depends on the package `loon,ggplot` available on github. - [teachingDemo.R](./R/teachingDemo.R) A few demonstrations are shown where `loon` can (and has) been used in the classroom. The demonstration code is available and can be adapted to new lessons. By then we should be exhausted. At least I will be. :-)