Data Visualization

A 4-Day Remote Seminar Taught by Kieran Healy, Ph.D.

Download a sample of the course materials

DOWNLOAD

The effective use of graphs and charts is an important way to explore data for yourself and to communicate your ideas and results to others. Being able to produce effective plots from data is also the best way to develop an eye for reading and understanding visualizations made by others, whether presented in academia, business, policy, or the media.

This seminar provides an intensive, hands-on introduction to the principles and practice of data visualization. We will begin with an overview of some basic principles. We will focus not just on the aesthetic aspects of good plots, but on how their effectiveness is rooted in the way we perceive properties like length, absolute and relative size, orientation, shape, and color. Students will learn how to produce and refine plots using ggplot, a powerful, versatile, and widely-used visualization library for R. It implements a “grammar of graphics” that gives us a coherent way to produce visualizations by expressing relationships between the attributes of data and their graphical representation.

Starting June 15, we are offering this seminar as a 4-day synchronous*, remote workshop. Each day will consist of a 3-hour live lecture held via the free video-conferencing software Zoom. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.

Each lecture session will conclude with a hands-on exercise reviewing the content covered, to be completed on your own. An additional lab session will be held Tuesday and Thursday afternoons, where you can review the exercise results with the instructor and ask any questions.

*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for two weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.

Closed captioning is available for all live and recorded sessions.

More Details About the Course Content

Through a series of worked examples, students will learn how to build plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics covered include plotting continuous and categorical variables, layering information on graphics; faceting grouped data to produce effective “small multiple” plots; transforming data to easily produce visual summaries on the graph such as trend lines, linear fits, error ranges, and boxplots; creating maps, together with simpler alternatives to maps for country – or state – level data.

We will also cover cases where we are not working directly with a dataset but rather with estimates from a statistical model. Using these tools, we will then explore the practical process of refining plots to accomplish common tasks such as highlighting key features of the data, labeling particular points, annotating plots, and changing their overall appearance. Finally, we will examine some strategies for presenting graphical results in different formats (such as in print, online, or in slides) and to different sorts of audiences.

At the end of the course, participants will:
– Understand the basic principles behind effective data visualization
– Have a practical sense for why some graphs and figures work well while
others may fail to inform or actively mislead
– Know how to create a wide range of plots in R using ggplot2
– Know how to refine plots for effective presentation

Computing

To participate in the hands-on exercises, you are strongly encouraged to use a laptop computer with the most recent version of R installed, together with the tidyverse library. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.

If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.

Who Should Register?

This course is for anyone who wants to learn how to produce, refine, and present effective visualizations generated from datasets, summary tables, or the output of statistical models.

It is helpful to have familiarity with the R programming language.

Outline

1. Basic principles of data visualization
– Why look at data?
– Beyond “good taste” in graphics
– Object perception and misperception
– Encoding data graphically
2. Using ggplot2 and R
– How to think about R
– How to think about ggplot
3. Understanding the grammar of graphics
– Mapping data values to plot aesthetics
– Building plots layer by layer
4. Plots of one, two, or more continuous or categorical variables
5. Grouped data, Faceting, and Small Multiples
6. Plots of estimates and effects
– Plotting tables of results
– Plotting directly from models
7. Plots in space and time
– Drawing maps, and their alternatives
– Animating plots
8. Refining plots for presentation
– Adding annotations or highlighting features
– Keys and labels
– Controlling overall appearance with themes
– Redrawing bad plots

Reviews of Data Visualization

“The course was incredibly well balanced, showing in logical direct ways how to build up the different layers of a plot. Course also covered different areas relevant for statisticians, epidemiologists, general scientists, journalists and other disciplines.”
  Kristian Lynch, University of South Florida

“This is an amazing comprehensive-compact course. I learned so much from it. Dr. Healy clearly explained the core concepts of data visualization while showing the power and range ggplot and tidyverse has. His mastery of the subject matter and his teaching prowess to communicate seemingly complex ideas is exceptional. I highly recommend this course to anyone looking to learn ggplot seriously.”
  Prashant Bhandari, University of Florida

“I thought this course was one of the best courses on the topic of data analysis that I have ever taken. It was so well thought out and carefully constructed to unpackage the logical steps towards using R for data viz. I am not a strong user of R at this point but this course also pushed along my understanding of R greatly!”
  Abigail Eaton, Kaiser Permanente

Seminar information

Tuesday, June 15, 2021 –
Friday, June 18, 2021

Each day will follow this schedule:

11:00am-2:00pm ET (New York time): Live lecture via Zoom

4:00pm-5:00pm ET: Live lab session via Zoom (Tuesday and Thursday only)

Payment Information

The fee of $895 includes all course materials.

PayPal and all major credit cards are accepted.

Our Tax ID number is 26-4576270.