Data Visualization

A 4-Day Livestream Seminar Taught by Kieran Healy, Ph.D.

Download Sample Course Slides

The effective use of graphs and charts is an important way to explore data for yourself and to communicate your ideas and results to others. Being able to produce effective plots from data is also the best way to develop an eye for reading and understanding visualizations made by others, whether presented in academia, business, policy, or the media.

This seminar provides an intensive, hands-on introduction to the principles and practice of data visualization. We will begin with an overview of some basic principles. We will focus not just on the aesthetic aspects of good plots, but on how their effectiveness is rooted in the way we perceive properties like length, absolute and relative size, orientation, shape, and color. Students will learn how to produce and refine plots using ggplot, a powerful, versatile, and widely-used visualization library for R. It implements a “grammar of graphics” that gives us a coherent way to produce visualizations by expressing relationships between the attributes of data and their graphical representation.

Starting May 17, we are offering this seminar as a 3-day synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 1-hour break. Participants are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if they are unable to attend at the scheduled time.

*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.

Closed captioning is available for all live and recorded sessions.

More Details About the Course Content

Through a series of worked examples, students will learn how to build plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics covered include plotting continuous and categorical variables; layering information on graphics; faceting grouped data to produce effective “small multiple” plots; transforming data to easily produce visual summaries on the graph such as trend lines, linear fits, error ranges, and boxplots; creating maps, together with simpler alternatives to maps for country – or state – level data.

We will also cover cases where we are not working directly with a dataset but rather with estimates from a statistical model. Using these tools, we will then explore the practical process of refining plots to accomplish common tasks such as highlighting key features of the data, labeling particular points, annotating plots, and changing their overall appearance. Finally, we will examine some strategies for presenting graphical results in different formats (such as in print, online, or in slides) and to different sorts of audiences.

At the end of the course, participants will:

  • Understand the basic principles behind effective data visualization
  • Have a practical sense for why some graphs and figures work well while others may fail to inform or actively mislead
  • Know how to create a wide range of plots in R using ggplot2
  • Know how to refine plots for effective presentation

Computing

To participate in the hands-on exercises, you are strongly encouraged to use a laptop computer with the most recent version of R installed, together with the tidyverse library. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms. Basic familiarity with R is highly desirable, but even novice R coders should be able to follow the presentation and do the exercises.

If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.

Who Should Register?

This course is for anyone who wants to learn how to produce, refine, and present effective visualizations generated from datasets, summary tables, or the output of statistical models.

Outline

1. Basic principles of data visualization

  • Why look at data?
  • Beyond “good taste” in graphics
  • Object perception and misperception
  • Encoding data graphically

2. Using ggplot2 and R

  • How to think about R
  • How to think about ggplot

3. Understanding the grammar of graphics

  • Mapping data values to plot aesthetics
  • Building plots layer by layer

4. Plots of one, two, or more continuous or categorical variables

5. Grouped data, Faceting, and Small Multiples

6. Plots of estimates and effects

  • Plotting tables of results
  • Plotting directly from models

7. Plots in space and time

  • Drawing maps, and their alternatives
  • Animating plots

8. Refining plots for presentation

  • Adding annotations or highlighting features
  • Keys and labels
  • Controlling overall appearance with themes
  • Redrawing bad plots

Reviews of Data Visualization

“The instructor is personable and droll. The effort put into his slides and his “story” is incredible, and shines through on every slide. He has thought through every slide (even digressions), and so he has firm command of the direction he wants to move his class in. Obviously, he is beyond expert to elite-level in his knowledge of this domain, and the set of worked examples he has provided will serve students in good stead for a long time.”
  Michael Marsiske, University of Florida

“I really enjoyed the ability to see what the instructor was doing (e.g., his keystrokes and the way he could use a spotlight to show where his mouse was). The flow of the topics was very natural and logical, and the instructor’s presentation style was very engaging and easy to follow.”
  Alicia Jaramillo-Underwood, Centers for Disease Control and Prevention

“I appreciated the concrete examples of high and low quality data visualization and continuous rationale throughout. I could tell the instructor was very knowledgeable.”
  Laura Faith, HSR&D

“This was a fantastic course all around! The instructor was very knowledgeable and kept the content/pace both interesting and engaging. It was really great to dive into working with code and examples.”
  Tim Hannigan, University of Alberta

“The fundamental approach that explained what inputs ggplot is looking for, as well as decoding all the different potential inputs in a stepwise fashion, was an incredible way to learn, and also will help when I come back later.”
  Atu Agawu, Children’s Hospital of Philadelphia

Seminar information

Tuesday, May 17 –
Friday, May 20, 2022

Schedule:
All sessions are held live via Zoom.

10:30am-12:30pm ET (New York time)
1:30pm-3:00pm ET

Payment Information

The fee of $995 includes all course materials.

PayPal and all major credit cards are accepted.

Our Tax ID number is 26-4576270.