Return to Tutorial Menu

Getting Started

The ggplot2 package is a very powerful graphics package created by Hadley Wickham. It is not part of base R, so you will need to install it if you have not already done so.

install.packages("ggplot2")

Once it is installed, you can load the package from your R library as follows:

library(ggplot2)

Overview

ggplot2 is a graphics package based on The Grammer of Graphics, a book by Leland Wilkinson that presents a unique way of producing graphics. Creating plots in ggplot2 is different the using base graphics, but once you get used to it you will most likely find it very useful, and with practice you will be able to create just about any graphic display of data you can think of. This tutorial is just a basic introduction to the package. To find out more visit the package’s webpage at https://ggplot2.tidyverse.org/, which has a link to a nice ggplot2 cheatsheet.

For this tutorial we will use the mtcars data, which is part of the base R datasets. You can load it with the data() function which I do below. I also convert some of the variables to factors. You should also open the help page for the dataset by typing ?mtcars into the R console.

data(mtcars)

mtcars <- within(mtcars, {
  cyl <- factor(cyl)
  vs <- factor(vs, labels = c("V-shaped", "straight"))
  am <- factor(am, labels = c("automatic", "manual"))
})

The grammer of graphics is a principles way to combine the components of graphs very similarly to the way we use words from the different parts of speach to construct meaningful sentences. The basic parts of graphics (see Programming Skills for Data Science, Ch. 16)

I will not cover all these parts in this tutorial, but will instead cover enough to get you started. Generally, you start by using the ggplot() function, and passing it a data frame and some aesthetics. Not surprisingly the data is passed to the data argument. Variables from the data are passed to the mappings argument within the aes() function. The asthetics determine how the variables are used and options include the x and y axis, color, group, and shape. Then a geometric is added. The parts of a ggplot object can be combined with a plus (+).

You can create many types of plots with ggplot2, but I will focus on scatterplots in this tutorial, as they are very common and useful for visualizing many statistical models.

Below I create a ggplot object with the mtcars data using mpg on the y axis and hp on the x axis. Then I add the point geometric, geom_point() to produce a scatterplot.

ggplot(data = mtcars, mapping = aes(x = hp,y =  mpg)) + geom_point()

The aesthetics map a variable onto a means of displaying values of a variable. So, here aex(x = hp) means to map the values of hp on the x axis.

Return to Tutorial Menu