Data visualisation

Bio300B Lecture 4

Richard J. Telford (

Institutt for biovitenskap, UiB

22 September 2023

  • A picture is worth a thousand words
  • Tell a story with figures
  • Avoid common mistakes

“reflect the data, tell a story, and look professional” Wilke


  • one of at three schemes for graphics in R
  • part of tidyverse

A system for ‘declaratively’ creating graphics, based on “The Grammar of Graphics”.

You provide the data, tell ‘ggplot2’ how to map variables to aesthetics, what graphical primitives to use, it takes care of the details.

ggplot in action

plot <- ggplot(data = penguins,     # Data
       mapping = aes(               # Aesthetics
         x = body_mass_g,    
         y = bill_length_mm, 
         colour = species)) +
  geom_point() +                    # Geometries
  scale_colour_brewer(palette = "Set1") + # scales
  labs(x = "Body mass, g",          # labels
       y = "Bill length mm", 
       colour = "Species") +
  theme_bw()                        # themes
                                    # Also facets

Tibble or data frame with data to be plotted.

Tidy data

Can process data within ggplot but usually best to do it first

Can add data to the whole plot or to individual geoms

penguin_summary <- penguins |> group_by(species) |> summarise(body_mass_g = mean(body_mass_g, na.rm = TRUE), bill_length_mm = mean(bill_length_mm, na.rm = TRUE) )
ggplot(penguins, aes(x = body_mass_g, y = bill_length_mm, colour = species)) +
  geom_point() +
  geom_text(aes(label = species), data = penguin_summary, colour = "black")


mapping specifies which variables in the data should be mapped onto which aesthetics with aes()

Each geom takes different aesthetics

Common aesthetics

  • x, y
  • fill, colour
  • shape
  • linetype
  • group

Setting vs mapping

Mapping in aes()

       aes(x = flipper_length_mm, 
           fill = "blue")) +

Setting in the geom

       aes(x = flipper_length_mm)) +
geom_histogram(fill = "blue")


Use different geoms for different plot types

Important geoms

  • geom_point()
  • geom_boxplot()
  • geom_histogram()
  • geom_smooth()
  • geom_line()
  • geom_text()

Many geoms, some in extra packages

Geoms to show distributions

base <- ggplot(penguins, aes(x = flipper_length_mm))
hist <- base + geom_histogram()
dens <- base + geom_density()

Geoms to show many distributions

base <- ggplot(penguins, aes(x = species, y = flipper_length_mm))

p_prange <- base + stat_summary(fun = "mean", geom = "col")
p_box <- base + geom_boxplot(aes(fill = species))
p_vio <- base + geom_violin(aes(fill = species))
p_jit <- base + geom_jitter(aes(colour = species))
p_quasi <- base + geom_quasirandom(aes(colour = species))

Boxplots can mislead

p <- datasauRus::box_plots |> 
  pivot_longer(everything()) |> 
  ggplot(aes(x = name, y = value))

p + geom_boxplot() +
p + geom_violin()

Show the raw data

top left panel shows mean + SE only, top right shows mean + SE togther with widely spread jittered raw data Bottom plots show the same with more data so SE are smaller

geoms for scatterplots

ggplot(penguins, aes(x = body_mass_g,  y = bill_length_mm, colour = species)) +
  geom_point() +
  geom_smooth(method = "lm")


Control how

  • variables are mapped onto the aesthetics
  • axes breaks

All called scale_aesthetic_description

  • scale_x_log()
  • scale_y_reverse()
  • scale_colour_viridis_c()
  • scale_shape_manual()


  • plot, axis and legend titles
ggplot(penguins, aes(x = body_mass_g, y = bill_length_mm, colour = species)) +
  geom_point() +
  labs(x = "Body mass g",
       y = "Bill length mm", 
       colour = "Species", 
       title = "Bill length against body mass ") 


Split data into separate panels.

plot + facet_wrap(facets = vars(species))

facet_grid() for two dimensional arrays of subplots

plot + facet_grid(rows = vars(species),
                  cols = vars(island)


Change how non-data elements of the plot look

Entire themes


Can also change individual elements

plot + theme(legend.position = "top")

Removing elements

plot + theme(panel.grid = element_blank())

Colour & fills

Colour deficient vision

den <- ggplot(penguins, aes(x = bill_length_mm, fill = species)) +
  geom_density(alpha = 0.7)

#End rainbow

Better colour scale

den <- ggplot(penguins, aes(x = bill_length_mm, fill = species)) +
  geom_density(alpha = 0.7) +
  scale_fill_brewer(palette = "Set1")

Using colour effectively

Choose an appropriate palette.

Qualitative palettes

RColorBrewer::display.brewer.all(type = "qual")

Sequential palettes

RColorBrewer::display.brewer.all(type = "seq")

Dividing palettes

RColorBrewer::display.brewer.all(type = "div")


ggplot(penguins, aes(x = body_mass_g, y = flipper_length_mm)) +
  geom_point(aes(colour = flipper_length_mm)) +


ggplot(penguins, aes(x = body_mass_g, y = flipper_length_mm)) +
  geom_point(colour = "red") +
  gghighlight::gghighlight(species == "Chinstrap")

Redundant encoding

       aes(x = body_mass_g,
           y = flipper_length_mm,
           colour = species,
           shape = species)) +

Avoiding legends


Most common mistake in presentations

plot with very small labels


theme_bw(base_size = 18)


  • You can plot anything you can imagine
  • Whole ecosystem of packages to help
  • #tidytuesday for inspiration