I have made several plots with ggplot2
in the past 2 years and occasionally got errors related to the group aesthetics. I solved these issues without once taking the time to fully understand how the group aesthetic works. This blogpost is a result of my experiments to finally explore how it works. My understanding is a combination of my experiments and Hadley Wickhams outstanding book about ggplot2. (https://github.com/hadley/ggplot2-book)
Scenario 1: mapping based on one variable
Our dummy data will be a unit square. We labeled its points as common in maths: counter-clockwise.
dt <- data.table(
x = c(0, 0, 1, 1),
y = c(0, 1, 0, 1),
grp = c('red', 'blue', 'blue', 'red'),
id = c('A', 'D', 'B', 'C')
)
Case 1: we have a global mapping applicable to lines
dt %>% ggplot(aes(x, y, col = I(grp))) +
geom_point(size = 3) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
We may want to join the points with segments. By default, the line geom inherits the color argument from the call to ggplot
. The group aesthetic is a combination of all discrete mappings, in this case the default group for the line geom will be the same as for the colors.
dt %>% ggplot(aes(x, y, col = I(grp))) +
geom_point(size = 3) +
geom_line() +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
We may want to link the points with segments joining together. One option is to overwrite the color mapping for this layer.
dt %>% ggplot(aes(x, y, col = I(grp))) +
geom_point(size = 3) +
geom_line(aes(col = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
Another option is to overwrite the silently set group
aesthetic and setting it constant. This way the segments will inherit their color from their (left) neighbour point.
dt %>% ggplot(aes(x, y, col = I(grp))) +
geom_point(size = 3) +
geom_line(aes(group = 'arbitrary_constant_value')) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
What order are the points linked? Now left-to-right, and order of appearance in data is coincidentally the same order as the points appear in the dataset, so we have to make further experiments.
dt_mixed <- data.table(
x = c(1, 0, 1, 0),
y = c(1, 0, 0, 1),
grp = c('red', 'red', 'blue', 'blue'),
id = c('A', 'C', 'B', 'D')
)
dt_mixed %>% ggplot(aes(x, y, col = I(grp))) +
geom_point(size = 3) +
geom_line(aes(group = 'arbitrary_constant_value')) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
We can safely conclude that the order in which the points are linked is left-to-right, order of appearance.
Case 2: we have a global mapping not applicable to lines
A slightly different case when the aesthetic defined in the ggplot
call is not applicable to lines. It will still effect the silently set group aesthetics.
dt %>% ggplot(aes(x, y, pch = grp)) +
geom_point(size = 3) +
geom_line() +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
We can still overwrite the pch
aesthetics in the geom_line
call thus silently unsetting the group
variable.
dt %>% ggplot(aes(x, y, pch = grp)) +
geom_point(size = 3) +
geom_line(aes(pch = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
However in this case explicitly setting the group
aesthetics is inarguably more clear. We even got a warning saying “Ignoring unknown aesthetics: shape”.
dt %>% ggplot(aes(x, y, pch = grp)) +
geom_point(size = 3) +
geom_line(aes(group = 1)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
What happens when the group variable is a combination of more mappings?
Scenario 2: mapping based on two variables
Case 1: one discrete and one continuous variable
To join the points in a group with line segments we need at least two points so we need to define a slightly bigger dataset to experiment with.
dt <- data.table(
x = c(0,2,4,5,4,2,0,-1),
y = c(0,-1,0,2,4,5,4,2),
grp_1 = c('red','blue','red','blue','red','blue','red','blue'),
grp_2 = c(2,2,4,4,2,2,4,4),
id = LETTERS[1:8]
)
dt %>% ggplot(aes(x, y, col = I(grp_1), size = I(grp_2))) +
geom_point() +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
By default we will have 2 group of 4 points linked together: one group for each value of the discrete variable used in the aesthetics call in ggplot
.
dt %>% ggplot(aes(x, y, col = I(grp_1), size = I(grp_2))) +
geom_point() +
geom_line() +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
One option is to overwrite the color and size mapping in geom_line
:
dt %>% ggplot(aes(x, y, col = I(grp_1), size = I(grp_2))) +
geom_point() +
geom_line(aes(col = NULL, size = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
Another option is to set the group aesthetics to a constant:
dt %>% ggplot(aes(x, y, col = I(grp_1), size = I(grp_2))) +
geom_point() +
geom_line(aes(group = 1)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
Case 2: two discrete variables
dt <- data.table(
x = c(0,2,4,5,4,2,0,-1),
y = c(0,-1,0,2,4,5,4,2),
grp_1 = c('red','blue','red','blue','red','blue','red','blue'),
grp_2 = c('a', 'a', 'b', 'b', 'a', 'a', 'b', 'b'),
id = LETTERS[1:8]
)
By default we have 4 pair of pairwise linked points: one for each combination of the two discrete variables used in the aesthetics call of ggplot
.
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line() +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
We can overwride one or two of these in the aes call in geom_line
:
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(aes(pch = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(aes(col = NULL, pch = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
We can also set the group aesthetics to constant or combine these two approaches.
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(aes(group = 1)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(aes(group = 1, pch = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
What happens if we specify the group
variable outside aes
?
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(group = 1) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
The order in which the segments are linked has been changed. But to what? Let’s find out by targeted experiments.
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(group = 1, aes(col = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(group = 1, aes(pch = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
dt %>% ggplot(aes(x, y, col = I(grp_1), pch = grp_2)) +
geom_point(size = 3) +
geom_line(group = 1, aes(col = NULL, pch = NULL)) +
geom_label(aes(label = id), vjust = "inward", hjust = "inward")
This way in the background we still have the group in aesthetics: it has effect although we overwrote the effect of joining lines, we did not overwrote every effect of the aesthetics mapping. So the order in which the points are linked: first levels of the group, then left to right, then order of appearance in dataset. Check for yourself with the above examples!
e.g. blue < red, circle < triangle.
Conclusion
Group is the interaction of discrete variables set inside aes
unless explicitly overwritten.
The order in which the points are linked:
- order of group levels
- left-to-right
- order of appearance in data