geom_dotplot - wch/ggplot2 GitHub Wiki
I've added an implementation of Wilkinson-style dot plots with geom_dotplot
.
The parameters are:
binwidth
: width of the bins.method
:dotdensity
(default), which uses the dot-density algorithm from Wilkinson (1999), orhistodot
, which uses fixed-width bins, just like a regular histogram.stackratio
: vertical spacing between dots, relative to the dot diameter (default 1).dotsize
: diameter of each dot, relative to the maximum bin width (fordotdensity
) or the bin width (forhistodot
).binaxis
Which axis to bin along."x"
(default) or"y"
.stackdir
: The way to stack the dots."up"
,"down"
,"center"
, or"centerwhole"
. See below for examples.- It also supports
alpha
,colour
, andfill
. binpositions
: This is used for dot-density binning.bygroup
(default) tells is to find bin positions for each group.all
tells it to find bin positions across all groups. This is used for aligning dot stacks across groups.stackgroups
: should dots be stacked across groups? This has the effect that ‘position = "stack"’ should have, but can't (because this geom has some odd properties).
There are some weird things about it at this point:
- If stacking along the y axis (binning along x), the y axis label is "count" and the y axis has a total range of 1, but these are meaningless. You can hide them with
scale_y_continuous(name = "", breaks = NA)
.
This happens becuase the dots are stacked visually, which may or may not align with the y tick marks. Unfortunately, it's not possible with ggplot2 to align the circles to a a y scale. (To see how this works, try resizing the window vertically -- the dots stay visually stacked but the y scaling changes.)
Other notes:
- Coord transforms (other than
coord_flip
) don't work. At this point I don't know if they even make sense conceptually for these objects. - With dot-density binning, it is possible for dot stacks to overlap with each other, by up to 50% of the dot width. This is a consequence of the binning algorithm. Wilkinson also mentions a smoothing algorithm, but I haven't implemented this yet.
Examples
# Generate data
set.seed(122)
dat <- data.frame(x=rnorm(20), y=rnorm(20))
Bin along x axis
# Stack vertically
dp1 <- ggplot(dat, aes(x)) + geom_rug() + scale_x_continuous(breaks=seq(-4,4,.4))
dp1 + geom_dotplot(binwidth=.4)
# Notice each dot stack is centered over a set of observations. The binning is done with
# Wilkinson's (1999) dot density algorithm. 'binwidth' sets the maximum bin width.
# The y range is set from 0 to 1, but the y axis scale actually has nothing
# to do within y positioning of the dots. The dot diameter is the same as the maximum
# bin width and they're stacked visually; if you resize the window to make it taller
# or shorter, they stay visually stacked. You could resize the window so that the dots
# align with the tick marks
# Use histodot binning
# This uses the algorithm from stat_bin: with fixed-width intervals. However, I
# couldn't directly use stat_bin because I needed to generalize the binning to work
# along x and y.
dp1 + geom_dotplot(binwidth=.4, method="histodot")
# Squish together vertically with smaller stackratio
dp1 + geom_dotplot(binwidth=.4, stackratio=.8)
# Dot diameter expanded to 1.4 * max binwidth. Stacking stays so that they're just touching
dp1 + geom_dotplot(binwidth=.4, dotsize=1.4)
Stacking methods
# stack up (default)
dp1 + geom_dotplot(binwidth=.4, stackdir="up")
# stack down
dp1 + geom_dotplot(binwidth=.4, stackdir="down")
# stack center
dp1 + geom_dotplot(binwidth=.4, stackdir="center")
# stack centerwhole - add one dot up, then one down, then one up, etc.
dp1 + geom_dotplot(binwidth=.4, stackdir="centerwhole")
Bin along y axis
To bin along the y axis, you need to set binaxis="y"
.
# Y direction
dp1y <- ggplot(dat, aes(x=0, y=y)) + geom_rug() + scale_y_continuous(breaks=seq(-4,4,.4))
dp1y + geom_dotplot(binwidth=.4, binaxis="y", stackdir="center")
# Y direction, stack centerwhole
dp1y + geom_dotplot(binwidth=.4, binaxis="y", stackdir="centerwhole")
Grouped data
# New data with x and g as factors
dat2 <- data.frame(x=LETTERS[1:3], y=rnorm(90), g=LETTERS[1:2])
# Plot with groups on x axis
dp2 <- ggplot(dat2, aes(x=x, y=y)) + scale_y_continuous(breaks=seq(-4,4,.4))
dp2 + geom_dotplot(binwidth=.25, binaxis="y", stackdir="centerwhole")
# Groups on x axis with violins (also smaller bin size)
dp2 + geom_violin() +
geom_dotplot(binwidth=.15, binaxis="y", stackdir="center")
# With boxplots and violins
dp2 + geom_violin() +
geom_boxplot(width=.2, outlier.colour=NA) +
geom_dotplot(alpha=.3, binwidth=.15, binaxis="y", stackdir="center")
# Above corresponding box plots
# This uses a little hack to move the dots above the boxplots
dp2 + geom_boxplot(width=.4) +
geom_dotplot(aes(x=as.numeric(x)+.2, group=x),
binwidth=0.15, binaxis="y", stackdir="up") +
coord_flip()
# Beside corresponding box plots
# This uses a hack to move the dot clusters and boxplots: convert their x-values
# to continuous, then make the continuous axis look like it is discrete
dp2 +
geom_boxplot(aes(x=as.numeric(x) - 0.2, group=x), width=0.4) +
geom_dotplot(aes(x=as.numeric(x) + 0.2, group=x),
binwidth=0.15, binaxis="y", stackdir="center") +
scale_x_continuous(breaks=1:nlevels(dat2$x), labels=levels(dat2$x))
# Dodging, mapping "x" to fill instead of x
ggplot(dat2, aes(x="foo", y=y, fill=x)) + scale_y_continuous(breaks=seq(-4,4,.4)) +
geom_dotplot(binwidth=.25, alpha=.4, position="dodge", binaxis="y", stackdir="center")
# grouping on x and g, dodging
ggplot(dat2, aes(x=x, y=y, fill=g)) + scale_y_continuous(breaks=seq(-4,4,.4)) +
geom_dotplot(binwidth=.2, alpha=.2, position="dodge", binaxis="y", stackdir="center")
# These clusters don't have an "real" x width, so dodging is a bit weird. In this case
# the clusters are too close together, but if you just make the window wider, the clusters
# will move apart (within each cluster the dots will stay together).
# Stacking groups, using dotdensity
ggplot(dat2, aes(x=y, fill=x)) +
geom_dotplot(binwidth=.25, stackgroups=TRUE, binpositions="all")
# Stacking groups, using histodot
ggplot(dat2, aes(x=y, fill=x)) +
geom_dotplot(binwidth=.25, stackgroups=TRUE, method="histodot")
# Stacking groups, using histodot, along y axis
ggplot(dat2, aes(x=1, y=y, fill=x)) +
geom_dotplot(binaxis="y", binwidth=.25, stackgroups=TRUE, method="histodot")