This vignette goes through all the functionality of the package. If
you want to see examples with real data, you can refer to
vignette('examples', 'ggHoriPlot')
.
The data used through this vignette are tables with sine waves, which aims to mimic time-series data. The data looks like this:
library(tidyverse)
library(patchwork)
library(ggthemes)
= 1:300
x = x * sin(0.1 * x)
y <- tibble(x = x,
dat_tab xend = x+0.9999,
y = y)
= 1:400
x = x * sin(0.2 * x) + 100
y <- tibble(x = x,
dat_tab_bis xend = x+0.9999,
y = y)
<- mutate(dat_tab, type = 'A') %>%
tab_tot bind_rows(mutate(dat_tab_bis, type='B'))
%>%
tab_tot ggplot() +
geom_line(aes(x, y)) +
facet_wrap(~type, scales = 'free_y', ncol = 1) +
theme_few()
This representation of the dataset is fine if we only have a few waves. However, if we aim to represent and compare time series with many entries, it might be challenging to plot them as line charts. A more convenient way to plot this type of datasets are horizon plots, which are able to condense the data but still retain all the information. You can learn more about horizon plots here.
ggHoriPlot
allows you to easily build horizon plots in
ggplot2
. First we will load the package and a helper
functions that can be used to visualize and compare horizon plots and
line charts.
library(ggHoriPlot)
<- function(dat, ori, cutpoints, colors){
plotAllLayers # Helper function to plot the origin and cutpoints
# of the horizon plot for comparison
<- ggplot()
p <- 1
acc for (i in cutpoints[cutpoints<=ori]) {
<- colors[acc]
colo <- p + geom_ribbon(aes(x = x, y = y, ymin = y, ymax = ori),
p fill = colo,
data = mutate(dat, y = ifelse(between(y, i, ori), y,
ifelse(y<ori, i, ori))))
<- acc+1
acc
}for (i in cutpoints[cutpoints>=ori]) {
<- colors[acc]
colo <- p + geom_ribbon(aes(x = x, y = y, ymin = ori, ymax = y),
p fill = colo,
data = mutate(dat, y = ifelse(between(y, ori, i), y,
ifelse(y>ori, i, ori))))
<- acc+1
acc
}
+geom_line(aes(x, y), data=dat)+
ptheme_few()
}
We are now all set! By using geom_horizon()
we can add a
layer in the ggplot2
framework to build a horizon plot.
<- dat_tab %>%
a ggplot() +
geom_horizon(aes(x = x, y=y))
a
The default ggplot2
fill colors might not be the best
choice of palette for horizon plots. Instead, we can use the
scale_fill_hcl()
function to choose an appropriate color
scheme. The default palette will color low values red and large values
blue.
<- dat_tab %>%
a ggplot() +
geom_horizon(aes(x = x, y=y)) +
theme_few() +
scale_fill_hcl()
a
To understand how horizon plots are related to line charts, we can plot both side by side.
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y)
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(78.48221, 172.74027, 266.99833, -110.03390, -204.29196, -298.55001),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- sum(range(dat_tab$y, na.rm = T))/2
mid
<- plotAllLayers(dat_tab, mid, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
The resulting figure shows how the sine curve of this example can be condensed into a stripe instead of a full line chart.
ggHoriPlot
can also output the exact intervals for each
cutpoint by simply adding fill=..Cutpoints..
in the
aesthetics:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y,
fill=..Cutpoints..)
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(78.48221, 172.74027, 266.99833, -110.03390, -204.29196, -298.55001),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- sum(range(dat_tab$y, na.rm = T))/2
mid
<- plotAllLayers(dat_tab, mid, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
The above example with default settings calculates the origin of the
horizon plot as the midpoint between the data range. Sometimes, however,
we might want to use some other origin for our data. In
ggHoriPlot
this can be achieved by specifying the desired
origin
argument in geom_horizon()
. For
example, if we want to use the median as the origin:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y,
fill=..Cutpoints..),
origin = 'median'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(96.20134, 190.4594, 284.7174, -92.31478, -186.57283, -280.83089),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- median(dat_tab$y, na.rm = T)
me
<- plotAllLayers(dat_tab, me, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
Note that the horizon scale –this is, the regular interval that determines the cutpoints–, is still the same as when using the midpoint. This might produce some cutpoints that do not entirely match the range of values. In the above example, limit for the upper interval (the bluest interval) falls outside of the range of values. At the other end, the limit for the lower interval (the reddest interval) falls within the range of the data. All the data values that are above the upper limit or (as happens in this case) below the lower limit are colored as the closest interval.
The origin can also be specified to be the mean:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y,
fill=..Cutpoints..),
origin = 'mean'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(91.89319, 186.15125, 280.40931, -96.62292, -190.88098, -285.13903),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(
names = factor(names, rev(names)),
y_max = ifelse(cuts == min(cuts),
-Inf,
ifelse(
== max(cuts),
cuts Inf,
%>%
cuts))) arrange(names)
<- mean(dat_tab$y, na.rm = T)
me
<- plotAllLayers(dat_tab, me, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
Alternatively, the origin might also be a manually chosen number:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y,
fill=..Cutpoints..),
origin = 50
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(144.25806, 238.51611, 332.77417, -44.25806, -138.51611, -232.77417),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- plotAllLayers(dat_tab, 50, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
If we specify the origin to be quantiles
, then the
origin will be set to the median and the cutpoints will be set to
equally sized quantiles:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y,
fill=..Cutpoints..),
origin = 'quantiles'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(43.02642, 124.15063, 266.99833, -45.43396, -119.31147, -298.55001 ),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- median(dat_tab$y, na.rm = T)
me
<- plotAllLayers(dat_tab, me, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
Note that this might produce intervals that do not have the same size, which can be undesirable and/or deceiving.
Sometimes we are not interested in plotting values as both above and
below the origin. In those cases, we can specify the origin to be the
smallest value by setting origin='min'
, so all values are
above the origin.
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
# xend = xend,
y=y,
fill=..Cutpoints..),
horizonscale = 6,
origin = 'min'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints_a cuts = c(-15.78, 78.48, 172.74, 266.998, -110.034, -204.292, -298.55),
color = c("#D7E2D4", "#36ABA9", "#324DA0", 'white', "#F6DE90", "#E78200", "#A51122")
)
<- cutpoints_a %>% arrange(desc(cuts))
cutpoints_a
<- plotAllLayers(dat_tab, -298.55, cutpoints_a$cuts, cutpoints_a$color)
b
/a) + plot_layout(guides = 'collect', heights = c(6, 1)) (b
For this example the red and blue coloring does not make much sense.
Instead, you can choose another hcl palette and specify it in
scale_fill_hcl()
. For example, a single-hue palette is much
more appropriate:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
# xend = xend,
y=y,
fill=..Cutpoints..),
horizonscale = 6,
origin = 'min'
+
) theme_few() +
scale_fill_hcl(palette = 'Purple-Orange', reverse = T)
<- tibble(
cutpoints_a cuts = c(-15.78, 78.48, 172.74, 266.998, -110.034, -204.292, -298.55),
color = c( "#B76AA8", "#8F4D9F","#5B3794", 'white', "#D78CB1", "#F1B1BE", "#F8DCD9")
)
<- cutpoints_a %>% arrange(desc(cuts))
cutpoints_a
<- plotAllLayers(dat_tab, -298.55, cutpoints_a$cuts, cutpoints_a$color)
b
/a) + plot_layout(guides = 'collect', heights = c(6, 1)) (b
You can list all available palettes by running
hcl.pals()
.
Apart from the origin, ggHoriPlot
also allows to
customize the horizon scale, this is, the number of cuts and where they
happen. The default number of cuts is set to 6, as in all of the
examples above, but it can be set to any other integer, such as 5
intervals:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
# xend = xend,
y=y,
fill=..Cutpoints..),
horizonscale = 5,
origin = 'midpoint'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(97.33383, 210.44349, -128.88551, -241.99518, -355.10485),
names = c('ypos1', 'ypos2', 'yneg1', 'yneg2', 'yneg3'),
color = c("#69BBAB", "#324DA0", "#FEFDBE", "#EB9C00", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- sum(range(dat_tab$y, na.rm = T))/2
mid
<- plotAllLayers(dat_tab, mid, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(5, 1)) b
or 10 intervals instead:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
# xend = xend,
y=y,
fill=..Cutpoints..),
horizonscale = 10,
origin = 'midpoint'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(40.77899 , 97.33383 , 153.88866, 210.44349 , 266.99833 ,
-72.33068, -128.88551, -185.44035, -241.99518, -298.55001),
names = c('ypos1', 'ypos2', 'ypos3', 'ypos4', 'ypos5', 'yneg1', 'yneg2', 'yneg3', 'yneg4', 'yneg5'),
color = c("#E5F0D6", "#ACD2BB" ,"#4EB2A9" ,"#0088A7", "#324DA0",
"#FAEDA9","#F1C363","#E98E00", "#DC4A00", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- sum(range(dat_tab$y, na.rm = T))/2
mid
<- plotAllLayers(dat_tab, mid, cutpoints$cuts, cutpoints$color)
b
/a + plot_layout(guides = 'collect', heights = c(10, 1)) b
Finally, we can also specify our own intervals by providing a vector of cutpoints:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
y=y,
fill=..Cutpoints..),
horizonscale = c(78.48221, 172.74027,
266.99833, -110.03390,
-204.29196, -298.55001),
origin = -15.77584
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(78.48221, 172.74027, 266.99833, -110.03390, -204.29196, -298.55001),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- sum(range(dat_tab$y, na.rm = T))/2
mid
<- plotAllLayers(dat_tab, mid, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b
Some data might have starting and end points for x values. If that is
the case, the line chart will have a step-like shape.
ggHoriPlot
can also plot this kind of data. We simply need
to specify the end coordinates using the xend
aesthetics
inside geom_horizon()
:
<- dat_tab %>%
a ggplot() +
geom_horizon(
aes(x = x,
xend = xend,
y=y,
fill=..Cutpoints..),
horizonscale = 6,
origin = 'midpoint'
+
) theme_few() +
scale_fill_hcl()
<- tibble(
cutpoints cuts = c(78.48221, 172.74027, 266.99833, -110.03390, -204.29196, -298.55001),
names = c('ypos1', 'ypos2', 'ypos3', 'yneg1', 'yneg2', 'yneg3'),
color = c("#D7E2D4", "#36ABA9", "#324DA0", "#F6DE90", "#E78200", "#A51122")
%>%
) mutate(names = factor(names, rev(names))) %>%
arrange(names)
<- dat_tab %>%
dt pivot_longer(c(x, xend)) %>%
mutate(x = value)
<- plotAllLayers(dt, mid, cutpoints$cuts, cutpoints$color)
b
/a+ plot_layout(guides = 'collect', heights = c(6, 1)) b