A36 Span Chart - cimat/data-visualization-patterns GitHub Wiki

#A 3.6: SPAN CHART

Description

Use a span chart to display multiple datasets, with each set containing a minimum and a maximum value and only these extreme values being of interest to the observer. In other words, your data table has two variable columns, both applying the same scale and unit. These datasets can also contain more than two values. However, those values “in between” will get lost as only minimum and maximum matter to this representation type (Behrens, 2008).

Required Data

Use a span chart to display multiple datasets, with each set containing a minimum and a maximum value and only these extreme values being of interest to the observer. In other words, your data table has two variable columns, both applying the same scale and unit. These datasets can also contain more than two values. However, those values “in between” will get lost as only minimum and maximum matter to this representation typ (Behrens, 2008).

Usage

Create a two-dimensional Cartesian coordinate system. Label and scale the axes appropriately. In your database, determine the minimum and maximum values of each dataset you want to display. For each of these sets, create a bar in the coordinate grid, and resize it in the way that the bottom end of it represents the minimum value, and the top end the maximum value (Behrens, 2008).

Rationale

The span chart converts potentially large sets of data into slim graphics, leading the user’s attention to the extreme values that matter in this case. This said, it should be kept in mind that this type of display doesn’t tell the user anything about the values in between these two extremes neither about their distribution, nor about average values (Behrens, 2008).

If you need to display more detailed information, you should consider a different representation method instead: The box plot, a Diagram style widely used in descriptive statisics, allows the reader to identify several key data out of a set of single items: Besides maximum and minimum values, the graph contains a median marker (displaying the average value), quartile ranges representing the upper and lower 25 percent of items as seen from the median, as well as outliers that significantly deviate from the data set [Benjamini 1988]

Related Patterns

  • A 3.1 Simple Bar Chart

PYTHON IMPLEMENTATION

Data Set

from datos import data
d=data('mtcars')
d.head()

Dependences

  • Matplotlib
  • Seaborn
  • Pyqtgraph
  • Pandas

Code Example

Matplotlib

import numpy as np
import matplotlib.pyplot as plt
from datos import data
import pandas as pd

d=data('mtcars')
subset1, subset2, subset3= d[d.cyl==4], d[d.cyl==6], d[d.cyl==8]
x=pd.DataFrame ({'Max': [max(subset1.mpg), max(subset2.mpg),
max(subset3.mpg)],
                                 'Min': [min(subset1.mpg),
min(subset2.mpg), min(subset3.mpg)],
                                 'Span':
[max(subset1.mpg)-min(subset1.mpg), max(subset2.mpg)-min(subset2.mpg),
max(subset3.mpg)-min(subset3.mpg)]})
x.index=[4,6,8]
bar_width = 0.8
opacity = 0.4
plt.bar(x.index,x.Span, bar_width, alpha=opacity,color='g',
bottom=x.Min)
plt.xlabel('Cylindres')
plt.ylabel('Miles per Gallon')
plt.title('Range of Milles per Gallon (mpg) by Cylindres (cyl)',
size=15)
plt.show()

\

Seaborn

import seaborn as sns
import matplotlib.pyplot as plt
from datos import data
import pandas as pd

sns.set(style="white")
f, ax = plt.subplots(figsize=(6, 15))
d=data('mtcars')
subset1, subset2, subset3= d[d.cyl==4], d[d.cyl==6], d[d.cyl==8]
datos=pd.DataFrame ({'Max': [max(subset1.mpg), max(subset2.mpg),
max(subset3.mpg)],
                                 'Min': [min(subset1.mpg),
min(subset2.mpg), min(subset3.mpg)],
                                 'Span':
[max(subset1.mpg)-min(subset1.mpg), max(subset2.mpg)-min(subset2.mpg),
max(subset3.mpg)-min(subset3.mpg)]})
datos.index=[4,6,8]
sns.barplot(x=datos.index, y=datos.Max, color="#2ecc71", linewidth=0)
sns.barplot(x=datos.index, y=datos.Min, color="white", linewidth=0)
sns.axlabel('Cylindres','Milles Per Gall')
plt.title('Range of Milles per Gallon (mpg) by Cylindres (cyl)',
family='Serif', size=16)
plt.show()

\

Pyqtgraph

References

R IMPLEMENTATION

Data Set

head(mtcars)

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Dependences

  • lattice
  • ggplot2

Code Example

Graphics

x<-subset(mtcars, mtcars$cyl==4)
x<-range(x$mpg)
y<-subset(mtcars, mtcars$cyl==6)
y<-range(y$mpg)
z<-subset(mtcars, mtcars$cyl==8)
z<-range(z$mpg)
df<-data.frame(cyl=c(4,6,8),from=c(x[1],y[1],z[1]), to=c(x[2],y[2],z[2]))
barplot(df$to, border='transparent', space=1, col='blue')
barplot(df$from, space=1, add=TRUE, col='white', 
        border='transparent', names.arg=df$cyl, 
        cex.names=0.8, main="Range of Milles per Gallon (mpg) by Cylindres (cyl)", xlab="Milles Per Gallon", ylab="Cylindres")
box(bty='l')

Lattice

library(lattice)
x<-subset(mtcars, mtcars$cyl==4)
x<-range(x$mpg)
y<-subset(mtcars, mtcars$cyl==6)
y<-range(y$mpg)
z<-subset(mtcars, mtcars$cyl==8)
z<-range(z$mpg)
df<-data.frame(cyl=c(4,6,8),from=c(x[1],y[1],z[1]), to=c(x[2],y[2],z[2]))
xyplot( to  ~ cyl + from, df, ylim = c(0, 35), xlim=c(2, 10), border='transparent', 
        main="Range of Milles per Gallon (mpg) by Cylindres (cyl)", xlab="Milles Per Galllon",ylab="Cylindres",
       panel = function(x, y, subscripts, groups,...){
         panel.barchart(y = y[subscripts], 
                    x = x[subscripts[groups=="cyl"]],  horizontal=FALSE, col="blue",box.width=1,
                    ...)
         panel.barchart(y = x[subscripts[groups=="from"]], 
                        x = x[subscripts[groups=="cyl"]],  horizontal=FALSE, col = "white", box.width=1,
                        ...)
       })

ggplot2

library("ggplot2")
x<-subset(mtcars, mtcars$cyl==4)
x<-range(x$mpg)
y<-subset(mtcars, mtcars$cyl==6)
y<-range(y$mpg)
z<-subset(mtcars, mtcars$cyl==8)
z<-range(z$mpg)
df<-data.frame(cyl=c(4,6,8),from=c(x[1],y[1],z[1]), to=c(x[2],y[2],z[2]))
g <- ggplot(df, aes(cyl,to)) + geom_crossbar(aes(ymin = from, ymax = to), width = 0.8, fill="blue")
g + labs(list(title = "Range of Milles per Gallon (mpg) by Cylindres (cyl)",  x="Cylindres", y="Miles per Gallon"))

References

⚠️ **GitHub.com Fallback** ⚠️