A36 Span Chart - cimat/data-visualization-patterns GitHub Wiki
#A 3.6: SPAN CHART
Use a span chart to display multiple datasets, with each set containing a minimum and a maximum value and only these extreme values being of interest to the observer. In other words, your data table has two variable columns, both applying the same scale and unit. These datasets can also contain more than two values. However, those values “in between” will get lost as only minimum and maximum matter to this representation type (Behrens, 2008).
Use a span chart to display multiple datasets, with each set containing a minimum and a maximum value and only these extreme values being of interest to the observer. In other words, your data table has two variable columns, both applying the same scale and unit. These datasets can also contain more than two values. However, those values “in between” will get lost as only minimum and maximum matter to this representation typ (Behrens, 2008).
Create a two-dimensional Cartesian coordinate system. Label and scale the axes appropriately. In your database, determine the minimum and maximum values of each dataset you want to display. For each of these sets, create a bar in the coordinate grid, and resize it in the way that the bottom end of it represents the minimum value, and the top end the maximum value (Behrens, 2008).
The span chart converts potentially large sets of data into slim graphics, leading the user’s attention to the extreme values that matter in this case. This said, it should be kept in mind that this type of display doesn’t tell the user anything about the values in between these two extremes neither about their distribution, nor about average values (Behrens, 2008).
If you need to display more detailed information, you should consider a different representation method instead: The box plot, a Diagram style widely used in descriptive statisics, allows the reader to identify several key data out of a set of single items: Besides maximum and minimum values, the graph contains a median marker (displaying the average value), quartile ranges representing the upper and lower 25 percent of items as seen from the median, as well as outliers that significantly deviate from the data set [Benjamini 1988]
- A 3.1 Simple Bar Chart
from datos import data
d=data('mtcars')
d.head()
- Matplotlib
- Seaborn
- Pyqtgraph
- Pandas
import numpy as np
import matplotlib.pyplot as plt
from datos import data
import pandas as pd
d=data('mtcars')
subset1, subset2, subset3= d[d.cyl==4], d[d.cyl==6], d[d.cyl==8]
x=pd.DataFrame ({'Max': [max(subset1.mpg), max(subset2.mpg),
max(subset3.mpg)],
'Min': [min(subset1.mpg),
min(subset2.mpg), min(subset3.mpg)],
'Span':
[max(subset1.mpg)-min(subset1.mpg), max(subset2.mpg)-min(subset2.mpg),
max(subset3.mpg)-min(subset3.mpg)]})
x.index=[4,6,8]
bar_width = 0.8
opacity = 0.4
plt.bar(x.index,x.Span, bar_width, alpha=opacity,color='g',
bottom=x.Min)
plt.xlabel('Cylindres')
plt.ylabel('Miles per Gallon')
plt.title('Range of Milles per Gallon (mpg) by Cylindres (cyl)',
size=15)
plt.show()
\
import seaborn as sns
import matplotlib.pyplot as plt
from datos import data
import pandas as pd
sns.set(style="white")
f, ax = plt.subplots(figsize=(6, 15))
d=data('mtcars')
subset1, subset2, subset3= d[d.cyl==4], d[d.cyl==6], d[d.cyl==8]
datos=pd.DataFrame ({'Max': [max(subset1.mpg), max(subset2.mpg),
max(subset3.mpg)],
'Min': [min(subset1.mpg),
min(subset2.mpg), min(subset3.mpg)],
'Span':
[max(subset1.mpg)-min(subset1.mpg), max(subset2.mpg)-min(subset2.mpg),
max(subset3.mpg)-min(subset3.mpg)]})
datos.index=[4,6,8]
sns.barplot(x=datos.index, y=datos.Max, color="#2ecc71", linewidth=0)
sns.barplot(x=datos.index, y=datos.Min, color="white", linewidth=0)
sns.axlabel('Cylindres','Milles Per Gall')
plt.title('Range of Milles per Gallon (mpg) by Cylindres (cyl)',
family='Serif', size=16)
plt.show()
\
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
- lattice
- ggplot2
x<-subset(mtcars, mtcars$cyl==4)
x<-range(x$mpg)
y<-subset(mtcars, mtcars$cyl==6)
y<-range(y$mpg)
z<-subset(mtcars, mtcars$cyl==8)
z<-range(z$mpg)
df<-data.frame(cyl=c(4,6,8),from=c(x[1],y[1],z[1]), to=c(x[2],y[2],z[2]))
barplot(df$to, border='transparent', space=1, col='blue')
barplot(df$from, space=1, add=TRUE, col='white',
border='transparent', names.arg=df$cyl,
cex.names=0.8, main="Range of Milles per Gallon (mpg) by Cylindres (cyl)", xlab="Milles Per Gallon", ylab="Cylindres")
box(bty='l')
library(lattice)
x<-subset(mtcars, mtcars$cyl==4)
x<-range(x$mpg)
y<-subset(mtcars, mtcars$cyl==6)
y<-range(y$mpg)
z<-subset(mtcars, mtcars$cyl==8)
z<-range(z$mpg)
df<-data.frame(cyl=c(4,6,8),from=c(x[1],y[1],z[1]), to=c(x[2],y[2],z[2]))
xyplot( to ~ cyl + from, df, ylim = c(0, 35), xlim=c(2, 10), border='transparent',
main="Range of Milles per Gallon (mpg) by Cylindres (cyl)", xlab="Milles Per Galllon",ylab="Cylindres",
panel = function(x, y, subscripts, groups,...){
panel.barchart(y = y[subscripts],
x = x[subscripts[groups=="cyl"]], horizontal=FALSE, col="blue",box.width=1,
...)
panel.barchart(y = x[subscripts[groups=="from"]],
x = x[subscripts[groups=="cyl"]], horizontal=FALSE, col = "white", box.width=1,
...)
})
library("ggplot2")
x<-subset(mtcars, mtcars$cyl==4)
x<-range(x$mpg)
y<-subset(mtcars, mtcars$cyl==6)
y<-range(y$mpg)
z<-subset(mtcars, mtcars$cyl==8)
z<-range(z$mpg)
df<-data.frame(cyl=c(4,6,8),from=c(x[1],y[1],z[1]), to=c(x[2],y[2],z[2]))
g <- ggplot(df, aes(cyl,to)) + geom_crossbar(aes(ymin = from, ymax = to), width = 0.8, fill="blue")
g + labs(list(title = "Range of Milles per Gallon (mpg) by Cylindres (cyl)", x="Cylindres", y="Miles per Gallon"))