A12 Bubble Chart - cimat/data-visualization-patterns GitHub Wiki

A 1.2: BUBBLE CHART

Description

Bubble charts share certain similarities with scatterplots: They are drawn into a Cartesian coordinate system and provide information about the correlation between quantitative attributes as represented by the two coordinate axes. But in opposition to a scatterplot, the raw data of a bubble chart does not consist of an array of anonymous pairs of variates that only become meaningful in the context of a larger group of items. Instead, each dataset has a unique label assigned, usually a plain-text name to identify the corresponding object in the coordinate grid (Behrens, 2008).

While two numerical variables can be derived from its x and y coordinates in the representation, the remaining data attributes are displayed by the bubbles’ graphic features, including object size, fill color, brightness etc. Their choice depends on the format of the raw data. While quantitative values can be displayed by the position of the bubbles within the coordinate grid, object size or brightness, qualitative (or categorical) values are usually distinguished by the object’s fill color. These considerations are crucial to the correct use of the bubble chart and refer to Jacques Bertin’s theory of graphic variables (Bertin, 1973).

Required Data

Use a bubble chart to display tabular data. Each dataset is identifiable by a unique label, and consists of a an array of variables. At least two of these variables must be quantitative. The other attributes can be of any scale of measurement it doesn’t matter whether they represent quantitative or qualitative values (Behrens, 2008).

Usage

Create a two-dimensional Cartesian coordinate system. Choose two of the quantitative variables from your data table, and apply them to the axes. For each row in the table, draw a circle with its center at the corresponding coordinates. Determine which and how many graphic variables you need to express the remaining variables from your table, and apply them to each bubble accordingly.The most common graphic attributes here are the bubble’s size or, for non-numeric variables, fill color and brightness.

When you use the bubble size to express a value, keep in mind the two competing concepts of how to translate the represented attribute into a geometric value. The problem here is that a one-dimensional value is represented by a two-dimensional graphical object. The “traditional“ way to perform this task is to draw the circle’s diameter proportionally to the numerical value. As a result, an object that doubles in diameter (in width and height, that is) appears four times as big - if the value triples, the area even grows to the ninefold size! This is why the technique is drawing much criticism for a long time and is often cited as a prime example for the visual distortion of facts (Bertin, 1973) (Hartmann, 2006 ).

Rationale

The Bubble Chart pattern is a convenient alternative to pseudo-3D diagrams that aim to display data of three variables in two-dimensional environments such as a book page or a computer screen. In some cases, the bubble chart can even bear more than three variables to display. Set in a conventional Cartesian coordinate system, it appears familiar to the user while extending the possibilities of standard scatterplots.(Behrens, 2008).

Related Patterns


PYTHON IMPLEMENTATION

Data Set

We use the data set mtcars R. Create a file named as datos.py and write the next code.

from rpy2.robjects import r
from rpy2.robjects import pandas2ri

def data(name):
        return pandas2ri.ri2py(r[name])

Then you need import the data.py file into a proyect.

from datos import data
d=data('mtcars')
d

Dependences

  • rpy2: The rpy2 package is used to access all R datasets from Python.
  • Matplotlib:
  • Seaborn:
  • Pyqtgraph:

Code Example

Matplotlib

import numpy as np
import matplotlib.pyplot as plt
from datos import data

d=data('mtcars')
area = np.pi * (2 * d.cyl)**2  # 0 to 15 point radiuses
plt.scatter(d.mpg, d.wt, s=area, c='blue', alpha=0.5)
plt.title('Bubble Chart by Milles per Gallon and  Car Weight',
family='serif', size=16)
plt.xlabel('Car Weight', family= 'serif')
plt.ylabel('Miles per Gallon', family='serif')
plt.show()

{width=12 cm}

Seaborn

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datos import data

d=data('mtcars')
sns.set(style="white")
g = sns.FacetGrid(d)
area = np.pi * (2 * d.cyl)**2
g.map(plt.scatter, "wt", "mpg",s=area)
plt.title("Scatterplot by Milles per Gallon and  Car Weight",
family='serif', size=16)
g.set_axis_labels("Car Weight","Milles per Gallon")
plt.show()

{width=12 cm}

Pyqtgraph

import pyqtgraph as pg
from pyqtgraph.Qt import QtCore, QtGui
import numpy as np
from datos import data

d=data('mtcars')
win = pg.GraphicsWindow()
win.resize(800,500)
win.setWindowTitle('Bubble Chart')
plt= win.addPlot(title="Scatterplots by Milles per Gallon and  Car
Weigh")
plt.plot(d.wt,d.mpg, pen=None, symbol='o', symbolSize=d.cyl,
symbolPen=(255,255,255,200), symbolBrush=(0,0,255,150))
plt.setLabel('left', "Miles per Gallon", units='mpg')
plt.setLabel('bottom', "Car Weight", units='lbs')

if __name__ == '__main__':
    import sys
    if (sys.flags.interactive != 1) or not hasattr(QtCore,
'PYQT_VERSION'):
        QtGui.QApplication.instance().exec_()

{width=12 cm}

References

R IMPLEMENTATION

Data Set

head(mtcars)

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Dependences

  • Lattice
  • ggplot2

Code Example

Graphics

radius <- sqrt( mtcars$cyl/ pi )
symbols(mtcars$wt,mtcars$mpg, circles=radius,inches=0.17, fg="blue", bg="blue", xlab="Car Weight", ylab="Miles per Gallon", main="Bubble Chart by Milles per Gallon and  Car Weight")
legend("topright", legend=c(4,6,8), col="blue",title = "cyl", pch=16, cex= 1.8, pt.cex=3:5)

Lattice

library(lattice)
xyplot(mtcars$mpg~mtcars$wt, aspect = 2/3,
       grid = TRUE,
       cex = sqrt( mtcars$cyl/ pi )*2.3, fill.color = rep("blue",40), 
       col = "blue",main="Bubble Chart by Milles per Gallon and  Car Weight", 
       xlab="Car Weight", ylab="Miles per Gallon",
       panel = function(x, y, ..., cex, fill.color, subscripts) {
         panel.xyplot(x, y, cex = cex[subscripts],
                      pch = 21, fill = fill.color[subscripts], ...)
       })

ggplot2

library (ggplot2)
g <- ggplot(mtcars, aes(wt, mpg, size = cyl))+geom_point(colour="blue") + scale_size_continuous(range=c(6,10))
g + labs(list(title = "Bubble Chart by Milles per Gallon and  Car Weight",  x="Car Weight", y="Miles per Gallon"))

References

⚠️ **GitHub.com Fallback** ⚠️