A51 Sankey Diagram - cimat/data-visualization-patterns GitHub Wiki

A 5.1 Sankey Diagram

Description

The sankey diagram visualizes complex systems of material or energy flows. It describes isolated systems by means of their input and output flows, and describes the proportional magnitudes of the single flows as they contribute to the entire system. The input portions of such a system are depicted as arrows leading into a main flow (usually flowing from left to right), while outputs are drawn as arrows leading away from the system. The proportional magnitude of each contribution is displayed through the width of the respective arrow.

Required Data

Use the sankey diagram to display quantitative data that are characterized by a sequential nature where the main focus is on the proportional amount of each data class. You want to describe an isolated system on the basis of input and output components and your main objective is the amount of “material” that goes into the system from different sources, or leaves it in different ways.

Usage

Create a graphic object to represent a flow, e.g. a big arrow pointing from left to right across the display, at a width that correlates to the total amount of the system. Draw a set of branches leading to the system, and a set that leads away from it, with one branch for each contribution to the system, and at a proportional width. If applicable, arrange the branches in the causal order in which they contribute to the system.

Rationale

Sankey diagrams clearly depict the proportional amount a flow channel adds to a whole system. Similarly to pie or bar charts, a graphic element’s width is directly linked to the one-dimensional data value it represents, making the major contributors instantly visible.

Related Patterns

A 2.3 Stacked Area Chart

A 4.1 Simple Pie Chart

A 5.2 Thread Arcs

Python Implementation Pattern

Sankey Diagram is a graphic that contains intputs an ouputs, they represent a sum respectively. In such diagrams, it is easy to see representations of the efforts of input and output.

Data Set

For this example it will be used Data Set called mtcars, this data set is the R default data set, to use this data set, was used a Python module called rpy2, which is used to use data sets of R in python. This data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

Dependencies

List of Modules that are required for implementation

Matplotlib

Code example

Code Example With Matplotlib


import matplotlib.pyplot as plt
from matplotlib.sankey import Sankey
from matplotlib import pyplot
from datos import data

d=data('mtcars')
mpg=d.loc["Datsun 710",'mpg']
disp=d.loc["Datsun 710",'disp']
qsec=d.loc["Datsun 710",'qsec']
gear =d.loc["Datsun 710",'gear']
cyl=d.loc["Datsun 710",'cyl']
hp=d.loc["Datsun 710",'hp']

fig, ax = plt.subplots(figsize=(15, 15))

Sankey(ax,margin=10, flows=[disp, mpg,-qsec, -gear, -hp,-cyl,],
       labels=['Displacement\n', 'Miles/(US) gallo', '1/4 Mile Time',
            'Number of forward gears', 'Gross horsepower', 'Number of
cylinders'],
       orientations=[-1, 0, 1, 0, 0, 0]).finish()

ax.axes.get_xaxis().set_visible(False)
ax.axes.get_yaxis().set_visible(False)
plt.title("Datsun 710 Sankey Diagram")
plt.show()

R Implementation Pattern

Sankey Diagram is a graphic that contains intputs an ouputs, they represent a sum respectively. In such diagrams, it is easy to see representations of the efforts of input and output.

Data Set

For this example it will be used Data Set called mtcars, this data set is the R default data set this data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

head(mtcars)

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Dependencies

Graphics - default package on R

For this example in graphics it will be used other dependence called Sankey.R this dependence is hosted in this link: [https://gist.github.com/aaronberdanier/1423501#file-sankey-r]

GoogleVis

Code example

Code Example With Graphics

source('Sankey.R')

mpg=mtcars["Datsun 710",1]
hp=mtcars["Datsun 710",4]
cyl=mtcars["Datsun 710",2]
wt= mtcars["Datsun 710",6]
disp=mtcars["Datsun 710",3]
qsec=mtcars["Datsun 710",7]
gear =mtcars["Datsun 710",10]
inputs = c(mpg,disp)
losses = c(qsec,gear,cyl,hp)
unit = "n ="

labels = c("Miles/(US) gallon",
           "Displacement\n",
           "1/4 mile time",
           "Number of forward gears",
           "Number of cylinders",
           "Datsun 710\nGross HP")

SankeyR(inputs,losses,unit,labels)

# Clean up my mess
rm("inputs", "labels", "losses", "SankeyR", "unit")

Code Example With GoogleVis

In this example, a package called GoogleVis google for R is used, this package generates Web graphics can also be used in desktop applications.

require(googleVis)

mpg=mtcars["Datsun 710",1]
hp=mtcars["Datsun 710",4]
cyl=mtcars["Datsun 710",2]
wt= mtcars["Datsun 710",6]
disp=mtcars["Datsun 710",3]
qsec=mtcars["Datsun 710",7]
gear =mtcars["Datsun 710",10]
dat <- data.frame(From=c(rep("Miles/(US) gallon",4), rep("Displacement", 4)),
                  To=c(rep(c("1/4 mile time",
                             "Number of forward gears",
                             "Number of cylinders",
                             "Datsun 710 Gross horsepower"))),
                  Weight=c(mpg,gear,cyl,hp))

sk1 <- gvisSankey(dat, from="From", to="To", weight="Weight")
plot(sk1)

⚠️ **GitHub.com Fallback** ⚠️