Lecture 2 - iloughman/D3-Lecture-Project GitHub Wiki
Intro
GeoJSON
GeoJSON is a format for encoding geographic data. A GeoJSON object may represent a geometry, a feature, or a collection of features. Geometries are often of the following type: points, polygons, or geometry collections. Features are objects that often have geometries associated with them, but also have a properties object, which contains additional information not available to a geometry type. Finally, feature collections are multiple feature objects.
Here is an example directly from the GeoJSON source data.
{ "type": "FeatureCollection",
"features": [
{ "type": "Feature",
"geometry": {"type": "Point", "coordinates": [102.0, 0.5]},
"properties": {"prop0": "value0"}
},
{ "type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [
[102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0]
]
},
"properties": {
"prop0": "value0",
"prop1": 0.0
}
},
{ "type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
[100.0, 1.0], [100.0, 0.0] ]
]
},
"properties": {
"prop0": "value0",
"prop1": {"this": "that"}
}
}
]
}
This example looks a little peculiar because it doesn't seem to be describing anything that we recognize. Let's look at something more familiar.
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"id": "01",
"properties": { "name": "Alabama" },
"geometry": {
"type": "Polygon",
"coordinates": [[[-87.359296,35.00118],
[-85.606675,34.984749],[-85.431413,34.124869],
[-85.184951,32.859696],[-85.069935,32.580372],
[-84.960397,32.421541],[-85.004212,32.322956],
[-84.889196,32.262709],[-85.058981,32.13674]
...
]]
}
},
{
"type": "Feature",
"id": "02",
"properties": { "name": "Alaska" },
"geometry": {
"type": "MultiPolygon",
"coordinates": [[[[-131.602021,55.117982],
[-131.569159,55.28229],[-131.355558,55.183705],
[-131.38842,55.01392],[-131.645836,55.035827],
[-131.602021,55.117982]]],[[[-131.832052,55.42469],
[-131.645836,55.304197],[-131.749898,55.128935],
[-131.832052,55.189182],
...
]]]
}
}]
}
Take a moment to look at this object more closely. Firstly, it has a type of Feature Collection, meaning that it must have a features key, which holds an array of feature objects. In this example, the individual feature objects are states. The properties key tells us the name of the state. The geometry key provides a geometry object, which encodes the border of the state as an array of arrays (...of arrays if the state is not a continuous land mass), each one consisting of a longitude, latitude pair.
Paths
D3 provides a tool for drawing paths when given data in the form of longitude and latitude coordinates: d3.geo.path()
. Let's see how we use this.
Let's assume (for now) that we have a GeoJSON file encoding the US state borders: states.json
. We can load this file asynchronously using d3.json
.
One of the most frustrating aspects of map making with D3 is fitting the map to the screen. Fortunately, for a map of the US, D3 provides projections that do most of the work for us. Let's set this up.
var projection = d3.geo.albersUsa()
var path = d3.geo.path()
.projection(projection)
We first specify the type of projection to be used. We then make sure that our path generator is aware of that projection.
The last step is to bind each array element of map.features
(which is just a specific state feature) to the a <path>
element contained in our SVG. Defining the d
attribute of our path completes the map.
var width = 960;
var height = 800;
var svg = d3.select("body").append("svg")
.attr("width", width)
.attr("height", height)
var projection = d3.geo.albersUsa()
var path = d3.geo.path()
.projection(projection)
d3.json('states.json', function(err,map){
svg.selectAll('path')
.data(map.features)
.enter().append('path')
.attr('d', path)
})
Notice that when defining the d
attribute we pass in only the path function, rather than function(d){return path(d)}
. This would work as well, but it is cleaner to pass in only the path, knowing that D3 handles this appropriately.
Let's set some simple styling so that our map appears only as an outline.
path {
fill: white;
stroke: black;
}
We should now see the following map render in our browser.
Making Our Map Interactive
Not only can we bind data to elements with D3, but we can also bind event listeners with D3. Let's bind an event listener that will recognize clicks, and console log the clicked state as a result. Working inside the callback function:
svg.selectAll('path')
.data(map.features)
.enter().append('path')
.attr('d', path)
.on('click', function(d,i){
console.log(d.properties.NAME)
})
This will log the name of the clicked state provided you can have access to it through d.properties.NAME
, which will likely depend on the GeoJSON file you are using.
Setting event handlers with D3 is very similar to other JavaScript libraries. For now we will not go into a lengthy discussion about them, but simply explain functionality as we use it. Feel free to read more about them here.
Map Formatting
In the previous example, the problem of displaying our map properly was solved by setting the projection type to albersUsa
. With most GeoJSON datasets that describe a region, things will not be so easy. Let's discuss a few tactics for properly displaying our maps.
What happens if we decrease the size of our SVG? If we set the width to 500px and the height to 300px we get:
The obvious fix for this is to enlarge the SVG, but instead, fit our map to the SVG by adjusting the projection.
var projection = d3.geo.albersUsa()
.scale(500)
.translate([width/2,height/2])
By chaining the scale
and translate
methods onto our projection function, we can fit our map back into the bounds of the SVG.
Read more about how these functions interact with projection here.
Data Visualization on Maps
There are many ways to visualize data on maps. One of the most popular visualizations is the choropleth. Even if you've never heard this word before, you've probably seen visualizations using it. Here's an example.
This section will walk through the basics of creating a choropleth.
Let's begin by understanding our two data sets. We have school_districts.geojson
, which is a GeoJSON file representing the boundaires of each of the 32 public school districts in New York City and School_Attendance.csv
, which provides a percent attendance for public schools in each district.
Loading the data
In this example we will load both data sets by nesting callback functions. We could also use a D3 plugin, queue
, which is a minimal asynchronous helper library. Read more about that here.
In the code below, we load the attendance data second and follow the loading with a quick renaming of the properties of interest, mainly for convenience.
We then do a bit of data scrubbing. Looping through the attendance data, we make sure that the attendance percentage is formatted as a decimal and also that the school district is a number rather than a string.
Finally, again for convenience, we attach the attendance percentage to the properties of the school district feature in our map object.
d3.json('school_districts.geojson', function(err,map){
console.log(map.features)
d3.csv('School_Attendance.csv')
.row(function(d){return {district: d.District, att: d['YTD % Attendance (Avg)']}})
.get(function(err,data){
data.forEach(function(row){
row.att = parsePercent(row.att);
row.district = parseDist(row.district)
map.features.forEach(function(feature){
if (feature.properties.SchoolDist === row.district){
feature.properties.att = row.att;
}
})
})
})
})
The parse functions used are shown below.
var parsePercent = function (str){
return str.substring(0,str.length-1)/100;
}
var parseDist = function (str){
return +str.substring(str.length-3, str.length)
}
Next, let's setup our SVG, path generator, and projection (remember, we should probably do this prior to loading our data).
var width = 960;
var height = 800;
var svg = d3.select("body").append("svg")
.attr("width", width)
.attr("height", height)
var projection = d3.geo.mercator()
.center([-74.02, 40.65])
.scale(65000)
.translate([480,400])
var path = d3.geo.path()
.projection(projection);
This time, we are going to use a Mercator projection because it simplifies the process of displaying our map. Observe that we have chained an additional method off of our projection: projection.center
. This centers the projection at a specified longitude and latitude. In this case, we're pulling these coordinates directly from the GeoJSON dataset describing New York City's school districts!
A downside to using the Mercator projection is that it does not preserve areas. This is undesirable for creating choropleth visualizations, because regions will become distored. But for a map of this scale, the distortions are trivial, so it is still an appropriate projection to use.
Now we can create our map.
svg.selectAll('path')
.data(map.features)
.enter().append('path')
.attr('d', path)
With no styling, our map looks a little strange, but don't worry, we're going to fix that.
To create a choropleth, we need each district to correspond to a shade of some color (let's choose blue for now). The correspondence between district and color should be determined by the attendance data bound to that district. To do this efficiently we create a quantize function that maps consecutive ranges of attendance percentage to one of eight different strings. Each string will specify the class of each path, rendering darkening shades of blue as the fill color.
Below is the quantize function.
var quantize = d3.scale.quantize()
.domain([.86,.94])
.range(d3.range([10]).map(function(i) {return 'a'+i}))
The interval from 0.86 to 0.94 (the range of attendance percentages for all the districts) is divided into ten subintervals, and then mapped to the discrete set of numbers between 0 and 10. These numbers are used to define classes, which fill each path with a shade of blue.
.a0 { fill:rgb(247,251,255); }
.a1 { fill:rgb(222,235,247); }
.a2 { fill:rgb(198,219,239); }
.a3 { fill:rgb(158,202,225); }
.a4 { fill:rgb(107,174,214); }
.a5 { fill:rgb(66,146,198); }
.a6 { fill:rgb(33,113,181); }
.a7 { fill:rgb(8,81,156); }
.a8 { fill:rgb(8,48,107); }
.a9 { fill: rgb(8,24,80);}
Finally, we chain an additional method onto the path definition, setting the class attribute.
svg.selectAll('path')
.data(map.features)
.enter().append('path')
.attr('d', path)
.attr('class', function(d){
return quantize(d.properties.att)
})
This will render the map.
Great! Although you may be surprised to learn that the darker colors in this map represent higher attendance rates. Usually we associate darker colors with poorer performance in choropleth visualizations. We can fix this easily enough by reversing the order of our class assignments. Try it!
Tool Tips
Despite how satisfying this picture may be, it still isn't actually telling us anything. We need to display what each color represents. We could do this easily enough with a fixed legend, but that's boring. Let's create something more dynamic.
First, let'd define a <div>
element that will appear whenever our mouse hovers over a district.
var toolTip = d3.select('body').append('div')
.attr('class','toolTip')
Next we set some of the styling for the tool tip class.
.toolTip {
color: #222;
background: #fff;
padding: .5em;
text-shadow: #f5f5f5 0 1px 0;
border-radius: 2px;
box-shadow: 0px 0px 2px 0px #a6a6a6;
opacity: 0.9;
position: absolute;
}
.hidden {
display: none;
}
Finally, we append event listeners to our path elements so that the tool tip will appear and display the data.
svg.selectAll('path')
.data(map.features)
.enter().append('path')
.attr('d', path)
.attr('class', function(d){
return quantize(d.properties.att)
})
.on('mousemove', function(d){
var mouse = d3.mouse(svg.node());
toolTip.classed("hidden", false)
.attr("style", "left:"+(mouse[0]+25)+"px;top:"+mouse[1]+"px")
.html('School District: '+d.properties.SchoolDist+'<br>% Attendance: '+Math.floor(d.properties.att*100))
})
.on("mouseout", function(d,i) {
toolTip.classed("hidden", true)
});
There are a few unfamiliar things going on here. First, the function d3.mouse(container)
returns the x and y coordinates of the current event relative to the specified container. In other words, we get the current coordinate of the mouse in the SVG coordinate system. Setting the style attribute positions the <div>
relative to the mouse location. Finally, the mouseout
event listener hides the <div>
when the mouse leaves the district.
When we hover over a district, we now see the relevant data about that district (also, notice that we switched our color assignements around).