Pew’s recent study made it clear that people want more context with their data. To meet this opportunity, governments and vendors have created tools that allow for custom visualizations within data portals, while resident civic hackers aim to share stories and ease friction through detailed analyses or apps. One commonly used, and misused, visualization tool is the heatmap. Governments, vendors, and civic hackers all frequently deploy heatmaps when trying to tell a spatial story, but is there a reason for this strong preference or is there a better method?
What are heatmaps?
You’ve definitely seen one of these, and if you haven’t, just check out a weather map. Heatmaps use
a color scale to display different values, and the fuzzy borders between zones make them good for expressing probabilities when making predictions. If you’re plotting out specific points from a dataset, however, these fuzzy borders lower the resolution of the location and make it more difficult to determine patterns. In many cases, it’s also not possible to control the arbitrary groupings for values along the color scale and they default to an equal interval method. If a dataset has even a few significant outliers, the map will be distorted by minimizing the difference between dimmer hotspots and their surrounding area. Despite these issues, heatmaps are often the only density visualization available to portal users and they continue to be popular among bloggers.
The alternative: hex binning
Hex binning creates a grid of hexagons over the map, which are then shaded like a choropleth map. Why hexagons? They’re the most efficient and compact use of 2D space, which is good when you’re trying to represent circular points on a map in a structured and comprehensive manner. In doing so, you can see much more detailed patterns emerge. Further, this visualization is inherently customizeable – you can increase the “resolution” of the data by creating smaller hexagons, and you can control the color scale on any software that handles GIS visualizations. When paired with a method of devising breaks (Jenks, or Heads/Tails), you now have a powerful method of accurately displaying spatial density patterns.
Side by side
A local blog, Baltimore: Byte-Sized, recently did a thorough analysis of towed cars in Baltimore which used heatmaps in conjunction with other plots. The first map shows the complete dataset, but as the author describes it, “the disproportional number of vehicles towed from downtown visually compresses the variation between other regions of the city.” To see the dimmer hotspots, in other words, the author had to create a second map that excluded the activity in downtown Baltimore.
Upon doing so, it becomes clear that there are actually several other comparative hotspots. Even while excluding the downtown data, the overflow from the surrounding points causes the downtown area to be shaded in anyway. The same is true for several parks and other greenspaces. The pattern begins to emerge when the viewer can hold both maps in their mind at once, and locations that might be the source of the high volume of activities – hospitals, courthouses, and busy roads – start to pop out. The author overlays a layer of points from the most towed streets to add back in another layer of detail and confirm the earlier hypotheses.
Was all of this necessary? How accurate is the final product? If it took a skilled practitioner three maps and a special program to convey all of this information, what hope does the average resident have when using tools through a data portal?
Below is the same dataset visualized through hex binning. Though the downtown area is clearly the most active, other hotspots are also visible. Further, the resolution of the hexagons allows the viewer to make out individual streets and compare them to the low-traffic neighborhoods and parks surrounding them. Each hexagon is shaded according to its own unique value, rather than a gradient based on its neighbor. As a result, the viewer can find specific high-activity locations more easily.
In the close-up thumbnails of the downtown neighborhoods, this contrast is even more striking. The density and concentration of data points in the center is visible on each, but the faint glows to the left and right of the heatmap are much more clearly defined on the hexmap. At this level of clarity, a Baltimorean would likely be able to identify Union Memorial to the north or UMMC in the west, giving additional context for them to interpret this data.
The tools and analyses that residents have access to represent a growing investment in open data, and the democratization of insight that they afford is largely unprecedented, but this is still a goal half-achieved without expanding choices for visualization. To their credit, there are platforms are working to implement more options, or advocating for this technique, and SpatiaLite recently implemented a function, ST_HexagonalGrid. It’s also widely available with readymade solutions for a number of different programs, and excellent walkthroughs exist to help hackers and bloggers adopt this. With increased interest and accessibility, hopefully more residents will be able to benefit from these visualizations and other GIS innovations.