While I've been working on the forthcoming book and mooc I've been doing some data wrangling in the background at work. For the 2012 Presidential election I made a gallery of maps that illustrated diverse styles of cartography along with some comments on the map types. Each map can tell a different story of the election. I've been in the process of updating this with a new gallery of the 2016 election results (currently around ten maps but more to come) and I got to the tricky one - the dasymetric dot density map. It requires quite a bit of manipulation of data so here is the map, and in this blog I'll explain a little of the process.
In 2012 I made a similar map for the Obama/Romney election. It was a product of the web mapping technology of the time. Made using ArcMap (full disclosure for those who don't know I work for Esri - who make ArcGIS). At the smallest scale 1 dot = 1,000 votes. At the largest, 1 dot = 10 votes and if you printed the map out it would be as large as a football field. It took 3 months to cajole the largest scale map onto the web!!! I wanted to update the map and the four years that have intervened have brought new software capabilities. For 2012 I had to generate up to 12 million points and position them. Now, using ArcGIS Pro I can use the dot density renderer and let the software take the strain and if I were going all out then why not try and make a map where 1 dot = 1 vote. So, for me, the map is a technical challenge. Part of what I do at work to push the software to see what it is capable of, to test it and to show others what capabilities it affords.
So how to make the map? Well, it's a product of a number of decisions, each one of which propagates into the map. I'll be doing a proper write-up on the ArcGIS blog in due course but, in summary, a dasymetric map takes data held at one spatial unit (in this case counties) and reapportions it to different (usually smaller) areas. It uses a technique developed by the late Waldo Tobler called pycnophylactic reallocation modelling. Those different areas are, broadly, urban. The point of the map is to show where people live and vote rather than simply painting an entire county with a colour which creates a map that often misleads [Waldo sadly passed away recently and I was running the model when I heard of his death a couple of weeks ago. I met him a few times and his legacy to computational geography and cartography is immense].
I used the National Land Cover Database to extract urban areas. It's a raster dataset at 30m resolution. I used the impervious surface categories and created a polygon dataset with three classes, broadly dense urban, urban, and rural. I then did some data wrangling in ArcGIS Pro (more of that in a different blog) to reapportion the Democrat and Republican total votes at county level into the new polygons. There's some weighting involved so the dense urban polygons get (in total) 50% of the data. The urban get 35% of the data and the rural polygons get 15% of the data. Then I got the dot density renderer in ArcGIS Pro to draw the dots, one for each vote resulting in a map with nearly 130 million dots.
The result is a map that pushes the data into areas where people actually live. It leaves areas where no-one lives devoid of data. It reveals the structure of the US population surface. Most maps that take a dasymetric approach will all end up like this but I think there's value in the approach. To me it presents a better visual comparison of the amount of red and blue that the standard county level map that maps geography, not people, and overemphasises relatively sparsely populated large geographical areas.
So the map I saw on my desktop late Tuesday afternoon took 35 minutes to draw. Technical challenge achieved. ArcGIS Pro nailed it. This is a map that I couldn't have made in the previous election cycle. I was excited and so I took a quick screengrab, sent out a tweet and went home to walk Wisley the dog.
And that, I thought, was that. I'd put the map on the backburner and return to doing layout reviews for the book and doing last-minute work on the mooc over the next couple of weeks. But then something unexpected happened. My phone started pinging. Slowly at first but then a little more during the evening as people began to see the map on Twitter and like or re-tweet it. That's nice, I thought. I went to bed. Wednesday morning I woke to a relative avalanche of likes and retweets. I spent the day in Palm Springs at our Developer Summit and my phone never stopped. By the end of the day it had received around 3,000 likes and had been retweeted 2,000 times. I'm writing this Thursday morning and it's currently at 7,000 likes and a little over 3,000 retweets. The side-effect of this 15 minutes of map fame is I've picked up an extra 1,000 followers (25% increase) on my nearly 10 year old Twitter habit.
But there's a problem. The screengrab was quick and dirty and while there have been many and varied comments on the 'map' it's by no means the finished article. I want to create a hi-res version and also make a web map like the 2012 version. I don't have time to do this in the next couple of weeks but it will happen. But be assured, I am aware of a number of issues. Some have already spotted them and commented.
The symbols - I chose a very default red and blue. Each dot has 90% transparency so overlapping dots at this scale will undoubtedly coalesce into clumps. The impression will appear to bleed across the map. I need to tweak the colours (less saturated) and adjust the transparency to get a better effect. I will also likely do what I did for the 2012 map and classify the data so that at small scales 1 dot = 100 or 1,000 etc. To remove visual 'noise' at those scales. I'll also check for too many overlaps and overprinting. I actually think there's a problem in some areas with blue dots overprinting red. There should be more mixing and more purple. And no, there's no yellow dots. The map only displays Democrat and Republican votes in what remains, effectively, a binary voting outcome.
The data - it's county data, reapportioned. Dot maps convey a positioning that is a function of the processing, not where people actually live or vote. Dots are positioned randomly. Some have, quite reasonably, interpreted the map as showing where votes are and this is a fundamental drawback of the approach. No personal information is in the map at all. I also need to double-check a few areas where people have pointed out apparent anomalies in the map, compared to their personal knowledge of the areas. There may be errors. I need to check. That said, it's a function of the way I've used the NLCD so that data is the basis for reapportionment.
The geography - yes, I hold my hand up. There's no Alaska or Hawaii. I apologise. I'm not sure I'll go back as it requires doing some movement of those states to position them around the lower 48 and put them back in. It's easy but a non-trivial task when you're working in a GIS but I'll think about it. I understand this is unpalatable for some and I accept that criticism.
The interpretations - many have offered some fascinating insights into the gaps and the patterns through Twitter replies. I'll be going through these more carefully when the hullabaloo dies down and teasing out some. But more than anything I've been blown away by the nice things that have been said about the map. It shows the election result in a different way. It tells a different story. One of my favourite responses was this by Thomas de Beus...a lovely mashup and play on the classic photo of Trump's preferred view of the data to hang on the wall of the White House by Trey Yingst.
And this is the point of making a map like this. It presents the SAME data in a different way. It leads to different insights, different interpretations and a different perception. Neither of the above are right or wrong. They are different. Of course, we all have out own view on which serves our needs and which we prefer but that's for us as individuals.
My only regret is that I excitedly tweeted a rough version. I should have waited until I made the map properly. I'll do that but I suspect this is my one viral 15 minutes of fame and I regret it doesn't reflect the quality I know the final version will exhibit. A finished map likely won't get the same traction but we'll see. At the very least it has ignited a discussion. It brings different cartographic eyes to the dataset. Will it ever be hung in the White House? Unlikely.
Thanks for your interest and comments thus far!
Hurriedly written from a hotel in Palm Springs during which time the map's had many more likes, 11 more mentions and I've picked up another 86 followers. I can only apologise to them when they realise I tweet just as much about beer and football as I do about maps.
Update: There's now a web map which shows the data at 6 scales in much more detail than the screengrab above. Check it out here.