cartogram design

I study cartogram design. Cartograms are thematic maps in which the enumeration units (states, countries) are resized based on a particular attribute (population, carbon emissions). There are dozens of types/designs of cartogram and many methods/algorithms for cartogram production.

These have gotten a lot of attention lately (uses the Gastner-Newman diffusion-based algorithm).

Some cartographers manually tweak their automatically generated cartograms to better preserve shape or topology (from the Dutch company Mapping Worlds).

Some preserve shape completely (from the Online Atlas of the Millennium Development Goals also by Mapping Worlds).

Others abstract enumeration units to geometric primitives (generated by my Python script, based on Daniel Dorling’s algorithm).

And of course many other designs can be found between these extremes (redrawn from the NY Times2006 election results app).

The standard approach to cartogram design is to classify them as either contiguous or noncontiguous (with some adding a pseudo-contiguous category). As the small gallery above illustrates, this is inadequate. It seems to me that cartogram designs vary along three dimensions, and that variation along each dimension is continuous.

  1. Shape preservation — how much the original shapes are preserved on the transformed cartogram (can be quantified with local angles and edge length ratios)
  2. Topology preservation — how well adjancencies are preserved
  3. Density equalization — how accurately unit size represents the chosen attribute

The latter requires perhaps more explanation. Isn’t size on a cartogram supposed to perfectly reflect the chosen attribute? Sure, but some recent cartogram algorithms (Gastner-Newman slightly, Kocmoud-House somewhat more) have chosen to allow for some inaccuracy in order to better preserve shape or topology. Since readers can’t accurately estimate area anyway, this seems like a fair tradeoff.

To show these continua, and better portray the tradeoffs involved in preserving individual properties, I drafted the Cartogram Cube. It has helped me think through some of these issues while writing my thesis.

Cartogram3

Cartogram Cube

Does any of the above matter? Well, only to the extent that it helps us make better maps. But I believe cartogram effectiveness has a lot to do with these design characteristics, and depends largely on the property tradeoffs made in cartogram design (for manual and algorithmically-produced cartograms). Indeed, this is precisely what my thesis results — to be defended on May 13 — indicate.

More on my actual results later.

aag post 0

Tomorrow, two of my fellow UW cartographers and I board a train in Chicago for our 24-hour ride to the Annual Meeting of the Association of American Geographers in Boston. I’ll present a paper, Cartograms for Political Cartography: A Question of Design, on Saturday, but I’m more excited about the visualization and cartography talks throughout the week.

To see what I can look forward to, I worked up a quick tag cloud of the keywords used by presentations marked ‘visualization’ in the AAG Preliminary Program.

3d visualization cartography climate change geographic visualization geovisualization gis giscience interactive map modeling self-organizing maps uncertainty visual analytics visualization

And for my fellow cartophiles, a cloud of the keywords used by presentations marked ‘cartography’.

animation art atlas cartograms cartography cinema computer mapping critical cartography education geographic education geology gis hazards historical cartography historical geography history of cartography map map use mapping maps navigation political cartography political geography propaganda maps remote sensing united states visualization

When I return, I’ll probably follow up this post with one or two more. My angle?– visualization in contemporary academic geography.

BTW, the map above is from Alex Tait & Erik Steiner’s actually quite cool Amtrak Route Atlas. Oh, and the reason our route stops in Albany is because of track repairs — from there we hop on a bus to Boston. Bring it on Amtrak!

choropleth mapping and standardization

Choropleth maps, like the one above, use the visual variable of value (aka shade or lightness), sometimes in concert with hue or saturation, to present data about featues. Academic cartographers teach that this symbology should only be used in certain circumstances.

  1. The phenomenon being mapped must be thought to vary only between enumeration units, and not within them. That is, the phenomenon is observed to change abruptly at enumeration unit borders (abrupt).
  2. There is little variation in the shape/size of enumeration units (uniformity).
  3. The data itself must be standardized (divided) by another attribute value of the feature, typically land area or a related variable. (standardized).
  4. The phenomenon being mapped is continuous, that is, present everywhere (continuous).

Though I think there’s reason to talk about/problematize all four, I’m most interested in the third criterion: standardization. It is taken as gospel in academic cartography that for a choropleth to “work”, the underlying data must be standardized. The need for standardization, as stated by Dent (1996), is because “the varying size of areas and their mapped values will alter the impression of the distribution.” Slocum (2005) backs up Dent, writing that standardization is necessary to “account for varying sizes of enumeration units”.

These same cartographers, though, stress that standardization need not be by land area. Slocum, for example, lists three methods of standardization other than land area (for example, crimes per capita, or really any per capita variable). Using non-area variables is justified by Slocum because “this approach indirectly standardizes for area because larger areas tend to have larger values for both attributes”.

This seems questionable, especially since small enumeration units with high values are a problem standardization was designed to address. My point: dividing a variable (say, elderly population) by a non-area variable (say, total population) does nothing to address the issue of varying unit size on the resultant choropleth map.

It seems to me we’re left with two options.

    Either,

  • We must standardize our data, and it must be by land area, to correct for the varying size of areas on the map.
  • Or,

  • We can standardize by any other variable, and indeed abandon standardization altogether by mapping raw totals.

I won’t weigh in just yet, as I could really go either way. Are there other justifications for standardization that don’t require standardization by land area?