<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>indiemaps.com/blog</title>
	<atom:link href="http://indiemaps.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://indiemaps.com/blog</link>
	<description>the notebook of cartographer zachary forest johnson</description>
	<pubDate>Fri, 26 Feb 2010 04:37:02 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>Building topology in Flash</title>
		<link>http://indiemaps.com/blog/2010/01/building-topology-in-flash/</link>
		<comments>http://indiemaps.com/blog/2010/01/building-topology-in-flash/#comments</comments>
		<pubDate>Wed, 13 Jan 2010 05:27:32 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[bitmap]]></category>

		<category><![CDATA[BitmapData.threshold()]]></category>

		<category><![CDATA[borders]]></category>

		<category><![CDATA[cartograms]]></category>

		<category><![CDATA[circular cartograms]]></category>

		<category><![CDATA[collision detection]]></category>

		<category><![CDATA[color]]></category>

		<category><![CDATA[dorling]]></category>

		<category><![CDATA[flash]]></category>

		<category><![CDATA[Flex]]></category>

		<category><![CDATA[game development]]></category>

		<category><![CDATA[generalization]]></category>

		<category><![CDATA[GIS]]></category>

		<category><![CDATA[mapping]]></category>

		<category><![CDATA[mathematics]]></category>

		<category><![CDATA[raster]]></category>

		<category><![CDATA[topology]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=106</guid>
		<description><![CDATA[
For a while now, I&#8217;ve wanted to build geographic topology in Flash.  Topology, as described in an ESRI white paper, is
a set of rules and behaviors that model how points, lines, and polygons share geometry.  For example, adjacent features, such as two counties, will share a common edge. 
For the applications I&#8217;ve had [...]]]></description>
			<content:encoded><![CDATA[<div class="centerIMG"><img src="http://indiemaps.com/images/bitmapTopology/rgb.png" alt="" /></div>
<p>For a while now, I&#8217;ve wanted to build geographic topology in Flash.  Topology, as described in an <a href="http://www.esri.com/library/whitepapers/pdfs/gis_topology.pdf">ESRI white paper</a>, is</p>
<blockquote><p>a set of rules and behaviors that model how points, lines, and polygons share geometry.  For example, adjacent features, such as two counties, will share a common edge. </p></blockquote>
<p>For the applications I&#8217;ve had in mind, polygonal or areal topology is all that&#8217;s required: I need to know which features share a common border with which features. As a bonus, it&#8217;d be nice to know how long a border they share (how contiguous are they?). </p>
<p>Described further down is a raster method to determine vector feature adjacency. First, though, I&#8217;ll cover why such information must be built in the first place and how it is useful for certain cartographic applications.</p>
<h3>Application to cartography and visualization</h3>
<p>Topology needs to be built because it is not encoded in the most popular vector GIS formats (KML and the shapefile).  In these formats, features (states or counties perhaps) are all stored separately; redundant points (like the shared corner of the &#8220;four corners&#8221; states) are repeated. There <em>are</em> topological GIS formats, like <a href="http://indiemaps.com/blog/2009/02/e00parser-an-actionscript-3-parser-for-the-arcinfo-export-topological-gis-format/">Arc/Info Export (e00)</a>, but geospatial data are rarely distributed in such formats.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/bitmapTopology/mapAndGraph.png" alt="" /></div>
<p>One potential use for topological information is graph decomposition.  Dasgupta <em>et al</em> write in <a href="http://www.cs.berkeley.edu/~vazirani/algorithms.html"><em>Algorithms</em></a>:</p>
<blockquote><p>A wide range of problems can be expressed with clarity and precision in the concise pictorial language of graphs. For instance, consider the task of coloring a political map. What is the minimum number of colors needed, with the obvious restriction that neighboring countries should have different colors? One of the difficulties in attacking this problem is that the map itself, even a stripped-down version like Figure 3.1(a), is usually cluttered with irrelevant information: intricate boundaries, border posts where three or more countries meet, open seas, and meandering rivers. Such distractions are absent from the mathematical object of Figure 3.1(b), a graph with one <em>vertex</em> for each country (1 is Brazil, 11 is Argentina) and <em>edges</em> between neighbors. It contains exactly the information needed for coloring, and nothing more.</p></blockquote>
<p>The simple graph above is two steps away from a circular cartogram of the form popularized by Daniel Dorling. On such maps, 1) feature circles are sized according to some numeric attribute, and neighbors &#8212; instead of being connected by lines &#8212; are 2) iteratively moved together or apart so that adjacencies are maintained where possible while avoiding circle overlap. The circular cartogram below is a snapshot of a <a href="http://www.nytimes.com/interactive/2008/08/04/sports/olympics/20080804_MEDALCOUNT_MAP.html">NY Times interactive app</a> developed in part <a href="http://leebyron.com/how/2008/08/09/olympic-medals-cartogram/">by Lee Byron</a>.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/bitmapTopology/nyTimesDorling.png" alt="" /></div>
<p>In addition to knowing the neighbors of all features, <a href="http://indiemaps.com/blog/2008/01/dorlingpy/">Daniel Dorling&#8217;s original algorithm</a> requires as input the shared border lengths of all contiguous features. These are used in the force-repulsion algorithm &#8212; while attempting to maintain all feature adjacencies, Dorling&#8217;s algorithm applies extra attraction forces to features that share relatively longer borders.</p>
<p>In cartography, topology has much value beyond that described above.  Generalization immediately comes to mind; to generalize/simplify/smooth the outline of a polygonal feature, one must also consider the feature&#8217;s neighbors.  If the cartographer fails to do so, gaps may be created where features previously neatly adjoined. That noted, I should say that the method described here &#8212; while quite accurate in evaluating adjacencies &#8212; can only determine adjacency and relative (pixel-based) feature overlap, whereas polygonal shapefile generalization necessitates a true (and computationally intensive) topological indexing of each and every coordinate in the dataset. More on this later.</p>
<h3>Method</h3>
<p>Features stored in KML and shapefiles are defined by a series of latitude-longitude coordinates.  The brute force method of determining if two features are topological neighbors would be to loop through their coordinates searching for an exact match.  Not only would this be quite computationally intensive, but it wouldn&#8217;t detect true neighbors that were digitized independently; a shared border does not necessarily mean that the two features are defined using the exact same control points.</p>
<p>Of course it can be done.  Most GIS software today will accurately build topology from a non-topological data source.  The NY Times&#8217; <a href="http://maps.grammata.com/">Matt Bloch</a> detailed a server-side method in his Wisconsin (natch) Cartography MS thesis (as well as a <a href="http://www.geography.wisc.edu/~harrower/pdf/AUTOCARTO_06_Bloch_Harrower.pdf">2006 AUTOCARTO proceedings paper</a>), used in his <a href="http://mapshaper.org/">MapShaper</a> app, for essentially converting a polygonal shapefile to a non-redundant polyline shapefile.  Though more efficient than simple brute force, these applications still rely on algorithms that require a large number of computations, and this scales up quickly as features get more complex.  This is why topology is always built server-side.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/bitmapTopology/arkansasTopologyTest.png" alt="" /></div>
<p>The method I settled on is similar to <a href="http://www.gamedev.net/reference/articles/article735.asp">pixel-based collision detection</a> (also called &#8220;pro pixel collision&#8221; or &#8220;pixel-perfect collision&#8221;) familiar to game developers.  In Flash, hitTests are only possible with specific points.  To test for overlap of complex polygonal features, game developers have often relied on a pixel-level bitmap test for overlap.  As described in an <a href="http://troygilbert.com/2007/06/pixel-perfect-collision-detection-in-actionscript3/">article by Troy Gilbert</a>,</p>
<blockquote><p>It&#8217;s pretty straightforward: you render two display objects to separate color channels, combine the color channels, then search the resulting image for any overlapping color.</p></blockquote>
<p>I first encountered the technique in a <a href="http://www.gskinner.com/blog/archives/2005/08/flash_8_shape_b.html">post by Grant Skinner</a> describing a method for shape-based collision detection in Flash published in 2005. GSkinner&#8217;s method doesn&#8217;t work <em>as is</em> in detecting adjacency in geographic features drawn from shapefiles. But with a few additions, the method is quite accurate and efficient in detecting feature neighbors in real-time. Try it out:</p>

<object	type="application/x-shockwave-flash"
			data="http://indiemaps.com/flash/bitmapTopology/FlashTopologyTest.swf"
			base="http://indiemaps.com/flash/bitmapTopology/"
			width="850"
			height="400">
	<param name="movie" value="http://indiemaps.com/flash/bitmapTopology/FlashTopologyTest.swf" />
	<param name=base" value="http://indiemaps.com/flash/bitmapTopology/" />
</object>
<h4>Some code</h4>
<p>This isn&#8217;t really fully developed.  It&#8217;s just something I&#8217;ve been thinking about for a while and wanted to get out there.  <a href="http://indiemaps.com/flash/bitmapTopology/AdjacencyTest.as">Here though</a> is a class I wrote to perform this pixel-based adjacency testing, as used in the above demo. It contains two chief public static methods:</p>
<ul>
<li><span class="incode">getNeighborsForFeature()</span>: Returns a Vector of DisplayObject features determined to be neighbors of the passed-in feature (from a passed-in Vector of all features)</li>
<li><span class="incode">checkFeatureAdjacency()</span>: Checks the two passed-in DisplayObject features for adjacency.  Returns the number of overlapping pixels.</li>
</ul>
<h4>How it works</h4>
<p>To test for adjacency, features are drawn to a bitmap.  The second feature is overlain using the <a href="http://en.wikipedia.org/wiki/Blend_modes#Difference"><em>difference</em> blend mode</a>. Any pixels shared by both features will now be lighter than before; I can check for such pixels using the <a href="http://livedocs.adobe.com/flash/9.0/ActionScriptLangRefV3/flash/display/BitmapData.html#threshold()"><span class="incode">BitmapData.threshold()</span></a> method.</p>
<p>Adjacent features, though, will only overlap if they&#8217;ve been drawn with an external stroke. The two methods above accept a <span class="incode">strokeWidth</span> parameter.  If this is less than 2 (features with external strokes of 1 sometimes fail to detect overlap accurately), I apply a precise GlowFilter to each raster.  This increases accuracy but affects performance; though it&#8217;s not necessary, the methods perform best when features are initially drawn with a 2px stroke (as in the demo above).</p>
<h3>Limitations</h3>
<h4>Accuracy</h4>
<p>This method is quite accurate for most realistic geographies.  But of course it&#8217;s all dependent on 1) the scale and complexity of your data and 2) the size of the BitmapData instance used for testing.  One particular area of concern is features that share a corner but no actual border line segments. In some applications, these should be considered neighbors; in others, not.  With a large enough BitmapData, such adjacencies will be detected.  </p>
<div class="centerIMG"><img src="http://indiemaps.com/images/bitmapTopology/sharedCorner.png" alt="" /></div>
<p>If such features should not be considered neighbors in your application, a threshold may be set.  <span class="incode">checkFeatureAdjacency()</span> returns the number of overlapping pixels; if this number is sufficiently low, features can be considered non-adjacent.</p>
<h4>Performance</h4>
<p>I haven&#8217;t done any real performance tests, but the method works fine, in a test with the 3000+ U.S. counties drawn from a large-scale shapefile in Firefox on my MacBook; neighbor detection was detected in real-time and nothing was cached.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/bitmapTopology/countyTopology.png" alt="" /></div>
<p>As noted above, the methods perform fastest when features are initially drawn with a 2px stroke.  If the methods are to be called repeatedly, it&#8217;s best to pass in shared BitmapData instances to be used for testing; the larger the BitmapData instance, the greater the accuracy, though this is offset by increased processing time for the BitmapData instance methods. No matter what you pass in, though, this raster-based method is far faster than any true coordinate-based adjacency testing algorithm (important b/c it is being performed client-side).</p>
<h4>Generalization</h4>
<p>This method simply doesn&#8217;t work for generalization (simplification or smoothing of feature border linework). Such processes require knowledge of all shared line segments, necessitating true coordinate-based topology building; my method only returns the number of overlapping pixels, which only correlates with actual shared border length (this, though, is sufficient information for the Dorling circular cartogram algorithm).</p>
<p>But I hope it&#8217;s useful for at least a few visualization applications.  True topological indexing is best; but this computationally intensive process cannot yet be performed client-side in Flash. I&#8217;ll be working with these methods in the future, most likely in an implementation of Dorling&#8217;s circular cartograms in <a href="http://indiemapper.com/">indiemapper</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2010/01/building-topology-in-flash/feed/</wfw:commentRss>
		</item>
		<item>
		<title>the first thematic maps</title>
		<link>http://indiemaps.com/blog/2009/11/the-first-thematic-maps/</link>
		<comments>http://indiemaps.com/blog/2009/11/the-first-thematic-maps/#comments</comments>
		<pubDate>Wed, 04 Nov 2009 04:22:03 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[Arthur Robinson]]></category>

		<category><![CDATA[cartogram]]></category>

		<category><![CDATA[cartography]]></category>

		<category><![CDATA[Charles Dupin]]></category>

		<category><![CDATA[choropleth]]></category>

		<category><![CDATA[dot density]]></category>

		<category><![CDATA[Edwin Halley]]></category>

		<category><![CDATA[Émile Levasseur]]></category>

		<category><![CDATA[flow maps]]></category>

		<category><![CDATA[Frère de Montizon]]></category>

		<category><![CDATA[Henry Drury Harness]]></category>

		<category><![CDATA[history]]></category>

		<category><![CDATA[isoline]]></category>

		<category><![CDATA[Joseph Minard]]></category>

		<category><![CDATA[Pieter Bruinsz]]></category>

		<category><![CDATA[proportional symbols]]></category>

		<category><![CDATA[symbology]]></category>

		<category><![CDATA[thematic mapping]]></category>

		<category><![CDATA[visualization]]></category>

		<category><![CDATA[William Playfair]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=100</guid>
		<description><![CDATA[Below&#8217;s a quick outline of the first maps created with six common cartographic symbologies. All of the below is out there, in books, articles, and blog posts.  Particularly helpful are Alan MacEachren&#8217;s 1979 article The Evolution of Thematic Cartography, Arthur Robinson&#8217;s Early Thematic Mapping in the History of Cartography (1982), Borden Dent&#8217;s thematic cartography [...]]]></description>
			<content:encoded><![CDATA[<p>Below&#8217;s a quick outline of the first maps created with six common cartographic symbologies. All of the below is out there, in books, articles, and blog posts.  Particularly helpful are Alan MacEachren&#8217;s 1979 article <a href="http://www.geovista.psu.edu/publications/MacEachren/MacEachren_Evolution_1979.pdf"><em>The Evolution of Thematic Cartography</em></a>, Arthur Robinson&#8217;s <a href="http://www.press.uchicago.edu/presssite/metadata.epl?mode=synopsis&#038;bookkey=3642990"><em>Early Thematic Mapping in the History of Cartography</em></a> (1982), <a href="http://en.wikipedia.org/wiki/Borden_Dent">Borden Dent&#8217;s</a> thematic cartography textbook, and Michael Friendly and Daniel J. Denis&#8217; <a href="http://datavis.ca/milestones/"><em>Milestones</em></a> chronology.  I couldn&#8217;t find a good modern summary, though, of these &#8220;firsts&#8221; of thematic cartography.  So I put a quick one together.</p>
<p>The six symbologies are <em>the</em> classic thematic cartography representation methods: choropleth, proportional symbol, dot density, flow, isarithmic, and cartogram.  Borden Dent&#8217;s great cartography textbook, in the section titled &#8220;Techniques of Quantitative Thematic Mapping&#8221;, dedicates a chapter to each of these symbologies, and to no others.  The classic Robinson textbook, as well as the modern Slocum <em>et al</em> one, also dedicate more space to these six representation methods than to any others.</p>
<h3>isoline</h3>
<p>The first isarithmic (representation of continuous phenomena using lines of equal value) map is <a href="http://datavis.ca/milestones/index.php?group=1700s&#038;mid=ms53">often ascribed to</a> Edwin Halley, with his <strong>1701</strong> isogonic contour maps of magnetic declination.  </p>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/halley_isoline_1701.jpg" alt="Edwin Halley's (1701) isogonic contour map" /></div>
<p>Not so, as the symbology has been traced back to as early as <strong>1584</strong>. Robinson notes in <em>Early Thematic Mapping</em>:</p>
<blockquote><p>
The contour had tentative beginnings as early as the end of the sixteenth centruly in the form of lines of equal depth on a map of the River Spaarne made in 1584 by the Dutch surveyor Pieter Bruinsz.  The next use of the symbol came more than a hundred years later in 1697 by Pierre Ancellin&#8230;
</p></blockquote>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/bruinsz_isoline_1584.png" alt="the earliest known isoline or contour map, produced by Pieter Bruinsz in 1584" /></div>
<p>You can&#8217;t see too much in the above &#8212; a terrible scan I made from Robinson&#8217;s<em> Early Thematic Mapping</em>.  This map isn&#8217;t listed on <a href="http://datavis.ca/milestones/"><em>Milestones</em></a>, nor could I find any reproductions of Bruinsz&#8217;s map on the internet. What I did find was a <a href="http://liber-maps.kb.nl/articles/15egmond15.jpg">detail image</a> of unknown provenance on the interesting site, <a href="http://liber-maps.kb.nl/articles/15egmond.html"><em>Dutch thematic maps on the web</em></a>.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/bruinsz_isoline_1584_detail.png" alt="detail of the earliest known isoline or contour map, produced by Pieter Bruinsz in 1584" /></div>
<p>On the above you can clearly see the dashed bathymetric line of equal depth.</p>
<h3>choropleth</h3>
<p>Frenchman <a href="http://en.wikipedia.org/wiki/Charles_Dupin">Charles Dupin&#8217;s</a> (<strong>1826</strong>) <a href="http://datavis.ca/milestones/index.php?group=1800%2B&#038;mid=ms99"><em>Carte figurative de l&#8217;instruction populaire de la France</em></a> is the first known choropleth map.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/dupin1826.jpg" alt="the earliest choropleth map, produced by Charles Dupin in 1826" /></div>
<p>Dupin&#8217;s map was and is well known, and had a great impact on cartographers and statisticians at the time.  Interestingly, the particular form of Dupin&#8217;s choropleth map &#8212; the unclassed variety, in which each unique value gets a unique shade &#8212; didn&#8217;t stick.  The classed variety quickly took hold, with the first such map appearing in an <strong>1828</strong> Prussian atlas, <em>Administrativ-Statistischer Atlas vom Preussischen Staatae</em>.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/prussianChoropleth_classed_1828.png" alt="the earliest classed choropleth, included in an 1828 Prussian atlas" /></div>
<p>Notice not the subtle lean in my scan (from Robinson since I couldn&#8217;t find it elsewhere), but the class breaks shown along the bottom.  The author of the thematic portion of the atlas is unknown.</p>
<h3>dot density</h3>
<p>Dupin&#8217;s fellow countryman Frère de Montizon is responsible for the first dot density map, <a href="http://datavis.ca/milestones/index.php?group=1800%2B&#038;mid=ms105"><em>Carte philosophique figurant la population de la France</em></a> (<strong>1830</strong>). </p>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/montizon_dotMap_1830.png" alt="the earliest dot density map, produced by Frère de Montizon in 1830" /></div>
<p>Unlike Dupin, who is fairly well known, Robinson describes Montizon as a &#8220;mystery man&#8221;.  He continues,</p>
<blockquote><p>It is one of the accidents of history that Frère de Montizon&#8217;s invention of the thematic dot map should have gone completely unnoticed.  In terms of cartographic innovation it ranks with the isothermal map, yet as far as can be ascertained no reference to him or to his dot map was made by anyone well into the twentieth century&#8230;this basically simple, logical idea had to wait some thirty years to be reinvented and much longer than that to become generally known.
</p></blockquote>
<p>The symbology was &#8220;reinvented&#8221; by Thure Alexander von Mentzer in his <strong>1859</strong> map of population distribution in the Scandinavian peninsula.  I can&#8217;t find any reproductions of this map anywhere.</p>
<h3>proportional symbols</h3>
<p>The first known example of a proportional symbol map appeared in the <em>Atlas to Accompany Second Report of the Railway Commissioners, Ireland</em> (<strong>1837</strong>).  Though the atlas was apparently quite innovative with regard to thematic cartography, it went largely unnoticed due to limited distribution.</p>
<p>Here, Henry Drury Harness&#8217;s map of Irish population density from the atlas.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/harness_propSymbols.png" alt="the earliest proportional symbol map, of Irish population, published by Henry Drury Harness in 1837" /></div>
<p>They&#8217;re a bit hard to make out in this horrible scan (again from Robinson &#8212; listed as part of his personal collection), but there they are: circles scaled according to population centered at various Irish cities. This wasn&#8217;t the first time that proportional circles had been used to convey data, but it was the first time it had been done on a map.  Notice the shading?&#8211; this is also one of the first <a href="http://astro.temple.edu/~jmennis/research/dasymetric/index.htm">dasymetric maps</a>.</p>
<h3>flow maps</h3>
<p>The following map is <a href="http://datavis.ca/milestones/index.php?group=1800%2B&#038;mid=ms113">more well known</a>, and could share the title of first proportional symbol map, as it appeared in the same <strong>1837</strong> atlas as the above map.  Here, the innovation is the use of line thickness to convey a quantitative value, in this case the traffic between Irish cities (again by Henry Drury Harness).</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/firsts/harness_flowMap.png" alt="the earliest known flow map, produced by Henry Drury Harness in 1837" /></div>
<h3>cartogram</h3>
<p>I&#8217;ve <a href="http://indiemaps.com/blog/2008/12/early-cartograms/">written about this before</a>, so below&#8217;s a quick summary.</p>
<p>Many sources, including Dent&#8217;s text, point to <a href="http://en.wikipedia.org/wiki/Pierre_Émile_Levasseur">Émile Levasseur</a> as the originator of the value-by-area cartogram.  Levasseur included diagrammatic maps like the following <strong>1868</strong> map of Europe in his economic geography textbooks.</p>
<div class="centerIMG"><img src='/images/levasseur.png' alt='early diagrammatic map (supposedly first cartogram) by Levasseur' class='alignnone' /></div</p>
<p>The above map was reprinted <a href="http://www.jstor.org/pss/301591">by Funkhouser</a> in 1937 and by Waldo Tobler in his <a href="http://www.geog.ucsb.edu/~tobler/publications/pdf_docs/inprog/Thirtyfiveyears.pdf"><em>Thirty Five Years of Computer Cartograms</em></a>. In the latter, Tobler notes:</p>
<blockquote><p>This is a map of the countries of Europe in which each country is represented by a square whose size is proportional to the area of the country, and with countries in their approximately correct position and adjacency. Could this be called an equal area map? Or is it an equal area cartogram?</p></blockquote>
<p>I believe the former: that this should be considered a diagrammatic equal area map, but not a value-by-area cartogram.  Sara Fabrikant <a href="http://www.geog.ucsb.edu/~sara/html/research/pubs/fabrikant_cagis03.pdf">makes the case</a> for the German provenance of the first cartogram:</p>
<blockquote><p>Another notable and early European contribution to the thematic mapper&#8217;s toolbox is the value-by-area cartogram. Hermann Haack and H. Wiechel published a cartogram depicting election results from the German Reichstag in <strong>1903</strong> (cited in Eckert 1925).</p></blockquote>
<p>Haack and Wiechel&#8217;s map is indeed cited in Eckert&#8217;s famous <em>Die Kartenwissenschaft</em> volumes, but it isn&#8217;t reprinted there.  Nor is it clear that unit areas on their election maps represented something other than land area. The earliest true value-by-area cartogram that I&#8217;ve seen reproduced came by way of Professor John Krygier, who <a href="http://makingmaps.net/2008/02/19/1911-cartogram-apportionment-map/">reprinted</a> William Bailey&#8217;s <strong>1911</strong> population cartogram.</p>
<div class="centerIMG"><img src='/images/bailey-1911.png' alt='perhaps the first American cartogram' class='alignnone' /></div>
<p>I doubt the above is truly the first cartogram, and welcome any scans or references to true value-by-area cartograms that precede the 1911 map.</p>
<h3>conclusion</h3>
<p>Four of the six classic thematic cartography symbologies &#8212; choropleth, dot density, proportional symbol, and flow &#8212; originated between 1826 and 1837. Two of them &#8212; proportional symbol and flow &#8212; were initially produced by one man (Harness), and appeared in the same obscure railway atlas. All were refined in the 19th century, and only one (isolines) predate the century.</p>
<p>Conspicuously absent above are the names <a href="http://en.wikipedia.org/wiki/William_Playfair">William Playfair</a> and <a href="http://en.wikipedia.org/wiki/Charles_Joseph_Minard">Joseph Minard</a>.  Indeed, some sources still cite these well-known engineer-statisticians as the originators of the proportional symbol and flow symbologies.  Playfair may have been the first to use proportional (value-by-area) symbols (specifically circles) to represent quantitative data, but it wasn&#8217;t on a map.</p>
<p>Funkhouser <a href="http://www.jstor.org/pss/301591">credited</a> Minard with the invention of the flow symbology, ignoring Harness&#8217;s innovation eight years earlier:</p>
<blockquote><p>The second, called cartograms with bands (<em>cartogramme a bandes</em>), was originated simultaneously and independently by the French engineer, Minard, and the Belgian engineer, Belpaire, in 1845. This consists of colored bands or ribbons which follow the watercourses and railways on a map, the width of the band being proportional to the amount of traffic or number of passengers carried.</p></blockquote>
<p>It&#8217;s certainly the case, though, that Playfair paved the way for much thematic mapping experimentation in the early 19th C. and that Minard helped popularize the flow, proportional symbol, and choropleth symbologies.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2009/11/the-first-thematic-maps/feed/</wfw:commentRss>
		</item>
		<item>
		<title>classed cartograms</title>
		<link>http://indiemaps.com/blog/2009/10/classed-cartograms/</link>
		<comments>http://indiemaps.com/blog/2009/10/classed-cartograms/#comments</comments>
		<pubDate>Sat, 17 Oct 2009 04:15:36 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[cartograms]]></category>

		<category><![CDATA[classification]]></category>

		<category><![CDATA[indiemapper]]></category>

		<category><![CDATA[proportional symbols]]></category>

		<category><![CDATA[symbology]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=97</guid>
		<description><![CDATA[
Classification is commonplace in thematic cartography.  In choropleth mapping, classification is the norm.  This was not always so.  The first choropleth map (created by Baron Charles Dupin in 1826) was unclassed.  

According to Arthur Robinson, in his Early Thematic Mapping in the History of Cartography:
So far as we now know, the [...]]]></description>
			<content:encoded><![CDATA[<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/classedStates.png" alt="" /></div>
<p>Classification is commonplace in thematic cartography.  In choropleth mapping, classification is the norm.  This was not always so.  The first choropleth map (created by <a href="http://en.wikipedia.org/wiki/Charles_dupin">Baron Charles Dupin</a> in 1826) was unclassed.  </p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/dupin1826.jpg" alt="" /></div>
<p>According to Arthur Robinson, in his <em><a href="http://www.jstor.org/pss/3104629">Early Thematic Mapping in the History of Cartography</a></em>:</p>
<blockquote><p>So far as we now know, the first choropleth map to provide a legend and give class limits to the tones was the 1828 Prussian manuscript map of population density. After the early 1830s most choropleth maps employed classes of various sorts, some based on rankings of the districts, some based on percentage departures from a mean, and some based on categorizing the data itself.</p></blockquote>
<p>It seems that classification in choropleth mapping was adopted for technical rather than theoretical reasons.  Robinson notes that &#8220;engravers were unsuccessful in making such [unclassed] maps, since they did not have full control over the darkness of very many flat tones.&#8221; Nonetheless, empirical research, begun in the late 1970s and continuing into the 1990s, backed up the decision, at least as regards the <em>acquisition</em> of specific information. Data on the <em>recall</em> of specific information and the <em>acquisition</em> of general information was less conclusive; a good summary of this research is provided in Slocum <em>et al</em>&#8217;s cartography textbook.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/histoLegend.png" alt="" /></div>
<p class="caption">(classed histogram legend from my first thematic map, produced for Mark Harrower&#8217;s <a href="http://www.geography.wisc.edu/~harrower/Geog370/">Geography 370: Introduction to Cartography</a>)</p>
<h3>Range-graded proportional symbols</h3>
<p>In proportional symbol mapping, the <em>unclassed</em> form is the norm.  This, too, seems to be a technical rather than theoretically-based trend.  As noted in the Slocum text:</p>
<blockquote><p>Classed, or <em>range-graded</em>, maps can be created by classing the data and letting a single symbol size represent a range of data values, but <em>unclassed</em> proportional symbol maps are more common.  This might seem surprising given the frequency with which <em>classed</em> choropleth maps are used.  The difference stems, in part, from the ease with which unclassed proportional symbol maps could be created in manual cartography (either an ink compass or a circle template could be used to draw circles of numerous sizes).</p></blockquote>
<p>The logical extension of classed choropleth mapping to proportional symbol mapping was first suggested by Hans Joachim Meihoefer in the late 1960s. Here, too, classification is purported to make the thematic map easier to comprehend; also from Slocum:</p>
<blockquote><p>Range grading is considered advantageous because readers can easily discriminate symbol sizes and thus readily match map and legend symbols; another advantage is that the contrast in circle sizes might enhance the map pattern, in a fashion similar to the use of an arbitrary exponent&#8230;</p></blockquote>
<p>This comes at the expense of precision, as map reader&#8217;s can never get exact values for enumeration units. Borden Dent notes, &#8220;In this scaling method, symbol-size discrimination is the design goal, rather than magnitude estimation.&#8221;</p>
<p>Rarely, though, is exact value retrieval (magnitude estimation) the goal of thematic cartography, at least of the static variety.  Further, map readers are notoriously bad size estimators, as an array of psychophysical studies in and outside of academic cartography have established.  The graph below (from John Krygier&#8217;s excellent <a href="http://makingmaps.net/2007/08/28/perceptual-scaling-of-map-symbols/">blog post on perceptual scaling</a>) sums up the average response to 1, 2, and 3D proportional symbols.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/apparentmagnitudegraph.png" alt="" /></div>
<p>The above suggests that subjects uniformly underestimate the areas of proportional symbols. Perceptual scaling has been developed in response, in which symbol sizes are perceptually scaled according to a power function, rather than being scaled mathematically.</p>
<p>Perhaps more important, though, are the deviations among subjects hidden in the graph above.  T.L.C. Griffin, in a 1985 study, found an <em>average</em> underestimation similar to previous studies, but stressed that &#8220;perceptual rescaling was shown to be inadequate to correct the estimates of poor judges, while seriously impairing the results of those who were more consistent.&#8221;</p>
<p>Both the Slocum <em>et al</em> and Dent cartography textbooks introduce range-grading of proportional symbols as a potential solution to the 1) average poor estimation of symbol size and 2) high deviation in this underestimation.</p>
<h3>Cartograms then</h3>
<p>This makes a case for classification in proportional symbol mapping.  It also makes the case for classification on cartograms.  Cartograms can be considered a variant of proportional symbol map in which the symbol shape used is that of the original enumeration unit in some projection.  But in my cartogram research for the M.S. Cartography degree at the University of Wisconsin, I ran across no references to classifying the area of cartogram units.  A more recent search also revealed no references.  Classification seems particularly appropriate to cartograms because the limited research done on their perception suggests that users estimate cartogram feature area much less accurately than the simplified shapes of standard proportional symbol maps.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/usPopulationCartogram_unclassed.png" alt="" /></div>
<p>The above is a normal, unclassed cartogram showing U.S. population.  Notice the continuous range of state areas, shown below in order of population (but smaller).</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/orderedStates.png" alt="" /></div>
<p>Despite the two legend chips, I think it&#8217;s quite difficult to estimate the population of states with any degree of precision.  The following classed cartogram abandons precision in favor of a more easily and quickly interpreted map.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/usPopulationCartogram_classed.png" alt="" /></div>
<p>Exact values can never be retrieved, but each state can quickly be matched with one of four classes (grouped according to <a href="http://www.terraseer.com/help/stis/interface/map/classify/About_natural_breaks.htm">Jenks natural breaks</a> classification).</p>
<p>The above maps were created with a beta version of <a href="http://indiemapper.com">indiemapper</a>.  No other mapping software allows the creation of classed cartograms.  You can get around this when using tools such as <a href="http://scapetoad.choros.ch/">ScapeToad</a> or Frank Hardisty&#8217;s <a href="http://people.cas.sc.edu/hardistf/cartograms/">Cartogram Generator</a> (both of which utilize the contiguous <a href="http://www-personal.umich.edu/~mejn/election/2008/">Gastner-Newman cartogram algorithm</a>) by pre-classing your data, so that only 3-7 unique values are found in the dataset for a given attribute.</p>
<p>In addition to 1) increased discriminability of symbol sizes and 2) a potentially enhanced spatial pattern, classed cartograms may hold a third advantage over the unclassed variety, though only in bivariate mapping.  Cartograms, though, are most typically and appropriately employed in bivariate mapping.  When constructing a bivariate cartogram, the cartographer sizes enumeration units according to one variable (typically population) and colors the units, choropleth-style, according to some other variable (often election results).  In unclassed cartograms, some features may shrink down so as to be nearly invisible, making the reading of the second (coloring) attribute impossible.  In the classed variant, a small but still legible minimum size can be established for the first class of data, ensuring that the coloring attribute can be interpreted on all units.  A minimum size can be established on an unclassed cartogram, but if mathematical/proportional scaling is employed it may result in absurdly large features at the higher end of the mapped variable.</p>
<p>Of course, classed cartograms won&#8217;t always have a larger minimum size; this must be a conscious design decision (in the U.S. cartograms above, the classed cartogram minimum size is smaller than the minimum size found on the unclassed version).</p>
<h3>Inappropriate</h3>
<p>Classification in cartogram scaling is not always appropriate.  Indeed, the typical unclassed form is preferable in many cases. Borden Dent writes the following of classification and choropleth mapping; I believe it&#8217;s equally applicable to the proportional and cartogram symbologies:</p>
<blockquote><p>The purpose of the choropleth map dictates its form.  If the map&#8217;s main purpose is to simplify a complex geographical distribution for the purpose of imparting a particular message, theme, or concept, the conventional [classed] choropleth technique should be followed.  On the other hand, if the goal is to provide an inventory for the reader in a form that the reader must simplify and generalize, then the unclassed form should be chosen.
</p></blockquote>
<p>This sounds a lot like the modern distinction between cartography and geovisualization, or like the older one between communication and represenation.  These distinctions have been overblown; experts and amateurs are often looking at the same map.  But there are certainly cases where the purpose and audience are more clear-cut.  In such cases, classed cartograms may be an option.</p>
<p>While providing the cartographer with the &#8220;ability to direct the message of the communication&#8221; (Dent), this power comes with great responsibility &#8212; specifically the responsible choice of the 1) number of classes, 2) classification method, and 3) appropriate and easily discriminable symbol sizes for each class.</p>
<p>Research on classification in choropleth and proportional symbol mapping has settled on three and seven as the minimum and maximum number of classes that are effective.  Classification methods have been much studied in academic cartography, statistics, and other fields.  Jenks natural breaks classification is often employed; equal intervals and quantiles are also typical.  Cartographers have less guidance on the third choice, that of an appropriate symbol size for each class.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/classedCartograms/meihoeferClasses.png" alt="" /></div>
<p>The above shows a set of ten class sizes for range grading developed by Hans Joachim Meihoefer as the result of a visual experiment; the symbol sizes were developed for maximum discriminability, while maintaining a realistic size range (none too small, none too large).  Ten classes weren&#8217;t recommended, but supposedly any five contiguous sizes could be chosen.  Indiemapper uses a slight variation of Meihoefer&#8217;s sizing scheme for classed proportional symbol and cartogram sizing.</p>
<p>Despite the additional responsibilities placed on the cartographer, I believe a carefully-crafted classed cartogram can be a very effective representation of some datasets for certain audiences.</p>
<div class="centerIMG"><a href="http://indiemapper.com"><img src="http://indiemaps.com/images/classedCartograms/imBadge_webBig_mah.png" alt="" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2009/10/classed-cartograms/feed/</wfw:commentRss>
		</item>
		<item>
		<title>visualizing MLB hit locations on a Google Map</title>
		<link>http://indiemaps.com/blog/2009/07/visualizing-mlb-hit-locations-on-a-google-map/</link>
		<comments>http://indiemaps.com/blog/2009/07/visualizing-mlb-hit-locations-on-a-google-map/#comments</comments>
		<pubDate>Fri, 24 Jul 2009 09:55:20 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[baseball]]></category>

		<category><![CDATA[data]]></category>

		<category><![CDATA[geography]]></category>

		<category><![CDATA[geometry]]></category>

		<category><![CDATA[github]]></category>

		<category><![CDATA[googlemaps]]></category>

		<category><![CDATA[javascript]]></category>

		<category><![CDATA[jquery]]></category>

		<category><![CDATA[json]]></category>

		<category><![CDATA[mapping]]></category>

		<category><![CDATA[mapstraction]]></category>

		<category><![CDATA[mashup]]></category>

		<category><![CDATA[neocartography]]></category>

		<category><![CDATA[neogeography]]></category>

		<category><![CDATA[sabermetrics]]></category>

		<category><![CDATA[slippy maps]]></category>

		<category><![CDATA[sportsviz]]></category>

		<category><![CDATA[visualization]]></category>

		<category><![CDATA[yql]]></category>

		<category><![CDATA[yui]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=91</guid>
		<description><![CDATA[Here&#8217;s a map of yesterday&#8217;s perfect game pitched by Mark Buehrle &#8212; only the 18th in MLB history. Highlighted is defensive replacement Dewayne Wise&#8217;s perfect game-saving catch over the wall in the 9th inning.

Here the same, as displayed by MLB&#8217;s Gameday, for which the initial pixel coordinates were collected.

The conversion from the pixel-based coordinates used [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a map of yesterday&#8217;s <a href="http://www.nytimes.com/aponline/2009/07/24/sports/AP-BBA-Rays-White-Sox.html?ref=global">perfect game pitched by Mark Buehrle</a> &#8212; only the 18th in MLB history. Highlighted is defensive replacement Dewayne Wise&#8217;s <a href="http://mlb.mlb.com/media/video.jsp?content_id=5699255">perfect game-saving catch</a> over the wall in the 9th inning.</p>
<div class="centerIMG"><a href="http://indiemaps.com/js-hit-locations-map/index.html?gid=gid_2009_07_23_tbamlb_chamlb_1"><img src="http://indiemaps.com/pitchalyzer/images/gid_2009_07_23_tbamlb_chamlb_1.png" alt="" /></a></div>
<p>Here the same, as <a href="http://mlb.mlb.com/mlb/gameday/index.jsp?gid=2009_07_23_tbamlb_chamlb_1&#038;mode=gameday">displayed by MLB&#8217;s Gameday</a>, for which the initial pixel coordinates were collected.</p>
<div class="centerIMG"><img src="http://indiemaps.com/pitchalyzer/images/gid_2009_07_23_tbamlb_chamlb_1_gameday.png" alt="" /></div>
<p>The conversion from the pixel-based coordinates used on the above to a geographic latitude-longitude space took a fair amount of work. More on that in a bit. So why even attempt it? What is the value added by presenting these data on a satellite photo? </p>
<p>At first, it was just to see if I could do it. From <a href="http://projects.flowingdata.com/walmart/">Wal-Marts</a> to <a href="http://oakland.crimespotting.org/">crimes</a> to <a href="http://www.mapchannels.com/twittermap/iranelection.htm">tweets</a>, the slippy map has become the <em>de facto</em> platform for geovisualization. Hits &#8212; balls in play &#8212; are inherently spatial, though at a micro level that gets rare consideration from neocartographers. The conversion to geographic points would also let me take advantage of the various symbologies written for mapping APIs, from <a href="http://www.heatmapapi.com/">heatmaps</a> to <a href="http://www.acme.com/javascript/#Clusterer">clusterers</a>.</p>
<p>As I continued work, though, I became more interested in the issue of scale: the images produced reveal the scale of baseball, easily compared to the scale of urban life. For example, most MLB ballparks have fences as far as 400 feet from home plate.  In New York City, the average N-S block is 264 feet long.  Such a block may contain dozens of businesses or residences, all of which we&#8217;d expect Google Maps and others to accurately geolocate. So why not hits?</p>
<p>Here a Jake Fox homer at the intersection of Kenmore and Waveland outside Wrigley Field.</p>
<div class="centerIMG"><img src="http://indiemaps.com/pitchalyzer/images/gid_2009_07_05_milmlb_chnmlb_1.png" alt="" /></div>
<p>And here, a <a href="http://indiemaps.com/js-hit-locations-map/index.html?gid=gid_2009_07_19_chnmlb_wasmlb_1">recent Washington Nationals game</a>, with hits placed as though home plate was located at the center of Dupont Circle.</p>
<div class="centerIMG"><img src="http://indiemaps.com/pitchalyzer/images/gid_2009_07_19_chnmlb_wasmlb_1.png" alt="" /></div>
<p>To get an idea for how field dimensions affect the game, we can display hits from a particular stadium as if they occurred at any other.  Here a recent Pirates-Phillies game that took place at the relatively small <a href="http://en.wikipedia.org/wiki/Citizens_Bank_Park">Citizens Bank Park</a>.  In the second image, I&#8217;ve placed and oriented hits as though they took place at the Pirates&#8217; <a href="http://en.wikipedia.org/wiki/Pnc_park">PNC Park</a>.  As you can see, two, maybe three of the home runs would have been swallowed up by the visiting team&#8217;s more cavernous outfield.</p>
<p><img style="float:left; position:relative; left:-35px; margin-right: -25px"  src="http://indiemaps.com/pitchalyzer/images/gid_2009_07_11_pitmlb_phimlb_1.png" alt="" /><img src="http://indiemaps.com/pitchalyzer/images/gid_2009_07_11_pitmlb_phimlb_1_pnc.png" alt="" /></p>
<p>What follows is a description of how I got the data, transformed it to lat-long coordinates, and displayed it on a slippy map.  For the demo proof of concept application, go <a href="http://indiemaps.com/js-hit-locations-map/">here</a>.</p>
<h3>Getting the data</h3>
<p>MLB&#8217;s <a href="http://webusers.npl.illinois.edu/~a-nathan/pob/pitchtracker.html">PITCHf/x</a> system provides highly accurate real-time data on pitch speed, break, and location, all measured using high-speed video cameras. The data &#8212; freely available <a href="http://gd2.mlb.com/components/game/mlb/">from MLB Gameday</a> &#8212; are used in real-time by a number of media outlets and after the fact by many number-crunching sabermetricians. A similar system for batted balls does not exist, though is <a href="http://www.baseballdailydigest.com/blogs/2009/01/14/whats-in-store-for-pitch-hit-fx-in-2009/">apparently in the works</a>.  Fairly accurate hit location data are available from the for-profit firms <a href="http://www.baseballinfosolutions.com/">Baseball Info Solutions</a> and <a href="http://www.stats.com/">STATS Inc</a>, and data on home run distance/location is disseminated freely by <a href="http://www.hittrackeronline.com/">HitTracker</a>.  But the only free source of locational information on all balls in play is the observational data collected by MLB Gameday for non-analytic purposes.</p>
<p>These data are purely observational, meaning hits are plotted by a MLB employee watching each game. For more on the accuracy of such data, see <a href="http://www.hardballtimes.com/main/article/is-seeing-believing/"><em>Is Seeing Believing?</em></a> at <a href="http://www.hardballtimes.com/">The Hardball Times</a>. For each ball in play, then, we have an X-Y coordinate.  This coordinate is in an arbitrary pixel space and must be transformed quite a bit; more on that later.</p>
<h4>Gameday&#8217;s XML data</h4>
<p>MLB Gameday provides both the PITCHf/x and observational hit location data in online directories, starting <a href="http://gd2.mlb.com/components/game/mlb/">here</a>, that are organized chronologically and then by game.  No API is provided, though, for querying this massive database of millions of pitches and hits &#8212; just directories and directories of XML files. Each game is described by dozens of XML files; the root game URL is accessed thusly:</p>
<p><span class="incode">http://gd2.mlb.com/components/game/mlb/ + year_{year}/month_{month}/day_{day}/ + gid_{year}_{month}_{day}_{away_team}mlb_{home_team}mlb_{game_number}/<br />
</span></p>
<p>Ex. <a href="http://gd2.mlb.com/components/game/mlb/year_2009/month_07/day_21/gid_2009_07_21_balmlb_nyamlb_1/">gid_2009_07_21_balmlb_nyamlb_1</a>, <a href="http://gd2.mlb.com/components/game/mlb/year_2009/month_04/day_05/gid_2009_04_05_atlmlb_phimlb_1/">gid_2009_04_05_atlmlb_phimlb_1</a></p>
<p>In <a href="http://oreilly.com/catalog/9780596009427/"><em>Baseball Hacks</em></a>, Joseph Adler provides Perl scripts for traversing these directories, parsing the various XML files found in them, and storing the parsed data in a MySQL database. Mike Fast has<a href="http://fastballs.wordpress.com/2007/08/23/how-to-build-a-pitch-database/"> updated the scripts</a>, which can be used pretty quickly to create a pitch and hit database with <a href="http://mikefast.googlepages.com/pbp_database_structure_2008.txt">this</a> structure. I&#8217;ve done so, and the advantages of such a database are many, but the speed of queries is perhaps the greatest.  For a small webapp, though, I didn&#8217;t want to go to the hassle of talking to a database, nor did I want the server-side dependencies. The <a href="http://developer.yahoo.com/yql/">Yahoo Query Language</a> (YQL) provides a great solution in such cases, and allowed me to create an API where none existed before and use SQL-like syntax to query it.</p>
<h4>My YQL tables</h4>
<p>YQL lets you use a universal SQL-like syntax on any API or web data source. Since you can use <a href="http://developer.yahoo.com/yql/guide/joins.html">subselects</a>, YQL also lets you join disparate data sources. Typically it is used as a wrapper for APIs, so that developers may use the same syntax across many different data sources. They do so by writing a XML file that describes a YQL table. In a basic table, data are returned from the the API unmodified. Since I&#8217;ve no API to call, I rely on YQL&#8217;s <a href="http://developer.yahoo.net/blog/archives/2009/04/yql_execute.html"><span class="incode">execute</span> element</a> to run server-side Javascript for each query.  With just a dozen or so lines of Javascript in the <span class="incode">execute</span> element of each table, I can quite easily return usable data from MLB&#8217;s directories of XML files. </p>
<p>I can also <a href="http://developer.yahoo.net/blog/archives/2009/07/yql_insert.html">use the execute element to insert</a> records into my own database (which can be MySQL or otherwise), allowing me to keep my database up-to-date more easily than with the Adler/Fast Perl update script.  I should note, though, that the YQL queries are nowhere near as fast as querying a local database, and likely aren&#8217;t quick enough for actual production (though they&#8217;re great for the kind of experimentation and rapid prototyping described here).</p>
<p>I initially set out to recreate Mike Fast&#8217;s <a href="http://mikefast.googlepages.com/pbp_database_structure_2008.txt">database environment</a> with YQL tables: seven tables, one each for games, game types, pitches, pitch types, players, and umpires. For this exercise, I needed only the games and atbats tables.  The latter includes the hit location information if a ball is sent into play.  For both, I&#8217;ve added fields, grabbing more data from the Gameday XML files as needed. I also needed a stadiums table, though this draws from <a href="http://indiemaps.com/pitchalyzer/yql/openairstadiums.xml">data I&#8217;ve personally collected</a>, rather than the Gameday XML files. The three tables are hosted <a href="http://github.com/indiemaps/js-hit-locations-map/tree/master/yql">on github</a> and available for use by anyone.  If you&#8217;re logged into Yahoo, you can try em out right now by <a href="http://developer.yahoo.com/yql/console/?env=http://indiemaps.com/js-hit-locations-map/yql/mlb.gd2.env">loading up the YQL console</a> with the <a href="http://github.com/indiemaps/js-hit-locations-map/raw/master/yql/mlb.gd2.env">mlb.gd2.env</a> environment file. You can then use SQL syntax on these tables. A few example queries (links are to REST queries that can be called from any web app):</p>
<ul>
<li><a href="http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20atbats%20where%20gid%3D%22gid_2009_04_11_nyamlb_kcamlb_1%22%20and%20hit_x%3C%3E%22NaN%22&#038;format=xml&#038;env=http%3A%2F%2Findiemaps.com%2Fjs-hit-locations-map%2Fyql%2Fmlb.gd2.env">SELECT * FROM atbats WHERE gid=&#8221;gid_2009_04_11_nyamlb_kcamlb_1&#8243; AND hit_x<>&#8220;NaN&#8221;<br />
</a>	</li>
<li><a href="http://query.yahooapis.com/v1/public/yql?q=SELECT%20stadium_id%20FROM%20games%20WHERE%20year%3D%222009%22%20AND%20month%3D%2207%22%20AND%20day%3D%2223%22&#038;format=xml&#038;env=http%3A%2F%2Findiemaps.com%2Fjs-hit-locations-map%2Fyql%2Fmlb.gd2.env">SELECT stadium_id FROM games WHERE year=&#8221;2009&#8243; AND month=&#8221;07&#8243; AND day=&#8221;23&#8243;</a></li>
<li><a href="http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20stadiums%20where%20id%3D%223%22&#038;format=xml&#038;env=http%3A%2F%2Findiemaps.com%2Fjs-hit-locations-map%2Fyql%2Fmlb.gd2.env">SELECT * FROM stadiums WHERE id=&#8221;3&#8243;</a></li>
</ul>
<h3>Transforming the data</h3>
<h4>Pixels to feet</h4>
<p>The hit location data recorded by MLB is reported in pixels.  The employees watching the games plot locations on a 250&#215;250 image of the ballpark; the (0,0) origin is &#8212; not home plate &#8212; but simply the upper left corner of the image.  The first step in making these data usable is to convert from this arbitrary pixel coordinate system to one in feet or meters.  We need to know 1) the scale of the 250&#215;250 image (feet per pixel) and 2) a known point on the field (preferably home plate) in the original pixel coordinates. </p>
<p>Initially, I assumed that these numbers would be the same or similar for all MLB ballparks.  An <a href="http://www.hardballtimes.com/main/article/using-gameday-to-build-a-fielding-metric-part-1/">article by Peter Jensen</a> disabused me of this notion. Jensen undertook to determine the home plate X-Y location and distance multiplier for each MLB ballpark by assuming a uniform hit ball distribution and imposing physical constraints on hit balls. Unfortunately, when I plotted hits using Jensen&#8217;s numbers, the ball distribution diverged quite obviously from the Gameday-plotted version: outfield balls were plotted too far from home plate, infield balls too close.  Jensen&#8217;s study used 2007-2008 hit locations, and MLB changes their images from season-to-season, so this may explain the divergence.  Either way, I needed a new method.</p>
<p>I devised this simpler but less systematic approach. To determine the distance multiplier, I took a look at the stadium images provided by MLB Gameday. The stadium diagrams shown on <a href="http://mlb.mlb.com/mlb/gameday/index.jsp?gid=2009_07_22_balmlb_nyamlb_1&#038;mode=gameday">Gameday&#8217;s real-time app</a> are all 250&#215;250 in expanded mode. Though I&#8217;m not positive that these are the exact diagrams upon which hits are initially plotted, they likely are as the resultant distance multipliers are quite accurate.  Given a known distance (typically the straightaway center field fence distance), I can calculate the scale of the image in feet per pixel.</p>
<div class="centerIMG"><img src="http://indiemaps.com/pitchalyzer/images/yankeeStadium.png" alt="" /></div>
<p>Determining home plate locations in Gameday&#8217;s coordinate system was less straightforward.  In <em>Baseball Hacks</em>, Joseph Adler creates spray charts of the field using Gameday data.  Unlike Jensen, Adler simply uses (125, 210) as the home plate location in pixels.  I figured I could use this as a starting point, but to my surprise I found that I only needed to modify this origin for a half dozen stadiums. Of course, the only accuracy check I have is eyeballing back-and-forth between my rendering and the Gameday version, but given the observational nature of the data, I believe this is good enough. </p>
<h4>Stadium locations and orientations</h4>
<p>Using Google Earth, I recorded home plate latitude-longitude coordinates and eyeballed stadium orientations for all open MLB ballparks.  Two stadiums are domes, so they&#8217;re out.</p>
<p><img style="float:left; position:relative; left:-35px; margin-right: -25px" src="http://indiemaps.com/pitchalyzer/images/humphreyMetrodome.png" alt="" /><img src="http://indiemaps.com/pitchalyzer/images/tropicanaField.png" alt="" /></p>
<p>And three of the league&#8217;s six retractable-roofed stadiums were closed during their satellite shots.</p>
<p><img style="float:left; position:relative; left:-175px; margin-right: -170px" src="http://indiemaps.com/pitchalyzer/images/chaseField.png" alt="" /><img style="float:left; margin-right: 5px" src="http://indiemaps.com/pitchalyzer/images/rogersCentre.png" alt="" /><img src="http://indiemaps.com/pitchalyzer/images/minuteMaidPark.png" alt="" /></p>
<p>LandShark Stadium was being used as a football field, and Citi Field hadn&#8217;t been built yet in Google&#8217;s satellite imagery, so home plate locations and stadium orientations weren&#8217;t collected for these.</p>
<p><img style="float:left; position:relative; left:-35px; margin-right: -25px" src="http://indiemaps.com/pitchalyzer/images/landSharkStadium.png" alt="" /><img src="http://indiemaps.com/pitchalyzer/images/citiField.png" alt="" /></p>
<p>For the other 23 stadiums, home plate latitude-longitude and X-Y coordinates, distance multipliers, and orientations are available in my <a href="http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20stadiums&#038;format=xml&#038;env=http%3A%2F%2Findiemaps.com%2Fjs-hit-locations-map%2Fyql%2Fmlb.gd2.env">YQL stadiums table</a>.</p>
<h4>Feet to latitude/longitude</h4>
<p>Given the above, I can convert all pixel locations to feet with home plate as the origin. For stadiums oriented away from due north, these feet-based locations can be rotated about their home plate origins.  The final step &#8212; converting from feet to geographic coordinates &#8212; is fairly simple, though I initially tried a couple of more complicated methods.  First, I converted hit locations to meters, converted the home plate lat-long to UTM coordinates, added the hit coordinates to the latter, and converted back to lat-long.  This worked fine, as did the second method I tried: a <a href="http://www.movable-type.co.uk/scripts/latlong.html">formula</a> that calculates the hit&#8217;s geographic coordinate given the distance and bearing from home plate. </p>
<p>Given the huge scale and low accuracy required, I can utilize a much simpler method.  Most mapping APIs include a distance method of some kind, taking as input two lat-long coordinates and returning the great circle (spherical, so perhaps off by as much as .3%) distance in meters or km. Here&#8217;s <a href="http://developer.yahoo.com/flash/maps/classreference/com/yahoo/maps/api/utils/Distance.html">Yahoo&#8217;s</a>, <a href="http://code.google.com/apis/maps/documentation/reference.html#GLatLng.distanceFrom">Google&#8217;s</a> and <a href="http://mapstraction.com/doc/LatLonPoint.html#distance">Mapstraction&#8217;s</a>.  With this method, I can determine an approximate conversion factor between meters and degrees latitude-longitude:</p>
<div class="codecolorer-container javascript"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">var</span> latConv = homePlate_latLong.<span class="me1">distance</span><span class="br0">&#40;</span> <span class="kw2">new</span> LatLonPoint<span class="br0">&#40;</span> homePlate_latLong.<span class="me1">lat</span> + .<span class="nu0">001</span>, homePlate_latLong.<span class="me1">lng</span> <span class="br0">&#41;</span> <span class="br0">&#41;</span> * <span class="nu0">1000000</span>;<br />
<span class="kw2">var</span> lngConv = homePlate_latLong.<span class="me1">distance</span><span class="br0">&#40;</span> <span class="kw2">new</span> LatLonPoint<span class="br0">&#40;</span> homePlate_latLong.<span class="me1">lat</span>, homePlate_latLong.<span class="me1">lng</span> + .<span class="nu0">001</span> <span class="br0">&#41;</span> <span class="br0">&#41;</span> * <span class="nu0">1000000</span>;</div></div>
<h4>Accuracy</h4>
<p>A game was taking place when the satellite image of <a href="http://en.wikipedia.org/wiki/At%26t_park">AT&#038;T Park</a> was taken; you can see the fans, players, and &#8212; importantly &#8212; the bases.  Using the formulas above, I&#8217;ve placed white squares on the map where the bases should be, and this shows the typical accuracy achieved.</p>
<div class="centerIMG"><img src="http://indiemaps.com/pitchalyzer/images/gid_2009_04_11_phimlb_colmlb_1.png" alt="" /></div>
<p>I&#8217;ve also plotted known outfield fence coordinates with similar accuracy.</p>
<h3>Putting it all together</h3>
<div class="centerIMG"><a href="http://indiemaps.com/js-hit-locations-map/"><img src="http://indiemaps.com/pitchalyzer/images/js-hit-locations-map.png" alt="" /></a></div>
<p>I created a <a href="http://indiemaps.com/js-hit-locations-map/">demo application</a> using <a href="http://mapstraction.com/">Mapstraction</a>, <a href="http://jquery.com/">jQuery</a>, and <a href="http://developer.yahoo.com/yui/">YUI</a>. It&#8217;s very much a proof-of-concept, and the YQL queries can take quite a while to populate the map and table. Nonetheless, this is the only non-trivial web development I&#8217;ve done outside of Flash/Flex, and I&#8217;m quite impressed by these tools.</p>
<p>The app lets you visualize the hits in any individual game that took place in an open stadium in 2009.  It&#8217;ll show games from earlier years, but I can&#8217;t vouch for the accuracy of the plotted points due to changes in Gameday&#8217;s observational data collection. By default, the app shows this year&#8217;s all-star game at Busch Stadium.  To see another game, add a <span class="incode">gid</span> URL parameter.  Here, for example, a few games from last night.</p>
<ul>
<li>the perfect game: <a href="http://indiemaps.com/js-hit-locations-map/index.html?gid=gid_2009_07_23_tbamlb_chamlb_1"> gid_2009_07_23_tbamlb_chamlb_1</a></li>
<li><a href="http://indiemaps.com/js-hit-locations-map/index.html?gid=gid_2009_07_23_sfnmlb_atlmlb_1">gid_2009_07_23_sfnmlb_atlmlb_1</a></li>
<li><a href="http://indiemaps.com/js-hit-locations-map/index.html?gid=gid_2009_07_23_seamlb_detmlb_1">gid_2009_07_23_seamlb_detmlb_1</a></li>
</ul>
<p>The display is pretty basic: just pink dots for outs and blue ones for hits.  And if you want to do any of the crazy stuff with plotting hits at different ballparks or arbitrary locations, you&#8217;ll have to dig around a bit. I can imagine many aggregation, filtering, and visualization options for these data. None are explored here. But all the code and data&#8217;s <a href="http://github.com/indiemaps/js-hit-locations-map/tree/master">up on github</a> if you want to have a go at it. </p>
<p class="update">
update Aug 08 2009: My <a href="http://indiemaps.com/js-hit-locations-map/">proof-of-concept app</a> was failing for a few days.  Should be back now.  My MLB Gameday YQL tables weren&#8217;t working as loaded from github, so I copied em over to indiemaps. Project&#8217;s still hosted <a href="http://github.com/indiemaps/js-hit-locations-map/tree/master">on github</a>. Any query links below utilizing the tables have been changed.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2009/07/visualizing-mlb-hit-locations-on-a-google-map/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Lens tools and fisheye map browsing</title>
		<link>http://indiemaps.com/blog/2009/04/lens-tools-and-fisheye-map-browsing/</link>
		<comments>http://indiemaps.com/blog/2009/04/lens-tools-and-fisheye-map-browsing/#comments</comments>
		<pubDate>Mon, 06 Apr 2009 23:30:23 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[fisheye]]></category>

		<category><![CDATA[flash]]></category>

		<category><![CDATA[Flex]]></category>

		<category><![CDATA[lens tool]]></category>

		<category><![CDATA[map browsing]]></category>

		<category><![CDATA[mapping]]></category>

		<category><![CDATA[panning]]></category>

		<category><![CDATA[semantic zoom]]></category>

		<category><![CDATA[ui]]></category>

		<category><![CDATA[visualization]]></category>

		<category><![CDATA[zooming]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=89</guid>
		<description><![CDATA[L.A.&#8217;s Cartifact recently released Cartifact Maps, a Flash-based tilemaps viewer with custom cartography and advanced map browsing tools.  The historic overlays and beautiful cartographic design are perhaps of most interest, but I&#8217;m equally impressed by their implementation of a novel map browsing UI featuring a magnifying glass or &#8220;lens tool&#8221;.

I first saw this map [...]]]></description>
			<content:encoded><![CDATA[<p>L.A.&#8217;s <a href="http://cartifact.com/">Cartifact</a> recently released <a href="http://maps.cartifact.com/">Cartifact Maps</a>, a Flash-based tilemaps viewer with custom cartography and advanced map browsing tools.  The historic overlays and beautiful cartographic design are perhaps of most interest, but I&#8217;m equally impressed by their implementation of a novel map browsing UI featuring a magnifying glass or &#8220;lens tool&#8221;.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/cartifactLensTool.png" alt="" /></div>
<p>I first saw this map browsing technique in a <a href="http://maps.grammata.com/mapviewer/ChinaMissingGirls.html">minimal browser Matt Bloch created</a> for an older static project.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/blochLensTool.png" alt="" /></div>
<p>I implemented a lens tool in the final project version of the <a href="http://freedom.indiemaps.com/">World Freedom Atlas</a>.  I also experimented in some of the early prototypes with a continuous fisheye effect for map browsing. The latter never really took off because of the distortion and pixelation inherent in the raster method.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/wfaLensTool.png" alt="" /></div>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/wfaFisheye.png" alt="" /></div>
<p>And the same idea in a Google Maps <a href="http://mundanemaps.googlepages.com/mu004_memories.html">mashup</a>.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/googleMapsLensTool.png" alt="" /></div>
<p>The originator of the fisheye/magnification method for multi-scale mapping is probably Edgar Kant, in a 1957 map he produced for a migration study of Asby, Sweden, by Torsten Hägerstrand. Here the &#8220;distance from the centre shrinks proportionally to the logarithm of the real distance.&#8221;</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/hagerstrandLogarithmic.png" alt="" /></div>
<p>Much work proceeded on multi-scale map projections, with the touchstone article being Snyder&#8217;s 1987 &#8220;&#8216;Magnifying-glass&#8217; azimuthal map projections&#8221;.  Good coverage on such projections (including parallels to cartogram distortion) can be found in Canter&#8217;s<a href="http://books.google.com/books?id=8cR7yG5ohHoC&#038;pg=PA158&#038;lpg=PA158&#038;dq=%22Magnifying-glass”+azimuthal+map+projections&#038;source=bl&#038;ots=G5H3VjyQGH&#038;sig=8dtAXDFAthNWy2CyV8bf_jmmvs8&#038;hl=en&#038;ei=5z_aSe7-NaT0tQOU2uCmCg&#038;sa=X&#038;oi=book_result&#038;ct=result&#038;resnum=10#PPA157,M1"> <em>Small-scale Map Projection Design</em></a>.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/snyderMagnifyingGlass.png" alt="" /></div>
<p>In non-mapping UIs, the magnification/fisheye effect is fairly common; the Mac dock does it and even <a href="http://en.wikipedia.org/wiki/Cover_Flow">Cover Flow</a> can be considered somewhat of a variant.  For browsing and selection from a &#8220;large linear list&#8221;, Ben Bederson at <a href="http://www.cs.umd.edu/hcil/">HCIL</a> came up with <a href="http://www.cs.umd.edu/hcil/fisheyemenu/">fisheye menus</a>.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/fisheyeMenu.png" alt="" /></div>
<p>So nothing new in general UI terms, but still pretty novel and perhaps especially applicable to online map browsing.</p>
<h3>Map browsing</h3>
<p><a href="http://www.axismaps.com/">Axis Maps</a> cartographer Andy Woodruff did a great <a href="http://www.cartogrammar.com/blog/map-panning-and-zooming-methods/">post</a> on a variety of map panning and zooming methods.  The lens or magnifying glass tool performs both panning and zooming functions, and should be considered as an alternative to the nine methods outlined there.</p>
<p>In interactive applications, the approach&#8217;s major strengths are threefold:</p>
<ul>
<li>low mouse mileage for panning and making selections</li>
<li>less disorientation or &#8220;getting lost&#8221; as general cues are always available</li>
<li>the ability to see generals and specifics simultaneously</li>
</ul>
<p>The last is particularly important in cartographic applications where the success of a good thematic map is often seen as its ability to present overall trends and specific values in the same map.</p>
<h3>Semantic zoom lens tool</h3>
<p>The Cartifact example above is particularly interesting because of the <a href="http://www.infovis-wiki.net/index.php/Semantic_Zoom">semantic zoom</a> inherent in its lens tool.  In normal, geometric zooming, a map (or other image) is simply blown up; more detail is shown by definition, because more pixels are dedicated to the image. In semantic zooming, different (typically more detailed) larger-scale renderings are shown at higher zoom levels; not only are features larger, but more details (and labels) are shown. Such semantic zooming is standard in slippy maps, which are produced and tiled at predefined zoom levels.  Nonetheless, the application to a lens tool is noteworthy, especially in thematic cartography; generals and specifics can be presented simultaneously, and both can be tailored semantically to different zoom levels.</p>
<p>Here&#8217;s a quick example I threw together in Flex using <a href="http://modestmaps.com/">Modest Maps</a> (right-click to view source).</p>

<object	type="application/x-shockwave-flash"
			data="http://indiemaps.com/flash/lensTool/ModestMapsLensTool.swf"
			base="http://indiemaps.com/flash/lensTool/"
			width="800"
			height="500">
	<param name="movie" value="http://indiemaps.com/flash/lensTool/ModestMapsLensTool.swf" />
	<param name=base" value="http://indiemaps.com/flash/lensTool/" />
</object>
<p>I like inverting the above, or perhaps more interestingly, showing Microsoft Aerial as the base and Microsoft Hybrid as the lens; the spotlight (zoomed in or otherwise) then serves to provide political/cultural details for the moused-over region.</p>
<h3>Application to thematic cartography</h3>
<p>In online thematic cartography, the practice of showing smaller enumeration units at higher zoom levels is somewhat common.  The <em>NY Times</em> has done it a few times, including their <a href="http://elections.nytimes.com/2008/results/president/map.html">Election 2008 results maps</a> (the county-level choropleth is revealed by zooming in).</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/lensTool/nyTimesCounty.png" alt="" /></div>
<p>The same idea can be applied to a lens tool. Here the lens reveals the county vote results, and can be zoomed (again, a quick Flex job) to further investigate the local-level results.  Click to launch the map (it&#8217;ll take a few secs to load, project, and draw the data).</p>
<div class="centerIMG"><a href="http://indiemaps.com/flash/lensTool/ThematicMapLensTool.html"><img src="http://indiemaps.com/images/lensTool/thematicLensTool.png" alt="" /></a></div</p>
<p>The above is based on some projections and choropleth code I <a href="http://indiemaps.com/blog/2008/12/noncontiguous-area-cartograms/">released</a> last year. I think there&#8217;s more room for experimentation here: the size of the lens could be user-modifiable and semi-transparent (so you can still see where you&#8217;re mousing over the main map); I&#8217;d also like to create a less-pixellated fisheye lens and try out multiple lenses/fisheyes (for detailed comparisons of multiple areas while still providing context).</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2009/04/lens-tools-and-fisheye-map-browsing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>E00Parser, an ActionScript 3 parser for the Arc/Info Export topological GIS format</title>
		<link>http://indiemaps.com/blog/2009/02/e00parser-an-actionscript-3-parser-for-the-arcinfo-export-topological-gis-format/</link>
		<comments>http://indiemaps.com/blog/2009/02/e00parser-an-actionscript-3-parser-for-the-arcinfo-export-topological-gis-format/#comments</comments>
		<pubDate>Sun, 22 Feb 2009 01:49:18 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[actionscript 3]]></category>

		<category><![CDATA[cartograms]]></category>

		<category><![CDATA[circular cartograms]]></category>

		<category><![CDATA[code]]></category>

		<category><![CDATA[Daniel Dorling]]></category>

		<category><![CDATA[flash]]></category>

		<category><![CDATA[Flex]]></category>

		<category><![CDATA[geography]]></category>

		<category><![CDATA[GIS]]></category>

		<category><![CDATA[graphs]]></category>

		<category><![CDATA[library]]></category>

		<category><![CDATA[topology]]></category>

		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=83</guid>
		<description><![CDATA[First off, why mess with such a retro format as Arc/Info Export (.e00)?&#8211;  any code written for this ASCII file type in the last few years has been on how to go from e00 to pretty much anything (especially to the non-topological data format, the shapefile).  
Put simply, topological information makes a lot [...]]]></description>
			<content:encoded><![CDATA[<p>First off, why mess with such a retro format as <a href="http://avce00.maptools.org/docs/v7_e00_cover.html">Arc/Info Export</a> (.e00)?&#8211;  any code written for this ASCII file type in the last few years has been on how to go <em>from</em> e00 <em>to</em> pretty much anything (especially to the non-topological data format, <a href="http://en.wikipedia.org/wiki/Shapefile">the shapefile</a>).  </p>
<p>Put simply, topological information makes a lot of things possible for the intrepid ActionScripter.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/e00/mapToGraph.png" alt="" /></div>
<p>E00 files non-redundantly store all nodes, lines, and polygons that make up a geographic data layer.  This geodata format is one of three currently <a href="http://www.census.gov/geo/www/cob/st2000.html">distributed by the Census Bureau</a> for boundary files (the others are the shapefile and the Ungenerate ASCII format).  The GIS formats used in most web mapping applications (I&#8217;m thinking of shapefiles, GeoJSON, and KML) are non-topological, meaning features are stored independently, and topological information on shared borders and the like is quite difficult to extract.  Like seriously hard.  Something you don&#8217;t want to be doing in the browser.  <a href="http://maps.grammata.com/">Matthew Bloch</a>, of the <em>NY Times</em>, did his cartography master&#8217;s thesis (at <a href="http://www.geography.wisc.edu/">Wisconsin</a>, natch) on <a href="http://www.mapshaper.org/">MapShaper</a>, much of which involved a C++ server-side solution for building topology from a polygonal shapefile.  Generalization requires non-redundant polylines so as not to create gaps between features when smoothing.  Other visualization techniques, including cartogram construction and graph decomposition, also require knowing the shared borders of geographic features.</p>
<p>Ideally, such topological information could be created/extracted for any geography, regardless of the datasource.  In reality, topology building is intensive and best suited to server-side processing.  Using E00 files and my E00Parser lets you experiment with the visualization and cartographic techniques only possible when such topological information is known, without the expensive processing necessary to build it. </p>
<h3>The code</h3>
<p>I&#8217;ve gotten a ton of use out of Edwin van Rijkom&#8217;s <a href="http://shp.riaforge.org/"><span class="incode">SHP</span> library</a>.  My noncontiguous cartogram, isolining, and political choropleth experiments relied on the code to load coordinate data in shapefile form at run-time, as did the <a href="http://indiemaps.com/blog/2008/01/shapefiles-projections-in-flash-as3/">early experiments</a> that led to <a href="http://indiemapper.com">indiemapper</a>.  I&#8217;m hoping I&#8217;ll get just as much use out of this parser, for when adjacency information is critical to the visualization technique.</p>
<p>There are two main classes, <span class="incode">E00Parser</span> and <span class="incode">E00Tools</span>.  <span class="incode">E00Parser</span> is based on the Perl extension <a href="http://search.cpan.org/~zummo/Geo-E00-0.05/lib/Geo/E00.pm"><span class="incode">Geo::E00</span></a> by Alessandro Zummo and Bert Tijhuis, with much aid from the (world famous) <a href="http://avce00.maptools.org/docs/v7_e00_cover.html">Arc/Info Export Format Analysis</a>.  There&#8217;s no way I would have attempted to write the AS3 E00 parser without Zummo and Tijhuis&#8217; code, as theirs appears to be the only stand-alone open source code available for reading the format.  Their Perl regular expressions were copied with few modifications, though I did fix an issue in some that was keeping their code from accurately reading certain sections of double-precision coverages.  I wrote <span class="incode">E00Tools</span> to collect a handful of methods for working with the resultant data.</p>
<p>I setup a <a href="http://code.google.com/p/e00parser/">Google Code project</a> for this work, as topology will likely form the basis for a decent amount of my cartographic experimentation in the near future.</p>
<ul>
<li>to browse the code, just go <a href="http://code.google.com/p/e00parser/source/browse/#svn/trunk/e00/src/com/indiemaps/mapping/data/parsers/e00">here</a></li>
<li>the latest zip distribution is available <a href="http://code.google.com/p/e00parser/downloads/list">here</a></li>
<li>three examples are included in <a href="http://code.google.com/p/e00parser/source/browse/#svn/trunk/e00/src/com/indiemaps/mapping/data/parsers/e00/examples"><span class="incode">com.indiemaps.mapping.data.parsers.e00.examples</span></a></li>
</ul>
<div class="centerIMG"><img src="http://indiemaps.com/images/e00/e00-mouseover.png" alt="" /></div>
<p>Oh, BTW:</p>
<blockquote><p>ESRI considers the export/import file format to be proprietary. As a consequence, the identified format can only constitute a &#8220;best guess&#8221; and must always be considered as tentative and subject to revision, as more is learned.</p></blockquote>
<p>(from the <em>Arc/Info Export Format Analysis</em>)</p>
<h3>How to use</h3>
<p>After loading your ASCII E00 file into a string, use something like the following to parse it.</p>
<div class="codecolorer-container actionscript" style="height:20px;"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">var</span> <span class="kw3">data</span> : <span class="kw3">Object</span> = E00Parser.<span class="me1">parse</span><span class="br0">&#40;</span> e00Text <span class="br0">&#41;</span>;</div></div>
<p>The returned data object includes all information contained in the file, and can have as many as nine sections.  Of most use are the <span class="incode">arc</span> (non-redundant list of polylines), <span class="incode">pal</span> (list of all polygons and their associated lines), and <span class="incode">ifo</span> (attributes and labels) sections.  The exact structure of the returned object is described on the wiki <a href="http://code.google.com/p/e00parser/wiki/Structure">here</a>.</p>
<p>There are three <em>sweet</em> examples to be found in <a href="http://code.google.com/p/e00parser/source/browse/#svn/trunk/e00/src/com/indiemaps/mapping/data/parsers/e00/examples"><span class="incode">com.indiemaps.mapping.data.parsers.e00.examples</span></a>.</p>
<h3>Tools</h3>
<p><span class="incode">E00Tools</span> contains some methods for working with the resultant data of <span class="incode">E00Parser.parse()</span>.  Included are methods for:</p>
<ul>
<li>Drawing all <em>features</em></li>
<li>Drawing individual <em>features</em></li>
<li>Getting a list of polygon IDs for all <em>features</em></li>
<li>Getting the centroid of a <em>feature</em></li>
<li>Getting the shared border length of all <em>features</em> and their neighbors</li>
</ul>
<p>Key above is the idea of the <em>feature</em>.  <em>Michigan</em> is a <em>feature</em>.  <em>Features</em> are not directly encoded in E00 files like they are in other formats.  In a polygonal shapefile, for example, each feature is encoded as a <em>multipolygon</em>, constituted of one or more rings of points.  In E00 files, only polygons are directly encoded; feature information (which polygons make up which features) can be ascertained from the INFO (<span class="incode">ifo</span>) section.</p>
<h3>Experimentation</h3>
<div class="centerIMG"><img src="http://indiemaps.com/images/e00/experimentation.png" alt="" /></div>
<p>I created these AS3 classes for myself, because I wanted to experiment with topological geodata in visualization and cartographic applications.  This typically boils down to knowing which features are neighbors and how much of a border they share. The <span class="incode">E00Tools</span> methods <span class="incode">getAllFeatureNeighbors</span> and <span class="incode">getAllFeatureSharedBorderLengths</span> gives you easy access to this information.  </p>
<p>Daniel Dorling popularized the circular cartogram form among academic cartographers, outlining the symbology most notably in his <a href="http://www.sasi.group.shef.ac.uk/thesis/index.html">1991 PhD thesis</a> and 1996&#8217;s <em>Area Cartograms: Their Use and Creation</em> (available <a href="http://www.qmrg.org.uk/?page_id=141">here</a> in PDF form along with many other gems of quantitative geography).  Dr. Dorling made Pascal and C code available.  I <a href="http://indiemaps.com/blog/2008/01/dorlingpy/">ported it to Python</a>, and began experimenting, <a href="http://indiemaps.com/blog/2008/01/mine-is-more-like-poorling/">mostly in vain</a>, on a method that worked with a shapefile as input, but without the expense of building topology.  It produced at best a pale imitation.  Dorling describes the gravity model used to produce the cartograms in his dissertation:</p>
<blockquote><p>The algorithm which was developed to create the area cartograms worked by repeatedly applying a series of forces to the circles representing the places. Circles attract those they are topologically adjacent to; the strength of this attraction being greater the larger the distance is between them and the longer their common boundary. </p></blockquote>
<p>The algorithm thus requires the shared border lengths of all features and their neighbors.  Producing this info is easy with <span class="incode">E00Tools</span>, but it seems kind of backward to parse my geodata in ActionScript only to produce the rendering in Python.  I&#8217;m working on porting Dorling&#8217;s algorithm to AS3 so I can go directly from geodata to cartogram without switching platforms.</p>
<p>Lee Byron<a href="http://leebyron.com/how/2008/08/09/olympic-medals-cartogram/"> mentions another technique</a>, used to generate the <a href="http://www.nytimes.com/interactive/2008/08/04/sports/olympics/20080804_MEDALCOUNT_MAP.html">Olympic Medal Count cartograms</a> he helped produce for the <em>Times</em>.  Byron didn&#8217;t release any code, but notes that a soft body force directed layout algorithm written in ActionScript was used.  I haven&#8217;t been able to reproduce his method, but I&#8217;ve included an example that drops the topological information gathered from an E00 file into a <a href="http://flare.prefuse.org/">Flare</a> visualization using a <a href="http://flare.prefuse.org/api/flare/vis/operator/layout/ForceDirectedLayout.html">force directed layout</a>.  The example is minimal, but shows how the E00 classes can be integrated with the Flare visualization API, and may point the way to a slightly different method for producing circular cartograms client-side.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2009/02/e00parser-an-actionscript-3-parser-for-the-arcinfo-export-topological-gis-format/feed/</wfw:commentRss>
		</item>
		<item>
		<title>political cartography: voting with our pocketbooks</title>
		<link>http://indiemaps.com/blog/2009/01/political-cartography-voting-with-our-pocketbooks/</link>
		<comments>http://indiemaps.com/blog/2009/01/political-cartography-voting-with-our-pocketbooks/#comments</comments>
		<pubDate>Mon, 19 Jan 2009 10:46:11 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[2008]]></category>

		<category><![CDATA[ActionScript]]></category>

		<category><![CDATA[bivariate]]></category>

		<category><![CDATA[brightness]]></category>

		<category><![CDATA[cartogram]]></category>

		<category><![CDATA[cartograms]]></category>

		<category><![CDATA[color]]></category>

		<category><![CDATA[election]]></category>

		<category><![CDATA[flash]]></category>

		<category><![CDATA[mapping]]></category>

		<category><![CDATA[McCain]]></category>

		<category><![CDATA[Obama]]></category>

		<category><![CDATA[open source]]></category>

		<category><![CDATA[politics]]></category>

		<category><![CDATA[python]]></category>

		<category><![CDATA[thematic mapping]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=80</guid>
		<description><![CDATA[These election maps are kinda late.  Here I&#8217;m interested in comparing how we, as a country, voted with our ballots versus how we voted with our dollars.  Obama received about 70% of the money donated to the major candidates in 2008, but only 53% of the votes, so I expected a bluer map. [...]]]></description>
			<content:encoded><![CDATA[<p>These election maps are kinda late.  Here I&#8217;m interested in comparing how we, as a country, voted with our ballots versus how we voted with our dollars.  Obama received about 70% of the money donated to the major candidates in 2008, but only 53% of the votes, so I expected a bluer map.  But I wasn&#8217;t sure what the spatial distribution of the difference would be.</p>
<p>As a first blush, the state level is alright (sorry Alaska, Hawaii).  Here I&#8217;m showing the proportion of the dollars donated to the major candidates that went to Barack Obama.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/politicalCartography/state-donations.png" alt="donations to the major candidates in the 2008 presidential election" /></div>
<p>Compare that blue and purple beauty, with only Mississippi to be embarrassed of, with this &#8212; the results of the popular vote.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/politicalCartography/state-votes.png" alt="votes to the major candidates in the 2008 presidential election" /></div>
<p>Some of those states were obviously more consequential to the candidates&#8217; finances.  Here&#8217;s an interactive cartogram, sized by either votes or total dollars.  Both cartograms use the 10th <em>densest</em> state as the &#8220;anchor unit&#8221; (in both cases, New Jersey), so comparisons between the two are meaningful.  I talk more about that in <a href="http://indiemaps.com/blog/2008/12/noncontiguous-area-cartograms/">my post on noncontiguous cartograms</a>.</p>

<object	type="application/x-shockwave-flash"
			data="http://indiemaps.com/flash/politicalCartography/VoteDonationComparisonExample.swf"
			base="http://indiemaps.com/flash/politicalCartography/"
			width="850"
			height="500">
	<param name="movie" value="http://indiemaps.com/flash/politicalCartography/VoteDonationComparisonExample.swf" />
	<param name=base" value="http://indiemaps.com/flash/politicalCartography/" />
</object>
<p>The state view is too coarse.  The obvious choice is the county level, but such aggregated data is not available from the FEC, nor from the <a href="http://open.blogs.nytimes.com/2008/10/14/announcing-the-new-york-times-campaign-finance-api/"><em>NY Times</em> Campaign Finance API</a>, where I retrieved all the finance data for this post. The data are available as individual records, or as summaries requestable by state or ZIP code.<strong>*</strong>  </p>
<p>So I wrote some Python scripts to retrieve and process all 32,800 ZIP codes available from the <em>Times</em> API.  There are more ZIP codes out there, but perhaps they had no donations in 2008.  This had to be spread over a few days, because the <em>Times</em> limits requests to 5000 per day per API key.</p>
<p>Thanks to the shapefiles available from <a href="http://www.census.gov/geo/www/cob/z52000.html#shp">the Census here</a> I was able to map the proportion of donations to Obama from those 32,800 ZIP codes. But too many ZIP codes lacked donations, leading to an unsightly choropleth characterized by radical change and data-less regions.  Best to aggregate to larger units (but smaller than the states above). The <em>NY Times</em> made some nice <a href="http://elections.nytimes.com/2008/president/campaign-finance/map.html">interactive campaign finance maps</a>, candidate-by-candidate, and aggregated to sub-state regions (ex. &#8220;Southern Wisconsin&#8221;, &#8220;Eastern Shore and northern Maryland&#8221;). I&#8217;ve settled on a finer unit, the <a href="http://www.census.gov/geo/www/cob/z32000.html">three-digit ZIP Code Tabulation Areas</a> (ZCTAs) of the Census Bureau (an aggregation of ZIP codes based on their first three digits).  These first three digits correspond to the <a href="http://en.wikipedia.org/wiki/Sectional_center_facility_(SCF)">sectional center facility</a> of the USPS that serves the area.  Though that sounds rather arbitrary, the Census Bureau has aggregated to such units in some of their data since 2000.  The following shows donations originating in the 877 three-digit ZIP code regions of the U.S.A., using the same color scheme as the maps above.</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/politicalCartography/zipcode-donations.png" alt="donations to the major candidates in the 2008 presidential election" /></div>
<p>As above, compared to the outcome of the popular vote (but by county):</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/politicalCartography/county-votes.png" alt="votes to the major candidates in the 2008 presidential election" /></div>
<p>I&#8217;ll spare you the ZIP code regions noncontiguous cartogram.  Cartograms rely on the recognizability of features on the distorted image, and 3-digit ZIP code regions lack familiarity save when they happen to line up with county boundaries.  A better technique in such cases is described by <a href="http://www.cartogrammar.com/blog/the-election-at-night/">Andy Woodruff</a> of <a href="http://www.axismaps.com/">Axis Maps</a>:</p>
<blockquote><p>It’s a standard red-blue map indicating the winner of each county in the lower 48 states, where the transparency indicates the population of a county. The many counties with low population fade into the background, diminishing their visual prominence. This is meant to accomplish something similar to a cartogram, where sizes are distorted to show the actual distribution of votes.</p></blockquote>
<p>Their <a href="http://www.axismaps.com/blog/2008/11/a-new-kind-of-election-map/">election maps</a> adapt the technique of encoding uncertainty information in transparency initially <a href="http://www.geovista.psu.edu/publications/MacEachren/maceachren_uncertainty_CP1992.pdf">suggested by Alan MacEachren in 1992</a> and refined by Igor Drecki in 1999.</p>
<p>Andy tells me they grouped counties into 16 opacity classes using the natural breaks (Jenks optimization) method.  I do the same here for my ZIP code regions.  This method minimizes the sum of deviations from class means, thus producing an optimal classification.  Sixteen classes ensures the appearance of a smooth gradient of transparencies.  I used <a href="http://www.r-project.org/">R</a> and the add-on package <a href="http://cran.r-project.org/web/packages/classInt/index.html">classInt</a> to create the classification.  Here then: finance compared to votes, with both <em>opacitized</em> by <em>consequentiality</em> (total dollars donated in one case, total votes cast in the other).</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/politicalCartography/transparency-black.png" alt="" class='alignnone' /></div>
<p>And here the same over a white background (thus switching the visual variable representing consequentiality to saturation).</p>
<div class="centerIMG"><img src="http://indiemaps.com/images/politicalCartography/transparency-white.png" alt="" /></div>
<p>I&#8217;ve said very little about what these maps actually show.  I&#8217;ll let the maps do the talking on that, though please do contact me if you&#8217;d like the data used in these maps for your own experiments.</p>
<p>One thing I&#8217;ve neglected to mention thus far: all of the above graphics were produced with ActionScript 3, using just a text editor and the latest <a href="http://opensource.adobe.com/wiki/display/flexsdk/Download+Flex+4">free Flex SDK.</a>  I used <a href="http://www.python.org/">Python</a> to retrieve and process the campaign finance data, <a href="http://www.openoffice.org/">OpenOffice</a> to paste the processed data into the DBF files of the shapefiles retrieved from the <a href="http://www.census.gov/geo/www/cob/bdy_files.html">Census Bureau</a>, and <a href="http://www.r-project.org/">R</a> to classify the data.  It&#8217;s pretty sweet that such visualizations can be created using only free tools and data.</p>
<p class="update">
update: As I toiled on my ZIP code detour, it turns out <a href="http://finder.geocommons.com/">GeoCommons Finder</a> was <a href="http://maker.geocommons.com/searches?mh_query=FEC+individual+county">accumulating the data</a> I craved.  As described there: &#8220;The monthly individual donor data was downloaded from FEC (Federal Election Commission), geocoded and then aggregated to county level for the lower 48 states.&#8221; The data provided there by county will still require some processing and doesn&#8217;t cover the full range of the data presented by ZIP code region above, but the common county aggregation makes further comparisons with voting data possible, and I&#8217;ll show some bivariate maps utilizing this new data in the near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2009/01/political-cartography-voting-with-our-pocketbooks/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Early cartograms</title>
		<link>http://indiemaps.com/blog/2008/12/early-cartograms/</link>
		<comments>http://indiemaps.com/blog/2008/12/early-cartograms/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 08:40:26 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[bubble charts]]></category>

		<category><![CDATA[cartograms]]></category>

		<category><![CDATA[history of cartography]]></category>

		<category><![CDATA[map projections]]></category>

		<category><![CDATA[mapping]]></category>

		<category><![CDATA[symbology]]></category>

		<category><![CDATA[timeline]]></category>

		<category><![CDATA[visualization]]></category>

		<category><![CDATA[Waldo Tobler]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=78</guid>
		<description><![CDATA[I&#8217;m kind of on a cartogram kick lately.  I&#8217;m interested in the pioneers of the form, those who first thought to distort borders and explode topologies in order to convey the distribution of some thematic variable.  When was the first cartogram produced, where, and by whom?  I ran into a lot of [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m kind of on a cartogram kick lately.  I&#8217;m interested in the pioneers of the form, those who first thought to distort borders and explode topologies in order to convey the distribution of some thematic variable.  When was the first cartogram produced, where, and by whom?  I ran into a lot of material while researching my thesis; this post only begins the discussion.</p>
<h3>1868</h3>
<p>The honor typically goes to <a href="http://en.wikipedia.org/wiki/Pierre_Émile_Levasseur">Émile Levasseur</a> for the diagrammatic maps contained in his 1868 and 1875 economic geography textbooks.  </p>
<div class="centerIMG"><img src='/images/levasseur.png' alt='early diagrammatic map by Levasseur' class='alignnone' /></div>
<p>H. Gray Funkhouser (1937) wrote of these &#8220;colored bar graphs&#8221;,</p>
<blockquote><p>
<em>squares proportional to the extent of surfaces, population, budget, commerce, merchant marine of the countries of Europe</em>, the squares being grouped about each other in such a manner as to correspond to their geographical position
</p></blockquote>
<p>Interestingly, Waldo Tobler (2004) points out that the example printed by Funkhouser (above) was sized by land area and thus not a true value-by-area cartogram. I don&#8217;t have access to Levasseur&#8217;s texts, and it&#8217;s odd that the only available scan of Levasseur&#8217;s first cartogram shows a diagrammatic map, not a true cartogram.</p>
<h3>1897</h3>
<p>On the other hand is the image below, whose units are definitely sized to the data, but whose geographic arrangement is questionable.  I first saw this page from an 1897 Rand McNally <em>Atlas of the World</em> in a <a href="http://spacecollective.org/rodneyw/2738/The-Debt-of-Your-Country">SpatialCollective post</a>; a <a href="http://www.davidrumsey.com/luna/servlet/detail/RUMSEY~8~1~20703~550091:The-world-s-gold-and-silver-money,-">high res version</a> is available from the David Rumsey Map Collection.</p>
<div class="centerIMG"><img src='/images/atlas-1897.png' alt='a bubble chart, perhaps a circular cartogram, from an 1897 atlas' class='alignnone' /></div>
<p>Circles on the left are sized proportional to population, those on the right to debt.  Though the arrangement seems haphazard, geography is not ignored as the circles are grouped together by continent.  I don&#8217;t really buy these as cartograms, but they&#8217;re certainly a predecessor to the circular cartogram form popularized by <a href="http://www.shef.ac.uk/geography/staff/dorling_danny/">Danny Dorling</a> nearly 100 years later.</p>
<h3>1903</h3>
<p>Others refer to the election maps of Hermann Haack and Hans Wiechel as the first cartograms (presumably because the popular example of Levasseur&#8217;s earlier technique was scaled by land area). These cartograms, which show election results in the Reichstag, were sized to population and of the same rectangular form as the earlier Levasseur diagrams.</p>
<p>Though these maps are mentioned in the second volume of Eckert&#8217;s touchstone history of cartography, <em>Die Kartenwissenschaft</em> (1925), they&#8217;re not reprinted there, and I&#8217;ve no access to the original sources.</p>
<h3>1911</h3>
<p>Professor John Krygier <a href="http://makingmaps.wordpress.com/2008/02/19/1911-cartogram-apportionment-map/">dug this one up</a> — perhaps the first American cartogram, and an interesting example of the form.</p>
<div class="centerIMG"><img src='/images/bailey-1911.png' alt='perhaps the first American cartogram' class='alignnone' /></div>
<p>This is the earliest non-rectangular cartogram I&#8217;ve seen of any provenance, and it is unique in maintaining the exact outer shape of the United States, while abandoning unit shape and position.</p>
<h3>1929</h3>
<div class="centerIMG"><img src='/images/grundy-1929.png' alt='Grundy's early American cartogram' class='alignnone' /></div>
<p>Another early American example, the above map by Joseph Grundy was published by the <em>Washington Post</em> in 1929.  States are scaled &#8220;on the basis of population and Federal taxes&#8221;.  Though somewhat rough, I consider this the first modern cartogram, as it maintains topology without abstracting to rectangles.</p>
<h3>1934</h3>
<p><a href="http://en.wikipedia.org/wiki/Erwin_Raisz">Erwin Raisz</a> was the first to give cartograms academic attention, describing their production in &#8220;The rectangular statistical cartogram&#8221; (1934) and devoting significant coverage to the form in his popular cartography textbooks.</p>
<div class="centerIMG"><img src='/images/raisz-1934.png' alt='rectangular statistical cartogram by Erwin Raisz, 1934' class='alignnone' /></div>
<h3>1961</h3>
<p>A later example, but still a first. In 1961, <a href="http://www.geog.ucsb.edu/~tobler/">Waldo Tobler</a> arrived at the University of Michigan as an assistant professor and began working on the first computer programs for cartogram production. His &#8220;pseudo cartograms&#8221; were created by expanding or compressing the lat/long grid until the minimum root mean squared error of unit densities resulted. </p>
<div class="centerIMG"><img src='/images/tobler-1961.png' alt='early computer cartogram by Waldo Tobler' class='alignnone' /></div>
<p>Errors were typically quite high, though likely no higher than those on the manually-produced ones that came before. All subsequent cartogram algorithms can be considered variants of Tobler&#8217;s method.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2008/12/early-cartograms/feed/</wfw:commentRss>
		</item>
		<item>
		<title>noncontiguous area cartograms</title>
		<link>http://indiemaps.com/blog/2008/12/noncontiguous-area-cartograms/</link>
		<comments>http://indiemaps.com/blog/2008/12/noncontiguous-area-cartograms/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 07:13:26 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[actionscript 3]]></category>

		<category><![CDATA[cartograms]]></category>

		<category><![CDATA[code]]></category>

		<category><![CDATA[flash]]></category>

		<category><![CDATA[flash cartograms]]></category>

		<category><![CDATA[Flex]]></category>

		<category><![CDATA[Judy Olson]]></category>

		<category><![CDATA[map projections]]></category>

		<category><![CDATA[noncontiguous]]></category>

		<category><![CDATA[portfolio]]></category>

		<category><![CDATA[symbology]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=75</guid>
		<description><![CDATA[notes on noncontiguous cartograms and ActionScript 3 classes for producing them
Fully contiguous cartograms have stretched and distorted borders but perfectly maintained topologies. Like the Gastner-Newman diffusion-based cartograms we see all over the place.  Though all sorts of cartogram designs have been produced, those with perfect topology preservation (fully contiguous cartograms) receive the majority of [...]]]></description>
			<content:encoded><![CDATA[<p><em>notes on noncontiguous cartograms and ActionScript 3 classes for producing them</em></p>
<p>Fully contiguous cartograms have stretched and distorted borders but perfectly maintained topologies. Like the <a href="http://www.worldmapper.org/">Gastner-Newman diffusion-based cartograms</a> we see all over the place.  Though all sorts of cartogram designs have been produced, those with perfect topology preservation (fully contiguous cartograms) receive the majority of academic and popular press attention.</p>
<p>Some notable exceptions are the well done animated ones by <a href="http://show.mappingworlds.com/">Mapping Worlds</a> and a <a href="http://www.nytimes.com/interactive/2008/11/02/opinion/20081102_OPCHART.html">recent <em>NY Times</em> example</a> showing electors per voter that I&#8217;ll return to later.  These fully noncontiguous cartograms preserve the shapes of enumeration units perfectly, but don&#8217;t even attempt to preserve any borders or adjacencies from the original map.</p>
<div class="centerIMG"><img src='/images/noncontiguous/nonconExample.png' alt='noncontiguous cartogram close-up' class='alignnone' /></div>
<p>Judy Olson (Wisconsin Geography alum natch) wrote the only <a href="http://www3.interscience.wiley.com/journal/119642159/abstract">academic article</a> to focus specifically on this cartogram symbology in 1976.  She believed noncontiguous cartograms held three potential advantages over contiguous cartograms (I&#8217;ve three more below):</p>
<ol>
<li>&#8220;the empty areas, or gaps, between observation units are meaningful representations of discrepancies of values, these discrepancies generally being a major reason for constructing a cartogram&#8221;</li>
<li>production of noncontiguous cartograms involves &#8220;only the discrete units for which information is available and only the lines which can be accurately relocated on the original map appear on the noncontiguous cartogram&#8221;</li>
<li>because of perfect shape preservation, &#8220;recognition of the units represented is relatively uncomplicated for the reader&#8221;</li>
</ol>
<p>Despite these inherent advantages (along with ease of production), all the early value-by-area cartograms I&#8217;ve seen maintain contiguity.  Some took the radical step of abstracting features to geometric primitives, like Levasseur&#8217;s early French examples (which may not have been cartograms) and Erwin Raisz&#8217;s early American &#8220;rectangular statistical cartograms&#8221;.  But in many ways the noncontiguous design is the more radical cartogram, as it actually breaks the basemap apart — rather than skewing shared borders it abandons them.</p>
<h4>my AS3 classes</h4>
<p>Olson outlines a technique — the projector method — for manually producing such cartograms.  A projector capable of precise numeric reduction/enlargement was required, but not much else, and accurate cartograms could be produced in minutes.  A scaling factor was calculated for each enumeration unit, the projector was set to this value, and the projected borders were traced, keeping units centered on their original centers.</p>
<p>My AS3 NoncontiguousCartogram class works similarly.  It takes an array of objects containing geometry and attribute properties and creates a noncontiguous cartogram.  I include methods for creating the input array from a shapefile/dbf combo, but using KML, WKT, or geoJSON representations wouldn&#8217;t be too hard.  Methods are included for projecting this lat/long linework (to Lambert&#8217;s Conformal Conic projection at least).  The NoncontiguousCartogram class draws the input geography, figures the area of each feature, and scales figures according to their density in the chosen thematic variable.</p>
<p>It&#8217;s all good/in ActionScript 3, so can be used in Flash or Flex. The <a href="http://indiemaps.com/flash/noncontiguous/noncontiguous.zip">zip distribution</a> includes the following:</p>
<ul>
<li>the main NoncontiguousCartogram.as class</li>
<li>two example applications and the data needed to run them</li>
<li>utility classes, including some that make creating cartograms from shp/dbf input quite easy</li>
<li>Edwin Van Rijkom&#8217;s <a href="http://code.google.com/p/vanrijkom-flashlibs/">SHP and DBF libraries</a>, which are used to load the shapefiles in both of the included examples</li>
<li>Keith Peters&#8217; <a href="http://www.bit-101.com/blog/?p=1126">MinimalComps</a> AS3 component library, for the components used in one of the examples</li>
<li>Grant Skinner&#8217;s <a href="http://www.gskinner.com/libraries/gtween/">gTween</a> class, which is required by the NoncontiguousCartogram class for tween transitions</li>
</ul>
<p><a href="http://indiemaps.com/flash/noncontiguous/srcview/">Browse all the above</a> or <a href="http://indiemaps.com/flash/noncontiguous/noncontiguous.zip">download the zip</a>.</p>
<h4>Flash examples</h4>
<p>I created a few examples, all started from just shapefile input.  The following are screenshots, but there are interactive examples further down.  First, an example out of Olson&#8217;s article, presumably produced using the &#8220;projector method&#8221; described above:</p>
<div class="centerIMG"><img src='/images/noncontiguous/olsonExample-reduced.png' alt='noncontiguous cartogram from Olson\&#039;s article' class='alignnone' /></div>
<p>Here I&#8217;ve updated Olson&#8217;s example with more recent data and a snazzy red outline:</p>
<div class="centerIMG"><img src='/images/noncontiguous/oldPopulation.png' alt='' class='alignnone' /></div>
<p>This one took a few secs to load the shapefile, project to Winkel Tripel, draw, and scale the cartogram by population:</p>
<div class="centerIMG"><img src='/images/noncontiguous/worldPopulation.png' alt='' class='alignnone' /></div>
<div class="centerIMG"><img src='/images/noncontiguous/usCountyPopulation.png' alt='' class='alignnone' /></div>
<p>The above is great as an artistic rendering, but is probably a bad depiction of the data as 10% of the counties have increased in size and there is a fair bit of overlap.  Decreasing the alpha of the counties reveals the overlap, and creates a really interesting rendering of the data.</p>
<div class="centerIMG"><img src='/images/noncontiguous/usCountyPopulation-take2.png' alt='' class='alignnone' /></div>
<p>If you&#8217;ve read this far, you probably at least want to see some shit move around or something. Here&#8217;s an example I put together with election data; it shows off the basic updating and tweening capabilities built into the class.  Notice how all features are drawn around their centroid and therefore remain centered as they change scale.  Tweening is via Grant Skinner&#8217;s <a href="http://www.gskinner.com/libraries/gtween/">gTween</a> and interface components are from Keith Peters&#8217; <a href="http://www.bit-101.com/blog/?p=1126">MinimalComps</a> library, which is great for prototyping in only AS3.</p>

<object	type="application/x-shockwave-flash"
			data="http://indiemaps.com/flash/noncontiguous/ElectionResultsCartogramExample.swf"
			base="http://indiemaps.com/flash/noncontiguous/"
			width="700"
			height="500">
	<param name="movie" value="http://indiemaps.com/flash/noncontiguous/ElectionResultsCartogramExample.swf" />
	<param name=base" value="http://indiemaps.com/flash/noncontiguous/" />
</object>
<p>In the above, the &#8220;electors per voter&#8221; option is based on the same data as the well-made <a href="http://www.nytimes.com/interactive/2008/11/02/opinion/20081102_OPCHART.html"><em>NY Times</em> example</a> I mentioned earlier, but something&#8217;s fishy about one of our scalings, because they <a href="http://indiemaps.com/images/noncontiguous/dontLineUp.png">don&#8217;t line up</a>.</p>
<h4>design considerations</h4>
<p>Position and size.  Both affect the one thing you&#8217;re trying to avoid on a noncontiguous cartogram: feature overlap.  If features are left at their original centroids, overlap is likely if any units are enlarged.  When sizing features on a noncontiguous cartogram, one feature is chosen as the &#8220;anchor unit&#8221;.  This feature will stay the same size, and all others will be scaled up or down depending on whether their density is higher or lower than the anchor.  If the unit with the highest density is chosen, overlap will be avoided because all other units will be reduced.  This can often produce large areas of white space and units too small to recognize.</p>
<p>To address this, both the anchor unit and the feature centers are configurable.  The &#8220;anchor percentile&#8221; is set in the class constructor, and can range from zero to one.  If set to one, all but one feature will be reduced; if set to zero, all but one feature will be enlarged.  </p>
<p>By default, features scale around their original centroids.  You can pass a Dictionary of feature centers (using the objects contained in the combined array as keys) to the constructor (or via the settable property, featureCenters), which allows you to manually position the features.  I include the following example to give you some ideas for how to use this feature.  I create an interactive cartogram on which features can be moved around.  Well Known Text representations of the feature centers are generated and converted into the required Dictionary via some included utility methods.  <a href="http://indiemaps.com/flash/noncontiguous/srcview/source/ManualCartogramTest.as.html">Check out the source</a> of this example to see what&#8217;s going on.</p>

<object	type="application/x-shockwave-flash"
			data="http://indiemaps.com/flash/noncontiguous/ManualCartogramTest.swf"
			base="http://indiemaps.com/flash/noncontiguous/"
			width="800"
			height="450">
	<param name="movie" value="http://indiemaps.com/flash/noncontiguous/ManualCartogramTest.swf" />
	<param name=base" value="http://indiemaps.com/flash/noncontiguous/" />
</object>
<h4>projections</h4>
<p>In her 1976 article, Judy Olson used linework projected with an equal area projection.  This was a requirement of her method, as she was using known land area measurements in her calculation of densities and scaling factors.  My class calculates areas of features — projected or otherwise — before scaling, so any projection can be utilized and an accurately-scaled cartogram will result.  In the examples above, I use a conformal projection (included in the source) on the points before passing to the cartogram class.  Conformal projections preserve local angles (shape, sort of) and may result in features that are easier to recognize.</p>
<p>That said, for most applications it&#8217;s probably a good idea to use equivalent (equal area) projections before constructing these cartograms, as the divergence of the thematic variable from land area is often the point of creating the cartogram in the first place.  </p>
<h4>more advantages</h4>
<p>In my thesis research last spring, noncontiguous cartograms performed quite well: subjects rated them highly on aesthetics and could locate and estimate the areas of features with relatively high accuracy.  I would add the following to Olson&#8217;s list of noncontiguous cartogram advantages.</p>
<ol>
<li>Olson concentrates on the perfect shape preservation of noncontiguous cartograms.  The form (well, those with units centered on the original enumeration unit centroids, as in Olson&#8217;s projector method) also perfectly preserves the <em>location</em> of the features on the resultant transformed cartogram. Not only are features easier to recognize, but locations within the transformed units can be accurately located as well (cities or mountain ranges from the original geography can be accurately plotted on the transformed cartogram).</li>
<li>Because units are separate on the transformed cartogram, their figure-ground is increased and areas of features can therefore be more accurately estimated.</li>
<li>Many cartogram designs (including most manual cartograms and the Gastner-Newman-produced cartograms) sacrifice some <em>accuracy</em> for shape recognition.  This is a defensible tradeoff, especially as area estimation is notoriously inaccurate and nonlinear.  Yet it&#8217;s a tradeoff that noncontigous cartograms need not make, as they can always perfectly represent the data with relative areas without sacrificing shape preservation.
</ol>
<p>Thus, noncontiguous cartograms seem to excel at the cartogram&#8217;s two main map-reading tasks: shape recognition and area estimation.  This is mediated somewhat by the chief advantage of contiguous cartograms: compactness.  Because no space is created between enumeration units, contiguous cartogram enumeration units can be larger than those on noncontiguous cartograms, all other things equal.  The increased size on contiguous cartograms may improves their legibility.</p>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2008/12/noncontiguous-area-cartograms/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Nightingale&#8217;s roses in ActionScript 3</title>
		<link>http://indiemaps.com/blog/2008/10/nightingales-roses-in-actionscript-3/</link>
		<comments>http://indiemaps.com/blog/2008/10/nightingales-roses-in-actionscript-3/#comments</comments>
		<pubDate>Tue, 21 Oct 2008 08:30:00 +0000</pubDate>
		<dc:creator>zach'ry</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[actionscript 3]]></category>

		<category><![CDATA[charting]]></category>

		<category><![CDATA[coxcomb]]></category>

		<category><![CDATA[flash]]></category>

		<category><![CDATA[Flex]]></category>

		<category><![CDATA[nightingale]]></category>

		<category><![CDATA[pie chart]]></category>

		<category><![CDATA[rose]]></category>

		<category><![CDATA[symbology]]></category>

		<category><![CDATA[temporal]]></category>

		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://indiemaps.com/blog/?p=70</guid>
		<description><![CDATA[I&#8217;ve long been a sucker for the polar area/coxcomb/rose charts popularized by Florence Nightingale.  These multivariate charts can show ordered or unordered categorical data.  As noted in an Economist piece on influential information graphics,
As with today&#8217;s pie charts, the area of each wedge is proportional to the figure it stands for, but it [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve long been a sucker for the polar area/coxcomb/rose charts popularized by Florence Nightingale.  These multivariate charts can show ordered or unordered categorical data.  As noted in <a href="http://www.economist.com/world/europe/displaystory.cfm?story_id=10278643">an <em>Economist</em> piece</a> on influential information graphics,</p>
<blockquote><p>As with today&#8217;s pie charts, the area of each wedge is proportional to the figure it stands for, but it is the radius of each slice (the distance from the common centre to the outer edge) rather than the angle that is altered to achieve this.</p></blockquote>
<div class="centerIMG"><img src="/images/coxcomb/nightingaleCoxcombs.png" alt="" /></div>
<p>I wanted to produce some just for kicks, so looked around for a script in AS3.  No dice.  OK, any language?  Didn&#8217;t see anything.  So I sat on the idea for a while and then finally thought up the technique that made producing them in AS3 quite easy.  With the resultant classes, producing graphics like the following small multiples of U.S. soldier deaths in Iraq is a snap.  The classes are written in AS3, so can be used with Flash, Flex, or mxmlc.  All the example screenshots below are PNGs captured from SWFs produced with only AS3 (extended Sprites).  To see the code (which includes a lot of ugly annotation), click &#8216;<a href="http://indiemaps.com/flash/coxcomb/srcview/">view source</a>&#8216; below any image.  All source code is included in the ZIP distribution linked below.</p>
<h3>U.S. Soldier Deaths in Iraq, March 2003 to October 2008</h3>
<div class="centerIMG"><img src="/images/coxcomb/usSoldierDeathsIraq.png" alt="" width="867" height="185" /></div>
<p class="caption"><a href="http://indiemaps.com/flash/coxcomb/srcview/source/IraqDeathCountCoxcombs.as.html">view source</a></p>
<p>The above chart series utilizes the coxcomb in the same manner as Nightingale&#8217;s original — as a month-by-month temporal chart.  The following is also a small multiples presentation, but utilizes categorical coxcombs to show car color popularity by percent manufactured.</p>
<h3>Cars manufactured by color, 2003 to 2006</h3>
<div class="centerIMG"><img class="alignnone" src="/images/coxcomb/carColors.png" alt="" width="737" height="380" /></div>
<p class="caption"><a href="http://indiemaps.com/flash/coxcomb/srcview/source/CarColors.as.html">view source</a></p>
<p>And as the ultimate test for the classes, I decided to try my hand at reproducing <a href="http://indiemaps.com/images/coxcomb/Nightingale-mortality.jpg">Nightingale&#8217;s original graphic</a>.  I relied on <a href="http://www.florence-nightingale-avenging-angel.co.uk/">Hugh Small&#8217;s</a> beautiful reproduction of Nightingale&#8217;s graphic.  I quickly gave up on reproducing the fonts, and indeed a few of Nightingale&#8217;s embellishments/idiosyncrasies are not reproduced in my graphic.  But one of these nights I&#8217;ll extend the class to more faithfully reproduce Nightingale&#8217;s original.  Nevertheless, not bad eh?</p>
<h3> </h3>
<div class="centerIMG"><img src="/images/coxcomb/myNightingale.png" alt="" width="1005" height="610" /></div>
<p class="caption"><a href="http://indiemaps.com/flash/coxcomb/srcview/source/NightingaleCoxcomb.as.html">view source</a></p>
<p>I&#8217;m open to the criticism that these coxcombs (and other area-based symbologies) are rarely preferable to a simple bar chart (a length-based symbology).  But I still believe they ought to be a part of the information designer&#8217;s toolkit, especially as they hold definite advantages over their more popular offspring, the pie chart.  For animated or small multiples applications in particular, the coxcomb is preferable to the pie chart because of the constant position/angle of slices.  It&#8217;s easy to glance across a series of small multiples coxcombs, or watch an animating coxcomb, and follow individual slices.  I&#8217;ve built basic interactivity and tweening (using Grant Skinner&#8217;s lightweight <a href="http://www.gskinner.com/blog/archives/2008/08/gtween_a_new_tw.html">GTween Engine</a>) into the class.  See <a href="http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00007o">this discussion</a> over at Edward Tufte&#8217;s site for some of the advantages and disadvantages of coxcomb charts.</p>
<p>Further, though bar charts may be easier to read, they may not work in all contexts, including interactive mapping, in which <em>compactness</em> may outweigh other concerns.</p>
<p>Here&#8217;s an example of some of the basic tweening and interactive capabilities of the class.</p>

<object	type="application/x-shockwave-flash"
			data="http://indiemaps.com/flash/coxcomb/CoxcombTest.swf"
			width="550"
			height="550">
	<param name="movie" value="http://indiemaps.com/flash/coxcomb/CoxcombTest.swf" />
</object>
<p class="caption"><a href="http://indiemaps.com/flash/coxcomb/srcview/">view source</a></p>
<p>When I initially thought of coding these charts, I balked at the idea of constructing slices whose relative areas would faithfully represent the data.  But I eventually realized that I could just use masked circles.  The circles could be accurately scaled to the data, and since each segment would mask the same percentage of the circle, the resultant pie slices would faithfully represent the data.  The technique (demonstrated in the diagram below) has the added advantage of allowing tweening of slices in animated displays.</p>
<div class="centerIMG"><img src="/images/coxcomb/coxcombAS3Diagram.png" alt="" /></div>
<p class="caption">the main technique behind CoxcombChart.as — the masking and rotation of CoxcombSlices</p>
<p>A bit easier said than done, since the strokes on each slice need to be drawn.  But that boils down to geometry.  Each slice need only be rotated into place to form the final rose chart.</p>
<p>I&#8217;ll create a few more examples in the future — I&#8217;m particularly interested in the cartographic applications of this multivariate chart.  I&#8217;ll likely post a <a href="http://modestmaps.com">Modest Maps</a> layer showing coxcomb charts of average precipitation-by-month for major cities, or something, in the near future.  For now, here&#8217;s all the code used to produce the graphics above.  It should be pretty easy to produce a coxcomb chart of any data using this code, and my interface is flexible enough to modify the appearance and interactivity of the charts for most applications.  But feel free to extend the class and override methods to alter these properties.</p>
<ul>
<li>A <a href="http://indiemaps.com/flash/coxcomb/coxcomb.zip">ZIP of the full distribution</a>.  This also includes Grant Skinner&#8217;s lightweight <a href="http://www.gskinner.com/blog/archives/2008/08/gtween_a_new_tw.html">GTween Engine</a> (a single .as class) which is required for the class.  Andy Woodruff&#8217;s <a href="http://www.cartogrammar.com/blog/drawing-dashed-lines-with-actionscript-3/">DashedLine</a> class is also included, as it is used in the Nightingale example.</li>
<li>For you browsers, view all the source <a href="http://indiemaps.com/flash/coxcomb/srcview/">here</a></li>
</ul>
<p>Creating the coxcombs should be pretty easy.  Here&#8217;s an example instantiation of a non-interactive coxcomb, 350 pixels wide/tall, with 3 slices, each a different color.</p>
<div class="codecolorer-container actionscript"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">var</span> myCoxcomb : CoxcombChart = <span class="kw2">new</span> CoxcombChart<span class="br0">&#40;</span> <br />
&nbsp; &nbsp; <span class="br0">&#91;</span> <br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span> label : <span class="st0">'red'</span>, value : <span class="nu0">15</span> <span class="br0">&#125;</span>,<br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span> label : <span class="st0">'white'</span>, value : <span class="nu0">20</span> <span class="br0">&#125;</span>,<br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span> label : <span class="st0">'blue'</span>, value : <span class="nu0">25</span> <span class="br0">&#125;</span> <br />
&nbsp;&nbsp; &nbsp;<span class="br0">&#93;</span>,<br />
&nbsp;&nbsp; &nbsp;<span class="nu0">350</span>, <span class="br0">&#91;</span> 0xff0000, 0xffffff, 0x0000ff <span class="br0">&#93;</span>, 0x888888<br />
<span class="br0">&#41;</span>;</div></div>
<div class="centerIMG"><img src="/images/coxcomb/simpleCoxcomb.png" alt="" /></div>
]]></content:encoded>
			<wfw:commentRss>http://indiemaps.com/blog/2008/10/nightingales-roses-in-actionscript-3/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
