easy shapefile loading in python

My recent work on creating a more flexible circular cartogram algorithm meant that I needed (well, wanted) to load shapefiles into a Python application. After many searches and scouring of message boards, I settled on the OGR/GDAL libraries. And 3 hours later I had it installed on my MacBook…

I’m sure that OGR (vector) and GDAL (raster; pronounced, apparently, like ‘goodle’) are great for their intended purpose: providing a jack-of-all-trades library for doing pretty much anything with geospatial data. But for users who need to do only simple tasks, they’re overkill. And slow. Loading a small shapefile took a few seconds, and accessing points to, say, compute centroids took a few more. This was too much. I needed a simple and quick module to load shapefiles and populate a list of their contents. I couldn’t find one, so I wrote my own. Have a gander:

[UPDATE: The following file has been modified so that it actually works]
shpUtils.py

Here’s a simple use example:

import shpUtils
# load the shapefile, populating a list of dictionaries
shpRecords = shpUtils.loadShapefile('shapefiles/gb.shp')
# now do whatever you want with the resulting data
# i'm going to just print out the first feature in this shapefile
print shpRecords[0]['dbf_data']
for part in shpRecords[0]['shp_data']:
 print part, shpRecords[0]['shp_data'][part]

The above code (tried out on a shapefile of British counties), produces the following output (copy-pasted to a file).

After working on the module for a while, I realized Python doesn’t have a standard utility for loading DBF files (like PHP’s dbase_open). A few such libraries exist for Python, but all were overkill for my purposes. Then I stumbled onto this on ASPN — a very simple DBF reader/writer module by Raymond Hettinger. I copy-pasted it into the following file:

dbfUtils.py

You’ll need that file, too, if you want to use my shapefile module. Enjoy!

10 Comments

  1. I’m working on a new module, based on this, which will let you convert the shapefile into geoJSON (shp2json) or kml (shp2kml), in addition to the current dictionary output.

    admin
    Posted May 31, 2008 at 12:34 pm | Permalink
  2. Michael Geary uses shpUtils.py in his recent Mapping the Vote work. He’s fixed a few bugs and posted his version here.

    admin
    Posted May 31, 2008 at 12:35 pm | Permalink
  3. and there’s a cool video up on YouTube of Michael’s Google talk of his Mapping the Votes work.

    admin
    Posted May 31, 2008 at 12:36 pm | Permalink
  4. Hi Zachary,

    I have been trying to use your lib to parse some ESRI shapefiles. It works great until I simplified the shapefiles with Matt Bloch’s mapshaper. At that point I run into this error:

    Traceback (most recent call last):
    File “parseShape.py”, line 3, in
    shpRecords = shpUtils.loadShapefile(’shapefiles/us/county2000/county2000-polygons-94pct.shp’)
    File “/Users/j-vung/Desktop/ShapefileParser/Python libs/shpUtils.py”, line 27, in loadShapefile
    shp_record = createRecord(fp)
    File “/Users/j-vung/Desktop/ShapefileParser/Python libs/shpUtils.py”, line 47, in createRecord
    for i in range(0,len(db[record_number+1])):
    IndexError: cannot fit ‘long’ into an index-sized integer

    My knowledge in python is pretty limited. I have been tried to look it up online, but couldn’t find the answer.

    Any suggestions would be great. Thank you for your time.

    Vu Nguyen
    Posted June 17, 2009 at 5:13 pm | Permalink
  5. these have been incredibly useful, thanks so much!

    Matt B
    Posted May 5, 2010 at 10:23 pm | Permalink
  6. Thanks a lot for this! However, there seems to be a bug in dbfUtils.py. It reads something before the EOF which is blank (for ‘value’) and tries to process it which in turn throws an exception. It was fixed by testing the value first before proceeding and doing a ‘continue’ if value turned out to be blank. Your sample program is a big help too!

    Noel Quintos
    Posted July 1, 2010 at 12:27 am | Permalink
  7. Thank-you so much!!! I’ve been looking for a way to parse shp in python forever.

    Ethan

    Ethan Van den Berg
    Posted August 28, 2010 at 2:54 pm | Permalink
  8. I had the same issue Vu Nguyen had. My shapefile was built in QGIS by merging two different shapefiles and deleting and editing several features. This error happens because somehow, a feature in your shapefile gets a ‘long’ type automatic record number. Had to redraw the shapefile from scratch (as I had a small one). For larger ones, maybe with by creating a new topology in ArcGIS, the shapefile records might be reset… Still not sure about this…

    camivic
    Posted October 16, 2010 at 1:56 am | Permalink
  9. This was exactly what I needed for some quick-and-dirty plotting. Thank you very much!

    Will Chang
    Posted April 6, 2011 at 4:01 pm | Permalink
  10. Saved аѕ a favorite, I love your blog!

    Cleta
    Posted December 28, 2013 at 12:53 pm | Permalink

3 Trackbacks

  1. [...] folks. So, just 1.5 months ago I posted my own Python script to load shapefile data. Well, recently — while doing some pretty hardcore shapefile loading + cartogram generation [...]

  2. [...] first hurdle in map making is getting the data in, for this I used the shapefile reader that Zachary Forest Johnson put together for his excellent blog ‘IndieMaps.com‘. This [...]

  3. By A Thematic Map in Python on February 20, 2011 at 9:41 am

    [...] first hurdle in map making is getting the data in, for this I used the shapefile reader that Zachary Forest Johnson put together for his excellent blog ‘IndieMaps.com‘. [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *