My recent work on creating a more flexible circular cartogram algorithm meant that I needed (well, wanted) to load shapefiles into a Python application. After many searches and scouring of message boards, I settled on the OGR/GDAL libraries. And 3 hours later I had it installed on my MacBook…
I’m sure that OGR (vector) and GDAL (raster; pronounced, apparently, like ‘goodle’) are great for their intended purpose: providing a jack-of-all-trades library for doing pretty much anything with geospatial data. But for users who need to do only simple tasks, they’re overkill. And slow. Loading a small shapefile took a few seconds, and accessing points to, say, compute centroids took a few more. This was too much. I needed a simple and quick module to load shapefiles and populate a list of their contents. I couldn’t find one, so I wrote my own. Have a gander:
[UPDATE: The following file has been modified so that it actually works]
shpUtils.py
Here’s a simple use example:
# load the shapefile, populating a list of dictionaries
shpRecords = shpUtils.loadShapefile('shapefiles/gb.shp')
# now do whatever you want with the resulting data
# i'm going to just print out the first feature in this shapefile
print shpRecords[0]['dbf_data']
for part in shpRecords[0]['shp_data']:
print part, shpRecords[0]['shp_data'][part]
The above code (tried out on a shapefile of British counties), produces the following output (copy-pasted to a file).
After working on the module for a while, I realized Python doesn’t have a standard utility for loading DBF files (like PHP’s dbase_open). A few such libraries exist for Python, but all were overkill for my purposes. Then I stumbled onto this on ASPN — a very simple DBF reader/writer module by Raymond Hettinger. I copy-pasted it into the following file:
You’ll need that file, too, if you want to use my shapefile module. Enjoy!
9 Comments
I’m working on a new module, based on this, which will let you convert the shapefile into geoJSON (shp2json) or kml (shp2kml), in addition to the current dictionary output.
Michael Geary uses shpUtils.py in his recent Mapping the Vote work. He’s fixed a few bugs and posted his version here.
and there’s a cool video up on YouTube of Michael’s Google talk of his Mapping the Votes work.
Hi Zachary,
I have been trying to use your lib to parse some ESRI shapefiles. It works great until I simplified the shapefiles with Matt Bloch’s mapshaper. At that point I run into this error:
Traceback (most recent call last):
File “parseShape.py”, line 3, in
shpRecords = shpUtils.loadShapefile(’shapefiles/us/county2000/county2000-polygons-94pct.shp’)
File “/Users/j-vung/Desktop/ShapefileParser/Python libs/shpUtils.py”, line 27, in loadShapefile
shp_record = createRecord(fp)
File “/Users/j-vung/Desktop/ShapefileParser/Python libs/shpUtils.py”, line 47, in createRecord
for i in range(0,len(db[record_number+1])):
IndexError: cannot fit ‘long’ into an index-sized integer
My knowledge in python is pretty limited. I have been tried to look it up online, but couldn’t find the answer.
Any suggestions would be great. Thank you for your time.
these have been incredibly useful, thanks so much!
Thanks a lot for this! However, there seems to be a bug in dbfUtils.py. It reads something before the EOF which is blank (for ‘value’) and tries to process it which in turn throws an exception. It was fixed by testing the value first before proceeding and doing a ‘continue’ if value turned out to be blank. Your sample program is a big help too!
Thank-you so much!!! I’ve been looking for a way to parse shp in python forever.
Ethan
I had the same issue Vu Nguyen had. My shapefile was built in QGIS by merging two different shapefiles and deleting and editing several features. This error happens because somehow, a feature in your shapefile gets a ‘long’ type automatic record number. Had to redraw the shapefile from scratch (as I had a small one). For larger ones, maybe with by creating a new topology in ArcGIS, the shapefile records might be reset… Still not sure about this…
This was exactly what I needed for some quick-and-dirty plotting. Thank you very much!
3 Trackbacks
[...] folks. So, just 1.5 months ago I posted my own Python script to load shapefile data. Well, recently — while doing some pretty hardcore shapefile loading + cartogram generation [...]
[...] first hurdle in map making is getting the data in, for this I used the shapefile reader that Zachary Forest Johnson put together for his excellent blog ‘IndieMaps.com‘. This [...]
[...] first hurdle in map making is getting the data in, for this I used the shapefile reader that Zachary Forest Johnson put together for his excellent blog ‘IndieMaps.com‘. [...]