Ann Arbor’s Year in Crime

Data available from the city of Ann Arbor's website

The map below depicts the better part of the year in crime for Ann Arbor in 2008. The first two weeks of January and the last two weeks of December are not included. So the data presented here should not be considered complete or official. The zooming slider allows a closer examination of individual neighborhoods. Clicking on the markers causes a balloon to appear that includes the date and category of the crime. The map itself appears after the jump. Here’s a larger interactive crime map with the same data. Discussion of data sources and the city of Ann Arbor’s system for pushing information appears below the map.

The Chronicle was alerted to a revision to the city website’s crime data archive via the automatic email system we’ve signed up for. Anyone can subscribe to the system, and it’s customizable by topic. Readers who don’t wish to receive notifications of crime updates can simply opt out of that topic at sign up. But The Chronicle subscribes to the complete package of notifications. So we received the following note in the email yesterday:

-
From: City of Ann Arbor, MI
To: dave.askins@annarborchronicle.com
Date: Tue, Dec 23, 2008 at 3:21 PM
Subject: Ann Arbor Police Crime Statistics – Dec. 14-20, 2008

You are subscribed to Neighborhood Watch for City of Ann Arbor, MI. This information has recently been updated, and is now available.

Update your subscriptions, modify your password or e-mail address, or stop subscriptions at any time on your Subscriber Preferences Page. You will need to use your e-mail address to log in. If you have questions or problems with the subscription service, please contact support@govdelivery.com.

This service is provided to you at no charge by City of Ann Arbor, MI.

GovDelivery, Inc. sending on behalf of City of Ann Arbor, MI · 100 N. Fifth Avenue · Ann Arbor MI 48104 · 734-994-2700

The link embedded in the email leads to the city of Ann Arbor’s neighborhood watch page, which includes a PDF archive of crime data that is divided into chunks of about two weeks per PDF. The notification yesterday was the second one we’d received. Two weeks ago, we received a similar notice. To date, we hadn’t allocated time and resources at The Chronicle for opening up PDFs with crime – partly because many of the PDFs generated by the city contain scanned images of pages, as opposed to the “native text” that allows for easy use of that data. We imagined that the crime data PDFs probably fit the scanned image pattern.

However, Ed Vielmetti forwarded to us by email the contents of the PDF, which he’d copied-and-pasted from the file. From that we concluded that the PDFs contained text, not just images of pages, and from there, we were able to assemble a single file of text containing the data, delimit it, geocode it, and map it. The key concept for mapping – a KML file – came from Vielmetti and a second-degree connection to him, Andrew Turner.

If time and resources permit, in 2009 we may provide maps of crime data from the city’s website as they become available. We’d first want to develop an efficient work process for generating them, as well as discuss with the police department the categories we’ve lumped together based on a lay person’s understanding (e.g., burglary and larceny from a vehicle are given the same color marker in the map we’ve generated above).

Technical Details

Why don’t we just use Google’s MyMaps instead of GPSVisualizer? There’s an apparent limit of 200 placemarks for Google’s MyMaps, which the year’s data easily exceeded. The 2009 strategy of creating one map for each two-week span covered by the city’s PDFs, plus the ability to assign log-in privileges to others for help in doing the work, points towards adoption of Google’s MyMaps.

Where do you get the longitude and latitude for the addresses? We submitted them to a for-now free service: BatchGeoCode. GPSVisualizer provides batch geocoding as well, but not at the level of precision required for this work. For presentation of the data, however, GPSVisualizer provides a whole range of options, including the ability to present quantified data – like campaign donations displayed on a map by donor address with a marker scaled relative to the size of the donation.

How do you get from the set of geocoded crime data to a KML file? In our case, we did a mail merge using MS Word and an MS Excel table as a data source. We’ll pause for a moment while our geek readers cough up their collective lungs laughing. It does work, although it requires using text like “LEFTANGLE” and “RIGHTANGLE” for the ubiquitous left and right angled brackets of KML, so that MS Word doesn’t try to interpret the document as something it cannot read and then crash. See “develop efficient work process” above.

How do you delimit the text, once you’ve got it out of the PDF? We used a text editor called TextWrangler, which supports grep searches, making it fairly easy to deal with records with no spaces between address numbers and street names.

Files:

MS Excel Spreadsheet of 2008 Crime Data [not complete or official]

KML file with data from Excel Spreadsheet of 2008 Crime Data [not complete or official]

4 Comments

  1. December 25, 2008 at 11:29 am | permalink

    Hey Dave,

    I just sucked your spreadsheet into http://umichcrime.org so your canned 2008 AAPD incidents now show up alongside those pulled on demand from DPS. :-)

    Seems that you’re missing a lot of data. I tried asking the AAPD for a clean, realtime dataset years ago to no avail (same with the AATA). Those PDFs look clean enough to scrape, though – maybe I’ll take a look when I get some time…

  2. By Virgil
    January 7, 2009 at 2:03 pm | permalink

    Not showing the theft of my car from 1/11/08. I guess I must of lent it to a complete stranger without my knowledge. Or the keys.

  3. By Dave Askins
    January 7, 2009 at 2:13 pm | permalink

    Virgil,

    The set of PDF’s on the city’s site had some gaps, which included the period during which your car was stolen. “The first two weeks of January and the last two weeks of December are not included.” Depending on where is was stolen, it might show up on Dug Song’s UmichCrime.org mentioned in the first comment.

  4. January 13, 2009 at 9:49 am | permalink

    I uploaded the CSV to GeoCommons and quickly built this map: http://maker.geocommons.com/maps/2294

    It would be interesting to pull in other local information such as police stations, and other demographics.

    We’re also adding some features to the GeoCommons Maker maps that should make rendering faster.

    A year ago I built a quick PDF scraper of the neighborhood watch reports – but the problem was the format changed every other week, so it was difficult/impossible to automate. I tried working some channels then, and there was interest, but then due to “budget” or other reasons people weren’t willing to discuss collaborating.