Learning About A Community: Data Analysis vs. Listening to People

(Note: This post was written by Jim Craner, 2012 CfA fellow. This aggregation system automatically lists Abhi Nemani as the author for all Code for America posts. We are working on a fix.)

As one of three Code for America Fellows working with the City of Santa Cruz, I am fortunate to spend five weeks in this beautiful city learning about the people, businesses, organizations, and institutions that make it such a great place to live and work. Of course, the beautiful weather and jaw-dropping scenery contribute as well — how’s that snow shoveling working out for you, Team Chicago? ;-)

As Fellows, our primary method of learning during our Residency month is simply meeting people and listening to them. We’re listening to the participants in our focus groups share their stories, we’re listening to the conversations between local business owners and dedicated civil servants in the City’s Planning Department, and we’re listening to the people of Santa Cruz tell us why they love this city and how we might be able to help make it even better. We’re also listening carefully as folks we meet tell us the best locations for burritos, bicycling, and beach-going.

Our focus this year is to help the City of Santa Cruz improve the experience of local businesses as those business obtain permits and licenses. We’re helping Santa Cruz businesses “get down to business,” so to speak: every hour that an entrepreneur doesn’t spend waiting in line for a form at City Hall is an hour he or she can spend innovating, building his or her business, and hiring employees!

But as Fellows, we also want to make decisions informed by data and we like to geek out sometimes. So when I heard about the City of Santa Cruz Business License Database, a city-published data file containing a list of every individual and company conducting business in the city, I knew I had to spend some time taking a look and seeing what else I could learn about the community.

I used Google Refine, a free data analysis tool provided by Google, to take a closer look at the 5,399 business licenses held in the City as of February 2012 (the City updates this file each month). Google Refine offers several easy methods for sorting and filtering large sets of data, letting me analyze the business licenses by location, type, founding date, and many other aspects.

Using Google Refine to take a closer look at the data makes it easy to get quick insights and interesting perspectives. With just a few clicks, I was able to see a list of the city’s businesses sorted by number of employees.

Screenshot of Google Refine interface showing sort

Hmmm, looks like the big employers include Plantronics, CostCo, the Boardwalk — aha! We’ve got a meeting scheduled later this month to interview the management of the Boardwalk — one of the city’s largest employers and biggest tourist attractions — about their thoughts on the permitting process. (We’ll also probably play Laser Tag and Skee Ball just to make sure we fully understand the Boardwalk as much as possible.)

Really, though, there are only a couple of dozen businesses in the city that employ over 100 people. We need to re-sort the data so we can easily see that the vast majority of businesses in Santa Cruz employ just one or two people. These are the professionals, contractors, moonlighters, the people who invested in a rental property or side business to supplement their income. These are the people that make up the small business community; these are the people that collectively spend thousands of hours every year filling out business permit applications and renewal forms; these are the people we’ve been seeking out so we can listen to their stories and learn from their experiences. These are the people we want to help when we begin building the city’s application with our partners in March.

Small Businesses make up the majority of Santa Cruz businesses

Let’s see how many new businesses have started in Santa Cruz so far this year (since our data was released the first week of February, we’ll basically be looking for January business license issuances). With Google Refine, we use the “Timeline Facet” and sort by the date the business was started. It’s easy to see the dozens of businesses that have launched this year: veterinarians, cycle shops, surf shops, jewelry stores, building contractors. Of course it’s great to see that activity happening, even if those entrepreneurs missed out on the chance to use the great permitting application we’ll be building this year! Hopefully they’ll enjoy renewing next year just as much!

Screenshot of Date Filter in Google Refine

It’s hard to tell with a data set like this if a “Business Start Date” of 1900 actually means the business is 112 years old or if it’s just a data entry or program error, but the Timeline shows that there are quite a few businesses with legitimate histories going back into the mid-20th Century. Hmmm, Zoccoli’s Deli downtown, with a business license history dating back to 1959. That sounds familiar: I just ran into Chief of Police Vogel there on Thursday where I listened to his advice on the best sandwich to get (I went with the chicken parm). By the way, SCPD is one of the most technologically-advanced police forces in the US so of course they release their own data sets for public analysis — but that’s a topic for another blog post.

I’ve just scratched the surface of Google Refine this week but it’s obviously a great tool to help analyze and visualize large data sets. Of course, even the best data and the niftiest tools are no substitute for sitting down and actively listening to your future users and the community that you’re working to serve. For instance, even though you can tie Google Refine to a GIS system and run a filter for alcohol licenses, there is no app that can beat just asking the local geeks where to find the closest bar to your house that has a great beer list and decent Wi-Fi (it’s the Seabright Brewery in our case :-) . Thankfully the geek community in Santa Cruz has welcomed us with open arms and answered those and many other questions for us already!

Original post

Leave a Comment


Leave a Reply

Andrew Krzmarzick

Great story – and excellent outline of your process for learning the needs of a community.

Good tip on Google Refine, too!

James Miceli

This is also GREAT use of a great tool from google. Wish my agency would allow me to access it… the perils of the firewall.

Brian Dowling

First, thanks for the introduction to Google Refine. Looks like a great tool. Also very impressed with what Santa Cruz is doing with their data especially in terms of transparency. Making their Business License database accessible is a great step. Many cities sell the list for a fast buck rather than allowing it to be ‘mined’ for its full value. I don’t think though that you really have a vs situation. When you listen to people you get a unique individual perspective but it should be still possible to tie that perspective to the insights brought through data analysis. Even when the results are surprising it, if the methodology and data is made transparent, it should be possible to map out the insights derived so that most will see the truth of them. There may still be a need to get people comfortable with the idea but Santa Cruz making their Business License data open is a step in that direction.