, ,

How Do You Get in on the Ground Floor of Big Data?

How do you get in on the ground floor of big data? Programming? Statistics? Research experience?

Last week I questioned the idea of leaving behind our traditional models of data analysis in favor of big data. Much like being the last kid in your grade school class to get the famed New Kids on the Block lunch box, the ubiquitous Adidas sneakers or the trendy North Face jacket, I sometimes feel like I’m the last kid in the class to lead a big data project.

Obviously, most employees in the workforce today were out of school before the advent of big data courses (or the New Kids on the Block, for that matter). If we are lucky enough have experience with big data, often it’s come about through our own bootstrapping. Some career fields lend themselves to this more than others. So if you happen to be one of the ones who haven’t yet had a big data project, how do you get in on the ground floor of big data?

In fact, who says that anyone isn’t qualified to start looking at big data? What does it mean to be a “data scientist” anyway? This seemingly new field is new enough that it seems that the definition is still malleable, so what should attributes should today’s data scientist have? I see four possible backgrounds that could make someone a great candidate for looking at big data:

1. A programming background. The whole idea behind big data is that the data set is too large for anyone to parse through by hand. This necessitates the use of data-mining algorithms and software that will parse through the data for us. While programming may be more user-friendly than it used to be, that’s not to say that the programming required is self-evident or even easily acquired on the job. Often a programming background would be hugely beneficial in this arena.

2. A statistics background. Having literally millions of data points can be somewhat useless if you can’t calculate statistics based on these data points. A background in statistics is necessary for handling the large sample sizes inherent in big data and allows one to control for a variety of factors. A statistics background also gives one the tools to assess whether differences between data sets, or points within a data set, are statistically significant or just simply different.

3. A mathematics background. Unlike smaller data sets, which require expertise in the field being studied, larger data sets require skills like linear algebra, calculus, and probability theory (maybe even some graph theory), regardless of the field being analyzed. While these skills can be self-taught, it’s much easier to learn linear algebra or calculus in a classroom setting than to struggle through these tough subjects on your own. This makes mathematics majors that much more valuable as big data hunters.

4. A research background. At its core, data analysis is about using the information available to discover new trends and make conclusions. This is also the heart and soul of what a researcher does. Those with a background in research also possess strengths in experimental design and data organization, allowing them to keep everything straight through the planning, execution, and analysis phases. They are also (hopefully) adept at reporting their conclusions to peer-reviewed sources, a hallmark of good research.

Of course, background isn’t enough. There must be an element of in-the-right-place-at-the-right-time to be in a field that not only embraces big data philosophically, but also has the capabilities, both technological and financial, to do so. And for government employees, we have an additional caveat: it must fit with the mission of your department/agency/office.

So what do you think? What’s needed to get in on the ground floor of big data? This is the classic Catch-22 in which many new college graduates find themselves: Prospective employers want you to have experience, but without a job, how can you get experience? Likewise, how can we bootstrap ourselves when it comes to big data? How can we seek out the right projects that will broaden our data analysis horizons while still being within our reach? Or is it the other way around- if you have the right background, will big data eventually come to you?

Erika Bakota is part of the GovLoop Featured Blogger program, where we feature blog posts by government voices from all across the country (and world!). To see more Featured Blogger posts, click here.

Leave a Comment

Leave a comment

Leave a Reply