This blog post is an excerpt from GovLoop’s recent guide, How You Can Use Data Analytics to Change Government. Download the full guide here.
There’s no denying that big data analytics has the power to transform the way government operates and serves its citizens. There are countless stories of agencies using analytics to speed emergency response times, improve online services and better track internal metrics.
But the path to a successful big data analytics implementation is one that requires proper long-term planning, said Gus Horn, Global Consulting System Engineer at NetApp.
“The reason is there are many variables involved in acquiring and ingesting data into big data analytics platforms, and then there’s also the logic around the analytic component, how to write it and gain insight into the data,” Horn said. “The biggest stumbling block is typically around ingesting the data into an analytic platform.”
It’s common for organizations to start out with small development and testing environments and a handful of servers to demonstrate a proof of concept. Although these small environments are a great starting point for agencies to demonstrate the power of big data analytics, the environment must be adapted to properly plan for future data growth and larger projects.
“What we find is that these very small environments are extremely powerful,” Horn said. “So it can be very alluring for our customers worldwide, regardless of sector, to start down that path and say, ‘This works well, I’m just going to repeat and do more of the same.’ They become trapped in an architecture that is extremely rigid in its design, and two or three years down the road they’re unable to evolve, respond to newer technologies, or implement new methodologies in these platforms.”
The problems typically don’t come until later on when the system has grown exponentially, servers age and must be replaced, and disk failures are a constant issue. By that time, it’s very difficult to move the data to a new hardware infrastructure, and the employees who originally built the system have long left the organization. At that point, keeping the system up and running becomes very expensive.
To avoid these issues, Horn recommended that agencies plan accordingly in these key areas:
Growth. “Organizations always have to plan for growth when they start talking about a big data platform,” Horn said. “Big data platforms by their nature are designed to accommodate the velocity, variety and variability of content, and because of those three V’s these systems are never static. So while agencies start out small, they should always be cognizant of the fact that technology evolves.”
The central processing units that agencies use are rapidly evolving and so are the storage technologies. But the challenge is these two technologies that play integral roles in supporting big data systems are on two different growth paths. Horn suggested that agencies first understand the problem they’re trying to address, as well as their growth demands each month, each year and even further out.
“NetApp’s architecture is based on directly attaching the storage to the compute, but it gives agencies the flexibility to decouple it and change out the computational element without having to migrate or rebalance that storage,” Horn said.
Flexibility. Agencies that don’t consider flexibility early on may find themselves locked into what Horn calls “rigid architectural designs.” In this scenario, agencies will find it harder to replace their servers when needed, without having to migrate content. With NetApp, agencies can keep their data put when updat- ing their technology clusters.
Agility. Not only are flexible compute and storage important, but so is agility in providing different kinds of storage subsystems. One example is the ability to move different types of data to the most economical storage options, depending on how often the data is accessed and its importance. “There’s no point in keeping data that’s 10 years old and not very relevant in your hot bucket, when you can move that into a cold or glacier-like storage device and still have access to it should you ever need it. You can reserve the high-performance flash technologies for more meaningful data,” he said.
One of the perks NetApp provides its customers is the ability to automate data migration from higher-priced storage to lower-priced storage. NetApp works with agencies to tag the data so they can track its usage and determine which storage option is most appropriate.
Evolution. “Oftentimes, agencies get stuck repeating the same mistakes because the task of moving to a newer technology is perceived to be too heavy of a lift,” Horn said. Using NetApp’s data lifecycle management tool, the company can easily integrate into an agency’s existing technology cluster, migrate the data to new hardware and decommission older technology gracefully.