,

Big Data: Remember Me?

References for coining the phrase Big Data seem mixed. Some indicate it was first mentioned in the late 1990s, then resurfaced in 2005; however, it wasn’t until 2010 that the term saw more frequent socialization. Regardless, the most interesting aspect to understand is how much data the phrase represents.

In 2010 as the phrase gained momentum, the worldwide volume of data was estimated at two zettabytes (or two trillion gigabytes). That’s a proverbial ton of data. Yet fast-forward 15 years to 2025, and the latest figures estimate the worldwide volume of data at 181 zettabytes. For the analytically inclined, that growth rate is not linear, but is more aptly described as exponential. Some analysts trying to ascertain a Compound Annual Growth Rate for Big Data’s increase have derived estimates ranging from 26% to as high as 58%. Building on those estimates and attempting to project the amount of data in five years, we could see worldwide amounts ranging from 574 to 1,782 zettabytes, depending on which growth rate you think is more accurate. In short, Big Data has grown up.

However it is calculated, who or what is generating all this data? Current thoughts suggest the major sources include social media, eCommerce transactions, and Internet of Things (IoT) sensors. Given the number of social media platforms, that first one is quite understandable. eCommerce transactions include banking, online payments, business transactions, etc., so that one seems quite plausible as well. 

But what about IoT? Is that really a major contributor? Considering that sensors are now placed in numerous devices, it starts to make sense. While many electronic devices have sensors (e.g., TVs, thermostats, smart home hubs, etc.), vehicles are becoming major contributors as well. At a recent conference, one manufacturer related that a single long-haul, heavy-duty truck can generate roughly 20 GB of data every minute. Meanwhile, a passenger car can generate up to between 5 and 25GB of data per hour. Another ton of data.

So, while the variety of devices that generate data may surprise you, the bulk of that data is still unstructured (e.g., images, videos, email, instant messages, presentations, etc.).

Why should you care? Your traditional relational database does not “play well” with unstructured (or semi-structured) data. You need to consider another database system if you want to derive value from that data. Those systems do exist. You also may need to consider increasing your analytical staff to effectively mine the data (AI may be able to help). There is value in the data, but it takes analytical skill to recognize. 

Back to the phrase Big Data and understanding the amount of data it initially represented: The question to ask is, should we use another phrase? Perhaps Enormous Data or Gargantuan Data or Stupendous Data? Gargantuan Data is a mouthful, but maybe it is the next phrase to use. AI needs data. Given the amount we generate, there is likely enough to feed it. We just need to understand what is collected and how to safely use it.


Dan Kempton is the Sr. IT Advisor at North Carolina Department of Information Technology. An accomplished IT executive with over 35 years of experience, Dan has worked nearly equally in the private sector, including startups and mid-to-large scale companies, and the public sector. His Bachelor’s and Master’s degrees in Computer Science fuel his curiosity about adopting and incorporating technology to reach business goals. His experience spans various technical areas including system architecture and applications. He has served on multiple technology advisory boards, ANSI committees, and he is currently an Adjunct Professor at the Industrial & Systems Engineering school at NC State University. He reports directly to the CIO for North Carolina, providing technical insight and guidance on how emerging technologies could address the state’s challenges.

Photo by Steve Johnson on Unsplash

Leave a Comment

Leave a comment

Leave a Reply