NASA has a lot of data.
Understatement, right? In pursuit of its exploration mission, NASA has generated, collected and compiled a vast amount of data through the eyes of satellites, telescopes, robots, spacecraft, wind tunnels, laboratories and the cameras of astronauts that has helped us better understand Earth, other planets, and the depths of space. In the time it took you to read this sentence, NASA gathered approximately more than 2 gigabytes of data from our nearly 100 currently active missions. And that’s just the current remote sensing data flow.
“The goal is to turn data into information, and information into insight.” — Carly Fiorina, former President and Chair of Hewlett-Packard
This data — from enormous technical and scientific datasets, to records, reports, simulations, videos, images and personal stories — is one of our Agency’s most valuable assets and must be understood and managed accordingly. For NASA, mission success requires us to make sense of all this data and to find new ways to effectively access, manage, scale, interpret and analyze this data for insights, decisions and discoveries. It’s not as easy as it sounds, though.
A wicked problem
The management of all this data is what we call a wicked problem. A wicked problem is defined as something that is difficult or impossible to solve because of incomplete, contradictory, and changing requirements. In government, there are many wicked problems, such as balancing the federal budget or delivering every day services, but the management of data may be the most wicked of them all.
No clear definition
Wicked problems have no clear definition and their root causes are complex and tangled. The average engineer at NASA regularly interfaces with data stored on any number of devices such as servers, desktops, laptops, and mobile devices; data embedded in labs, facilities, and space vehicles; data used in software, databases, file systems, websites, and application-specific local storage; or data generated by everyday business operations such as communication messages, network packets, and vehicle commands/telemetry.
Radically different world views
The many different people involved in solving a wicked problem usually have radically different worldviews and approaches, making the situation even more difficult. At NASA, we have four (diverse) mission directorates, eleven Distributed Active Archive Centers, hundreds of active SBIRs, and more than 100 active missions collecting terabytes of data. This data is generally collected and stored at the project level — sometimes on the laptop of the principal scientist — and often according to the practices of their home institution. The vocabulary we use to talk about data even depends on whether you are scientist, an engineer, a business unit, an operations team or any number of other institutional stakeholder. Essentially, everyone is involved in one way or another.
An under-equipped workforce
Many organizations simply lack the internal expertise to address their data needs, limiting their ability to appropriately address the underlying systemic issues, and limiting their ability to develop that expertise and capacity in-house. Without the understanding that comes with data science expertise, it’s common to view data as synonymous with information, and as the processes required to turn the former into the latter are invisible to them, insight is often “independently acquired.”
Constantly changing technology
Making sense of the data universe is a particularly challenging pursuit in more mature organizations because they are often working with huge, outdated systems that have never evolved, have no budget to update them, and many of the decision makers do not see what the issue is. Think about flying a spacecraft, using 1990’s technology, without administrative rights on your machine, all while my four year old at home can fly a drone using an iPhone.
The need for ongoing experimentation and learning
Ultimately, a solution to a wicked problem depends on how the problem is framed and vice versa. Wicked problems are never really solved definitively and require ongoing experimentation and learning. This is difficult in the context of large organizations like government agencies because of the continuously changing constraints and resources.
There is no wonder why it so hard to build and maintain a stable capacity to access, manage, scale, interpret and analyze data for insights, decisions and discoveries. However, we believe the fundamental reason that most agencies are well behind in addressing their wicked data problem is not because they are unable to procure the right tools, or hire the right contractors, or write a compelling policy… but because they fundamentally neglect to develop and implement a system-wide data strategy.
We need a data strategy?
Gary Pisano, in a recent article in the Harvard Business Review, noted that “a strategy is nothing more than a commitment to a set of coherent, mutually reinforcing policies or behaviors aimed at achieving a specific competitive goal. Good strategies promote alignment among diverse groups within an organization, clarify objectives and priorities, and help focus efforts around them.” Most government agencies have mission and even business strategies; we regularly line up various functions to support those strategic objectives. Operations always present trade-offs, and strategy defines how we make those decisions and how we pursue the same priorities.
“Having a strategy suggests an ability to look up from the short term and the trivial to view the long term and the essential, to address causes rather than symptoms, to see woods rather than trees.” — Lawrence Freedman, Strategy: A History
Any organization’s fundamental capacity to use data is derived from a data system. This system is a coherent set of interdependent processes and structures that dictates how the organization organizes and searches its resources, defines its problems and solutions, synthesizes ideas into business concepts and mission designs, and selects which projects get funded. Approaching these challenges necessitates both an understanding of the data lifecycle and a unified strategy that reaches across centers, missions, technical disciplines, business functions and the broader scientific community.
If the system isn’t sustainable, the agency will never develop a true data capability. A successful strategy reshapes the whole system to have impact, identifying relevant stakeholders, gaining their support, enabling their communication and collaboration, and making the subsequent processes flexible enough for when mission needs, technical inputs and requested outputs change.
Your Data Strategy should begin by answering these questions:
- What data does your agency currently have? What data do you wish you had?
- Where is your data located?
- How will this data create value for each level and sector of the organization?
- What types of data will allow the agency to create and capture value, and what resources should each type receive?
- How will the agency apply and operationalize the value its data creates?
- Are there emerging analytic capabilities within your organization that you can leverage for the good of the whole?
- How conducive are your standards and architecture to supporting your data lifecycle and meeting your analytics needs?
- What partnerships can the agency leverage to permit collaborative development?
- Do you have the right skill set within your agency to create value from this data? If not, what will it take to attract it?
“Data is a precious thing and will last longer than the systems themselves.” — Tim Berners-Lee, inventor of the World Wide Web
Most critical to developing a data strategy is establishing the mission focus of your agency — whether exploring space, balancing budgets, or delivering other services — as the driving force to explore new data techniques and derive increasing value from the data that you are producing. Your mission strategy and data strategy must be intertwined through a shared goal that ties back to strategic initiatives of the missions and programs.
Your data strategy should help you manage the day-to-day trade offs of governance and support improved decision-making. It will focus first on the data and shaping a flexible system to the data — not the other way around. The system will evolve as the inputs and stakeholders shift, and government isn’t always conducive to developing flexibility.
The volume, variety and velocity of data generation are all enormous and growing; current strategy and approaches are clearly unable to cope with the challenges. Your organization — and ours — will need to change the way we acquire, plan, manage, share and govern our data to fully realize the potential of the data, and we need to realize that potential if we want to realize our goal of pioneering the future.
The key to developing a data strategy for your organization is to start somewhere and do something.