Federal agencies are wrestling with a fundamental problem with Big Data: The data is being generated at exponentially faster rates every year.
Tommy Gardner, Chief Technology Officer for HP Federal, said this is a key challenge facing data scientists. “You can restrict your data in some sense, or you can run multiple calculations on multiple parts of data, almost in parallel processing mode,” he said. But you need to find some way to get your data processed.
Sending data to the cloud can alleviate the problem. But it is an expensive solution since it’s bandwidth-intensive, and it erodes time sensitivity.
Some examples of huge data flows in government include:
- Climate readings such as temperature, humidity, precipitation amounts, wind speeds and cloud formation
- Satellite imagery for everything from weather forecasting to military activities and intelligence-gathering
- Generated results, such as applying algorithms to real-world information to create multiple hypothetical scenarios based on changing sets of assumptions
At the same time, agencies can use that data to meet their unique mission requirements if they can harness data science techniques and emerging technologies such as artificial intelligence and virtual reality. But they face the limitations of their legacy IT infrastructure and careful allocation of scarce resources, such as supercomputer availability.
The global pandemic has added another layer of complexity to these challenges by forcing many in the federal government to work remotely. They have not had seamless access to the computing resources integral to doing their jobs. IT teams throughout the government have spent the past year scrambling to find ways to securely deliver capabilities so agencies can continue to meet their mission requirements.
The government now plans to support an elevated level of remote working in the future, even once the pandemic subsides and workers can return to their offices – which means continued emphasis on distributed high-end computing capabilities.
The Solution: High-Performance Workstations
Agencies are finding that high-performance workstations offer a powerful alternative. While workstations can’t handle the volume of data that a supercomputer can, high-end workstations can be used to separate data into much smaller subsets, process it and then combine it back into a single query, all from the comfort of one’s office or home.
“The high-end workstation is probably one of the most valuable tools that HP makes for the world to do their jobs and to get things done correctly, accurately, systematically and safely,” said Gardner. “And at the same time, they can be utilized to make decisions that in the past we just weren’t able to make.”
These new high-end workstations include such benefits as massive storage, which allows users to store large quantities of data for analysis; very high-end graphics capabilities suitable for image-intensive applications such as design, architecture and video editing; and significantly faster processing times through the use of high-end microprocessors and multiple cores. Such benefits translate into lower costs of operation by minimizing the use of cloud resources and improving workflow. And they can provide useful insights much faster than waiting for supercomputer cycles.
Depending on the user and application, this also opens up the prospect of shifting analysis to the point of collection. For instance, connecting sensors in the field to their workstations allows researchers to collect many different types of data for processing as it is being recorded, leading to timely analysis and the ability to spot trends in real time. Only high-end workstations can support the level of distributed, powerful computing needed for highly scientific applications.
“The edge is really where data is being collected,” Gardner said. Moving processing closer to it is critical for the future.
This article is an excerpt from GovLoop’s report, “Power Tools to Step Up to the Federal Data Challenges.” Download the full report here.