This blog post is an excerpt from GovLoop’s recent guide, How You Can Use Data Analytics to Change Government. Download the full guide here.
Imagine that you’re an analyst at the Defense Information Systems Agency. Your job is to detect and investigate anomalous network traffic that may be nefarious, including malicious insiders and nation states seeking to steal classified data and wreak havoc on defense systems.
Keep in mind that in fiscal 2014 alone, federal agencies reported nearly 70,000 security incidents. But that number doesn’t include the billions of unsuccessful hack attempts that agencies had to thwart each month.
As if an analyst’s work weren’t hard enough, imagine having to monitor myriad systems, looking for red flags that may or may not be the work of bad actors. Not that long ago, DISA analysts used a manual process to see the anomalous behavior and connections between various system alerts.
“In a lot of cases, it’s either missed or they get tips and cues from something else,” alerting them to dig deeper, said Dan Bart, Chief of Cyber Situational Awareness Systems. “For example, if a firewall log alerted an analyst to anomalous activities, he or she could then open an investigation and look at other sensors to try to draw connections from the data.” That was the old way of doing business.
About 18 months ago, DISA deployed a new anomaly detection system that in some respects will provide those connections for analysts. Rather than sending alerts that something may not be right, this new system will let analysts know when something is not right, Bart said.
Consider how many end users likely log on to their computers outside of normal business hours. With the rise of telework and flexible schedules, it’s hard to say what a “normal” workday looks like for any given employee. But analysts have to distinguish between anomalous activity that is normal and abnormal.
What’s far more valuable for analysts is the ability to easily correlate multiple data sets, such as what time users are logging on to their computers, how that compares with their normal computer use, and what they were downloading or uploading, said Christopher Paczkowski, Cyber Situational Awareness & Analytics Division Chief.
“You can start to add A, B, and C together to say, ‘Alright, this indicator is a lot higher priority than the rest of them,” Paczkowski added. “So now you’re giving the analyst an ordered list of priorities to look into,” instead of following up on every user who logged in after regular business hours.
Analysts can work more efficiently because they aren’t flooded with a sea of alerts that realistically can’t all be addressed at the same time. This system has helped DISA prioritize anomalies and understand what data is leaving the network in an unauthorized way, Bart said.
The genesis of the program has some ties to the devastating leaks by former government contractor Edward Snowden about the National Security Agency’s controversial data collection processes. But it isn’t just insiders that the system can detect. The same anomaly detection can be used to identify a nation state worming its way into the network, or even a machine misconfiguration. “So if you have a number of different log alerts or data sets that are flagged, then you’re able to correlate those and start answering heuristic questions,” or noting patterns in behavior, Bart said. For example, if an analyst sees the same pattern with a user logging in at an odd time or from an odd place, the system provides an analytic tip or alert that will ultimately lead to a countermeasure.
Humans decide what follow-up measures to take — not the system. “It doesn’t solve it,” Bart said. “It doesn’t catch the guy, but it does tip and cue to say, ‘Hey, this should be looked at.’”
Since its rollout about 18 months ago, the system continues to evolve. The agency is working to fine-tune the algorithm that defines and prioritizes threat factors. Think back to the earlier example of the individual who repeatedly logs in after hours, at different times. While it may raise eyebrows, it’s not uncommon or particularly suspicious.
If analysts combine that algorithm with different priorities such as data exfiltration, where the person logged in, what devices they used to connect to the network, and other behavior, then they can combine that information to get a prioritized rank of anomalous behavior. “Alerts used to be more compartmentalized, based on specific behaviors,” Paczkowski said. But now the focus is on tying that picture together to give analysts a more complete threat profile.