The human challenges of dealing with security alert backlogs [Q&A]
Most security teams experience some level of alert overload, struggling to prioritize the issues demanding immediate attention from those that are less pressing. This can lead to a backlog of problems to be dealt with and consequent stress on team members.
We talked to Yoav Nathaniel, CEO and co-founder of Silk Security, about why alert backlog is a people problem rather than a technological problem and how IT and security teams can overcome this challenge.
BN: Why do alert backlogs build up in the first place?
YN: The simple answer is that the number of alerts that detection tools generate far outpaces the ability of security teams to treat them. The broader picture is that the expanded attack surface also plays a role here, as does the growing number of vulnerabilities. However, there are compounding factors, especially duplicate alerts -- sometimes teams receive multiple alerts about the same issue from individual tools or overlapping tools produce alerts using different formats about the same issue. Since security teams contend with a tool-centric view, rather than a set of de-duped, prioritized, and contextualized findings, they have to work through each alert for validation through manual assessment. While alerts grow exponentially, security teams can only scale linearly by adding more analysts. The outcome is alert backlogs that reach well over a million, and often multiple millions.
BN: How can increased risk awareness play a part in addressing backlogs?
YN: It's important to distinguish between reducing the backlog through consolidation, de-duplication, and correlation of alerts, and taking a risk-centric approach to yielding prioritized findings. We have seen the consolidation process result in a 50 to one reduction in the alert backlog, because of the volume of duplicate alerts. However, a reduced backlog doesn't immediately translate into better risk awareness. More layers of context are needed for prioritization assessment based on risk, which encompasses understanding severity, likelihood of exploit, the asset's environmental context, as well as the asset's profile, and how it's related to other assets as a component of the application. The goal for teams should be to automate a scalable process that enables security teams to generate findings from detection tools that point to the most urgent risks for their environments and businesses.
The other dimension to bear in mind is that security teams still rely on IT, engineering, development, and other stakeholders to perform remediation. Well-understood remediation workflows and integrations into teams’ daily toolsets to facilitate task communication are critical, otherwise, security teams are shifting -- rather than fixing -- the bottleneck for risk reduction.
BN: Is this more about people than technology?
YN: The two elements are intertwined. Without the right technology in place to make sense of alerts and translate them into findings that are meaningful in terms of contextualized risk, the 'people problem' will remain unresolved. We frequently encounter poor organizational dynamics because security teams struggle to establish credibility, often stemming from their inability to effectively communicate a set of prioritized remediation requests. However, once the security team is in a position to pass off remediation requests with the right level of context and remediation guidance, the dynamics shift. Most teams recognize the importance of security and understand that they share responsibility for managing risk -- they just need a better partner in security to help them do the right thing.
From an operationalization perspective, part of helping them do the right thing is also making sure that the remediation requests align with the teams' daily functions. Using the same tools for remediation workflows while centralizing reporting also helps security teams better identify where the process is efficient and effective, and where more direct collaboration is needed to drive improvements.
BN: What can management do to improve the culture around handling alerts?
YN: It takes two to tango: security teams need to better prioritize, contextualize and operationalize if they want to transform the culture and dynamics. However, with the right set of processes in place to identify and communicate risk (as opposed to throwing alerts 'over the wall'), there is an opportunity to build better accountability for fixing the risk. Accountability is a critical lever to shifting the culture. Accurate and transparent reporting of assets, risks, ownership, controls, and processes can point out cybersecurity deficiencies and operational bottlenecks that may be preventing timely progress. Many organizations develop an organization-wide 'wall of shame' to highlight which organizational units are underperforming when compared to their peers -- this can create a sense of competition and humility among the different units.
BN: What role does AI have to play in prioritizing backlogs?
YN: AI can play a significant role in the overall risk remediation process by automating manual tasks, improving the accuracy of prioritization output by learning from human interaction with the technology output, as well as aiding in the effective communication of remediation steps. For Silk, this means using AI as an integral tool in our normalization and alert duplication process -- helping the system learn how to better identify duplicate alerts as well as assets from different tools that represent the alert and asset in differing ways.
One of the areas where we see security teams struggle and burn cycles in the remediation process is identifying who is responsible for a specific remediation task. Silk uses AI to firstly predict who the owner is, and then to refine ownership assignment for remediation responsibility based on feedback.
We have also seen positive impact from leveraging generative AI to provide remediation guidance. Silk generates a prompt for ChatGPT (removing any identifying data or sensitive information) for the specific vulnerability or required update, so the stakeholder responsible for remediation can make use of clear guidance on the next steps.
Image credit: solarseven/depositphotos.com