GenAI and its hallucinations: A guide for developers and security teams

Artificial-intelligence

With the rapid proliferation of Generative AI (GenAI), developers are increasingly integrating tools like ChatGPT, Copilot, Bard, and Claude into their workflows. According to OpenAI, over 80 percent of Fortune 500 companies are already using GenAI tools to some extent, whilst a separate report shows that 83 percent of developers are using AI-tools to speed up coding.

However, this enthusiasm for GenAI needs to be balanced with a note of caution as it also brings a wave of security challenges that are easily overlooked. For many organizations, the rapid adoption of these tools has outpaced the enterprise's understanding of their inherent security vulnerabilities. This would yield a set of blocking policies for example, Italy had at one point this year completely blocked usage of GPT, which is never the answer.

This misalignment could not only compromise an organization’s data integrity but also impact its overall cyber resilience. So, how should AppSec teams, developers, and business leaders respond to the security challenges that accompany the widespread use of GenAI?

The risks of GenAI for developers

For developers and AppSec teams, identifying and addressing the critical concerns of GenAI should be the first step in its adoption, with the proviso that we're still in the very early stages of understanding all the intricacies of the risks it poses.

A prevalent issue is "AI Hallucinations," where Large Language Models (LLMs) generate false information with high confidence. This is because of the inherent nature of how these LLM are built and trained. Some LLM models are trained on open-source data. While this training data undergoes some degree of cleaning to remove biased or inaccurate material, it's a challenging problem to solve completely due to the subtlety and context-dependency of language.

These hallucinations can manifest in multiple ways, such as generating inaccurate code snippets, recommending incorrect security protocols, or even creating data that appears real but is, in fact, fictitious. If developers trust these outputs, they would unwittingly integrate this into the larger codebase, which might create vulnerabilites in applications and networks, leading to supply-chain attacks. Additionally, machine-generated code creates a “false-hope” of security and people tend to think that if the AI generated the code, it has to be more secure.

Additionally, privacy and data integrity are of paramount concern. Take the recent case where Samsung filed a lawsuit against ChatGPT for uploading proprietary code. Once data is fed into an LLM, extricating it is no small feat.

This complexity is highlighted by the findings from a StackOverflow survey, which reports that 82.55 percent of developers are using AI tools for code writing, and that 42 percent trust GenAI outputs while 32 percent reported productivity gains. The balance between benefits and risks is a fine line and the over-reliance and under-scrutiny of GenAI can result in severe consequences for developers and AppSec teams alike. At the end, technology will always find a way to advance and it’s our responsibility to make sure we understand and reduce the risks of getting there.

Mitigating GenAI risks -- a balanced approach

Completely blocking GenAI is an ineffective strategy, much like shutting the barn door after the horse has bolted. Many CISOs have opted for a nuanced, "risk-driven" approach that involves diversifying their GenAI toolset, but there’s no one-size-fits-all solution. For example, development teams often prefer a blend of Co-pilot and Co-pilot chat for coding tasks, while broader organizational or business use-cases may better align with ChatGPT. A multi-GenAI approach allows enterprises to maximize efficiency without pigeonholing various departments into using unsuitable tools.

Privacy remains a significant challenge for CISOs and more enterprises are resorting to "private instances" of GenAI solutions. Azure OpenAI was a pioneer in this realm, and now OpenAI has followed suit with their "ChatGPT Enterprise" offering. These private instances provide an extra layer of assurance that sensitive data won't be exploited.

In fact, GenAI is so disruptive that it’s impacting the entire organization and not only the technology side. It has implications on the CFO, CRO and every other C-level in an enterprise. If my financial controllers are now uploading sensitive data to GPT for example, it creates a risk for the CFO. If my sellers are uploading opportunities to Bard for advice, it generates a risk for the CRO. Everyone is impacted by it.

Reviewing the OWASP top 10 for GenAI can provide valuable insights into understanding the risks and implementing existing protective measures. Regulatory frameworks governing the use of GenAI are still in the very early stages of being formulated but developers must still work within the scope and structures of existing guidelines such as the GDPR. The OWASP top 10 is focused mostly on the technological aspect of using GenAI and as stated already, GenAI has impacts on every department in the organization.

In Europe, a proposed EU AI Act, aims to provide specific provisions into what should be considered unacceptable risks, limited risks, and high-risk systems and will provide developers and users with ‘clear requirements and obligations’ on the use of AI. 

Best practices for secure GenAI adoption

The key to navigating this complex landscape lies in robust DevSecOps processes. While there may be a sense that machine-generated code is inherently more secure, this couldn't be further from the truth. Existing DevSecOps processes must remain in place and be rigorously followed, regardless of the source of the code.

 Also, business leaders and CISOs should put more emphasis on training developers on the security vulnerabilities and gaps of LLMs. GenAI undeniably offers an immediate impact the speed of development, enhancing efficiency almost instantaneously, but to ensure this doesn't create bigger long-term risks, education is critical. Developers must be equipped with the skills to recognize the limitations of GenAI, such as the potential for AI hallucinations and other biases.

With no industry standards in sight, the onus is on individual organizations to ensure responsible GenAI usage. Multiple tools exist to suit varying needs, but best practices involve a balanced blend of tool selection, employee education, and existing protective measures.

Ori Bendet is VP of Product Management at Checkmarx. Ori brings more than 17 years of senior-level experience to his role as VP of Product Management at Checkmarx where he oversees the entire AST portfolio, serving thousands of customers worldwide.

Comments are closed.

© 1998-2024 BetaNews, Inc. All Rights Reserved. Privacy Policy - Cookie Policy.