How To Mitigate The Enterprise Security Risks Of LLMs

Christopher Savoie, PhD, is the CEO & founder of Zapata AI. He is a published scholar in medicine, biochemistry and computer science.

getty

Since ChatGPT came out last year, Large Language Models (LLMs) have been on the tip of every enterprise leader’s tongue. These AI-powered tools have promised to dramatically increase productivity by automating or assisting with the creation of marketing content, sales materials, regulatory documents, legal contracts and more—while transforming customer service with more responsive, human-like chatbots.

However, as these LLMs become increasingly integrated into business operations, enterprises should be aware of several potential security risks.

There are three layers to the security issues of LLMs.

1. Sharing sensitive data with an external LLM provider.

2. The security of the model itself.

3. Unauthorized access to sensitive data that LLMs are trained on.

Sharing Sensitive Data With External LLM Services

MORE FROMFORBES ADVISOR

Best High-Yield Savings Accounts Of September 2023

Kevin Payne

Contributor

Best 5% Interest Savings Accounts of September 2023

Cassidy Horton

Contributor

Back in May, Samsung was in the news for banning the use of ChatGPT and other AI chatbots after sensitive internal source code was shared with the service. Samsung feared the code could be stored on the servers of OpenAI, Microsoft, Google or other LLM service providers and potentially be used to train their models.

By default, ChatGPT saves users’ chat history and repurposes it to further train their models. It’s possible this data could then be exposed to other tool users. If you use an external model provider, be sure to find out how prompts and replies can be used, if they are used for training and how and where they are stored.

Many enterprises, particularly in regulated industries like healthcare or finance, have strict policies about sharing their sensitive data with external services. Sharing data with an externally hosted LLM provider is no exception. Even if data isn’t inadvertently shared with other users of these tools, customers have no recourse if the data they share with external LLM providers is hacked.

To avoid these risks entirely, enterprises should consider training and running their AI chatbot tools within their own secure environment: private cloud, on-premises—whatever the enterprise considers secure. This approach not only ensures LLM applications abide by the same security policies as the rest of the enterprise’s IT stack, but it also gives enterprises more control over the cost and performance of their models.

Model Security

Data security is one thing, but LLMs trained on your proprietary data are more valuable than the data itself. Once trained, these models are effectively a blueprint for the inner workings of your company and its strategy. Models contain not only your proprietary data but also the valuable insights hidden in that data.

To illustrate, imagine if a company created an LLM tool to help create marketing content, training it on their customer data and business strategy documents. Such an LLM would be immensely valuable for a competitor and a strong motivator for corporate espionage to exfiltrate the model. Furthermore, unlike exfiltrating petabytes of corporate data, exfiltrating a much smaller-sized model of the data is a much easier task for a hacker.

Model security is thus as important or more important than data security. Keeping your models in-house will give you more control over the security measures to protect them. You may also want to consider model obfuscation: in other words, making your models unintelligible without a separate decoding key. Think of it like encryption but for LLMs.

Unauthorized Access To Sensitive Data

When LLM-based chatbots are used for public-facing applications, anybody can be a hacker just by asking an LLM-based chatbot a simple question—if the right safeguards aren’t in place.

For example, NPR recently reported a case where a hacker could access another person’s credit card information using an AI chatbot. All they had to do was say their name was the credit card number they were looking for and then ask the chatbot what their name was.

Even for internal-facing applications, you don’t want to give employees access to sensitive information or aggregated data that they would not otherwise have access to. If your data normally has internal access controls, these controls can be overcome if the sensitive data is used to train an LLM accessible to employees.

Generally, if sensitive information is used to train an LLM, that information will be retrievable by whoever is using the LLM tool with the right prompt engineering. To mitigate these kinds of unauthorized access, we need to find strategies and architectures to compartmentalize information and control access to that information within the LLM framework. LLMs can be a powerful tool for enterprises across business functions and across industries. But without the right security and data protection measures in place, they could quickly become a liability. Don’t be distracted by the dazzle of a hot new technology: When building enterprise LLM applications, security should be front-of-mind as it would with any other new technology adoption.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Follow me on LinkedIn. Check out my website.

More From Forbes