Why keeping old customer records could cost millions [Q&A]

The modern world thrives on data, but what happens when that data has outlived its usefulness? Legacy data can become a weak link in corporate security. These records don’t just take up space; they expand the exposure surface in a breach and can damage both finances and reputations.
We spoke with Rob Shavell, co-founder and CEO of DeleteMe, about why companies can’t afford to ignore legacy data and what they can do to address it.
BN: How does keeping outdated and unnecessary records increase the financial and reputational damage of a data breach?
RS: Think of old records as dead weight. They don’t generate value for the business, but they still create risk. Hackers don’t care if a database has been untouched for a decade. They just want to know if it contains social security numbers, financial histories, addresses, and the like. By hanging onto those outdated files, companies increase the amount of sensitive information available to steal, which magnifies both the financial and reputational fallout when a breach occurs.
Another problem is liability. The cost of a breach usually depends on how many individual records are exposed, so the more unnecessary data you keep, the bigger the bill when something goes wrong. What might have been a few hundred thousand dollars in damages can quickly turn into millions simply because no one bothered to delete outdated files.
And from the customer’s point of view, there’s no difference between ‘old’ and ‘new’ data. If their personal information gets exposed, they see it as a failure to protect them. That loss of trust often hurts more than the fine itself.
BN: Are there any examples of how legacy data has contributed to a major compliance fine?
RS: Absolutely, and we’ve seen how this plays out more than once, especially under the EU’s General Data Protection Regulation (GDPR). GDPR gives regulators the authority to levy steep fines when companies fail to protect personal information or hold on to more than they need.
A good example is the British Airways breach in 2019. Attackers gained access to the data of about 400,000 customers, and regulators noted that much of that information should never have been stored in the first place. Marriott faced a similar situation in 2018 after acquiring Starwood Hotels. Along with the business, they inherited massive stores of unmanaged, outdated records that turned into a liability when hackers broke in.
That’s the problem with legacy data: it’s often inherited, forgotten, and unsecured. And when regulators show up after a breach, they’re not just asking, ‘How did this happen?’ They’re also asking, ‘Why were you keeping all of this in the first place?’ If you don’t have a good answer, the penalties can escalate quickly.
BN: What is ‘data minimization’ and should it be considered a fundamental principle of modern data governance and risk management?
RS: Data minimization sounds complicated, but it’s really straightforward: collect only what you need, keep it only as long as you need it, and delete it once it’s no longer useful. In practice, that could mean asking for an email address instead of a full mailing address, setting automatic expiration dates for customer files, or building retention rules into HR and financial systems so that records don’t linger indefinitely.
The problem is, a lot of companies still operate on a ‘more is better’ mindset. They hoard data because they think it might be valuable someday. That might have made sense twenty years ago, but today it’s outdated and dangerous. The more you hold onto, the bigger your attack surface, and the harder it is to protect.
Companies should treat data minimization as a core principle of governance. It lowers your exposure, reduces compliance headaches, and even saves on storage costs. But the bigger shift is in how you think about data. It’s not always an asset. Sometimes it’s a liability, and that’s a lesson too many companies learn the hard way.
BN: What role do artificial intelligence and machine learning have to play in identifying and managing legacy data?
RS: AI can definitely help here, but let’s be clear, it’s not magic. Most companies don’t even know how much legacy data they have, let alone where it’s sitting. It’s usually scattered across old databases, forgotten file servers, cloud archives, even on employee laptops. Trying to manually audit all of that is almost impossible.
What AI and machine learning can do is help sort through the mess. They can automatically scan systems to locate forgotten datasets, classify records by type, and flag duplicates so they don’t keep multiplying across platforms. Natural language processing tools can read through unstructured files -- like PDFs and emails -- to spot sensitive details such as Social Security numbers or credit card data. Usage analytics can highlight which records employees are actually accessing versus which have sat untouched for years.
But, and this is important, technology isn’t going to fix the problem by itself. If the culture inside the company is still ‘save everything forever,’ then AI just becomes another layer of complexity. You still need strong policies, executive buy-in, and a real commitment to minimizing data. Otherwise, you’re just automating bad habits.
BN: Are there practical steps that organizations can take to start addressing their legacy data?
RS: Absolutely. The key is to start small and stay consistent. The first step is just knowing what you’ve got. You can’t secure or delete data you don’t even know exists, so an inventory is essential. Many companies start with a data map or use automated discovery tools that scan across databases, file shares, and cloud systems to identify what’s out there.
Once you have that picture, you need clear retention policies. Work with compliance and legal to decide how long certain types of information should stick around -- maybe three years for customer support tickets, seven for tax records, and so on. Then build those rules directly into your systems so data automatically expires instead of piling up.
When it’s time to delete, do it properly. Too many companies just shuffle data into cold storage -- low-cost systems where files sit unused but still accessible -- and assume that’s good enough. It isn’t. Sensitive records should be permanently and verifiably erased when they’re no longer needed. That might mean using secure erase software for digital files, wiping or shredding old hard drives, or working with certified providers to destroy physical media. Automation can help here too: modern tools can classify files, apply retention timers, and even generate audit trails so you can prove compliance if regulators ask.
And that brings us to culture. Employees need to understand that hanging onto excess data is a risk, not a safety net. Training sessions, regular reminders, and even simple dashboards that show how much data has been deleted versus stored can reinforce that point. Ultimately, minimizing data has to become part of everyday habits, not just an IT project.
Legacy data is one of the most overlooked risks that companies face. It doesn’t show up on balance sheets, but it can determine whether a breach becomes a manageable incident or a major crisis. If you don’t get rid of what you don’t need, you’re basically storing future liabilities on your own servers. The choice is simple: deal with it now, or deal with the consequences later.
Image credit: Andreus/depositphotos.com
