How fake data can help to combat breaches [Q&A]
September this year marked five years since the notorious Equifax data breach which exposed the social security numbers, birthdates, credit card details, and more of millions of customers.
But how much has the industry learned from this breach? And what measures can be used to help avoid similar issues in the future? We spoke to Ian Coe, co-founder at Tonic.ai to find out why fake data might be the answer.
BN: How has the financial industry responded in the wake of the Equifax breach?
IC: It seems more financial institutions are putting safeguards in place to protect their customer data. We've seen significant moves made by businesses to protect sensitive data, whether it be network infrastructure upgrades or steps to improve data protection within their data pipelines. In our case, the customers we work with on a daily basis are improving their data privacy habits by integrating masked or synthetic data into their development and testing processes. It's a critical solution for minimizing the exposure of customer data to internal teams and reducing the risks of a leak or breach. At the same time, it keeps development teams running smoothly, even more efficiently, thanks to the ease with which they can source quality, safe data.
BN: Exactly what is 'fake data'?
IC: Fake data (also known as synthetic data) can mean many things, depending on the techniques used to create it. Ideally, fake data is created to meet the degrees of privacy and quality required by a given use case. It can be as simple as dummy data that has no connection to real-world values, and as complex as high-fidelity synthetic data created using AI and deep neural networks. The former is very safe but not very useful where realism is required; the latter achieves the best of both worlds -- high-fidelity realism and strong data privacy. In the world of software, developers need access to safe, realistic test data that looks and behaves like their production data, to ensure they're catching bugs, accounting for edge cases, and delivering quality software with each release. This has been our focus at Tonic since day one. With our platform, developers can connect to their real world data and transform it into fake or synthetic data that looks and feels like production data. Given that sourcing realistic data has long been a bottleneck for developers, this solution eliminates data pipeline overhead and expedites the development process by enabling developers to rapidly source the realistic data they need to build and test their products.
BN: How can it help prevent and deal with data breaches?
IC: When companies use fake data in pre-production environments, they minimize their data footprint and ensure that their sensitive customer data isn't sitting on a developer's laptop just waiting to be leaked or hacked. It removes a significant weakness in the chain of data security. De-identified data, when done well, cannot be re-identified, so if a development team using masked data ever does experience a breach or leak, the fake data is what is at risk and not actual customer data. It's also worth noting that regulations like GDPR and CCPA have made using fake data in software development a requirement, wherever sensitive data is at risk. So regulatory compliance is a huge business driver for having fake data in place in software development processes.
BN: Is this applicable to sectors other than finance?
IC: Yes, absolutely. We have many customers across a range of industries, including ecommerce, healthcare, insurance, HR, marketing and ad tech, edtech, retail, and more.
BN: As we approach the peak season for online shopping what should businesses be doing to protect their customers?
IC: The end of the year is a common time for companies to experience website crashes, breaches, and leaks. Businesses need to consider investing in preventative measures before breaches and leaks hit their peak season, whether it be upgrading their security networks or using fake data to prevent actual customer data from exposure.