The many privacy concerns surrounding contact tracing efforts for COVID-19
Contact tracing is being touted as the best way to keep the novel coronavirus under control and avoid yet another country-wide shutdown, at least until a vaccine can be developed. The process has many benefits and was seen as integral to the success in staunching the spread of COVID-19 in a few different cities in Asia. However, there are quite a few hurdles to overcome in the process of implementing successful contact tracing efforts in the United States.
For starters, manual contact tracing requires a large workforce as well as time and money. Apps, on the other hand, can be implemented quickly and for a relatively low cost.This option seems simple and unproblematic enough at first glance but, upon further inspection, poses a threat to the privacy of any and all users. Despite their promise of aiding in the eradication of COVID-19, If left unregulated, contact tracing apps put your personal information at risk and open up potential abuses of that information for decades to come.
When it comes to matters of healthcare, personal privacy protections have been in place since 1996, thanks to HIPAA. But, as contact tracing is more of a matter of public interest and doesn’t necessarily have to involve your health insurance provider or primary care physician, those laws of protection don’t apply. Accidentally agreeing to the terms and conditions on an app thinking you’re contributing to the good of your community by being open with your information could lead to all kinds of information in a compromised state.
The fact of the matter is, contact tracing systems are most effective when they see widespread adoption. They see widespread adoption when there is an element of public trust, when those using the service can have faith that the information collected during the process will be ethically and appropriately used towards pandemic response management. To mishandle that collected data would be an infringement on an individual’s civil liberties and could have serious repercussions.
In order to gain the trust of potential users and ensure the effectiveness of the effort, contact tracing providers need to have data privacy as the key focus of any solution that can be easily explained to the user. However, most solutions that espouse the privacy of the data they collect claim to have "de-identified" said data when actually, they haven’t actually fully de-identified anything. While it’s true that the traces of personally identifying information have been removed from the data, a study from MIT found that oftentimes these solutions pair the supposedly anonymous data with key metrics that can help re-identify what users the information comes from.
That isn’t to say that collecting and using data is bad. The data acquired from contact tracing apps can support a myriad of functions -- from helping first responders to understand where potential pandemic 'hot spots' exist when answering emergency calls to local, state and even federal disease response teams learning more from how the virus is spread. The data collected could be of great assistance in the planning and response strategies of hard-hit areas as well as help out by determining resource requirements and management.
Contact tracing could be seen as a civic duty to be undertaken and the best way to do your part to bring about the end of the pandemic. Trust in the technology being used is the only way to inspire that sentiment and instill a sense of responsibility in the public. If everyone could be part of the virus management, we’d all be working towards a solution to the problem at hand.
If the only way to ensure both widespread adoption of the contact tracing technology and accurate pandemic response management is to ensure total data privacy, then the answer isn’t de-identified data. Instead, those looking to implement contact tracing apps should seek to adopt synthetic data. Utilizing synthetic data completely anonymizes any personal identifying information. Anonymization that comes from synthetic data is different from de-identification. Where a de-identification step simply scrubs the personal information from collected data, synthetic data sets are completely different yet share the same statistical properties as the real data.
It creates a digital twin of the original data, allowing any important statistics collected to be shared among doctors, hospitals, government officials or first responders. With accurate data showing where and when COVID-19 transmissions are occurring, these groups can accurately plan for surges in ICUs, the volume of need for a vaccine by area or testing site crowds. Since it’s an entirely new set of data, and therefore cannot be reverse engineered, there’s no risk of exposing location information or other sensitive info of where, when and from whom the data was collected.
We have the technology to do this and do this right. Contact tracing will be key in stopping the spread of the coronavirus, especially as we continue to see case numbers climbing with no vaccine in the near future. By putting contact tracing systems in place that have data privacy at their core, we can create a sense of trust in users and come together to bring an end to the pandemic.
Image credit: Robert Avgustin / Shutterstock
Dr. Mike Capps is the co-founder and CEO of Diveplane, an understandable AI startup committed to developing ethical and transparent AI technology to empower unbiased decision-making. Prior to Diveplane, Dr. Capps was the President at Epic Games, the company behind billion-dollar entertainment franchises like Gears of War and Fortnite. Mike holds master’s degrees in computer science and electrical engineering from UNC-Chapel Hill and MIT, and a doctorate in computer science from the Naval Postgraduate School.