Facing the business challenge of open data [Q&A]
We're all familiar with the term 'big data' but not perhaps with 'open data'. Open data is information that can be freely reused and distributed, it's often linked with mixing datasets too.
But what challenges does dealing with open data present for enterprises? We spoke to Mo Ladha, product manager at content services platform Hyland, to find out.
BN: What sorts of data are open, and how do they become so?
ML: There are a number of meaningful datasets that are being used by intelligence services, such as AI, to help organizations understand patterns and problems within larger domains and giving them the ability to better assess information, then solve specific problems. While there are a lot of
different instances where open data is making an impact, the most notable for me are the industries and situations where it's truly helping solve challenges.
The first example that comes to mind is within healthcare where we see a lot of research -- from genome projects through vaccination research -- available as data sets and published by companies (with the right anonymity). Open data allows these medical organizations to collaborate better, build solutions faster and solve some of the biggest challenges they are facing.
Open data is also fueling data management within academics. There is currently a lot of intelligence research coming out of PhD programs at established universities and incubators that are building algorithms based on open datasets. An example is information classification for content management and natural language processing (NLP). Several PhD and university incubators have been using open data to fuel algorithmic innovations from health tech to image recognition -- it's a fascinating space to watch as we see large investments being made from venture capital into these programs.
A final example that comes to mind, and area that has been interesting, is in the finance and banking industry where open data is fueling initiatives such as financial trading through large scale economic modelling, helping uncover patterns in large data sets.
BN: How can businesses understand the amount of open data they're collecting?
ML: This is a great question because it is a major challenge many organizations face due to the amount of data they ingest, manage and store daily. 'Dark data' or information an organization has stored, but doesn't utilize and isn't bringing value to the organization, is a huge problem and hard to quantify. Many organizations have approached us with the need to understand company information and how to apply governance. Businesses must have the right governance in place to review and understand the information they are collecting and storing.
The best way to address this problem is to better understand the data you're creating. Under ideal circumstances this happens at the point of data ingestion -- knowing what is being collected and the value it brings to the organization. Businesses should have a schematic of what data is important and where it should be stored so that it is easily retrievable when needed. Certain technologies, like content modelling with governance capabilities, can then help organizations understand the content and then apply the correct compliance policies.
There are various approaches to information management with open data, but I think the best middle ground includes a solution that has analytics and reporting to see the business value each piece of content brings, and an open policy that allows their customers to know what information is stored and how they can access it.
BN: Can the collection of open data lead to businesses falling foul of privacy laws?
ML: Yes it can, especially when it comes to the ability to govern the breaches of content. We are currently at the dawn of privacy in terms of information management. Right now data privacy is working at a slow pace compared to the pace of content creation. Organizations must contribute to the governance conversation willingly and not just to meet privacy law. I believe data security should be front and center led by organizations because fines alone for non-compliance doesn't incentivize innovation in data protection.
Consumers are demanding access to their data. The right to be forgotten is an important initiative, but if you don't know where you (or your data) lives, it's hard to manage. As a society we're definitely taking a step in right direction but there is a lot more we have to do to ensure businesses care about data management as an obligation to consumers.
BN: Looking ahead, are we going to see open data increase in importance?
ML: Absolutely. Information and data are growing at an exponential rate. There was a statistic shared a few years ago that stated 90 percent of world's information was created in the last two years. The volume of information and content that has been created since is phenomenal, and growing every day.
Open data will fuel the next generation of technology advancement, and enables a better understanding of how society functions, not just what applications consumers use. We see data mining as a way to help analyze peoples' behaviors from voter trends, to online shopping, to social interactions. That understanding comes through open data -- and understanding specific patterns, threats and innovation potential. Now that we have 'cheap' cloud compute and strong AI, we need clean and open data to solve real world problems.
BN: Are businesses failing to recognize the value of open data?
ML: I think there is still a disconnect in the value of open data and its potential. Businesses need to have conversations regarding the ability to collaborate together, emulating academia, to solve common problems and challenges. Open datasets provide a good bridge to guide these discussions, with the right level of governance of course. I don't believe these conversations are happening at the executive level yet. I'm seeing those discussions geared more toward IP ownership and the commercial advantage of open data rather than how do you use that information to foster generational change.