Enterprise AI, ground truth, and the 'corona effect'
Nothing in our lifetimes has prepared us for what's happening in our world today. We've certainly had our share of major catastrophes in the past 100 years -- both natural and man made -- but nothing matches the impact of the COVID-19 pandemic. We are living in a time when fundamental assumptions about how our societies function are being thrown out and re-written with blinding speed.
The degree of global disruption is unprecedented in scope and scale, and we're still in the early phases. Given the confluence of medical, social, political, and economic factors, we have not yet reached the peak of the impact, and the world we'll inherit as the storm tide recedes will be significantly changed, and changeable. This is not to suggest that "the end is nigh" or that all changes wrought by the pandemic will be bad. But the undeniable truth is that we are experiencing an unexpected and extreme test of our AI technologies and their ability to automate and improve our ability to make good decisions quickly in increasingly complex situations. With respect to AI, we are entering an especially critical phase.
The "truth" I'm focusing on here is what is known to data scientists as "ground truth", who's dictionary definition is "factual data as ascertainable through direct observation rather than through inference from remote sensing.” In data science circles, the term generally refers to the reality that underlies the data being fed into AI models in production, and the concern is around any differences between the current ground truth and that reflected in the data with which machine learning models are trained.
For AI systems to produce meaningful and valuable results, it's essential that the AI algorithms be trained to discern patterns from underlying data which are stable enough to inform appropriate actions in the broadest of anticipated operating conditions. A simple example: Suppose your model for scheduling airline capacity was trained with data from a time when approximately 4 percent of international tickets matching certain criteria were cancelled in the 24 hours prior to departure. It's fair to assume that absent a broader set of training data either through collection or simulation, the model is not going to make very good predictions today, because the ground truth today barely resembles what it was when the model was released to production.
In our simple example, if 80 percent of flights are being cancelled but the model still predicts 4 percent, it can indicate a massive shift in ground truth or the "operating regime" in which the AI system is expected to perform. Data scientists have many techniques for determining so-called "model drift" and it's underlying causes, and in the case where ground truth has shifted substantially they can start to address the model's deterioration by retraining with broader data more reflective of currently observable statistical parameters. Perhaps a retrained model will provide more accurate predictions, or perhaps the model needs more significant changes, such as new inputs like the number and trends in COVID-19 cases reported in the departure and arrival cities and the actions by governments in the affected countries. In either case, time is of the essence because until a retrained or updated model is deployed into production, the prior version is working away making increasingly bad predictions.
Here's the rub: Suppose your data scientists have developed a new, retrained model that is delivering predictions much more in line with the new ground truth. Of course, you want to get that model into production as fast as possible -- in days or even hours if possible. But the reality is that most models never make it from the data science lab into production. And those that do can take weeks or months -- and in some regulated industries, up to a year -- to make it into production. There are many reasons for the delays, some are technical, but much of the delays are caused by inefficiencies in business processes and organizational friction.
In short, most enterprises have not yet organized themselves around the principles of Enterprise AI in which the traditional business, actuarial, optimization models, etc are modernized to be driven by ML/AI algorithms and operationalized, automated, and governed at enterprise scale. The notion of Enterprise AI highlights a certain "ground truth": Models are very different from conventional software, and companies need to adjust accordingly if they're going to be able to use AI effectively in a fast-changing world.
A new discipline is emerging in the large enterprise called ModelOps that, in ways analogous to (but different from) DevOps combines process, technology and organizational alignment to enable models to move quickly from data science into production -- without compromising visibility, operational control or governance. When implemented as an enterprise-wide capability accountable to the CIO, ModelOps enables organizations to ensure that they can get new and updated models into production as fast as the ground truth is changing -- which as we now know can be much faster than we'd previously imagined. The alternative is to see AI investments squandered, or worse, to drive business decisions based on models that no longer reflect the world we live in. This is the "Corona Effect", and those of us in the business of developing and using AI in the real world need to take heed.
For the moment, consider where the ground truth in your business has shifted (and will continue to shift) as the pandemic peaks, ebbs and returns us to a "new normal." The ability to respond to these unanticipated and potentially dramatic shifts in your business’s operating conditions as they occur is the ultimate goal of Enterprise AI.
Stu Bailey is the Co-Founder and Chief AI Architect of ModelOp. He is a technologist and entrepreneur who has been focused on analytic and data intensive distributed systems for over two decades. Stu is the founder and most recently Chief Scientist of Infoblox (NYSE:BLOX). While bringing successful products to market over the years, Stu has received several patents and awards and has helped drive emerging standards in analytics and distributed systems control. During his six years as technical lead for the National Center for Data Mining, Stu pioneered some of the very first analytic applications to utilize model interchange formats.