The future role of AI in fact checking
As an analyst, I’d like to have a universal fact checker. Something like the carbon monoxide detectors on each level of my home. Something that would sound an alarm when there’s danger of intellectual asphyxiation from choking on the baloney put forward by certain sales people, news organizations, governments, and educators, for example.
For most of my life, we would simply have turned to academic literature for credible truth. There is now enough legitimate doubt to make us seek out a new model or at a minimum, augment that academic model.
I don’t want to be misunderstood: I’m not suggesting that all news and education is phony baloney. And I’m not suggesting that the people speaking untruths are always doing so intentionally.
The fact is, we don’t have anything close to a recognizable database of facts from which we can base such analysis. For most of us, this was supposed to be the school system but sadly, it has increasingly become politicized.
But even if we had the universal truth database, could we actually use it? For instance, how would we tap into the right facts at the right time? The relevant facts?
If I’m looking into the sinking of the Titanic, is it relevant to study the facts behind the ship’s manifest? It might be interesting, but would it prove to be relevant? Does it have anything to do with the iceberg? Would that focus on the manifest impede my path to insight on the sinking?
It would be great to have Artificial Intelligence advising me on these matters. I’d make the ultimate decision, but it would be awesome to have something like the Star Trek computer sifting through the sea of facts for that which is relevant.
Is AI ready? IBM recently showed that it’s certainly coming along.
Is the sea of facts ready? That’s a lot less certain.
Debater holds its own
In June 2018, IBM unveiled the latest in Artificial Intelligence with Project Debater in a small event with two debates: "we should subsidize space exploration", and "we should increase the use of telemedicine". The opponents were credentialed experts, and Debater was arguing from a position established by "reading" a large volume of academic papers.
The result? From what we can tell, the humans were more persuasive while the computer was more thorough. Hardly surprising, perhaps. I’d like to watch the full debates but haven’t located them yet.
Debater is intended to help humans enhance their ability to persuade. According to IBM researcher Ranit Aharanov, "We are actually trying to show that a computer system can add to our conversation or decision making by bringing facts and doing a different kind of argumentation."
So this is an example of AI. I’ve been trying to distinguish between automation and AI, machine learning, deep learning, etc. I don’t need to nail that down today, but I’m pretty sure that my definition of AI includes genuine cognition: the ability to identify facts, comb out the opinions and misdirection, incorporate the right amount of intention bias, and form decisions and opinions with confidence while remaining watchful for one’s own errors. I’ll set aside any obligation to admit and react to one’s own errors, choosing to assume that intelligence includes the interest in, and awareness of, one’s ability to err.
Mark Klein, Principal Research Scientist at M.I.T., helped with that distinction between computing and AI. "There needs to be some additional ability to observe and modify the process by which you make decisions. Some call that consciousness, the ability to observe your own thinking process."
Project Debater represents an incredible leap forward in AI. It was given access to a large volume of academic publications, and it developed its debating chops through machine learning. The capability of the computer in those debates resembled the results that humans would get from reading all those papers, assuming you can conceive of a way that a human could consume and retain that much knowledge.
Beyond spinning away on publications, are computers ready to interact intelligently?
Artificial? Yes. But, intelligent?
According to Dr. Klein, we’re still far away from that outcome. "Computers still seem to be very rudimentary in terms of being able to 'understand' what people say. They (people) don't follow grammatical rules very rigorously. They leave a lot of stuff out and rely on shared context. They're ambiguous or they make mistakes that other people can figure out. There's a whole list of things like irony that are completely flummoxing computers now."
Dr. Klein’s PhD in Artificial Intelligence from the University of Illinois leaves him particularly well-positioned for this area of study. He’s primarily focused on using computers to enable better knowledge sharing and decision making among groups of humans. Thus, the potentially debilitating question of what constitutes knowledge, what separates fact from opinion from conjecture.
His field of study focuses on the intersection of AI, social computing, and data science. A central theme involves responsibly working together in a structured collective intelligence lifecycle: Collective Sensemaking, Collective Innovation, Collective Decision Making, and Collective Action.
One of the key outcomes of Klein’s research is "The Deliberatorium", a collaboration engine that adds structure to mass participation via social media. The system ensures that contributors create a self-organized, non-redundant summary of the full breadth of the crowd’s insights and ideas. This model avoids the risks of ambiguity and misunderstanding that impede the success of AI interacting with humans.
Klein provided a deeper explanation of the massive gap between AI and genuine intellectual interaction. "It's a much bigger problem than being able to parse the words, make a syntax tree, and use the standard Natural Language Processing approaches."
"Natural Language Processing breaks up the problem into several layers. One of them is syntax processing, which is to figure out the nouns and the verbs and figure out how they're related to each other. The second level is semantics, which is having a model of what the words mean. That 'eat' means 'ingesting some nutritious substance in order to get energy to live'. For syntax, we're doing OK. For semantics, we're doing kind of OK. But the part where it seems like Natural Language Processing still has light years to go is in the area of what they call 'pragmatics', which is understanding the meaning of something that's said by taking into account the cultural and personal contexts of the people who are communicating. That's a huge topic. Imagine that you're talking to a North Korean. Even if you had a good translator there would be lots of possibility of huge misunderstandings because your contexts would be so different, the way you try to get across things especially if you're trying to be polite, it's just going to fly right over each other's head."
To make matters much worse, our communications are filled with cases where we ought not be taken quite literally. Sarcasm, irony, idioms, etc. make it difficult enough for humans to understand, given the incredible reliance on context. I could just imagine the computer trying to validate something that starts with, "John just started World War 3…", or "Bonnie has an advanced degree in…", or "That’ll help…"
A few weeks ago, I wrote that I’d won $60 million in the lottery. I was being sarcastic, and (if you ask me) humorous in talking about how people decide what’s true. Would that research interview be labeled as fake news? Technically, I suppose it was. Now that would be ironic.
Klein summed it up with, "That's the kind of stuff that computers are really terrible at and it seems like that would be incredibly important if you're trying to do something as deep and fraught as fact checking."
Centralized vs. Decentralized Fact Model
It’s self-evident that we have to be judicious in our management of the knowledge base behind an AI fact checking model and it’s reasonable to assume that AI will retain and project any subjective bias embedded in the underlying body of 'facts'.
We’re facing competing models for the future of truth, based on the question of centralization. Do you trust yourself to deduce the best answer to challenging questions, or do you prefer to simply trust the authoritative position? Well, consider that there are centralized models with obvious bias behind most of our sources. The tech giants are all filtering our news and likely having more impact than powerful media editors. Are they unbiased? The government is dictating most of the educational curriculum in our model. Are they unbiased?
That centralized truth model should be raising alarm bells for anyone paying attention. Instead, consider a truly decentralized model where no corporate or government interest is influencing the ultimate decision on what’s true. And consider that the truth is potentially unstable. Establishing the initial position on facts is one thing, but the ability to change that view in the face of more information is likely the bigger benefit.
A decentralized fact model without commercial or political interest would openly seek out corrections. It would critically evaluate new knowledge and objectively re-frame the previous position whenever warranted. It would communicate those changes without concern for timing, or for the social or economic impact. It quite simply wouldn’t consider or care whether or not you liked the truth.
The model proposed by Trive appears to meet those objectivity criteria and is getting noticed as more people tire of left-vs-right and corporatocracy preservation.
IBM Debater seems like it would be able to engage in critical thinking that would shift influence towards a decentralized model. Hopefully, Debater would view the denial of truth as subjective and illogical. With any luck, the computer would confront that conduct directly.
IBM’s AI machine can examine tactics and style. In a recent debate, it coldly scolded the opponent with: "You are speaking at the extremely fast rate of 218 words per minute. There is no need to hurry."
Debater can obviously play the debate game while managing massive amounts of information and determining relevance. As it evolves, it will need to rely on the veracity of that information.
Trive and Debater seem to be a complement to each other, so far.
Image credit: keport / Shutterstock
Barry Cousins is a Research Lead at Info-Tech Research Group (www.infotech.com), the world’s fastest growing IT research and advisory firm. Founded in 1997, Info-Tech produces unbiased and highly relevant IT research solutions. Since 2010 McLean & Company, a division of Info-Tech, has provided the same, unmatched expertise to HR professionals worldwide.