Combating misinformation with AI document management [Q&A]

Many organizations rush to implement AI chatbots without addressing their document management issues first, but when these systems deliver incorrect information it can create significant risks.

But while AI is part of the problem it can also be part of the solution. We spoke to Stéphan Donzé, CEO of AODocs, to find out more.

BN: Why can AI chatbots be such a problem when it comes to finding documents?

SD: AI chatbots are very good at finding information, and at presenting this information in natural language in response to a user's question. However, AI chatbots have no way to assess which ones of the documents available to them are valid and up to date, and which ones are drafts, obsolete, expired or incomplete. When asked a question, AI chatbots will find 'any' document which is relevant to the question, and respond in an 'assertive' and 'confident' tone, leading the user to believe that the answer is authoritative and trustworthy.

Let's take an example: a sales representative is on the phone with a customer, and needs to provide a pricing quote. The sales person asks the AI chatbot, "What is the price for product XYZ", and the chatbot confidently answers "$49.99 per unit, with a minimum order of 10 units". The sales person has no way to know that, in fact, the chatbot has found this information in a price list which expired one year ago, and she communicates this price to the customer, thus putting the company at risk of losing money.

AI's biggest strength is also its main weakness: it makes information (the good and the bad) very easy to find, and thus it dramatically increases the risk of users misusing incorrect content if the AI chatbots are fed with uncontrolled documents.

BN: There's a lot of focus on AI model capabilities, but much less discussion about the quality of information these models access. Why do companies overlook document management as a critical foundation for AI success?

SD: Most companies have built chatbot proof of concepts in 2024 in controlled environments, running their AI models on a limited number of manually picked documents. They have seen great results with these demonstrators, and great improvements brought by the newer AI models such as GPT-4o, GPT-o1, or Gemini 2.0. Seeing that their AI pilot chatbots were working so well, they have 'naively' considered that they could simply feed these chatbots with all of their documents (because, 'the more information, the better', right?).

Indeed, it is hard to comprehend that AI can be so good at finding the most relevant answer to a complex question, and yet completely unable to 'understand' that a spreadsheet named '2021 price list' is obviously outdated and should not be used.

BN: You've mentioned that AI can actually amplify existing document management problems. Could you explain how AI makes these issues worse rather than better, and what organizations should do about it?

SD: The traditional search user experience presents the user with a list of relevant results, and lets the user pick which one of the search results seems to be the most relevant. On the contrary, AI chatbots provide direct answers to the user's question, and therefore the users don't realize that there might be more than one document matching their questions, and that some of these documents might be incorrect. Let's take again the example of a sales rep who has to provide her customer with a quote for 15 units of product XYZ.

In a pre-AI world, the rep would go to the search interface, search for 'product XYZ pricing' and she would see multiple results, including a spreadsheet named 'Price list -- 2025', another one named 'Special promotion schedule -- Christmas 2023' and another one named 'LATAM pricing - 2022 (EXPIRED DO NOT USE)'. Looking at the list, she obviously picks 'Price list - 2025' since it is the one that seems the most up to date.

With an AI chatbot, she would ask "What is the price for 15 units of XYZ" and the chatbot will answer something like "The price for XYZ is $49 per unit with a minimum order of 10 units". Great, this saves our rep a lot of digging in spreadsheets, she has the answer directly in the chat window! Happy to be able to reply to her customer on the phone without waiting, she communicates the price to the customer. Since the chatbot 'confidently' provided an answer to the question, it is very unlikely that the sales rep will double check which version of the price list the chatbot used to reply. She automatically assumes that the chatbot knows what it is doing (why would she not? The chatbot is an official tool provided by her company).

BN: For companies that want to ensure their AI systems provide reliable information, what practical steps can they take to validate and structure their document repositories before implementing AI solutions?

SD: The most important step is to ensure AI chatbots only run on controlled sets of information, where the documents are known to be up to date and valid. AI chatbots work very well on vertical systems such as a CRM, or a technical documentation repository that is managed by a publication and review workflow. On the contrary, AI chatbots running on top of file repositories which can contain several versions of the same information (for example a mix of work in progress draft documents, current documents and obsolete versions, as is typically the case in any company's shared folders) have a very high probability of responding to user questions with incorrect information, resulting in costly mistakes.

As a practical step, companies should 'start small' when deploying AI chatbots: start with the most 'reliable' sets of documents, such as internal knowledge bases, curated folders containing only your up-to-date sales materials, or your official, customer-facing technical documentation. And then progressively expand your AI agents to new document repositories, using metadata to single out the list of documents that are 'valid' for the chatbot to use.

Image credit: limbi007/depositphotos.com

© 1998-2025 BetaNews, Inc. All Rights Reserved. Privacy Policy - Cookie Policy.