Beyond words: What AI is really learning -- and what it knows that we never taught it

Imagine you had to finish every sentence in every book ever written -- with just your best guess of the next word. That’s how large language models (LLMs) like GPT-4 start learning. 

LLMs use self-supervised learning, meaning they don’t need someone to label or explain the data to them. Instead, they learn by reading vast amounts of text from books, code, academic papers, Wikipedia (and its 57 million+ articles), Reddit forums, and news articles, in addition to billions of others, and then predicting what word comes next in a sentence -- over and over again.

Although this may sound simplistic, it is the core of how LLMs can do the amazing things they do. Let’s say the model reads: 

“In 1492, Christopher Columbus set sail from Spain to ___.”

It has to guess the next word based on everything it’s seen before. Maybe it says “explore,” “America,” or “discover,” depending on the context. If it gets it right, great; if not, it adjusts, learning how people talk and how ideas like timelines, historical figures, cause and effect, and more relate to one another.

This next-word prediction task is repeated billions of times during training, using models that can include hundreds of billions of parameters (the “brain cells” of the model). The more examples it sees, the more easily it recognizes language patterns -- grammar, spelling, style, and even humor -- but it doesn’t stop there. Because it learns from data provided by everything from Wikipedia to movie scripts to software documentation, it doesn’t just learn language, but how the world works through language.

LLMs learn much more than just language

Even though the model’s goal is “just” to predict words, it ends up learning a whole lot more.

How things work

As humans, we know that “if you drop a glass, it might break” or “If it’s cloudy, it could rain.” But for LLMs, learning this is called a world model -- not just language, but cause and effect and how the world fits together.

How people think and feel

LLMs can learn to pick up that “I’m fine” might sometimes mean the opposite. It learns tone, emotion, and even sarcasm -- all from how people write.

How to solve problems

LLMs see many examples of math, logic puzzles, code, and advice columns. Over time, they get better at not just saying things but figuring things out. Some newer models even learn to plan by breaking a tricky question into small steps to solve problems.

How to use tools

Some LLMs learn how to write computer code or help build spreadsheets, templates, and APIs just by seeing enough examples of how people do those things in writing.

Learning how to learn

This is called “meta-learning.” For LLMs, this means figuring out what they don’t know -- and how to fix it -- like recognizing their own confusion or need to ask a clarifying question. If it requires more information, it will prompt that it can get a better answer if provided with additional details and ask the user for it. For example: “My response will differ depending on whether this treatment is for children or adults. What age range should I consider?”

Thanks to the transformer architecture, LLMs can track relationships between words across long distances in text and do this learning in parallel, much faster than older models. It’s like giving the model access to a giant library and watching it learn how the world works by reading everything inside. That’s what LLMs do, except instead of reading for years, they learn across billions of examples in days or weeks on infrastructure that spans hundreds of processors. So, even though they’re “just predicting words,” they develop generalized human-like cognition -- not real consciousness, but the ability to simulate understanding remarkably well.

New models that actually “think”

Until recently, most LLMs were great at sounding smart through pattern recognition, but often struggled with logic, made up (hallucinated) facts, or failed at multi-step reasoning. We’re now in the Reasoning Model Era: new models don’t just guess what to say but think about how to say it and why. These models can:

Examples of reasoning-first models:

  • OpenAI’s o1 and o1-pro: Designed for better planning and fewer mistakes.
  • Perplexity’s Sonar Pro-Reasoning: Combines AI with a live internet search to give answers with verifiable sources and logic chains.
  • Anthropic’s Claude 3.x: Focused on ethical, helpful responses with strong step-by-step thinking.
  • Google’s Gemini: Can understand text, images, code, and audio -- then connect them.

These models can break problems into parts, use tools, look up sources, and even revise their own answers -- like a researcher with internet access, memory, and a whiteboard. These advancements move us closer to AI systems that can engage in planning, problem-solving, and even ethical deliberation -- not just mimicry.

The multilingual and cultural challenge

If AI will be helpful to everyone, it must understand more than just English or “textbook” English. While LLMs work best for people who speak like the books and articles the model was trained on (over 80 percent of LLM pretraining datasets are English-dominant), this can leave out key demographics of users, including:

  • Speakers of Indigenous languages, regional dialects, or languages that don’t have much online data to train on.
  • People who speak in non-standard grammar or slang (e.g., Spanglish, Hinglish).
  • People using assistive communication tools.
  • Non-Latin scripts (e.g., Arabic, Hindi) that suffer from inefficient text segmentation.
  • Idioms, metaphors, and customs that don’t always translate directly.

These are real ways people talk, but they often get left out of AI training. This isn’t just about fairness -- it’s also about performance, as a model that misunderstands what you say can give wrong, biased, or even potentially dangerous advice.

What’s next: From language to logic

AI is everywhere. Over 50 percent of content online is now AI-generated, and by the end of this year, over 90 percent of live code will have been generated by AI.

With the advent of reasoning models, what began as text prediction now touches software engineering, scientific discovery, personal coaching, and even legal drafting and arguments.

Yet challenges remain:

  • Hallucination: Even advanced models occasionally generate convincing but false statements.
  • Bias: Models reflect the data they’re trained on, often reinforcing stereotypes or biases.
  • Interpretability: It remains difficult to understand why a model made a given prediction or decision, though reasoning models are getting much better at showing their work.

Today’s LLMs are powerful simulators of thought, learning far more than words -- they internalize logic, context, and human-like reasoning. The shift from fluency to reasoning is real, but to realize their full potential -- ethically and effectively -- as we move into the future, we must ensure they are not just intelligent but trained on robust data (and not just disproportionately on data they themselves generated, as usage expands), provide transparency, multilingual equity, and clearly-grounded factual reasoning grounded in truth. This will help ensure these models continue to improve and serve an increasing number of people without devolving into self-reinforced groupthink.

Image Credit: Wanniwat Roumruk / Dreamstime.com

Keryn Gold, PhD, MBA is Ex-FAANG leader, executive advisor, AI strategist, and author of multiple books, including “The Leadership Playbook, Revisited: How To Accelerate Success, Guide & Empower Your People, Navigate Change, & Build Good AI Habits to Take Control of Your Work Life & Future-Proof Your Career, Today”

© 1998-2025 BetaNews, Inc. All Rights Reserved. Privacy Policy - Cookie Policy.