The AI model behind tools like ChatGPT that generates text — and, increasingly, search answers.
What an LLM actually does
Strip away the marketing and a large language model is a statistical engine for predicting the next chunk of text. During training it reads an enormous slice of the internet, books, and code, and learns the probability that any given word (technically a token) follows the words before it. When you prompt it, the model isn't looking up a stored answer — it's generating one token at a time, each choice conditioned on everything written so far. That single mechanism, scaled up across billions of parameters, is enough to produce fluent prose, working code, and plausible answers to questions it was never explicitly taught.
This is why an LLM can sound authoritative and still be wrong. Because it optimises for the most likely continuation rather than verified truth, it will sometimes invent citations, statistics, or facts — the behaviour known as hallucination. OpenAI researchers argued in a 2025 paper that the training and evaluation process itself rewards confident guessing over admitting uncertainty, which helps explain why hallucination has proven so stubborn even as models improve. The takeaway for anyone publishing online: the model's confidence is not evidence, and that has direct consequences for how your content gets used.
Why LLMs matter for getting found
The reason marketers now care about LLMs is that they sit between your content and a growing share of searchers. Google's AI Overviews and the conversational AI Mode are powered by its Gemini models; ChatGPT Search, Perplexity, and Gemini all use an LLM to read live web pages and write a synthesised answer on the spot. Increasingly the user reads that generated answer instead of clicking ten blue links — so the question shifts from "do I rank?" to "am I one of the handful of sources the model chose to summarise and cite?"
A concrete example. Ask Perplexity a question and it doesn't just hand back its training data: it runs a retrieval-augmented loop, crawling pages, ranking them, and citing a small set inline. Industry analyses suggest it may visit on the order of ten pages per query while citing only three or four — so being retrievable, well-structured, and clearly factual is what gets you into that shortlist. The discipline of earning those mentions is what people now call generative engine optimization or, framed around the answer itself, answer engine optimization. Our practical guide to ranking in ChatGPT and Perplexity walks through what that looks like in practice.
The common misconception is that the LLM "knows" things the way a database does. It doesn't. A base model is frozen at a training cutoff and has no live view of the web; the up-to-date answers you see in AI search come from retrieval — the system fetches current pages and feeds them to the model as context, then the model writes over that grounding. That distinction matters because it tells you where your leverage is: you generally can't change what a model memorised, but you absolutely can influence what it retrieves and cites today by being the clearest, most credible source on a topic.
One honest caveat: this landscape is moving fast. Which models power which products, how answers are grounded, how citations are displayed, and even whether ads appear inside AI answers are all shifting from quarter to quarter. Treat specific behaviours as snapshots rather than permanent rules — but the underlying principle, that an LLM rewards clear and trustworthy sources it can safely build on, has held steady so far.