AI Terminology Cheat Sheet for Product Teams

AI terminology cheat sheet drawn as clustered concepts on a studio whiteboard

At your next product meeting you will hear it: “We can put an agent around it, or fine-tune later and let RAG handle the rest.” You will find yourself nodding along with the room, even as you ask yourself if you have just committed to a chatbot, a search engine or some kind of autonomous employee.

It is an expensive form of confusion. Stanford’s 2026 AI Index has organizational AI adoption at 88%, yet when MIT’s NANDA study looked at 300 enterprise GenAI rollouts they saw about 95% of them with nothing to show for it on the P&L. The discrepancy is often in the scope, decided by people who were not clear on what any of the jargon meant.

We put together this cheat sheet for those kinds of decisions. Not to make you sound clever at a conference, but so you understand what the terms do to your build, your bottom line and where things might go wrong.

Why Most AI Cheat Sheets Are Useless

There is an acronym index making the rounds among practitioners in 2026 that runs to over 250 terms. In reality they are 30 ideas at most: retrieval, orchestration, sampling and so on. The buzzwords multiply every month while the concepts stay put.

You can see why the usual references don’t work. An executive list is reduced to a marketing slogan; an engineer’s copy is full of textbook definitions without any feel for the problem. Neither one is much use to someone who has to make a call by Friday. We prefer a narrower filter. If a term doesn’t affect your cost, your data or your reliability, it is probably just décor. There are good glossaries out there for backup, like our longer 2024 AI terminology piece or a more technical working developer’s version. But they are no replacement for your own judgment.

Clarity is what you need, not perfect vocabulary. Can you answer what the term changes about the product?

Three Buckets That Actually Matter

Think of the vocabulary as a map. Three buckets will get you through almost any product talk.

Core concepts: what kind of problem

People will throw around AI, machine learning, deep learning and generative AI. They are not all peers. They nest.

  • AI is the umbrella term for anything doing the job of human judgment.
  • Machine learning is how you build some of that AI, letting the system learn from data rather than being hand-coded.
  • Deep learning is a subset of ML using layered neural networks; most modern generative systems are of this type.
  • Generative AI is for systems churning out new content from their training data.

When a vendor comes to you with “AI” and does not specify which of these they mean, their proposal is incomplete.

Models and architecture: what kind of engine

This is LLMs, transformers, foundation and multimodal models. It dictates what is possible. A large language model on text can put together a decent support response but don’t expect it to read your scanned PDFs. If your product has to deal with screenshots or invoices, you want a multimodal model that can look at an image and a question and give you an answer. Figure out the engine before you spec the work or you will be in for a painful rebuild.

Interaction and application: what your users will feel

Prompting, temperature, RAG, tool calling and agents. This is where the behavior is shaped and the decisions are made. You have to decide if the answer is to come from the model’s training, your own documents or an action in another system. Each makes for a different product.

The Terms That Get Confused Most Often

GPT, LLM and transformer are not three siblings

Some glossaries will put them side by side as though they are options.

  • Transformer is the math, the architecture.
  • Large language model is what you end up with after you train a transformer on vast amounts of text.
  • GPT is OpenAI’s particular brand of transformer-based LLM.

To say “we’ll use GPT or an LLM” is fuzzy thinking. It is the difference between saying you will buy a Toyota or a car.

Hallucination is not a bug to fix off

It is a plausible output that happens to be wrong. You can’t patch it because it is part of probabilistic generation. Some would say the word is a misnomer, making the behaviour seem more exotic than it is. You might be better off calling it “fabrication” or simply “confidently wrong output.” That is what is really going on.

Think of hallucination as a constant pressure on your system. Any product decision where you can not afford to be factually in error will require a counter-measure, be it retrieval, citations, human oversight or some other constraint. But make no mistake, none of them put an end to it entirely.

### Tokens, context window and temperature

Teams are often caught off guard by how these three affect cost and behavior down the line.

* **Tokens:** The model’s way of reading text (about 3-4 English characters) and what you get billed for. * **Context window:** The system’s short-term memory, i.e. the number of tokens it can hold in view at one time. * **Temperature:** A dial for randomness. Turn it up for variety, down for consistency.

Your choice of temperature should follow the feature. You want it low for extraction, policy summaries or customer support. Let it run higher for ad concepts or when you are brainstorming. But if you need the same question to yield the same answer every time, creativity is not your friend.

## Prompting, Fine-Tuning, and RAG: When Each One Earns Its Place

In the end, most AI product work comes down to one of three patterns. Go with the wrong one and you have the usual explanation for why a project has gone over budget.

| Approach | Best for | Strengths | Watch-outs | | :— | :— | :— | :— | | *Prompting* | Early MVPs, simple tasks | Fast and cheap; easy to iterate | Gets brittle with complexity | | **Fine-tuning** | Repeated task or a particular style | Consistent structure and tone | Slow to update and requires clean labeled data | | *RAG* | Answers from your own documents | Change the source and the model follows | Quality is a function of retrieval, not the model |

There is a temptation to fine-tune that most teams would do well to resist. Fine-tuning is for shaping behaviour – making the model write in your voice or adhere to a schema. It does not put new facts into its head. If you need the model to know the terms of a customer’s contract, you use retrieval.

### RAG in plain language

Retrieval-Augmented Generation is the open-book exam. The system goes and finds the relevant pieces in your content and puts them before the model as context before it generates anything. Under the hood you have embeddings, the numerical fingerprints of the text, and a vector database to keep them where similar meaning can be found.

For a support bot or any internal tool where the response has to come from approved material, RAG is usually the way to go. Yet this is where projects tend to fail in silence. People will point the finger at the model, but the problem is upstream: stale indexes, poor chunking or weak retrieval. In production, what looks like a generation failure is more often an information architecture one.

For a longer look at how these fit together in terms of scope, data and evaluation, see our guide to building an AI model.

## Agents, MCP and the 2026 Vocabulary Shift

The words we used in 2024 don’t cut it anymore. As we move through 2025 and 2026 the talk has been of agentic systems. A cheat sheet limited to “prompt” and “RAG” is already lacking.

* **Agent.** More than a chatbot that answers, an LLM with logic to call tools and chain steps. An agent acts. * **Tool calling.** How an agent gets an external API to do something, like book an appointment rather than tell you how. * **MCP (Model Context Protocol).** Anthropic has put forward this standard to link agents to their data and tools via a common schema. Some call it “USB-C for AI”. It is important in that it spares you vendor lock-in at the integration level. * **Agentic RAG.** Where the agent itself makes the call on retrieval, re-running and refining based on what it uncovers.

  • It is a slower process, but you get better results on an ambiguous question.
  • Agent memory. Think of the context window as short-term. For anything long-term the agent needs to read or write, that will be in a vector database.
  • Multi-agent. You have several agents working together, perhaps with distinct roles. In theory it is useful, but in practice it is expensive and you will see reliability erode as you add more of them.
  • Agents alter your risk profile. A generative tool might put out a poor sentence; an agent can do damage by sending off a bad email, updating the wrong record or refunding the wrong invoice. The job then becomes less about whether the output is any good and more about what this thing is permitted to do and who is watching it. Take our automated news pipeline case study for instance: it is an example of an agent workflow that is deliberately narrow, showing the value in curation instead of going for full autonomy.

    Hype Terms to Label Clearly

    You will find some words on cheat sheets because they are intriguing, not because they inform product decisions.

    • AGI. Artificial general intelligence. There is no definition everyone agrees on, so plan for what your models can do today rather than some AGI timeline.
    • ASI and singularity. Leave the philosophy out of your feature spec.
    • “Cognitive computing” and “AI-powered.” This is vendor marketing. Open the box and see what is really there.
    • Alignment. It is a technical matter (RLHF, DPO, ORPO) and a term for corporate comms. The word does double duty.

    Put those next to your engineering vocabulary and mark them as speculative or contested. If you don’t, you will be encouraging science-fiction thinking in business meetings and end up paying for the scope creep.

    Tailor the Vocabulary to the Reader

    An engineer’s cheat sheet is no use to an executive and vice versa. Do not try to please both and end up with a document that pleases neither.

    • For growth leaders and operators. Talk about cost (tokens, fine-tuning versus prompting), where humans remain in the loop and how to handle hallucination.
    • Product managers. They need to know RAG, embeddings, evaluation and agent boundaries. Can they tell a generation problem from a workflow one?
    • Engineering leads. Give them the details on sampling parameters, MCP, eval infrastructure and the trade-offs in fine-tuning and retrieval design.

    If you want a primer on the operational terms that come up in tool work for PMs and engineers alike, our prompt engineering glossary has you covered. Or if you are looking to pitch definitions to a specific audience, the accountant’s AI terminology cheat sheet from Karbon is a good model to follow.

    From Vocabulary to Product Decisions

    Words are helpful, but they are not what build the product. The questions that matter are found one layer down from the terminology.

    • You have to ask if a model is even called for here, or if plain rules and search will do.
    • Assuming you need one, what kind of task is it? Generation, classification, action or retrieval?
    • RAG or fine-tuning comes down to the stability of your knowledge. Is it in flux or fixed?
    • What are the consequences of an error from the model? If the answer is a refund, you can’t ship without some human oversight.
    • And who has the eval set? How will you be able to tell quality is on the upswing?

    These aren’t vocabulary questions so much as product ones. The research at MIT would tell you that is where the rub lies: an AI project with a vendor on board has about twice the success rate of a home-grown DIY effort, simply because the partner makes you put these answers on the table before any code is put to paper. For a view on how to decide what to build first, our write-up on how to prioritize product features is right on point for scoping an AI project.

    Say you are looking to put together an agent or assistant in the near term; you will see in our AI chatbot development guide and multimodal AI examples how these terms become something you can ship. Or if you are still trying to map out the opportunity, the generative AI startups guide gets into the structural stuff that has to be sorted out first.

    A cheat sheet is not for winning arguments. Its purpose is to get you past the polite nodding and on to the questions that make progress. Should you want help making a build plan out of all this, Refact’s product design and discovery is meant to settle those early scoping issues, and our AI development practice takes it from there.

    Share

    FAQS

    Commonly asked questions

    Get in touch

    What is the difference between AI, machine learning, and deep learning?

    They are nested, not separate. AI is the umbrella for software that performs tasks usually requiring human judgment. Machine learning is one approach to building AI, where the system learns patterns from data. Deep learning is a subset of machine learning that uses layered neural networks. Most modern generative systems are deep learning models.

    What is RAG and when should we use it instead of fine-tuning?

    RAG, or Retrieval-Augmented Generation, lets a model answer using your documents by retrieving relevant content before generating a reply. Use RAG when the knowledge changes often or must stay tied to approved content. Use fine-tuning when you want consistent style, tone, or output structure. Fine-tuning does not reliably teach the model new facts.

    What is MCP (Model Context Protocol)?

    MCP is a vendor-neutral protocol introduced by Anthropic in 2024 for connecting AI agents to tools, data sources, and file systems through a shared schema. It is often described as USB-C for AI because it standardizes how an agent plugs into external systems. It matters mainly to reduce lock-in on the integration layer between models and tools.

    Is GPT an LLM or a transformer?

    It is a transformer-based LLM. Transformer is the architecture, the underlying math. Large language model describes what you get when you train a transformer on huge text datasets. GPT is a specific family of transformer-based LLMs from OpenAI. Listing them as alternatives is a common glossary mistake.

    What is a hallucination and can it be turned off?

    A hallucination is a plausible-sounding but factually wrong output. It cannot be turned off because it is a property of probabilistic generation, not a bug. You manage it through retrieval, citations, narrow scope, and human review. Some practitioners prefer terms like fabrication or confidently wrong output because hallucination makes the behavior sound rarer than it is.

    Does my product need an agent or just a chatbot?

    If users want answers, a chatbot is usually enough. If users want actions taken on their behalf across systems, you are looking at an agent. Agents introduce a different risk profile because they can do things, not just say things, so the work shifts toward defining what the system is allowed to do and who reviews its actions.

    Related Insights

    More on AI & Automation

    See all AI & Automation articles

    AI Loyalty Programs: What Actually Works

    You could say the average loyalty program is little more than a coupon dispenser with a points ledger. A customer will sign up, put your emails in the junk folder and only ever redeem what was already on their shopping list. The program chugs along but it does not learn. If one of your regulars […]

    Syntactic Analysis in AI: A Builder’s Guide

    “Remind me to put in a call to the bank and get a dentist appointment for tomorrow,” a user might type. Your app is left with one muddled task rather than two. No crash, no refusal from the model, it simply failed to parse the sentence as you would have. You could call this the […]

    AI Scalability: A Practical 2026 Guide

    You will not find much generative AI work that has made it past the pilot stage. According to Stanford’s 2026 AI Index, 88 per cent of organizations are running some form of AI, but Deloitte’s enterprise survey for the same year tells a different story: only one in three leaders would claim they have actually […]