If these words all sound the same
LLM, model, token, prompt, context, agent, MCP, harness. If you’ve heard these in a meeting and quietly pretended to follow along, this session is for you.
The AI vocabulary around software grew fast, but the problem isn’t the number of terms. It’s that people use the same words for different things. A PM says “model” meaning the product (ChatGPT). A dev says “model” meaning the neural network (GPT-4o). Here’s the minimum you need to join those conversations without getting lost.
The terms that matter
AI fundamentals
LLM
Simple: A program trained to read and write text. You send a question, it generates an answer.
Technical: Large Language Model. A neural network with billions of parameters trained on large text corpora. Generates output token by token based on conditional probability.
Example: When you ask ChatGPT to explain a bug, the LLM behind it processes your text and generates the answer piece by piece.
Common mistake: Treating LLM as a synonym for ChatGPT. ChatGPT is a product; GPT-4o is the LLM inside it.
Model
Simple: The trained “brain” that processes text. Different models have different capabilities, speeds, and costs.
Technical: The weights and architecture produced by training. Each model has its own profile for reasoning, speed, context window, and cost.
Example: Claude Sonnet is a model. Claude is the product. You interact with the product, but the model generates the response.
Common mistake: Confusing model and product. When someone says “let’s use Claude”, ask whether they mean the chat, the API, or a specific model.
Inference
Simple: Asking the model to generate a response. Every message you send triggers an inference.
Technical: Running the model over an input prompt to produce a completion. It consumes compute and is usually billed per token.
Example: You ask “how does async/await work in TypeScript?” and get an explanation. That generation step is inference.
Common mistake: Assuming inference is instant and free. It costs time and money. Larger models cost more.
Token
Simple: A chunk of text the model processes: a word, part of a word, or a special character.
Technical: The atomic processing unit of an LLM. Text is split by a tokenizer. Roughly 1 token ≈ 4 characters in English.
Example: A 500-line code file can be thousands of tokens. This matters because every model has token limits.
Common mistake: Ignoring token count. If you paste a huge file and the model “forgets” the start, you’ve probably hit the context window.
Context
Simple: Everything the model can see when generating an answer: your message, chat history, system instructions, attached files.
Technical: The full token sequence given as input, including system prompt, conversation history, and current message. Bounded by the context window.
Example: If you ask for a code review with three files attached, the context is your request plus those files. If it doesn’t fit, something gets cut.
Common mistake: Thinking the model remembers past conversations. It only knows what’s in the current context.
Working with AI
Prompt
Simple: The text you send to the model: question, request, instruction, or all three.
Technical: Textual input provided to generate a completion. Can include instructions, examples, constraints, and output format.
Example: Bad prompt: “make a component”. Better: “create a React button with primary/secondary variants, CSS Modules, a disabled prop, and tests”.
Common mistake: Thinking the prompt is just “the question”. It also includes context, constraints, examples, and format.
System prompt
Simple: Initial instructions that shape how the model behaves, like a briefing before the conversation starts.
Technical: A message with the system role that defines behavior, tone, constraints, and capabilities. Takes priority over user messages.
Example: “You’re a senior code reviewer. Point out security issues first.” That shapes every response.
Common mistake: Skipping system instructions and then blaming the model for not understanding the job.
Tools and interfaces
IDE
Simple: The code editor where you write software. VS Code, Cursor, and Windsurf are examples.
Technical: Integrated Development Environment. In the AI context, modern IDEs embed LLMs directly into editing and code actions.
Example: Cursor has AI built in. VS Code with Copilot becomes an AI IDE through an extension.
Common mistake: Assuming an AI IDE is automatically better than a terminal tool. They’re different interfaces for different workflows.
CLI
Simple: Command-line interface. You type in the terminal, get text back. Claude Code runs this way.
Technical: A text interface where you interact through commands. AI CLIs can read/edit files, run tests, and use system tools.
Example: Instead of clicking menus, you ask a terminal agent to review src/api.ts. It reads the file and responds inline.
Common mistake: Thinking CLI means “only for hardcore devs”. Modern AI CLIs are conversational and approachable.
Autocomplete
Simple: AI suggests the next piece of code while you type. Like phone autocomplete, but for code.
Technical: Code completion powered by LLMs, using the current file and open files to predict useful code in real time.
Example: You start typing function validate and the tool suggests the full body based on surrounding code.
Common mistake: Accepting every suggestion without reading it. Autocomplete is likely, not necessarily correct.
Code context
Simple: Project information AI uses to understand your work: open files, folder structure, types, dependencies.
Technical: Repository information included in the model’s context: file tree, relevant contents, imports, types, tests, configs.
Example: Cursor uses imports and nearby files to suggest code that fits your project better than a generic chat answer.
Common mistake: Assuming AI knows the whole project automatically. Every tool has context limits.
Agents and systems
Agent
Simple: A program that uses AI to make decisions and take actions, not just answer questions.
Technical: A system combining an LLM, tools, and control logic to execute tasks autonomously. It plans, acts, observes, and iterates.
Example: You ask for a refactor. The agent reads files, edits code, runs tests, and reports back.
Common mistake: Confusing chatbot and agent. A chatbot responds; an agent does work through tools.
Coding agent
Simple: An agent specialized in writing, editing, and reviewing code, with direct access to your project.
Technical: An AI agent with developer tools: file read/write, terminal commands, repo navigation, test execution.
Example: Claude Code and Codex CLI are coding agents. Cursor’s agent mode works the same way.
Common mistake: Giving full autonomy without review. Output still needs validation before reaching production.
MCP
Simple: A standard that lets AI tools connect to external services cleanly. Like USB for AI.
Technical: Model Context Protocol. An open protocol for connecting AI apps to data sources and tools through shared server interfaces.
Example: An MCP server for GitHub lets any compatible tool access issues, PRs, and repos without building its own integration.
Common mistake: Treating MCP as fully settled. It’s useful, but the ecosystem is still evolving.
Harness
Simple: The complete work system around an AI agent: instructions, tools, rules, context. The environment where it operates.
Technical: The orchestration layer around a coding agent: system prompt, tools, project rules, persistent context, automated checks, acceptance criteria.
Example: A harness might include AGENTS.md (CLAUDE.md in Claude Code), pre-commit checks, MCP tools, and a prompt defining project rules.
Common mistake: Treating harness as something you install. It’s a practice you build over time.
Why this matters
Shared vocabulary isn’t decoration. It’s communication infrastructure.
When a PM says “use the faster model”, they might mean cheaper, lower latency, or the one they saw on social media. Without shared vocabulary, everyone interprets differently and the result is rework.
- Multidisciplinary teams decide on AI together. Dev, QA, PM, and design all need enough vocabulary to participate.
- Tools change fast. What was autocomplete became agents; stale vocabulary leads to stale decisions.
- Vendors use jargon on purpose. If you don’t understand the terms, you can’t evaluate what they’re selling.
Real example
A real meeting between PM, QA, and dev:
PM: “We need AI to speed up testing.”
QA: “What kind? Generate cases? Run automated tests? Analyze failures?”
Dev: “We can use a coding agent to generate tests from the spec, but it needs good code context. If the project has weak types, the model will guess.”
PM: “So the model needs to see our code?”
Dev: “Yes. The code becomes context. The better organized the project, the better the output. But there are token limits, so we can’t send everything at once.”
QA: “And if it generates tests that pass but test nothing?”
Dev: “Exactly. Inference gives us a draft. Validation is still our job.”
Where this breaks
- Jargon as a wall: Technical terms used to impress instead of communicate create barriers, not vocabulary.
- Definitions too rigid: These terms are still moving. Don’t treat this glossary as law.
- Memorizing without understanding: Knowing the technical definition doesn’t help unless you grasp why it matters in practice.
Interactive block
A computer program trained to read and write text. You send a question, it generates an answer.
The trained "brain" that processes text. Different models have different capabilities, speeds, and costs.
The act of asking the model to generate a response.
A chunk of text the model processes. It can be a word, part of a word, or a special character.
Everything the model can see while generating an answer: your prompt, chat history, system instructions, files.
The text you send to the model. It can be a question, a request, an instruction, or all of that together.
Initial instructions that shape how the model should behave. A briefing before the conversation starts.
The code editor where you write software. VS Code, Cursor, and Windsurf are examples.
Command-line interface. You type in the terminal and get text back. Tools like Claude Code run this way.
AI suggests the next piece of code while you type. Like phone autocomplete, but for code.
The project information AI uses: open files, folder structure, dependencies, types, tests.
A program that uses AI to make decisions and take actions, not just answer questions.
An agent specialized in writing, editing, and reviewing code. It can access your project directly.
A standard that lets AI tools connect to external services in an organized way. Kind of like USB for AI tools.
The complete work system around an AI agent: instructions, tools, rules, context, and validation.
Takeaway
- Shared vocabulary is a prerequisite for AI collaboration, not a nice-to-have
- Match the explanation to the audience: simple for PMs/stakeholders, technical for devs
- Product and model are different things. Separating them avoids confusion about cost and capability
- Context is limited; the model only knows what fits in the window
- Agent isn’t chatbot; the difference is autonomy and tool access
- Revisit these terms regularly. Meanings evolve fast