Glossary

AI terms,
clearly explained.

A practical glossary for teams evaluating AI agents, RAG systems, and LLM integrations.

AI Agent

An autonomous system that receives a goal, plans steps, executes actions through external tools, and adapts when something fails. Unlike chatbots that answer questions, agents complete tasks — querying databases, sending emails, updating CRMs, making decisions.

RAG (Retrieval-Augmented Generation)

A pattern that lets AI models answer accurately about your private data. Instead of training a model, you search for relevant documents at query time and inject them into the prompt. This gives sourced, accurate answers without expensive fine-tuning.

Embeddings

Numerical representations of text (or images) as high-dimensional vectors. Similar content produces similar vectors, enabling semantic search — finding documents by meaning rather than exact keyword matches. Generated by embedding models like OpenAI or Cohere.

Vector Database

A database optimized for storing and searching embeddings. Examples: Pinecone, Weaviate, pgvector, Chroma. Essential for RAG systems — you store document embeddings and query them with the user's question embedding to find relevant content.

Tool Use (Function Calling)

The ability of an AI model to invoke external functions during a conversation. You define tools as JSON schemas, and the model decides when to call them. This turns AI from a text generator into an active component that can query APIs, run calculations, or trigger actions.

MCP (Model Context Protocol)

An open protocol for connecting AI applications to tools, data sources, prompts, and workflows. A custom MCP server gives AI clients a secure, reusable way to use your APIs, databases, files, and business systems.

Structured Outputs

Forcing an AI model to respond in a specific JSON schema instead of free text. Eliminates fragile parsing and makes AI responses predictable for automated pipelines. Reliability goes from ~90% with free-text prompts to ~99% with strict schemas.

Prompt Engineering

The practice of crafting instructions for AI models to get reliable, accurate outputs. Includes system prompts, few-shot examples, chain-of-thought reasoning, and constraint specification. Good prompt engineering is the difference between a demo and a production system.

Chunking

The process of splitting documents into smaller pieces for RAG systems. Naive chunking (fixed token count) loses context. Smart chunking respects document structure — headings, paragraphs, code blocks — and carries metadata from the parent document.

Re-ranking

A second-pass relevance scoring after initial retrieval. A cross-encoder model evaluates each query-document pair for actual relevance, not just vector similarity. Dramatically improves precision in RAG systems, especially for ambiguous queries.

Human-in-the-Loop

A design pattern where an AI agent pauses and escalates to a human when confidence is low or the action has high impact. Critical for production agents handling financial, legal, or customer-facing processes. The agent knows when to ask for help.

Multi-Agent System

An architecture where multiple specialized AI agents coordinate to complete complex tasks. Each agent has a defined role, budget, and chain of command. Agents check out tasks, communicate via comments, and escalate blockers — similar to a human engineering team.

Hallucination

When an AI model generates confident-sounding but factually incorrect information. RAG systems reduce hallucinations by grounding responses in real documents. Good systems also detect when they don't have enough information and say so explicitly.

Ready to build?

Now that you know the terms, let's talk about your project.

Book a discovery call

AI terms,clearly explained.