I wrote the book on developer marketing. Literally. Picks and Shovels hit #1 on Amazon.

Get your copy
AI and AI-era marketingRAG

Retrieval-augmented generation

RAG (rhymes with bag)

Fetching relevant data and feeding it to an LLM so the response is grounded in real, current information instead of training data alone.

RAG is a pattern where you fetch relevant documents first, then pass them to an LLM along with the user's question. The model generates its answer based on the retrieved content, not just its training data. This solves two problems: the model stays current (training data has a cutoff), and the model stays accurate (it answers from your actual documents, not its memory, reducing hallucinations).

The typical RAG pipeline works like this. A user asks a question. Your system converts the question into an embedding. It searches a vector database for the most similar documents. It passes those documents plus the question to the LLM. The LLM generates an answer grounded in the retrieved content.

Every company building an AI-powered support bot, documentation assistant, or internal knowledge tool is using some version of RAG. Vercel's AI SDK has RAG built in. LangChain and LlamaIndex are frameworks specifically designed for RAG pipelines. If your documentation is well-structured, RAG makes your product easier for AI to recommend accurately.

Examples

An AI-powered documentation assistant.

Supabase built an AI assistant that answers questions about their platform. When a developer asks "How do I set up Row Level Security?", the system retrieves the relevant docs pages, passes them to the LLM, and generates a step-by-step answer with links to the original documentation.

A sales team knowledge base.

A sales rep asks the internal AI: "What is our competitive positioning against Datadog?" RAG retrieves the latest battle card, recent win/loss reports, and pricing comparisons. The LLM synthesizes a concise answer from current internal documents, not outdated training data.

A customer support chatbot with current data.

Without RAG, the bot only knows what was in its training data. With RAG, it retrieves the customer's account info, recent support tickets, and current product docs before answering. The response is specific, current, and grounded in real data.

In practice

Frequently asked questions

When should I use RAG versus fine-tuning?

Use RAG when you need current, factual answers from specific documents. Use fine-tuning when you need the model to learn a new style, format, or domain-specific behavior. RAG is for knowledge. Fine-tuning is for skill. Many production systems use both.

Related terms

Picks and Shovels: Marketing to Developers During the AI Gold Rush

Want the complete playbook?

Picks and Shovels is the definitive guide to developer marketing. Amazon #1 bestseller with practical strategies from 30 years of marketing to developers.