Foundation model
fown-DAY-shun MOD-ul
A large AI model trained on broad data that can be adapted for many tasks. The base layer companies like OpenAI and Anthropic build.
A foundation model is a large model trained on a broad dataset, then adapted for specific tasks. Think of it as a general-purpose brain. OpenAI builds GPT. Anthropic builds Claude. Google builds Gemini. Meta builds Llama. These are foundation models. Other companies take them and build products on top.
The word "foundation" is deliberate. You do not build a foundation model to do one thing. You build it to do many things. Then you specialize it through fine-tuning, prompt engineering, or retrieval-augmented generation. A single foundation model can power a coding assistant, a customer support bot, a content generator, and a data analyst.
For marketers, this matters because the foundation model layer is where brand awareness now lives. When a developer asks Claude or GPT for a recommendation, the foundation model's training data determines the answer. Your product documentation, blog posts, and community content are the inputs. The model's response is the output.
Examples
A startup builds on top of a foundation model.
Cursor took foundation models from OpenAI and Anthropic and built a code editor around them. They did not train their own model from scratch. They built a product layer on top of existing foundation models.
Comparing foundation model providers.
OpenAI's GPT-4o costs $2.50 per million input tokens. Anthropic's Claude 3.5 Sonnet costs $3.00. Google's Gemini 1.5 Pro costs $1.25. Pricing, performance, and safety profiles vary. Most companies use multiple providers.
Open-source versus proprietary foundation models.
Meta released Llama 3 as open source. Companies can download it, run it on their own infrastructure, and fine-tune it without sending data to a third party. Proprietary models from OpenAI and Anthropic offer better performance but require API calls.
In practice
Read more on the blog
Related terms
A neural network trained on massive text data to generate and understand language. The technology behind ChatGPT, Claude, and Gemini.
Training an existing model on your specific data to improve its performance on your tasks. Customization without building from scratch.
Writing instructions that get the best output from an AI model. The difference between a useless response and a useful one.
Numerical representations of text that capture semantic meaning. Two similar sentences produce similar numbers, enabling AI-powered search.

Want the complete playbook?
Picks and Shovels is the definitive guide to developer marketing. Amazon #1 bestseller with practical strategies from 30 years of marketing to developers.