AI Glossary

Foundation model

A foundation model is a large AI model trained on broad data at massive scale that can be adapted to many downstream tasks without full retraining, like GPT-4, Claude, or Llama.

A foundation model is a large-scale AI model, typically trained on broad and diverse datasets, that serves as a base for downstream tasks across domains. The term was coined by Stanford's HAI in 2021 and has become standard in AI strategy discussions.

Foundation models differ from traditional ML models in three ways: scale of training data (trillions of tokens vs thousands), generality (handling many tasks vs one), and emergent capabilities (skills that appear without explicit training as the model grows).

In 2026, the foundation model landscape is dominated by seven organizations: OpenAI (GPT-4o, GPT-4.5), Anthropic (Claude 3), Google DeepMind (Gemini 2), Meta (Llama 3, open source), Mistral AI (Mistral Large), Cohere (Command R+), and xAI (Grok).

Enterprise strategy around foundation models asks three questions: which model for which use case, build versus buy versus fine-tune, and how to handle governance and risk when model behavior can change with each vendor update.

How it works

Foundation models are trained once at massive cost (tens to hundreds of millions of USD) and then adapted for specific tasks via fine-tuning, in-context learning, or retrieval augmented generation. Customers rarely train foundation models from scratch.

Practical example

A consulting firm chooses GPT-4o as its primary foundation model for client work, Claude for long-context analysis, and an open source Llama 3 variant for sensitive internal data that cannot leave their infrastructure.

Definition by Miss Yera, Leading Woman in Technology in Peru · AI Consultant · Favikon 2025.

Version en espanol: /glosario-ia/#foundation-model

Ready to apply AI in your company?

Miss Yera helps US and LATAM enterprises adopt AI with measurable ROI.

¿Tienes alguna duda o consulta?