MICROSOFT FOUNDRY — AI APP & AGENT FACTORY

Build, Optimize
& Govern

Microsoft's unified AI platform — formerly Azure AI Foundry — for building, deploying, and managing AI apps and agents across 11,000+ models, with enterprise-grade observability, security, and governance.

11K+
MODELS
80K+
ENTERPRISES
80%
OF FORTUNE 500
3B+
QUERIES/DAY

Six Integrated Components

Everything you need to ship production AI — natively integrated, with one resource provider, one portal, and one governance layer.

FOUNDRY ARCHITECTURE

The Stack at a Glance

Foundry is a layered platform: models at the bottom, agents and tools in the middle, knowledge and governance wrapping everything. A single Azure resource provider unifies it all.

FOUNDRY RESOURCE · Microsoft.CognitiveServices
06
⚖️
Control Plane
Governance, RBAC, networking, policies, observability — wrapped around everything below.
Entra ID Purview Azure Monitor Content Safety
05
🤖
Agent Service · Workflows
Managed runtime for prompt, workflow, and hosted agents. Handles scaling, identity, and observability.
Agent Framework LangGraph Multi-agent A2A · MCP
04
🧠
Foundry IQ · Knowledge Layer
Agentic retrieval over enterprise data. Permission-aware, multi-source, reflective. Built on Azure AI Search.
SharePoint OneLake Blob Storage Web
03
🔧
Foundry Tools · Toolbox
Built-in tools (web search, code interp, file search) plus custom tools and remote MCP servers, all behind a unified MCP endpoint.
MCP OpenAPI Azure Functions Toolbox versions
02
🧬
Foundry Models · 11,000+
Direct-from-Azure (OpenAI, Anthropic Claude, MAI) and Partners & Community (Meta, Mistral, DeepSeek, Cohere, xAI, HuggingFace).
Serverless API Managed compute Model Router Fine-tuning
01
💻
Foundry Local
On-device inference for offline, sovereign, or low-latency workloads. Same model catalog, runs on your hardware.
Edge Offline Data sovereignty
01 / FOUNDRY MODELS

The Model Catalog

A unified catalog of more than 11,000 models — flagship frontier models from Microsoft and partners, plus open and specialized models. Browse, benchmark, fine-tune, and deploy through one consistent inference API.

11,000+
TOTAL MODELS
From foundational to industry-specific
15+
PROVIDERS
Microsoft, Anthropic, OpenAI, Meta, & more
60M+
PHI DOWNLOADS
Microsoft's small language models
2
DEPLOYMENT MODES
Serverless or managed compute
OpenAI
GPT-5 · o3 · o-series
Anthropic
Claude Opus · Sonnet · Haiku
Microsoft
Phi · MAI · industry models
Meta
Llama family
Mistral AI
Mistral · Mixtral · Codestral
DeepSeek
R1 · V3 reasoning models
xAI
Grok family
Cohere
Command · Embed · Rerank
NVIDIA
NIM microservices
HuggingFace
Open community models
DEPLOYMENT · SERVERLESS API

Pay-per-token API

Call models through a managed endpoint without provisioning infrastructure. Single inference API across providers — switch models with a config change, not a rewrite.

No infra to manage
Pay only for what you use
Unified API across providers
Built-in content safety
DEPLOYMENT · MANAGED COMPUTE

Dedicated Capacity

Deploy to dedicated GPU clusters when you need predictable performance, custom fine-tunes, or model formats that don't fit serverless. Provisioned throughput available.

Predictable latency & throughput
Custom fine-tuned models
Bring-your-own-model imports
Network isolation options
⚡ MODEL ROUTER · GA · OPTIMIZE COST & PERFORMANCE AT RUNTIME
User Query "summarize this 50-page contract"
Cheap fast model simple Q · 0.2¢
Mid-tier model summary · 1.4¢
Frontier model ✓ complex reasoning · 8.2¢
Reasoning model multi-step · 24¢
02 / AGENT SERVICE

Build, Deploy, Scale Agents

A fully managed runtime for AI agents. Choose your build path — no-code, declarative workflow, or fully custom code — and let Foundry handle hosting, scaling, identity, observability, and enterprise security.

TYPE 01
📝

Prompt Agents

GENERALLY AVAILABLE

Defined entirely through configuration — instructions, model selection, and tools. Build no-code in the Foundry portal in minutes, or via SDK / REST API.

Best for: Rapid prototyping, internal tools, agents that don't need custom orchestration logic.
TYPE 02
🔀

Workflow Agents

PUBLIC PREVIEW

Orchestrate sequences of actions or coordinate multiple agents using declarative definitions. Visual designer or code-first API. Power Fx expressions for control flow.

Best for: Multi-step business processes, multi-agent coordination, deterministic flows.
TYPE 03
🐳

Hosted Agents

PUBLIC PREVIEW

Code-based agents built with Agent Framework, LangGraph, or your own framework, deployed as containers. You write logic; Foundry manages runtime and scaling.

Best for: Complex workflows, custom tool integrations, full control over agent behavior.
EVERY AGENT COMBINES THREE CORE COMPONENTS
🧬 Model

From the Foundry catalog. Provides reasoning and language capabilities.

  • GPT, Claude, Llama, etc.
  • Switch via config change
  • Reasoning models supported
📋 Instructions

Define goals, constraints, and behavior — prompt-based, workflow-defined, or code.

  • System prompt + persona
  • Prompt Optimizer (preview)
  • Task Adherence guardrails
🔧 Tools

Provide access to data and actions — search, files, code, custom APIs, MCP servers.

  • Built-in: web, code, files
  • Foundry IQ knowledge bases
  • Remote MCP & OpenAPI
📊
M365 Copilot & Teams
Publish agents to Microsoft 365 Copilot and Teams via OpenResponses and Activity Protocols. Reach users in the apps they already use.
DISTRIBUTION
🔐
Entra Agent Registry
Centrally publish, discover, and govern agents across the organization via Microsoft Entra identity.
REGISTRY
🔌
Invocations Protocol
Flexible endpoint integration with custom apps and services. Use any framework that speaks A2A or AG-UI.
INTEGRATION
03 / FOUNDRY IQ

The Knowledge Layer

RAG, reimagined as a reasoning task. Foundry IQ treats retrieval as a multi-step plan — query decomposition, source selection, parallel search, and reflection — instead of a one-shot vector lookup. Permission-aware by default.

KNOWLEDGE BASES → AGENTIC RETRIEVAL → AGENTS
Knowledge Sources
📂 SharePoint
🪣 Azure Blob
🏞️ OneLake
🌐 Web
IQ Engine
Plan
Retrieve
Reflect
Cite
Agents
Grounded answers
Inline citations
ACL-aware
Auditable
1
Query Plan

An LLM analyzes the question and decomposes it into optimal sub-queries.

2
Source Select

Routes each sub-query to the right knowledge source — multi-source by default.

3
Parallel Search

Runs hybrid search (vector + keyword + semantic rerank) across selected sources at once.

4
Permission Filter

Honors document ACLs and Purview sensitivity labels under caller's Entra identity.

5
Reflect & Iterate

Evaluates results — re-queries if context is insufficient. Reasoning-style retrieval.

6
Aggregate & Cite

Returns extractive content with citations so agents can trace answers to source documents.

MICROSOFT'S INTELLIGENCE LAYER — THREE COMPLEMENTARY IQ WORKLOADS
📚
Foundry IQ

Organizational knowledge — documents, files, web content. The general-purpose knowledge base for any agent.

SOURCES · SHAREPOINT · BLOB · ONELAKE · WEB
🏞️
Fabric IQ

Semantic layer for Microsoft Fabric — ontologies, semantic models, and graphs over business data.

SOURCES · ONELAKE · POWER BI · FABRIC SEMANTIC MODELS
💼
Work IQ

Contextual layer for Microsoft 365 — collaboration signals from documents, meetings, chats, workflows.

SOURCES · M365 · TEAMS · OUTLOOK · COPILOT
04 / FOUNDRY TOOLS

Beyond Text Generation

Tools turn LLMs into agents that can act. Foundry provides ready-to-use built-in tools and a unified Toolbox that exposes any custom tool — including remote MCP servers — through a single endpoint to any compatible agent runtime.

CATEGORY · BUILT-IN

Ready Out of the Box

Configured in minutes through the Foundry portal. Some are GA, others in preview — most agents need only basic configuration to start.

🌐Web Search
🐍Code Interpreter
📁File Search
🧠Memory
📊Fabric Data
🔍AI Search
🖥️Computer Use
🛡️Content Safety
CATEGORY · CUSTOM

Bring Your Own Capabilities

Wire in any external API, internal service, or remote MCP server. Foundry handles auth, identity, and routing through unified endpoints.

🔌Remote MCP
Azure Functions
📜OpenAPI Spec
🔗Logic Apps
🛠️Custom MCP
📡Webhook Tools
🧰 TOOLBOX — CURATE ONCE, EXPOSE EVERYWHERE VIA MCP
📝 Prompt Agents
🔀 Workflows
🐳 Hosted Agents
🌍 Any MCP client
🧰 Toolbox

Single MCP-compatible
endpoint · versioned

crm_lookup
create_ticket
query_warehouse
send_notification
05 / CONTROL PLANE

Enterprise Governance

Foundry separates management from development. IT teams configure security and policy at the Foundry resource level; development teams build inside project containers. One unified portal, one resource provider, one set of controls.

🔐
Identity & Access

Unified RBAC across models, agents, tools, and knowledge bases. Microsoft Entra-backed identity end-to-end.

Role-based access control
Managed identities for resources
Caller identity propagation
🌐
Networking

Virtual network injection, private endpoints, public network disable for sensitive workloads.

Private endpoints
Subnet injection for agents
Bring-your-own-VNet
🛡️
Safety & Guardrails

Built-in content safety, prompt injection mitigation (XPIA), and Task Adherence guardrails for agentic workflows.

Azure AI Content Safety
Cross-prompt injection defense
Task Adherence (preview)
📊
Observability

Tracing, monitoring, and evaluation — all under Azure Monitor. LangChain & LangGraph traces supported natively.

Azure Monitor metrics
Run tracing & replay
Eval-driven optimization
💾
Data Residency

Bring your own storage, Azure SQL, Cosmos DB, AI Search. Customer-managed encryption keys supported.

Bring-your-own storage
Customer-managed keys (CMK)
Purview sensitivity labels
📜
Compliance

Inherits Azure's compliance posture — 50+ region-specific certifications. Responsible AI guidance built in.

SOC · ISO · HIPAA · FedRAMP
Responsible AI standards
Transparency reports per model
MANAGEMENT vs. DEVELOPMENT — CLEAR SEPARATION OF SCOPE
SCOPE
OWNED BY
RESPONSIBILITIES
Foundry Resource
IT & Platform Eng
Networking, security policies, model deployment governance, RBAC at resource level, compliance, monitoring config
Project Container
Dev & ML Teams
Build agents, define workflows, register tools, run evaluations, manage project assets and connections
Project Assets
Individual Builders
Files, prompts, evaluation datasets, agent configs, knowledge base configs, tool credentials
06 / FOUNDRY LOCAL

On-Device Inference

Run Foundry models on your own hardware — laptops, edge servers, sovereign clouds. Same model catalog, same APIs, no data leaves your environment. For when latency, privacy, or sovereignty rules out the cloud.

FOUNDRY CLOUD

Managed in Azure

  • Hosted in Microsoft data centers
  • Pay-per-token serverless or managed compute
  • Auto-scaling and global distribution
  • Full agent service runtime
  • Latency depends on network
FOUNDRY LOCAL

On Your Hardware

  • Runs on local CPU, GPU, or NPU
  • Data never leaves the device or network
  • Works offline — no internet required
  • Same model catalog & SDK surface
  • Sub-millisecond local latency
WHEN TO REACH FOR FOUNDRY LOCAL
🏥
Regulated Industries

Healthcare, defense, finance — where data sovereignty is non-negotiable.

📡
Edge & Offline

Field operations, retail kiosks, manufacturing floors with intermittent connectivity.

Ultra-Low Latency

Real-time interactions — voice agents, live coding assistants, gaming NPCs.

💰
Predictable Cost

Heavy local workloads where per-token cloud pricing breaks the budget.

🔬
Dev & Prototyping

Iterate on agents and prompts without round-tripping the cloud or burning quota.

🌍
Sovereign Cloud

Deploy in sovereign or air-gapped clouds where Azure isn't an option.

All Six Components at a Glance

Microsoft Foundry is the AI app and agent factory — six natively integrated components, one Azure resource, one portal.

COMPONENT
PURPOSE
KEY CAPABILITIES
WHEN IT MATTERS
🧬 Foundry Models
Discover, deploy, fine-tune from a unified catalog
11K+ models · serverless or managed · model router · benchmarking · fine-tuning
Any AI workload — start of every Foundry project
🤖 Agent Service
Build, deploy, scale single & multi-agent systems
Prompt · workflow · hosted agents · M365 publish · MCP · A2A
Production agents that need scaling, identity, observability
🧠 Foundry IQ
Permission-aware knowledge layer for grounded agents
Agentic retrieval · multi-source · ACL sync · Purview labels · citations
Agents that need to ground responses in enterprise data
🔧 Foundry Tools
Extend agents with built-in & custom capabilities
Web · code · file · MCP · OpenAPI · Toolbox versioning
Agents that need to take actions or call external systems
⚖️ Control Plane
Enterprise governance across all of the above
RBAC · networking · safety · observability · CMK · compliance
Always — production AI without governance is not production
💻 Foundry Local
On-device inference for offline / sovereign workloads
Edge runtime · same model catalog · same SDK · CPU/GPU/NPU
Regulated industries, edge, ultra-low-latency, sovereign clouds