AI has made its way into the business world, and it’s transforming things across the board. Every department, every job role, and every process is in the middle of a major overhaul. Meanwhile, a whole new breed of companies is coming into view — ones that will look nothing like the businesses that dominated the previous generation. The companies that come out on top won’t be the ones with the flashiest demonstrations. They’ll be the ones that turn AI into a well-managed, constantly evolving system that handles actual day-to-day operations.
And this goes well beyond simple chatbots. Sure, those tools are handy, but they don’t fundamentally change how a big organization runs. The real game-changer is deploying teams of AI agents that carry out ongoing tasks across key areas like software development, customer support, finance, HR, and operations — backed by proper identity management, contextual awareness, policy controls, and human supervision so you can actually trust them in real-world use.
To pull this off, companies need more than just access to a strong AI model or enough computing power. What really sets winners apart is the infrastructure surrounding the AI: how engineering teams build and roll out agents, how those agents are tailored to the organization, how they’re monitored and kept in check once live, and how they’re refined over time without introducing risk. Without that infrastructure, AI stays scattered, unreliable, and hard to trust at any meaningful scale.
We’re charting a completely different course. We’re developing a full-scale agent platform — one that works with a wide variety of models, stays open, and lets you pick and choose what works best at every part of the technology stack. And we’re deliberately making it developer-first. Right now, the next major building blocks of that platform are falling into place.
Designing a system for the AI-driven enterprise
To thrive in this new landscape, an agent platform has to clear a much higher hurdle. It needs to handle genuine production workloads, reflect the actual complexity of an organization, and carry real business accountability.
We’re centering our approach around three core principles:
First, it needs to be a unified, fully integrated system that supports a broad mix of models.
Businesses simply can’t piece together their agent strategy bit by bit. Cobbling together separate tools after the fact only slows people down and creates avoidable risk. Building, customizing, running, monitoring, and refining agents should all happen inside one coherent environment. That’s exactly why we’re unifying Azure, GitHub, Microsoft IQ, Fabric, Foundry, Windows, Microsoft Security, and Microsoft 365 into one connected system that lets you deploy agents at enterprise scale. Companies also need the freedom to pick the best model for each job — factoring in performance, speed, and cost — whether that’s a Microsoft model, a partner model, or an open-source option.
Second, it needs to be secure and governed from the ground up.
Everyone talks about governance, but actually delivering on it is an entirely different matter. Making it real means starting with one unified stack that covers everything from development through to production, anchored in the identity, access, compliance, and security foundations organizations already rely on. By extending Entra, Purview, Defender, Agent 365, and the broader Microsoft Security stack, governance is woven directly into the system rather than tacked on afterward — giving an AI-first enterprise the room to grow without sacrificing oversight or control.
Third, it needs to get better over time.
Enterprise AI systems can’t just be set and forgotten. How agents perform, what results they produce, and what feedback humans give need to flow back into the system so it can evolve safely with proper human involvement. As the platform runs day to day, models, workflows, and agents all become sharper and more finely tuned to a company’s specific processes. What you end up with is a system that grows more valuable the longer you use it.
These capabilities are quickly becoming non-negotiable, and companies that map their AI ambitions to these three principles will gain a measurable edge in quarters, not years.
So how does a system like this actually come together inside a real organization? It starts where work starts — with how agents are created. Let’s walk through what that process looks like on the platform we’ve built.
1. Build in GitHub
GitHub is where your developers are already doing their work. That’s where your project dependencies sit, where your code and application context is stored, where you contribute to and collaborate with the open-source community you lean on, and where you push innovation forward. Building agents anywhere else means abandoning all of that.
Agents should be developed the same way production-grade software is. You use GitHub Copilot to write code more quickly. You gather the key resources that matter: code repositories, work items, agent capabilities, and tools. And since agents aren’t just about code, you pair your evaluation and observability assets with them — all version-controlled just like any production system should be.
Agents need to follow a lifecycle: source, test, deploy, observe, and refine. GitHub establishes that lifecycle and puts the essential controls in place right from the start. The outcome is a workflow purpose-built for creating agents with the right safety guardrails baked in from the beginning. And you can handle all of this in a single place, inside a new application purpose-built for this system.
2. Contextualize with Microsoft IQ
Code is only half the picture for an agent. To truly deliver value, an agent also needs to understand your business: who your customers are, what you sell, what your contracts say, and how your processes actually work. Without reliable enterprise intelligence and context, even the most powerful model is just taking stabs in the dark.
Companies need access to a diverse range of models and the ability to match the right model to the right task — but simply choosing a model isn’t where the job starts. Microsoft IQ roots agents in real enterprise context by tapping into your business data wherever it sits — across Microsoft 365, your core business systems (like customer and revenue records), and other platforms your company already depends on, such as knowledge bases and your website. With Web IQ, the newest addition to the IQ platform, agents can also pull in relevant information from the wider web when it makes sense.
Tailoring agents to enterprise data isn’t just just about access. Simply pointing AI at a mountain of raw data is clunky and unreliable as a strategy. Microsoft IQ structures, secures, and delivers the right information in formats agents can actually work with — so they can arrive at accurate insights without getting buried in irrelevant information or generating made-up answers.
Once agents are anchored in the right context, companies can push further. With Frontier Tuning, you’re not just calling on AI models as static tools. You’re actively improving how those models behave using your own data and real-world workflows.
That includes Microsoft’s seven new MAI models, covering image processing, voice, transcription, coding, and reasoning. This entire model family is engineered to handle the kinds of tasks that matter in real-world business settings, and critically, these models aren’t fixed endpoints. They’re designed to learn from how work actually flows through your organization.
Our reinforcement learning environments allow our models to be reinforced through actual outcomes
Think of them as specialized training environments for AI. Within these setups, the agent learns your unique workflows, standards, and operational methods. It becomes tailored to your needs, providing a clear and improved return on investment.
Additionally, your custom or post-trained models remain within your own infrastructure. Your intellectual property, confidential data, and actual work processes are integrated into how your agents think and operate. The resulting intelligence functions within your environment, under your supervision, and the insights gained remain exclusively yours.
Lacking context and Frontier Tuning, agents are versatile but general. With these elements, they transform into a dedicated partner that truly understands the business landscape they are navigating.
3. Run in Foundry
Once agents are developed and given context, they require a platform to operate. Not as a trial, but in a live production setting.
Agents and agent teams have distinct requirements compared to standard applications. They must be able to reason, take action, utilize tools, collaborate with other agents, and evolve over time, all while adhering to enterprise-level controls. Foundry is the runtime built specifically for these demands.
- The widest selection of models: Various agents require different capabilities at varying price points. Regardless of the task or budget, Foundry offers access to the appropriate model, and an intelligent model router assists you in balancing quality, speed, and cost for each agent.
- Enhanced performance for open models: Through Fireworks AI on Foundry, businesses achieve quicker, more efficient inference directly within the platform.
- Compatibility with any agent, even those not built on our framework: Integrate agents developed using the Microsoft Agent Framework, LangGraph, GitHub Copilot SDK, Claude Agent SDK, or a bespoke setup.
- Tools and actions: Agents interact with enterprise systems via MCP, connectors, APIs, and workflows, with secure execution as the default.
- Evals and traces: Observability and traces make agent performance quantifiable. If you can’t measure it, you can’t enhance it.
- Ongoing optimization: Foundry allows for the continuous tuning of models, harnesses, IQs, tools, and actions, boosting performance as agents function within your specific environment.
A comprehensive trust, security, and policy framework encompasses the entire runtime. Policies are applied uniformly across context access, tool calls, optimization updates, traces, and response delivery. The agent doesn’t just function; it functions in the manner your enterprise demands.
This is the stage where your agent transitions from a project to a fully operational production system.
4. Govern with Agent 365
Now imagine scaling that agent to hundreds, then thousands. This occurs as various teams develop agents across an enterprise. Some are well-constructed, others are not. Some have inappropriate access, while others perform valuable tasks that the rest of the organization doesn’t benefit from.
Enterprise governance is essential. Enterprises require a method to monitor active agents, understand their access, track task compliance, and enforce policies across their entire agent network.
Agent 365, integrated with Entra, Purview, Defender, and the broader Microsoft Security suite, provides this capability. And if you’re interested in AI for security in addition to securing your AI, there’s “MDASH.”
Every agent within your organization appears in a unified catalog, regardless of whether it was built in Foundry or elsewhere. IT can see who deployed an agent, what data and tools it can access, its operational behavior, and its associated costs. They can enforce policies or take corrective action as needed.
A single location. Complete visibility. Genuine control over your agents’ actions and limitations.
5. Improve continuously
Agents cannot remain static. Every action an agent takes generates data: trajectories, outcomes, feedback. The system captures this, refines it, and feeds it back. Observe. Assess. Enhance. Deploy safely. Repeat.
This learning cycle operates continuously, in a live production environment.
Most improvements begin with eval-driven enhancements to the agent itself: prompts, context, skills, and tools. As distinct patterns emerge, learning can expand to model routing across multiple models, fine-tuning, or reinforcement learning. However, it always remains grounded in evaluation, elevating agent quality and ROI to meet business requirements.
The loop is governed, not closed. Enterprises need to audit it, correct it, and manage how changes are implemented. The system becomes more capable over time, guided by human oversight and increasingly autonomous, but always within your control.
This is the hill-climbing model in action: system-level enhancement, occurring continuously while the system is running.
6. Surface where people work, and scale on Azure
Of course, none of this is impactful if it doesn’t reach the individuals performing the work.
Agents are integrated directly into the workflow, within Teams, across Microsoft 365, and within your own applications and experiences. Identity, security, and compliance are inherent from the outset, ensuring that the agents your teams depend on daily inherit the same trust model as the rest of your environment.
We support multiple platforms, but your agents can be developed and run in an optimized and secure manner on Windows. You can execute models both in the cloud and locally on your device, and superior sandboxing enables you to run always-on agents safely.
When you require compute optimized for AI, global and sovereign infrastructure, or a path to market, the system scales on Azure, the same enterprise foundation customers have relied on for decades.
The system compounds
Every leading enterprise will eventually adopt this model: a central AI platform that orchestrates work across the business, integrating data, models, agents, and human judgment into a continuously improving and secure system.
As this system operates, its value grows exponentially. Velocity increases, and the bottleneck shifts from effort to human creativity and coordination. People can accomplish more independently, guided by shared context and fewer handoffs, while the business accelerates without introducing friction.
We are in an era of significant disruption. The enterprises that excel in this period will be those that adapt to evolving conditions, streamline how work is coordinated across the business, and consistently convert intelligence into tangible outcomes. Microsoft’s agent platform is engineered to achieve precisely this: it enables the building, contextualizing, running, governing, and improving of agents as a unified, integrated system.
At this juncture, the platform transcends being merely a build layer. It evolves into the operating system for enterprise AI at scale, where intelligence and trust are foundational by design.



