Build Enterprise AI Agent with Azure AI Foundry and Power BI MCP

 

How I Built an Enterprise AI Agent with Azure AI Foundry That Unlocked Hidden Insights Across an Entire Organization

A practical architecture guide for data engineers and AI practitioners building enterprise-grade AI agents on the Microsoft stack.


The problem every data-heavy organization faces

Enterprise organizations invest heavily in data infrastructure — lakehouse architectures, semantic models, and Power BI dashboards. The data is clean, governed, and trusted.

But access to insights remains bottlenecked.

Stakeholders ask questions. Someone opens three dashboards, exports data, cross-references numbers, writes a summary, and sends it back. Thirty minutes for a question that should take thirty seconds. Multiply that across every executive, every meeting, every ad-hoc request — and the cost isn't just time. It's the questions that never get asked because the friction is too high.

The most valuable insights in any organization are the ones that require looking across data domains simultaneously — correlating production trends with revenue, or headcount changes with cost variances. These cross-domain insights are theoretically available but practically inaccessible because they require navigating multiple dashboards and manually synthesizing the data.

I built an AI agent that eliminates this bottleneck entirely.


AI Agent Architecture


The entire solution uses three managed Azure services. No VMs. No containers. No App Service. No custom APIs. No Streamlit. No Flask. No infrastructure to manage, patch, or scale.

Azure AI Foundry Agent Service — The orchestration layer. It manages conversations, plans tool calls, passes data between the LLM and external services, and maintains conversational memory. The agent service itself is free — you pay only for the LLM tokens it consumes.

Power BI Remote MCP Server — The data access layer. MCP (Model Context Protocol) is an open standard that defines how AI agents connect to external data sources. Microsoft hosts a remote MCP server specifically for Power BI at a public endpoint. It gives the agent three capabilities: discover the schema of any semantic model, execute DAX queries, and return structured results — all using the authenticated user's existing workspace permissions. No additional cost.

Microsoft Teams — The user interface. The agent is deployed as a Teams bot. Users interact with it in the same app they already use for everything else. No new logins, no new apps, no training. Included in existing Microsoft 365 licensing.

Underneath these three components sits the data infrastructure that most enterprises already have: a Microsoft Fabric Lakehouse with medallion architecture (Bronze → Silver → Gold) and Power BI semantic models containing DAX measures that encode all the business logic. The agent doesn't bypass this infrastructure — it sits on top of it, querying the same measures and returning the same numbers that appear in the trusted dashboards.


Why this architecture works for an enterprise

Same numbers, different interface. The agent queries the exact same Power BI semantic models that stakeholders already trust. There's no separate data pipeline, no data copy, no sync lag. If the dashboard shows $4,584K revenue, the agent shows $4,584K revenue. Data trust is inherited, not earned from scratch.

Cross-domain in one conversation. A stakeholder asks about revenue, follows up with headcount, then asks about production efficiency — all in one Teams chat. No switching between dashboards. No context lost between questions. The agent queries multiple semantic models in sequence and synthesizes the results into a single narrative.

Zero new infrastructure cost. The only new variable cost is LLM token consumption — pure pay-per-use. No fixed monthly fees, no capacity reservations, no minimum spend. Everything else (Foundry Agent Service, MCP server, Teams bot) is included in existing Microsoft licensing. The agent adds intelligence on top of infrastructure the organization was already paying for.

Model-swappable. GPT-4.1 today, Claude Sonnet tomorrow, an open-source model next quarter. The Foundry Agent Service supports models from OpenAI, Anthropic, Meta, DeepSeek, xAI, and others. Switching is a dropdown change — no code changes, no architecture changes. No vendor lock-in.

Security by default. Authentication flows through Microsoft Entra ID. The user's identity is passed through the MCP server to Power BI, where workspace-level permissions are enforced automatically. Users can only query data they already have access to in Power BI. No custom access control code needed.


The key design decision: zero Copilot dependency

The Power BI MCP server includes a tool that uses Microsoft Copilot to generate DAX queries from natural language. I disabled it.

Two reasons:

Cost isolation. The Copilot tool consumes Fabric capacity units (CUs) — the same pool used for running pipelines, processing data, and rendering reports. By having the LLM generate DAX directly from schema metadata, all AI compute runs on the Azure OpenAI pay-per-token model. The existing Fabric capacity stays completely untouched by the agent.

Full control over accuracy. When the LLM writes the DAX, its behavior is controlled entirely through the system prompt. I can add domain-specific query patterns, correct recurring mistakes, encode business rules, and tune accuracy — all by editing text. Every wrong answer becomes a new rule. The agent gets smarter every day, and I have complete visibility into why it generates what it generates.


The real challenges (the parts tutorials don't cover)

The right Foundry experience matters

Azure AI Foundry has two portal experiences: Classic and New. MCP tool support only exists in the New experience. If you create an agent in Classic, you'll see tool options for Code Interpreter and OpenAPI — but no MCP. The fix is simple (toggle to New Foundry, create a new agent) but confusing if you don't know it exists.

Authentication type is the make-or-break decision

The MCP tool supports multiple authentication types. For agents that run in the Foundry playground (where most development happens), "Project Managed Identity" is the correct choice — it uses the Foundry project's service principal. "Agent Identity" (which passes the user's personal token) does not work in the playground context. Getting this wrong produces cryptic "entity not found" errors that look like permission issues but are actually authentication mismatches.

The project identity, not the resource identity

When granting the Foundry agent access to Power BI workspaces, you must add the PROJECT-level service principal — not the resource-level one. The project identity follows a specific naming pattern that includes both the resource name and the project name. Adding only the resource identity produces "not found" errors even when the semantic model ID is correct. This was the single most important discovery in the entire build.

Quota pools are separate

Azure OpenAI offers two deployment types: Global Standard and Standard. They use separate quota pools. If one type shows zero available quota in your region, the other may have capacity. During initial deployment, Standard had available quota when Global Standard didn't. Also, set the tokens-per-minute limit high enough (50K+) for production use — AI agents make multiple API calls per question (schema retrieval + query generation + execution + narrative), and low limits cause rate-limit errors during normal conversations.

DAX generation needs domain-specific guidance

The LLM generates syntactically valid DAX out of the box, but domain-accurate DAX requires explicit guidance in the system prompt. Date filtering patterns, aggregation logic, and measure references — these are areas where generic LLM knowledge produces queries that look right but return wrong results. The fix is always the same: identify the error pattern, add a specific rule to the system prompt, and retest. After enough iterations, the agent generates accurate queries consistently.


The system prompt is 80% of the project

This is the part that separates a useful agent from a toy demo.

The system prompt isn't "you are a helpful assistant." It's a comprehensive encoding of domain intelligence: which data sources to query for which types of questions, how to aggregate metrics correctly at different organizational levels, what thresholds indicate a concern versus normal performance, how to structure analytical narratives, and what formatting the audience expects.

Every wrong answer the agent produces becomes a new rule in the prompt. After enough iterations, the agent doesn't sound like a generic chatbot — it sounds like someone who deeply understands the business. It flags concerns proactively, compares against targets, and structures its analysis the way the audience actually thinks about the data.

This prompt is intellectual property. It encodes domain expertise that took months or years to develop. I store it in Azure Key Vault, where access is tightly controlled. The prompt is what makes the agent valuable — the LLM is a commodity, but the domain intelligence encoded in the prompt is the competitive advantage.


The impact: unlocking hidden potential

Before the agent, cross-domain insights required opening multiple dashboards, exporting data, and manually analyzing the intersection. Nobody did it routinely because the friction was too high.

The agent makes it a single question. It queries multiple data domains, synthesizes results, flags patterns, and writes a narrative — in seconds.

The organization didn't get new data. It got new visibility into data that already existed. Insights that were always theoretically available became practically accessible to anyone with a Teams window. That's the hidden potential that was unlocked — not new information, but new access to existing information.

The result: leadership adopted the agent for daily decision-making. Cross-domain analysis that previously required dedicated analyst time became self-service. And the agent keeps getting smarter as more domain rules are encoded into the system prompt.


The technical stack

  • Cloud: Microsoft Azure (AI Foundry, Azure OpenAI, Key Vault, Entra ID)
  • Data platform: Microsoft Fabric Lakehouse, Delta Lake, PySpark
  • Data architecture: Bronze → Silver → Gold medallion pattern
  • BI layer: Power BI semantic models with DAX measures
  • AI orchestration: Azure AI Foundry Agent Service
  • Data access protocol: Model Context Protocol (MCP) — Power BI Remote MCP Server
  • LLM: GPT-4.1 via Azure OpenAI (swappable to Claude, DeepSeek, Llama, or any Foundry-supported model)
  • Interface: Microsoft Teams (Bot Framework)
  • Security: Microsoft Entra ID SSO, Power BI workspace RBAC, Azure Key Vault
  • Core methodology: System prompt engineering — iterative domain-specific accuracy tuning

Key takeaways for practitioners

Leverage what already exists. The most powerful AI agents don't build new data infrastructure — they sit on top of existing, trusted data layers. If your organization has Power BI semantic models, the MCP server gives your agent instant access to governed, business-logic-enriched data. Don't rebuild what's already built.

The prompt is the product. Architecture takes a day. Prompt engineering takes weeks. The iterative process of testing, finding failures, adding rules, and retesting is where the actual value is created. Budget your time accordingly.

Start with the portal, not the SDK. Azure AI Foundry's portal lets you prototype a working agent in minutes. Move to code-first development only when you need programmatic control, automated testing, or CI/CD pipelines.

Test with real stakeholder questions. The gap between "Hello, how are you" and "Why did our southeast region's productivity decline while headcount increased?" is enormous. Your test suite should be composed entirely of questions that come up in real meetings.

Ship early, iterate continuously. The first version will be rough. The version stakeholders adopt will be polished. The difference is iteration — and iteration requires production feedback, not playground testing.

MCP changes the enterprise AI equation. Connecting an AI agent to trusted BI data with zero custom code eliminates the biggest blocker to enterprise AI adoption: data trust. When the agent returns the same numbers as the dashboard, adoption follows naturally.


What's next

The same architecture extends without re-architecting: anomaly detection (statistical models in Fabric notebooks, surfaced by the agent), predictive analytics (ML models forecasting KPIs, queried through natural language), what-if scenarios (simulation on demand), and multi-user rollout (workspace permissions automatically control data visibility per user).

Same agent. Same MCP connection. Same Teams interface. New capabilities layered on top.

The hardest part isn't the technology. It's encoding domain expertise into a system prompt that makes the agent genuinely useful. That's the work that only someone who understands both the data and the business can do — and it's the work that creates lasting competitive advantage.


I specialize in Microsoft Fabric, Azure AI, and building enterprise AI agents that transform how organizations access and act on their data. Currently focused on the intersection of data engineering and generative AI — where data platforms become intelligent platforms.

Connect with me on LinkedIn.


Tags: #AzureAI #AIFoundry #PowerBI #MCP #ModelContextProtocol #AIAgents #DataEngineering #MicrosoftFabric #GenAI #EnterpriseAI #SystemPromptEngineering #DAX #LLM #GPT4



Comments

Popular posts from this blog

🚀 The End of the Spark Upgrade: Why "Versionless Spark" is a Game Changer for AI