Category Archives: Cloud & Datacenter Management (2025-2026)

Is Your AI Safe? Protect It in Hybrid and Multicloud Environments with Microsoft Defender for Cloud

Security in hybrid and multicloud environments is no longer a marginal topic: it’s a strategic priority. The numbers are clear: the average cost of a breach has reached $4.44 million; 86% of decision-makers believe their cybersecurity strategy isn’t keeping pace with multicloud complexity; over 40% expect a skills shortage precisely in security administration roles. In this scenario, the attack surface expands, dependencies multiply, and SecOps teams must interpret fragmented signals coming from different platforms—often with limited resources.

A shift in perspective is needed, and AI itself makes it possible: an approach that combines real-time visibility, shared context, and intelligent automation, capable of keeping up with the speed of the cloud and the evolution of threats.

This article provides an overview of the evolutions of Microsoft Defender for Cloud and how the solution helps strengthen AI security in hybrid and multicloud environments.

How AI Enables a Paradigm Shift

AI is not simply a new tool: even in security, if adopted judiciously, it becomes an operational amplifier capable of transforming posture assessment, incident analysis, and collaboration across teams. In particular, it enables you to:

Continuously assess and improve security posture, with real-time visibility and context at “hyper-cloud” scale, thanks to automatic correlations between assets, identities, configurations, and risks.
Investigate and respond to threats with unprecedented speed and expertise, with AI-driven detections and strategies, risk-based prioritization, automated playbooks, and operational guidance.
Increase productivity and collaboration through natural-language workflows, using, for example, Copilot for triage, research, queries, runbooks, and reporting.

AI Attack Surface: Where Risks Lurk

Before implementing any controls, it’s essential to map the most exposed areas across the entire lifecycle of AI solutions—identities, network, data, models, supply chain, and operations—because that’s where risks accumulate and often go unnoticed.

Identity & access. Threats arise from unprotected keys, excessive privileges that pile up over time, and the absence of JIT/PIM mechanisms to limit access and permission duration.
Network. AI endpoints exposed to the internet, uncontrolled egress, and the lack of Private Endpoints open avenues an attacker can probe.
Data. In RAG architectures with unclassified sources, risk increases: loss of ACLs during indexing and leakage in prompts or logs can expose sensitive information.
Models. The use of unapproved families/versions, absence of content safety, and lack of anti-abuse testing expose you to harmful responses, jailbreaking, and non-compliant outputs.
ML supply chain. Dataset poisoning, unverified dependencies, and unsigned container images compromise upstream integrity, contaminating the entire training and release process.
Cost masking. Anomalous token/RPM usage, key scraping, and abuse by bots/scripts generate unexpected expenses and can mask fraudulent activity.
Operations. The lack of SLOs, absence of effective rollbacks, and weak BC/DR strategies make service continuity fragile and extend recovery times.

Mapping these weaknesses is not a theoretical exercise: it’s the prerequisite for designing targeted, measurable, and sustainable controls over time. It’s also about balancing costs and the level of security you aim to achieve.

How Microsoft Defender for Cloud Intervenes

To reduce risk and gain visibility in hybrid and multicloud environments, Defender for Cloud acts on multiple levels:

CSPM (Cloud Security Posture Management). It starts with posture: evaluates configurations, maps assets and dependencies, highlights deviations, and proposes concrete remediations. All with a unified multicloud view to compare criteria and priorities across different providers.
Workload protection (CWPP). Extends coverage to workloads—VMs, containers/Kubernetes, and PaaS services (databases, storage, app services)—combining hardening recommendations and detections on runtime and configurations.
AI detections and recommendations. Makes AI workloads visible and flags risks across configurations, identities, network, and logging, aligning with emerging best practices for AI security and governance.
SecOps integration. Closes the loop with operations: forwards events and alerts to Microsoft Sentinel and Defender XDR, enables automated playbooks, and supports guided investigations to reduce MTTD/MTTR.

The result is coordinated defense: from prevention to detection to response, with ready-to-use insights that speak the same language across all clouds.

AI Security Posture Management (CSPM): “Code-to-Cloud” Visibility for Generative AI

With the Defender Cloud Security Posture Management (CSPM) plan in Microsoft Defender for Cloud, security spans enterprise on-premises environments and hybrid/multicloud scenarios (Azure, AWS, Google Cloud), covering the entire lifecycle of generative AI applications: from code, to pipelines, to production runtime.

AI Bill of Materials (AI BOM)

Defender for Cloud discovers AI workloads and reconstructs the AI BOM: application components, data, and AI artifacts, from code to cloud. This end-to-end visibility makes it possible to identify vulnerabilities, prioritize risks, and protect generative applications with targeted interventions.

Continuous discovery of AI workloads is available for major services:

Azure OpenAI Service
Azure AI Foundry
Azure Machine Learning
Amazon Bedrock
Google Vertex AI (Preview)

In addition, Defender for Cloud detects vulnerabilities in dependencies of generative AI libraries (e.g., TensorFlow, PyTorch, LangChain) by analyzing source code (IaC misconfigurations) and container images (vulnerabilities).

Contextual Insights and Recommendations

Defender CSPM provides recommendations on identities, data security, and internet exposure, helping identify and prioritize critical issues.

DevOps security & IaC misconfigurations intercept misconfigurations that expose generative apps (excessive permissions, unintentionally published services), reducing breaches, unauthorized access, and compliance problems.

Examples of IaC controls for AI

Use of Private Endpoints for Azure AI Service.
Restricting Azure AI Service Endpoints.
Managed Identity for Azure AI service accounts.
Identity-based authentication for Azure AI service accounts.

In addition, the attack path analysis feature detects and helps mitigate risks to AI workloads, even when data and compute are distributed across Azure, AWS, and GCP.

What’s New: Defender for AI Services (Runtime Protection for Azure AI Services)

Defender for AI Services introduces runtime protection for Azure AI services (formerly threat protection for AI workloads). It is designed for risks specific to generative AI and combines Microsoft Threat Intelligence and Azure AI Content Safety (Prompt Shields) with real-time analytics to detect data leakage, data poisoning, jailbreaks, credential theft, wallet abuse, suspicious access patterns, and other malicious behaviors.

Overview — Protection Against AI Threats

The solution makes it possible to identify threats to generative AI applications in real time and assists in response with context-rich alerts and recommendations. It provides coverage for endpoints and AI resources present in subscriptions, highlighting risks that can impact applications.

Integration with Defender XDR

Protection for AI services integrates with Defender XDR, allowing you to centralize alerts related to AI workloads in the XDR portal and correlate alerts and incidents with identities, endpoints, network, and applications along the entire kill chain.

Evidence from User Prompts

With the protection plan active, it is optionally possible to include in alerts suspicious segments of user prompts and/or model responses originating from apps or AI resources. This evidence is customer data and helps with triage, classification, and intent analysis. It is available in the Azure portal, Defender portal, and via specific integrations.

Application and User Context in Alerts

To maximize actionability, the solution propagates to API calls to Azure AI the context of the user and application (e.g., userId, userIp, sessionId, appId, environment, requestId). This makes it possible to block users, correlate incidents, prioritize, and distinguish suspicious activity from expected behavior for a specific app.

Data and AI Security Dashboard: Unified View, Faster Decisions

The Data and AI Security Dashboard in Microsoft Defender for Cloud offers a centralized platform to monitor and manage data and AI resources, associated risks, and protection status. It highlights critical issues, resources requiring attention, and internet-exposed assets, enabling proactive mitigation. It also provides insights on sensitive data within data services and AI workloads.

Key Benefits

Unified view of all data and AI resources in a single interface.
Insights into data location and the types of resources that host it.
Assessment of protection coverage for data and AI resources.
Attack paths, recommendations, and data threat analysis in one place.
Mitigation of critical risks and continuous posture improvement.
Security explorer highlighting useful queries to uncover insights.
Identification and synthesis of sensitive data in cloud resources and AI assets.

Data Security with Microsoft Purview

To rigorously manage data used in AI applications, you can enable integration with Microsoft Purview. This feature requires a Microsoft Purview license and is not included in the Microsoft Defender for Cloud plan for AI services.

By enabling Purview, you allow the platform to access, process, and store request and response data—including associated metadata—originating from Azure AI services. In this way, you enable key data security and compliance scenarios, such as:

Sensitive Information Type (SIT) classification.
Analysis and reporting with Microsoft Purview DSPM for AI.
Insider risk management.
Communications compliance.
Microsoft Purview auditing.
Data lifecycle management.
Electronic discovery (eDiscovery).

In practice, this integration makes it possible to govern and monitor AI-generated data in alignment with corporate policies and regulatory requirements, fostering responsible, traceable, and compliant use of AI throughout the entire information lifecycle.

Conclusions

AI security in hybrid and multicloud environments requires a continuous, measurable, risk-oriented posture. Microsoft Defender for Cloud provides the tools to move from visibility to operational protection: discovery of workloads and AI BOM, contextual recommendations and attack path analysis, through to runtime protection with Defender for AI Services and incident correlation in Defender XDR and Microsoft Sentinel. Integration with Microsoft Purview makes it possible to govern the data that fuel models, ensuring traceability and compliance throughout the entire lifecycle.

The recommended path is clear: map the AI attack surface; enable CSPM and essential IaC controls; extend coverage to key workloads (VMs, containers, PaaS); activate runtime protection for Azure AI services; and centralize detection and response. Only then does AI become a multiplier of resilience rather than a new vector of risk. Finally, remember that absolute security in IT does not exist (except for systems that are powered off and completely isolated): it is therefore essential to balance costs, operational impact, and the desired level of protection, based on the value of assets and acceptable risk.

The 7 Pillars of AI Governance on Azure PaaS — A Practical Guide

AI is no longer theory; it’s everyday practice: pilot projects, enterprise chatbots, and new customer-facing features. Adoption is accelerating—often faster than an organization’s ability to govern it. In the midst of this race, Azure’s AI PaaS offerings provide a fast track to experiment and move services into production. But speed without guardrails comes at a cost: data exposure, unpredictable spend, opaque decision-making, and compliance risks that can slow innovation precisely when it should be accelerating.

Governance isn’t a brake on creativity—it’s the structure that lets AI become repeatable, safe, and measurable value. It means aligning investments with business goals, clarifying accountability, and defining controls, observability, and lifecycles; it means knowing where models live, who uses them, with what data, and at what cost. In Azure, where many capabilities are just “an API call away,” the line between a brilliant idea and an operational incident often comes down to the quality of your governance choices.

This article turns the Cloud Adoption Framework guidance into practical recommendations for governing Azure’s AI PaaS services. The journey is organized into seven complementary domains that together build a responsible AI posture: governing platforms, models, costs, Security, operations, regulatory compliance, and data.

In the chapters that follow, we’ll dive into each domain with an operational focus. The goal is simple: to lay the foundation for a governance framework that unlocks innovation, reduces risk, and keeps AI aligned with the business—today and as it evolves.

Governing AI Platforms

If the foundation isn’t consistent, every team ends up “doing its own thing.” Platform governance exists precisely to prevent that: to apply uniform policies and controls to Azure AI services so security, compliance, and operations stay aligned as architectures evolve.

Put this into practice:

Leverage built-in policies. With Azure Policy you’re not starting from scratch: there are ready-made definitions covering common needs—security setup, spending limits, compliance requirements—without custom development. Assign these policies to Azure AI Foundry, Azure AI Services, and Azure AI Search to standardize identity, Networking, logging, and required baseline configurations.
Enable Azure Landing Zone policy sets. Landing zones include curated, tested initiatives for AI workloads, already aligned with Microsoft recommendations. During deployment, select the Workload Specific Compliance category and apply the dedicated initiatives (e.g., Azure Openai, Azure Machine Learning, Azure AI Search, Azure Bot Service) to achieve broad, consistent coverage across environments.

Governing AI Models

A powerful but ungoverned model produces unpredictable results. Model governance ensures safe, reliable, and ethical outputs by setting clear rules for model inputs, outputs, and usage. Here’s what to implement:

Inventory agents and models.
Use Microsoft Entra Agent ID to maintain a centralized view of AI agents created with Azure AI Foundry and Copilot Studio. A complete inventory enables access enforcement and compliance monitoring.
Restrict approved models.
With Azure Policy, limit which model families/versions can be used in Azure AI Foundry. Apply model-specific policies to meet your organization’s standards and requirements.
Establish continuous risk detection. Before release and on a recurring basis:
- Enable AI workload discovery in Defender for Cloud to identify workloads and assess risks pre-deployment.
- Schedule regular red-team exercises on generative models to uncover weaknesses.
- Document and track identified risks to ensure accountability and continuous improvement.
- Update policies based on findings so controls stay effective and aligned with current risks.
Apply content-safety controls everywhere.
Configure Azure AI Content Safety to filter harmful content on both inputs and outputs. Consistent application reduces legal exposure and maintains uniform standards.
Ground your models.
Steer outputs with system messages and RAG (retrieval-augmented generation); validate effectiveness with tools like PyRIT, including regression tests for consistency, safety, and answer relevance.

Governing AI Costs

AI can burn through budget quickly if you don’t govern consumption, capacity, and usage patterns. The goal is predictable performance, controlled spend, and alignment with business objectives. Here’s what to put into practice:

Choose the right billing model for the workload.
For steady workloads, use commitment tiers / provisioned throughput. With Azure OpenAI, Provisioned Throughput Units (Ptus) offer more predictable costs than pay-as-you-go when usage is consistent. Combine PTU endpoints as primaries with consumption-based endpoints for spikes, ideally behind a gateway that routes traffic intelligently.
Select appropriately sized models—avoid overkill.
Model choice directly impacts cost; less expensive models are often sufficient. In Azure AI Foundry, review pricing and billing mechanics, and use Azure Policy to allow only models that meet your cost and capacity targets.
Set quotas and limits to prevent overruns.
Define per-model/per-environment quotas based on expected load and monitor dynamic quotas. Apply API limits (max tokens, max completions, concurrency) to avoid anomalous consumption.
Pick deployment options that are cost-effective and compliant.
Models in Azure AI Foundry support different deployment modes; prefer those that optimize both cost and regulatory requirements for your use case.
Govern client-side usage patterns.
Uncontrolled access makes spend explode: enforce network controls, keys, and RBAC; impose API limits; use batching where possible; and keep prompts lean (only the necessary context) to reduce tokens.
Auto-shut down non-production resources.
Enable auto-shutdown for VMs and compute in Azure AI Foundry and Azure Machine Learning for dev/test (and in production when feasible) to avoid costs during idle periods.
Introduce a generative gateway for centralized control.
A generative AI gateway enforces limits and circuit breakers, tracks token usage, throttles, and load-balances across endpoints (PTU/consumption) to optimize costs.
Apply cost best practices for each service.
Every Azure AI service has its own levers and pricing. Follow the service-specific guidance (e.g., for Azure AI Foundry) to choose the most efficient option for each workload.
Monitor consumption patterns and billing breakpoints.
Keep an eye on TPM (tokens per minute) and RPM (requests per minute) to tune models and architecture. Use fixed-price thresholds (e.g., image generation, hourly fine-tuning) and consider commitment plans when usage is steady.
Automate budgets and alerts.
In Azure Cost Management, set budgets and multi-threshold alerts to catch anomalies before they impact projects, maintaining financial control over AI initiatives.

Governing AI Security

Protecting data, models, and infrastructure requires consistent controls across identity, Networking, and runtime. The goal: reduce attack surface and preserve the reliability of your solutions. Here’s what to put into practice:

Enable end-to-end threat detection.
Turn on Microsoft Defender for Cloud on your subscriptions and enable protection for AI workloads. The service surfaces weak configurations and risks before they become vulnerabilities, with actionable recommendations.
Apply least privilege with RBAC.
Start everyone at Reader and elevate to Contributor only when truly needed. When built-in roles are too permissive, create custom roles that limit access to only the required actions.
Use managed identities for service authentication.
Avoid secrets in code or config. Assign a Managed Identity to every service that accesses model endpoints and grant only the minimum permissions required on application resources.
Enable just-in-time access for admin operations.
With Privileged Identity Management (PIM), elevation is temporary, justified, and approved—reducing privileged account exposure and improving traceability.
Isolate AI endpoint networking.
Prefer Private Endpoints and VNet integration to avoid Internet exposure. Where supported, use service endpoints or firewalls/allow-lists to permit access only from approved networks, and disable public network access on endpoints.

Governing AI Operations

Operations are what keep AI stable over time: without controls on lifecycle, continuity, and observability, even the best model stalls at the first hiccup. The objectives: reliability, clear recovery times, and steady business value.

Define model lifecycle policies.
Standardize versioning and compatibility with mandatory pre-rollout tests (functional, performance, and safety). Plan release strategies (shadow/canary/blue-green), rollback procedures, and deprecation/retirement rules valid across platforms (Azure AI Foundry, Azure Openai, Azure AI Services). Document dependencies, feature flags, and the version compatibility matrix.
Plan business continuity and disaster recovery.
Set RTO/RPO and configure baseline DR for resources exposing model endpoints: replicate across paired regions, use Infrastructure as Code (Bicep/Terraform) for rebuild, and place a gateway in front for failover and cross-instance/region routing. Where possible, enable zone redundancy; snapshot/backup configurations (prompts, safety settings, embeddings/vector stores); and run periodic tests to validate plans.
Configure monitoring and alerting for AI workloads.
Enable Azure Monitor / Log Analytics / Application Insights and set recommended alerts for Azure AI Search, Azure AI Foundry Agent Service deployments, and individual Azure AI Services. Track key SLIs (latency, 4xx/5xx error rates, timeouts, throughput, HTTP 429) and surface degradation before it impacts users. Centralize logs, Define slis, and create intervention runbooks with escalation paths and automated actions where feasible.

Governing Regulatory Compliance for AI

Regulatory compliance isn’t bureaucracy: it defines what’s acceptable, reduces legal risk, and builds trust. It requires a continuous, automated, and demonstrable process. Here’s what to put into practice:

Automate assessments and management.
Use Microsoft Purview Compliance Manager to centralize assessments and tracking, assign remediation actions, and maintain evidence. In Azure Policy, apply the Regulatory Compliance initiatives relevant to your sector to enforce controls and continuously monitor for deviations.
Build frameworks specific to your industry/country.
Rules differ by industry and geography: create targeted checklists and control mappings (privacy, Security, transparency, human oversight). Adopt standards such as ISO/IEC 23053:2022 to audit policies applied to machine learning workloads, and define a cadence for periodic reviews.
Make compliance auditable by design.
Define responsibilities (COOL), exception handling with expirations (waivers), and an evidence repository (policy assignments, change history, RBAC logs). Tie compliance KPIs to shared dashboards to demonstrate alignment and continuous improvement.

Governing AI Data

Without clear data rules, risks, costs, and inconsistent results grow. Data governance protects sensitive information and intellectual property, and underpins output quality. Here’s what to activate:

Centralized discovery and classification.
Use Microsoft Purview to scan, catalog, and classify data across the organization (data lakes, databases, storage, M365). Define consistent taxonomies/labels and leverage Purview SDKs to enforce policies directly in pipelines (e.g., block ingestion of “Confidential” data into noncompliant endpoints).
Maintain security boundaries across AI systems.
Indexing can decouple native source controls: require a security review before data flows into models, vector indexes, or prompts. Preserve and enforce ACLs/access metadata at the chunk level, limit exposure with Private Endpoints/VNet, and apply least privilege to indexing workflows. Accept only data that’s already classified and meets internal standards.
Prevent copyright violations.
Apply filters with Azure AI Content Safety — Protected Material Detection — on generative inputs and outputs. For training/fine-tuning, use only lawful sources and appropriate licenses, maintaining provenance and evidence (contracts, terms of use) for audits and disputes.
Version training and grounding (RAG) data.
Treat datasets like code: Snapshots, immutable versions, changelogs, and rollback. Align each model/endpoint version with the corresponding data version (documents, embeddings, filtering policies) to ensure consistency across environments and over time.

Conclusions

AI creates value when delivery speed is channeled within clear, measurable rules. Governance here doesn’t mean braking; it means scaling what works, knowing why it works, and proving it at every audit, incident, or business decision. The path is pragmatic: define a minimal, uniform baseline (identity, Networking, policy, logging), measure outcomes with a small set of shared indicators, automate as much as possible, and evolve controls at the same cadence as models and data. You don’t need perfection on the first try: you need short cycles, explicit accountability, and infrastructure as code to quickly replicate choices that prove effective. In this context, Azure’s PaaS platforms become reliable accelerators because they operate within predictable boundaries: rapid experimentation, yes—but with guardrails, observability, and continuity plans already built in. The result is innovation that stays aligned with the business, reduces risk and reliance on chance, and turns AI into a repeatable, sustainable enterprise asset.

RAG on Azure Local: the evolution of generative AI in hybrid environments

In the era of Artificial Intelligence, companies are required to combine computational power with distributed data management, as data is increasingly located across cloud environments, on-premises infrastructures, and edge settings. In this context, Azure Local emerges as a strategic solution, capable of extending the benefits of cloud computing directly into local data centers—where the most sensitive and critical workloads reside. After exploring this topic in the previous article, “AI from Cloud to Edge: Innovation Enabled by Azure Local and Azure Arc," this new piece focuses on a particularly significant evolution: the adoption of RAG Capabilities (Retrieval-Augmented Generation) within Azure Local environments. Thanks to Microsoft’s adaptive cloud approach, it is now possible to design, deploy, and scale AI solutions consistently and in a controlled manner, even in hybrid and multicloud scenarios. Azure Local thus becomes the enabler of a tangible transformation, bringing generative AI capabilities closer to the data, with clear benefits: reduced latency, preservation of data sovereignty, and greater accuracy and relevance of the generated results.

A Consistent AI Ecosystem from Cloud to Edge

Microsoft is building a consistent and distributed Artificial Intelligence ecosystem, designed to enable the development, deployment, and management of AI models wherever they are needed: in the cloud, on-premises environments, or at the edge.

This approach is structured into four key layers, each designed to address specific needs:

Application Development: With Azure AI Studio, developers can easily design and build intelligent agents and conversational assistants using pre-trained models and customizable modules. The development environment offers integrated tools and a modern interface, simplifying the entire AI application lifecycle.
AI Services: Azure offers a wide range of advanced AI services — including language models (based on OpenAI), machine translation, computer vision, and semantic search — which, until now, were limited to the cloud environment. With the introduction of RAG in Azure Local, these capabilities can now also be executed directly in local environments.
Machine Learning and MLOps: Azure Machine Learning Studio allows for efficient creation, training, optimization, and management of ML models. Thanks to the AML Arc Extension, all these features are now also available on local and edge infrastructures.
AI Infrastructure: Supporting all these layers is a solid and scalable technology foundation. Azure Local, together with Azure’s global infrastructure, provides the ideal environment for running AI workloads through containers and optimized virtual machines, ensuring high performance, Security, and compliance.

Microsoft’s goal is clear: to eliminate the boundary between the cloud and the edge, enabling organizations to harness the power of AI where the data actually resides.

What is Retrieval-Augmented Generation (RAG)

Within the unified AI ecosystem Microsoft is building, one of the most impactful innovations is Retrieval-Augmented Generation (RAG) — an advanced technique poised to revolutionize the approach to generative AI in the enterprise space. Unlike traditional models that rely solely on knowledge learned during training, RAG enriches model responses by dynamically retrieving up-to-date and relevant content from external sources such as documents, databases, or vector indexes.

RAG operates in two distinct but synergistic phases:

Retrieve: The system searches and selects the most relevant information from external sources, often built using enterprise data.
Generate: The retrieved content is used to generate more accurate responses, consistent with the context and aligned with domain-specific knowledge.

This architecture helps reduce hallucinations, increase response accuracy, and work with updated and specific data without retraining the model, thereby ensuring greater flexibility and reliability.

RAG on Azure Local: Generative AI Serving On-Premises Data

With the introduction of RAG Capabilities in Azure Local environments, organizations can now bring the power of generative AI directly to their data—wherever it resides: in the cloud, on-premises, or across multicloud infrastructures—without needing to move or duplicate it. This approach roots artificial intelligence in enterprise data and enables the native integration of advanced capabilities into local operational workflows.

The solution is available as a native Azure Arc extension for Kubernetes, providing a complete infrastructure for data ingestion, vector index creation, and querying based on language models. Everything is managed through a local portal, which offers essential tools for prompt engineering, monitoring, and response evaluation.

The experience is designed in a No-Code/Low-Code fashion, with an intuitive interface that allows even non-specialized teams to develop, deploy, and manage RAG applications.

Key Benefits

Data Privacy and Compliance: Sensitive data remains within corporate and jurisdictional boundaries, allowing the model to operate securely and in compliance with regulations.
Reduced Latency: Local data processing enables fast responses, which are crucial in real-time scenarios.
Bandwidth Efficiency: No massive data transfers to the cloud, resulting in optimized network usage.
Scalability and Flexibility: Thanks to Azure Arc, Kubernetes clusters can be deployed, monitored, and managed on local or edge infrastructures with the same operational experience as the cloud.
Seamless Integration with Existing Environments: RAG capabilities can be directly connected to document repositories, databases, or internal applications, enabling scenarios such as enterprise chatbots, intelligent search engines, or vertical digital assistants—natively and without invasive infrastructure changes.

This capability represents a fundamental element in Microsoft’s strategy: to make Azure the most open, extensible, and distributed AI platform, capable of enabling innovation wherever data resides and transforming it into a true strategic asset for the digital growth of organizations.

Advanced RAG Capabilities on Azure Local

The RAG capabilities available in Azure Local environments go beyond simply bringing generative AI closer to enterprise data—they represent a comprehensive set of advanced tools designed to deliver high performance, maximum flexibility, and full control, even in the most demanding scenarios. Thanks to continuous evolution, the platform is equipped to support complex and dynamic use cases, while keeping quality, Security, and responsibility at the forefront.

Here are the main advanced features available:

Hybrid Search and Lazy Graph RAG (coming soon): The combination of hybrid search with the upcoming support for Lazy Graph RAG enables the creation of efficient, fast, and low-cost indexes, providing accurate and contextual responses regardless of the nature or complexity of the query.
Performance Evaluation: Native evaluation pipelines allow structured testing and measurement of RAG system effectiveness. Multiple experimentation paths are supported—helpful for comparing different approaches in parallel, optimizing prompts, and improving response quality over time.
Multimodality: The platform natively supports text, images, documents, and—soon—videos. By leveraging the best parsers for each format, RAG on Azure Local can process unstructured data located on NFS shares, offering a unified and in-depth view across various content types.
Multilingual Support: Over 100 languages are supported during both ingestion and model interactions, making the solution ideal for organizations with a global presence or diverse language requirements.
Always-Up-to-Date Language Models: Each update of the Azure Arc extension provides automatic access to the latest models, ensuring optimal performance, enhanced security, and alignment with the latest advancements in generative AI.
Responsible and Compliant AI by Design: The platform includes built-in capabilities for managing security, regulatory compliance, and AI ethics. Generated content is monitored and filtered, helping organizations comply with internal policies and external regulations—without placing additional burden on developers.

Key Use Cases of RAG on Azure Local

The integration of RAG into Azure Local environments delivers tangible benefits across several sectors:

Financial Services: in the financial sector, RAG can analyze sensitive data that must remain on-premises due to regulatory constraints. It can automate compliance checks on documents and transactions, provide personalized customer support based on financial data, and create targeted business proposals by analyzing individual profiles and preferences.
Manufacturing: for manufacturing companies, RAG is a valuable ally for enhancing operational efficiency. It can offer real-time assistance in problem resolution through analysis of local production data, help identify process inefficiencies, and support predictive maintenance by anticipating failures through historical data analysis.
Public Sector: public administrations can leverage RAG to gain insights from the confidential data they manage. It’s useful for summarizing large volumes of information to support quick and informed decision-making, creating training materials from existing documentation, and enhancing public safety through predictive analysis of potential threats based on local data.
Healthcare: in the healthcare sector, RAG enables secure handling of clinical data, delivering value across multiple areas. It can support the development of personalized treatment plans based on patient data, facilitate medical research through clinical information analysis, and optimize hospital operations by analyzing patient flow and resource usage.
Retail: in the retail sector, RAG can enhance customer experiences and streamline business operations. It is effective for creating personalized marketing campaigns based on purchasing habits, optimizing inventory management through sales data analysis, and gaining deeper insights into customer behavior to refine product and service offerings.

Conclusion

The integration of RAG capabilities within Azure Local environments marks a significant milestone in the maturity of distributed Artificial Intelligence solutions. With an open, extensible, and cloud-connected architectural approach, Microsoft enables organizations to leverage the benefits of generative AI consistently—even in hybrid and on-premises scenarios. RAG capabilities, in particular, allow advanced language models to connect with the contextual knowledge stored in enterprise systems—without compromising governance, Security, or performance. This evolution makes it possible to create intelligent, secure, and customized applications across any operational context, accelerating the time-to-value of AI across multiple industries. Azure Local with RAG represents a strategic opportunity for businesses that want to govern Artificial Intelligence where data is born, lives, and generates value.

SQL Server Licensing: How Azure Arc Can Change the Rules

In my previous article, I explored how Azure Arc enables organizations to harness the power of the Azure cloud in managing SQL Servers, regardless of where the databases reside: on-premises, at the edge, or in other cloud environments. This extension of the Azure platform allows for centralized governance, enhanced security, and advanced features without requiring a full migration to the cloud.

But once this new management approach is enabled, what services are available, and how is licensing handled? What models are available, and how do they differ from traditional SQL Server licensing?

In this article, we’ll answer these questions by delving into the SQL Server licensing model enabled by Azure Arc and comparing the different approaches to help organizations choose the solution that best fits their needs.

Features Included at No Additional Cost

Azure Arc for SQL Server provides many features at no extra charge, depending on the type of license held. If the organization already has a SQL Server license with Software Assurance (L+that) or opts for the PAYG (Pay-As-You-Go) model, it can access advanced tools for free, such as:

Best practices assessment
Automated patching
Automated local backups
Point-in-time restore
TDE encryption via Azure Key Vault

For customers with a License-only (L-only) model, even without SA, key governance features are still included—such as resource inventory, failover cluster management, and support for Always-On Availability Groups.

These capabilities allow for a cloud-like management experience, even while keeping databases on local infrastructure.

Figures 1 – SQL Server enabled by Azure Arc pricing model

Value-Added Advanced Services

Naturally, Azure Arc also enables the extension of feature sets through optional paid services, which can be activated selectively based on need:

Microsoft Defender for SQL Server, for advanced protection
Log Analytics and Azure Monitor, for deep monitoring
Azure Policy, for configuration and compliance management
Purview, for data governance
Cluster-aware patching and long-term backups to Azure or Amazon S3, for resilient and modern operations

This modularity allows organizations to scale their management capabilities based on actual needs while maintaining control over costs.

A New Perspective on Licensing Management

Traditionally, SQL Server licensing has been based mainly on Enterprise Agreements and Software Assurance contracts, binding companies to three-year purchases and requiring accurate forecasting of future usage. However, this approach doesn’t align well with modern IT environments, which are marked by workload fluctuations, hybrid adoption, and the need for more dynamic cost optimization.

Limitations of Traditional Licensing

In the face of this new flexibility, it’s worth highlighting the shortcomings of the traditional model. In addition to rigid contracts and lack of flexibility for workloads, organizations often face:

Difficulty tracking actual usage
Risk of under- or over-provisioning
Unexpected and costly true-ups
Complexities in managing across multiple teams and locations

In hybrid and distributed scenarios, these limitations can slow down processes and increase costs.

This is exactly where Azure Arc comes in—not only to extend management functionalities but also to introduce new licensing models that overcome past limitations.

The PAYG Model: Licensing That Fits

To meet these needs, Azure Arc offers a Pay-As-You-Go (PAYG) model for SQL Server, allowing organizations to pay strictly for what they use—hourly or monthly.

The benefits are significant:

No upfront costs: Ideal for temporary environments, testing, or seasonal workloads.
Adaptability: Licensing follows actual usage, reducing waste.
Targeted billing: Costs can be broken down by project, department, or individual server.
Visibility and control: The Azure portal enables continuous monitoring, compliance checks, and role-based access.
Cost-saving opportunities: PAYG licenses can be included in MACC agreements and treated as OpEx, making spending more predictable.

Conclusion

The true value of Azure Arc for SQL Server lies not only in its technical capabilities but in the innovative operating model it enables: greater visibility, centralized control, process automation, and cost optimization.

Whether it’s environments under strict regulatory requirements, intermittent workloads, or gradual modernization journeys, Azure Arc offers a flexible licensing approach that aligns perfectly with real business needs.

Azure Arc truly revolutionizes SQL Server license management, moving beyond a traditional, often rigid and complex model, to embrace a dynamic, transparent model that is natively integrated with Azure cloud tools.

This evolution allows organizations to respond more agilely to the challenges of an increasingly distributed IT landscape, making the most of existing infrastructure and accelerating digital transformation.

On-Premises GPUs for AI and Virtual Desktop: How Azure Local is Changing the Game

In recent years, the adoption of technologies based on Artificial Intelligence, machine learning, and desktop virtualization has accelerated dramatically. However, behind the innovation visible to end users lies a fundamental requirement: high-performance IT infrastructures capable of efficiently handling complex workloads. While the cloud often appears to be the go-to solution, it is not always the only — or the most suitable — option for every need.

In certain scenarios, organizations are increasingly demanding local solutions that deliver high performance, low latency, and strict data control. This need is often driven by concerns related to security, regulatory compliance, and specific architectural constraints that require workloads to run directly in on-premises environments.

This is where Azure Local comes into play — a Microsoft offering that redefines the concept of hybrid infrastructure. Combining the power of Azure cloud with the flexibility of local deployment, Azure Local enables full utilization of GPUs directly within your datacenter, empowering AI and Virtual Desktop Infrastructure (VDI) scenarios with high performance and full operational control.

Why Use On-Premises GPUs?

Let’s start with a key point: enterprise-grade GPUs — those built for datacenters and heavy workloads—have become essential for handling complex tasks such as:

Training and inference of AI models
Real-time video and visual processing
Virtualization of graphic-intensive desktops and compute-heavy environments

But what happens when these workloads need to run in environments where:

Internet connectivity is absent, unstable, or unsuitable for continuous traffic
The data being processed is too sensitive or regulated to be moved to the public cloud
Applications require immediate response times without delays from round trips to Azure

In these cases, having on-premises GPUs fully integrated into your local infrastructure is not just a strategic choice — it’s often a necessity. This is where Azure Local steps in, enabling organizations to harness the power of enterprise GPUs right in their datacenter, with the simplicity and scalability of the Azure experience.

What is Azure Local?

Azure Local is essentially the extension of Microsoft’s cloud services directly into your own datacenter. It delivers a selection of Azure services, the same APIs, and the same management model—but with the option to run everything locally, wherever it’s needed: on-premises or at the edge.

With Azure Local, you can deploy applications, virtual desktops, and AI models within your own infrastructure while retaining complete data control — benefiting from the flexibility, scalability, and operational consistency of the cloud. No need to move sensitive data. No compromise on the Azure experience. Just the resources you need, right where you need them.

Azure Local + GPUs: A Powerful Combination

One of Azure Local’s most compelling features is its native GPU support, allowing you to tackle AI workloads and VDI environments with high performance and operational efficiency. You can choose between two usage modes:

DDA – Discrete Device Assignment: The GPU is exclusively assigned to a single virtual machine. This is the most powerful mode, ideal for scenarios requiring maximum compute power, such as AI model training, deep learning, or advanced rendering.
GPU-P – GPU Partitioning: In this mode, the GPU is divided into multiple virtual partitions, each assignable to a VM. Perfect for maximizing efficiency and supporting multi-user environments like VDI.

Both modes are fully compatible with NVIDIA drivers and support major compute and graphics libraries, including CUDA, OpenGL, and DirectX.

Which GPUs Are Supported?

Supported models currently include:

NVIDIA A2 and A16 – Supported in both DDA and GPU-P modes
Nvidia A10, A40, L4, L40, L40S – Supported in GPU-P mode

All are centrally manageable through Azure Arc, ensuring full control and visibility — even in the most distributed environments.

What About Virtual Desktops?

This is where things get even more interesting.

With Azure Virtual Desktop on Azure Local, you can deliver modern, high-performance, and secure desktop experiences directly within your on-premises environments. This means bringing the benefits of cloud-native VDI to where it truly matters, with session hosts physically close to end users.

The result? A significantly improved user experience, thanks to:

Ultra-low latency, ideal for on-site users or limited connectivity environments
Optimized performance for graphics applications and compute-intensive workloads
Data that remains on-premises, ensuring security and compliance
Full compatibility with Windows 11 and 10 in multi-session mode
Native integration with traditional Active Directory and Microsoft Entra ID

All orchestrated via the Azure portal, with the same provisioning, monitoring, and management tools—simplifying administration and ensuring operational consistency between cloud and datacenter.

AI Where It Truly Matters

When it comes to Artificial Intelligence, Azure Local is a game changer. You can now train, deploy, and manage AI models directly on-premises or at the edge, without relying on the cloud for every step of the process.

How? With two key technologies:

Edge RAG (Retrieval-Augmented Generation): Enhances generative models by integrating your local data—without ever moving it out of your environment. An ideal solution for highly secure and confidential use cases such as healthcare, government, or regulated industries.
Azure Machine Learning with Azure Arc: A unified platform for managing the entire lifecycle of AI models — from training to deployment — whether in the cloud or on-premises, using the same tools, APIs, and capabilities.

The result? In hybrid, secure, scalable, and fully localized AI ecosystem, designed to bring intelligence right where it’s needed: close to your data, your users, and your business-critical processes.

But Is It Complicated to Set Up?

Absolutely not. One of Microsoft’s main goals has been to simplify the configuration and management experience of Azure Local — even in GPU scenarios.

To get started, you simply need:

GPU-compatible physical hosts (over 100 validated models available)
VMs configured according to recommended technical requirements for DDA or GPU-P
Connection to Azure Arc for centralized, consistent, and secure management

Once the environment is up and running, you can operate just as you would in the Azure cloud — but with your own data, your own network, and full infrastructure control. No added complexity—just greater operational flexibility.

Conclusion

In a constantly evolving tech landscape, where performance, Security, and compliance demands are increasingly strict, Azure Local stands out as a true game changer. The ability to bring enterprise GPUs directly into local datacenters—while preserving the experience, scalability, and consistency of Azure cloud — empowers organizations to effectively tackle AI, VDI, and high-performance workloads.

Whether it’s about achieving ultra-low latency, protecting sensitive data, or operating in limited-connectivity environments, Azure Local offers a modern and tangible solution, with a flexible, manageable, and most importantly, accessible hybrid approach.

This is not simply about “bringing the cloud on-premises.” It’s about redefining how IT infrastructure supports core business processes, enabling advanced scenarios without compromise in control, performance, or security.

Ultimately, Azure Local is an excellent choice for those looking to bring innovation exactly where it’s needed most: close to the data, the users, and the everyday operational needs.