Local AI Inferencing
In many situations, analyzing data in real-time directly at the edge (e.g., within a retail store or an industrial facility) provides significant advantages in terms of latency and reduced bandwidth usage. Azure Local enables on-site data processing, eliminating the need to transfer all data to the cloud before performing critical analyses. Here are some examples:
- Retail Loss Prevention: With AI integrated locally, suspicious behaviors and potential thefts can be identified in real-time, allowing retailers to act immediately and reduce losses.
- Smart Self-Checkout: Video surveillance and visual analysis facilitate automatic item recognition, improving customer experience and reducing wait times.
- Pipeline Monitoring: In sectors like oil & gas, real-time video monitoring of infrastructure helps detect anomalies and leaks, reducing environmental risks and ensuring timely interventions.
Operational Continuity in Mission-Critical Environments
The ability to ensure business continuity during network or power outages is a crucial aspect. With Azure Local, robust systems can be implemented to preserve operations even when cloud connectivity is limited or unavailable. Examples include:
- Factory and Warehouse Operations: Production and inventory management cannot stop; having a local solution ensures that production lines and management systems continue functioning despite network disruptions.
- Stadiums and Event Venues: Services like security, ticketing, and lighting must remain operational to safeguard both spectator experience and safety.
- Transport Hubs: Constant operation of ticketing systems, scheduling, and communications is essential for passenger flow and safety in large transit hubs.
Control Systems and Near Real-Time Processing
Some industrial, financial, and healthcare environments demand extremely low response times to avoid errors, ensure safety, or maximize performance. Azure Local, combined with Azure Arc, can meet these latency requirements:
- Manufacturing Execution Systems (MES): Continuous synchronization and monitoring of production machinery optimize processes and minimize downtime.
- Industrial Quality Assurance (QA): Immediate quality checks and verifications identify defects before they reach the final stage of production, increasing compliance and reducing waste.
- Financial Infrastructures: Low-latency transaction processing and rapid risk assessment are critical for market competitiveness and stability.
Regulatory Compliance and DDIL Connectivity (Disconnected, Degraded, Intermittent, Limited)
For many organizations (governmental, military, or those operating critical infrastructures), data protection and secure management, even in the absence of reliable connectivity, are top priorities. Azure Local supports the need for on-premises data and control:
- Government and Military Sectors: Data confidentiality is paramount, requiring local management to ensure continuous access even in compromised network scenarios.
- Energy Infrastructures: The stability of distribution networks and control of pipelines and refineries require resilience under limited connectivity conditions, while adhering to stringent regulations.
Azure’s Adaptive Cloud Approach
Microsoft’s adaptive cloud approach, enabled by Azure Arc, helps organizations unify hybrid, multicloud, and edge infrastructures within Azure. With Azure Arc, the same cloud-native experiences and capabilities—such as security, updates, management, and scalability—can be extended anywhere, from on-premises data centers to distributed locations.
Azure Local, connected to the cloud through Azure Arc, enables:
- Operating and scaling distributed infrastructure via the Azure portal and the same APIs.
- Running fundamental compute, network, storage, and application services locally, choosing hardware from the preferred vendor.
- Strengthening the security of apps and data with Azure technologies, protecting them against advanced threats.
A key feature is the presence of Azure Kubernetes Service (AKS), Microsoft’s managed Kubernetes solution. On Azure Local, AKS can be configured and updated automatically, providing everything needed (storage drivers, container images for Linux and Windows, etc.) to support containerized applications. Moreover, each cluster is automatically enabled with Azure Arc, allowing integration with services like Microsoft Defender for Containers, Azure Monitor, and GitOps for continuous delivery.
New Azure AI Services with Azure Local and Azure Arc
On-Premises Data Search with Generative AI
In recent years, generative AI has made significant strides, driven by the introduction of language models (like GPT) capable of interpreting and generating natural language text. Public tools like ChatGPT work well for general knowledge queries but cannot address questions about private business data on which they have not been trained. To bridge this gap, the concept of Retrieval Augmented Generation (RAG) was introduced, a technique that “enhances” language models with proprietary data, enabling more advanced and customized use cases.
Within the Azure Local framework, Microsoft has announced a new service that brings generative AI and RAG directly to the edge, where the data resides. Within minutes, organizations can deploy (via an Azure Arc extension) everything needed to query their on-premises data, including:
- Small and large language models (SLM/LLM) running locally, with support for both CPUs and GPUs.
- An end-to-end data ingestion and RAG pipeline that keeps all information on-premises, with RBAC (Role-Based Access Control) ensuring secure access.
- An integrated tool for prompt engineering and result evaluation to optimize model settings and performance.
- APIs and interfaces aligned with Azure standards, facilitating integration into enterprise applications, plus a preconfigured UI for immediate service use.
This feature is now available in private preview for Azure Local customers, with Microsoft planning to expand availability to other Arc-enabled platforms in the near future.
“Edge RAG”: The Local Retrieval-Augmented Generation Ecosystem
This new service, known as “Edge RAG”, integrates seamlessly into the Azure ecosystem and supports various input components, such as:
- Azure AI Search: Provides document search and indexing functionality, enabling quick identification of relevant content within large datasets.
- Azure OpenAI: Offers advanced AI models (like GPT) capable of generating, understanding, and summarizing text in natural language.
- Azure AI Studio: A platform for developing and managing AI assets (datasets, models, pipelines) centrally.
Together, these components power an integrated flow—from data ingestion to inference and result presentation via chat or other development interfaces. This enables the creation of chatbots, knowledge discovery tools, and other AI-driven solutions that leverage internal business data in a secure, customizable, and compliant environment.
Deploying Open-Source AI Models via Azure Arc
Another key feature of Azure AI is the availability of a catalog of AI models tested, validated, and guaranteed by Microsoft. These models are ready for deployment and provide consistent inference endpoints. This functionality is now extended to the edge, where Microsoft makes selected models available directly from the Azure portal:
- Phi-3.5 Mini (language model with 3.8 billion parameters)
- Mistral 7B (language model with 7.3 billion parameters)
- MMDetection YOLO (object detection)
- OpenAI Whisper Large (speech-to-text recognition)
- Google T5 Base (automatic translation)
These models can be deployed in just a few steps on an Arc AKS cluster running on-premises. Most models require only a CPU, but Phi-3.5 and Mistral 7B also support GPUs for enhanced performance in intensive inference scenarios.
Azure AI Offerings: From Cloud to Edge
Microsoft’s approach spans the full spectrum of AI capabilities, offering services and tools that can be delivered in the Azure cloud or extended to on-premises and edge environments via Azure Arc. The offering consists of four main pillars:
- Application Development
- Azure AI Studio: A development environment for AI applications (e.g., chatbots, virtual agents) with a complete set of APIs and interfaces for seamless AI integration.
- AI Services
- Azure AI Language and Model Services: Preconfigured services for NLP, computer vision, and other AI functionalities.
- Solutions like Edge RAG, Video Indexer, and Managed AI Containers for local deployment of “ready-to-use” AI models.
- Machine Learning & ML Ops
- Azure Machine Learning Studio: A comprehensive platform for creating, training, optimizing, and managing machine learning models.
- With Azure Arc, ML Ops capabilities can extend to the edge via extensions like the AML Arc Extension, enabling Azure ML tools on on-premises and edge infrastructures.
- Infrastructure
- Azure Global Infrastructure: Azure’s cloud foundation, including compute, storage, and networking resources.
- Arc-Enabled Edge Infrastructure: Extends Azure capabilities to data centers or edge devices, managed as if they were cloud resources.
Conclusion
Microsoft’s strategy is built on delivering the best of the cloud “anywhere.” Azure Local epitomizes this vision: a solution that brings all the benefits of the cloud—agility, scalability, security—directly to local environments, meeting the needs for low latency, operational continuity, and regulatory compliance.
Thanks to Azure Arc, organizations can leverage Azure AI services such as advanced language models, Retrieval-Augmented Generation (RAG) pipelines, and ML Ops tools in a hybrid mode. Applications range from factory quality control to retail theft prevention, from critical government data centers to energy infrastructure monitoring.
In a world where data continues to grow exponentially and the need for on-site analysis becomes increasingly urgent, solutions like Azure Local represent the next step toward a new generation of distributed infrastructures. This is how Microsoft meets the challenge of uniting cloud potential with on-premises reality, creating opportunities for innovation and growth across all sectors.