Category Archives: Azure Machine Learning

AI from Cloud to Edge: Innovation Powered by Azure Local and Azure Arc

In the era of Artificial Intelligence, which is significantly transforming business models, the adoption of local and distributed infrastructures is crucial for managing specific and mission-critical workloads. In this context, Azure Local emerges as an innovative solution capable of bridging the gap between cloud and edge computing, delivering applications, data, and AI services exactly where they are needed. This article will explore real-world scenarios where Azure Local, combined with Azure Arc, enables real-time data processing “at the source” and the deployment of advanced AI solutions. We will also delve into the new Azure AI services designed for Azure Local, focusing on maximizing the potential of on-premises data.

Real-World Scenarios of Local and Distributed Infrastructure with Azure Local

In the following sections, we will examine concrete examples that demonstrate how Azure Local, in synergy with Azure Arc, effectively addresses the needs of distributed infrastructure, ensuring low latency, security, and operational continuity across various business and industrial contexts.

Figure 1 – Real-World Scenarios for Local and Distributed Infrastructure with Azure Local

Local AI Inferencing

In many situations, analyzing data in real-time directly at the edge (e.g., within a retail store or an industrial facility) provides significant advantages in terms of latency and reduced bandwidth usage. Azure Local enables on-site data processing, eliminating the need to transfer all data to the cloud before performing critical analyses. Here are some examples:

  • Retail Loss Prevention: With AI integrated locally, suspicious behaviors and potential thefts can be identified in real-time, allowing retailers to act immediately and reduce losses.
  • Smart Self-Checkout: Video surveillance and visual analysis facilitate automatic item recognition, improving customer experience and reducing wait times.
  • Pipeline Monitoring: In sectors like oil & gas, real-time video monitoring of infrastructure helps detect anomalies and leaks, reducing environmental risks and ensuring timely interventions.

Operational Continuity in Mission-Critical Environments

The ability to ensure business continuity during network or power outages is a crucial aspect. With Azure Local, robust systems can be implemented to preserve operations even when cloud connectivity is limited or unavailable. Examples include:

  • Factory and Warehouse Operations: Production and inventory management cannot stop; having a local solution ensures that production lines and management systems continue functioning despite network disruptions.
  • Stadiums and Event Venues: Services like security, ticketing, and lighting must remain operational to safeguard both spectator experience and safety.
  • Transport Hubs: Constant operation of ticketing systems, scheduling, and communications is essential for passenger flow and safety in large transit hubs.

Control Systems and Near Real-Time Processing

Some industrial, financial, and healthcare environments demand extremely low response times to avoid errors, ensure safety, or maximize performance. Azure Local, combined with Azure Arc, can meet these latency requirements:

  • Manufacturing Execution Systems (MES): Continuous synchronization and monitoring of production machinery optimize processes and minimize downtime.
  • Industrial Quality Assurance (QA): Immediate quality checks and verifications identify defects before they reach the final stage of production, increasing compliance and reducing waste.
  • Financial Infrastructures: Low-latency transaction processing and rapid risk assessment are critical for market competitiveness and stability.

Regulatory Compliance and DDIL Connectivity (Disconnected, Degraded, Intermittent, Limited)

For many organizations (governmental, military, or those operating critical infrastructures), data protection and secure management, even in the absence of reliable connectivity, are top priorities. Azure Local supports the need for on-premises data and control:

  • Government and Military Sectors: Data confidentiality is paramount, requiring local management to ensure continuous access even in compromised network scenarios.
  • Energy Infrastructures: The stability of distribution networks and control of pipelines and refineries require resilience under limited connectivity conditions, while adhering to stringent regulations.

Azure’s Adaptive Cloud Approach

Microsoft’s adaptive cloud approach, enabled by Azure Arc, helps organizations unify hybrid, multicloud, and edge infrastructures within Azure. With Azure Arc, the same cloud-native experiences and capabilities—such as security, updates, management, and scalability—can be extended anywhere, from on-premises data centers to distributed locations.

Figure 2 – Adaptive Cloud Approach

Azure Local, connected to the cloud through Azure Arc, enables:

  • Operating and scaling distributed infrastructure via the Azure portal and the same APIs.
  • Running fundamental compute, network, storage, and application services locally, choosing hardware from the preferred vendor.
  • Strengthening the security of apps and data with Azure technologies, protecting them against advanced threats.

A key feature is the presence of Azure Kubernetes Service (AKS), Microsoft’s managed Kubernetes solution. On Azure Local, AKS can be configured and updated automatically, providing everything needed (storage drivers, container images for Linux and Windows, etc.) to support containerized applications. Moreover, each cluster is automatically enabled with Azure Arc, allowing integration with services like Microsoft Defender for Containers, Azure Monitor, and GitOps for continuous delivery.

Figure 3 – Bring Azure Apps, Data, and AI Anywhere

New Azure AI Services with Azure Local and Azure Arc

On-Premises Data Search with Generative AI

In recent years, generative AI has made significant strides, driven by the introduction of language models (like GPT) capable of interpreting and generating natural language text. Public tools like ChatGPT work well for general knowledge queries but cannot address questions about private business data on which they have not been trained. To bridge this gap, the concept of Retrieval Augmented Generation (RAG) was introduced, a technique that “enhances” language models with proprietary data, enabling more advanced and customized use cases.

Within the Azure Local framework, Microsoft has announced a new service that brings generative AI and RAG directly to the edge, where the data resides. Within minutes, organizations can deploy (via an Azure Arc extension) everything needed to query their on-premises data, including:

  • Small and large language models (SLM/LLM) running locally, with support for both CPUs and GPUs.
  • An end-to-end data ingestion and RAG pipeline that keeps all information on-premises, with RBAC (Role-Based Access Control) ensuring secure access.
  • An integrated tool for prompt engineering and result evaluation to optimize model settings and performance.
  • APIs and interfaces aligned with Azure standards, facilitating integration into enterprise applications, plus a preconfigured UI for immediate service use.

This feature is now available in private preview for Azure Local customers, with Microsoft planning to expand availability to other Arc-enabled platforms in the near future.

“Edge RAG”: The Local Retrieval-Augmented Generation Ecosystem

This new service, known as “Edge RAG”, integrates seamlessly into the Azure ecosystem and supports various input components, such as:

  • Azure AI Search: Provides document search and indexing functionality, enabling quick identification of relevant content within large datasets.
  • Azure OpenAI: Offers advanced AI models (like GPT) capable of generating, understanding, and summarizing text in natural language.
  • Azure AI Studio: A platform for developing and managing AI assets (datasets, models, pipelines) centrally.

Together, these components power an integrated flow—from data ingestion to inference and result presentation via chat or other development interfaces. This enables the creation of chatbots, knowledge discovery tools, and other AI-driven solutions that leverage internal business data in a secure, customizable, and compliant environment.

Deploying Open-Source AI Models via Azure Arc

Another key feature of Azure AI is the availability of a catalog of AI models tested, validated, and guaranteed by Microsoft. These models are ready for deployment and provide consistent inference endpoints. This functionality is now extended to the edge, where Microsoft makes selected models available directly from the Azure portal:

  • Phi-3.5 Mini (language model with 3.8 billion parameters)
  • Mistral 7B (language model with 7.3 billion parameters)
  • MMDetection YOLO (object detection)
  • OpenAI Whisper Large (speech-to-text recognition)
  • Google T5 Base (automatic translation)

These models can be deployed in just a few steps on an Arc AKS cluster running on-premises. Most models require only a CPU, but Phi-3.5 and Mistral 7B also support GPUs for enhanced performance in intensive inference scenarios.

Azure AI Offerings: From Cloud to Edge

Microsoft’s approach spans the full spectrum of AI capabilities, offering services and tools that can be delivered in the Azure cloud or extended to on-premises and edge environments via Azure Arc. The offering consists of four main pillars:

  • Application Development
    • Azure AI Studio: A development environment for AI applications (e.g., chatbots, virtual agents) with a complete set of APIs and interfaces for seamless AI integration.
  • AI Services
    • Azure AI Language and Model Services: Preconfigured services for NLP, computer vision, and other AI functionalities.
    • Solutions like Edge RAG, Video Indexer, and Managed AI Containers for local deployment of “ready-to-use” AI models.
  • Machine Learning & ML Ops
    • Azure Machine Learning Studio: A comprehensive platform for creating, training, optimizing, and managing machine learning models.
    • With Azure Arc, ML Ops capabilities can extend to the edge via extensions like the AML Arc Extension, enabling Azure ML tools on on-premises and edge infrastructures.
  • Infrastructure
    • Azure Global Infrastructure: Azure’s cloud foundation, including compute, storage, and networking resources.
    • Arc-Enabled Edge Infrastructure: Extends Azure capabilities to data centers or edge devices, managed as if they were cloud resources.

Conclusion

Microsoft’s strategy is built on delivering the best of the cloud “anywhere.” Azure Local epitomizes this vision: a solution that brings all the benefits of the cloud—agility, scalability, security—directly to local environments, meeting the needs for low latency, operational continuity, and regulatory compliance.

Thanks to Azure Arc, organizations can leverage Azure AI services such as advanced language models, Retrieval-Augmented Generation (RAG) pipelines, and ML Ops tools in a hybrid mode. Applications range from factory quality control to retail theft prevention, from critical government data centers to energy infrastructure monitoring.

In a world where data continues to grow exponentially and the need for on-site analysis becomes increasingly urgent, solutions like Azure Local represent the next step toward a new generation of distributed infrastructures. This is how Microsoft meets the challenge of uniting cloud potential with on-premises reality, creating opportunities for innovation and growth across all sectors.

Building modern IT architectures for Machine Learning

For most companies, the ability to continuously provide and integrate artificial intelligence solutions within their own applications and business workflows, is considered a particularly complex evolution. In the rapidly evolving artificial intelligence landscape, machine learning (ML) plays a fundamental role together with "data science". Therefore, to increase the successes of certain artificial intelligence projects, organizations must have modern and efficient IT architectures for machine learning. This article describes how these architectures can be built anywhere thanks to the integration between Kubernetes, Azure Arc ed Azure Machine Learning.

Azure Machine Learning

Azure Machine Learning (AzureML) is a cloud service that you can use to accelerate and manage the life cycle of machine learning projects, bringing ML models into a secure and reliable production environment.

Kubernetes as a compute target for Azure Machine Learning

Azure Machine Learning recently introduced the ability to activate a new target for computing: AzureML Kubernetes compute. In fact,, it is possible to use an Azure Kubernetes Service cluster (AKS) existing or an Azure Arc-enabled Kubernetes cluster as a compute target for Azure Machine Learning and use it to validate and deploy ML models.

Figure 1 - Overview on how to take Azure ML anywhere thanks to K8s and Azure Arc

AzureML Kubernetes compute supports two types of Kubernetes clusters:

  • Cluster AKS (in Azure environment). Using an Azure Kubernetes Service managed cluster (AKS), you can get a flexible environment, secure and capable of meeting compliance requirements for ML workloads.
  • Arc-enabled Kubernetes Cluster (in environments other than Azure). Thanks to Azure Arc-enabled Kubernetes it is possible to manage Kubernetes running in different environments from Azure clusters (on-premises or on other clouds) and use them to deploy ML models.

To enable and use a Kubernetes cluster to run AzureML workloads you need to follow the following steps:

  1. Activate and configure an AKS cluster or an Arc-enabled Kubernetes cluster. In this regard it is also recalled the possibility of activate AKS in Azure Stack HCI environment.
  2. Distribute the extension AzureML on the cluster.
  3. Connect the Kubernetes cluster to the Azure ML workspace.
  4. Use the Kubernetes compute target from CLI v2, SDK v2 and the Studio UI.

Figure 2 - Step to enable and use a K8s cluster for AzureML workloads

Infrastructure management for ML workloads can be complex and Microsoft recommends that it be done by the IT-operations team, so that the data science team can focus on the efficiency of the ML models. In light of this consideration, the division of roles can be as follows:

  • The IT-operation Team is responsible for the former 3 steps above. Furthermore, typically performs the following activities for the data science team:
    • Make configurations of aspects related to networking and security
    • Create and manage instance types for different ML workload scenarios in order to achieve efficient use of compute resources.
    • It deals with troubleshooting the workload of Kubernetes clusters.
  • The Data science Team, completed the activation activities in charge of IT-operation Team , can locate a list of compute targets and instance types available in the AzureML workspace. These compute resources can be used for training or inference workloads. The compute target is chosen by the team using specific tools such as AzureML CLI v2, Python SDK v2 or Studio UI.

Usage scenarios

The ability to use Kubernetes as a compute target for Azure Machine Learning, combined with the potential of Azure Arc, allows you to create, train and deploy ML models in any on-premises infrastructure or on different clouds.

This possibility activates different new usage scenarios, previously unthinkable using only the cloud environment. The following table provides a summary of the use scenarios made possible by Azure ML Kubernetes compute, specifying where the data resides, the motivation that drives each usage model and how it is implemented at the infrastructure and Azure ML level.

Table 1 - New usage scenarios made possible by Azure ML Kubernetes compute

Conclusions

Gartner expects that by 2025, due to the rapid spread of AI initiatives, the 70% of organizations will have operationalized IT architectures for artificial intelligence. Microsoft, thanks to the integration between different solutions, offers a series of possibilities to activate flexible and cutting-edge architectures for Machine Learning, an integral part of artificial intelligence.