Running AI from Cloud to Edge with Kubernetes: A Joint Approach from Vultr, Supermicro, and SUSE

AI is moving closer to where data is created.

In telecom and infrastructure, workloads run far from traditional data centers. In manufacturing, decisions need to happen on the production line. In retail, insights come from in-store systems. Sending everything back to a central cloud is often too slow, too expensive, or simply not reliable enough.

Kubernetes enables applications to run consistently across these environments. But once you move beyond a single cluster, a new problem appears: how do you actually operate AI workloads across edge sites, regional infrastructure, and cloud environments?

This is where a combined approach across Vultr, Supermicro, and SUSE becomes practical.

What actually runs at the edge

At the edge, the priority is simple: keep things running, no matter what.

You are dealing with limited space, limited power, and sometimes unreliable connectivity. That is where Supermicro's systems fit in. They are designed to run in these environments and handle real-time workloads, such as computer vision or sensor data processing, directly at the source.

In this architecture, the edge can be further broken down into metro edge, near edge, and far edge. Metro edge is where Supermicro systems are typically deployed, providing localized compute in secondary regional locations. Near-edge is what Vultr offers, delivering regional cloud infrastructure close to users and edge sites. Far edge refers to the devices themselves – such as sensors, cameras, or on-site systems – where data is initially generated.

A typical setup might be a small Kubernetes cluster using K3s, running a local inference service. If the network drops, the system continues to operate. That’s non-negotiable in most edge scenarios.

But running one site is easy. Running hundreds or thousands is where things break down.

The real challenge: operating at scale

Once you scale beyond a handful of locations, the problem shifts from deployment to operations.

You need a way to:

Roll out new models safely

Keep clusters updated

Enforce consistent policies

Avoid configuration drift

This is where SUSE Edge comes in.

With K3s, Rancher, and Fleet, you can manage the full lifecycle of Kubernetes across distributed environments. Instead of manually touching each site, updates are pushed through a GitOps workflow. The same model version, the same configuration, the same policies – applied consistently everywhere.

That’s what makes large-scale edge deployments manageable.

Why cloud and near-edge infrastructure matter

Not everything should run at the edge – or in a single cloud region.

As deployments scale, data residency, regulatory requirements, and latency constraints mean workloads need to stay closer to where they operate. At the same time, creating separate environments in every location is impractical.

A near-edge cloud layer solves this.

With infrastructure spanning 33 global data centers, Vultr enables teams to deploy regional Kubernetes clusters close to edge sites and users while keeping deployments consistent across locations. Using the Cluster API, teams can define cluster configurations once and programmatically replicate, scale, and manage them across regions – without rebuilding each environment from scratch.

This also enables a layered regional approach, where primary regions can be extended to metro-edge and far-edge locations. For example, a primary near-edge region such as Frankfurt can be extended to a metro-edge location like Bonn, and further to the far edge where devices and data sources operate, bringing compute closer to where workloads are executed while maintaining consistency with the broader cloud environment.

These regional clusters handle:

Model storage and distribution
API and application services
Telemetry within regional boundaries
GPU-based inference when edge capacity is limited, powered by high-performance AMD GPUs

This keeps edge environments focused, while the cloud provides regional scale, alignment with compliance requirements, and consistency.

From architecture to real-world deployment

This model works because each layer has a clear role.

Vultr provides the cloud layer, delivering global infrastructure with regional deployment options, scalable compute, and GPU access – while ensuring data is aggregated within geographic boundaries. At the edge, Supermicro systems run AI workloads close to where data is created, ensuring real-time performance and resilience. Across those environments, SUSE provides the control layer – managing Kubernetes, applications, and updates consistently at scale.

Together, this creates a system that is both distributed and manageable. Edge environments stay focused, regional infrastructure supports scale, and operations remain consistent across locations.

As this approach continues to evolve, discussions at events like SUSECON highlight how cloud, edge, and Kubernetes ecosystems are converging to support real-world AI deployments at scale.

Running AI from cloud to edge is not just about where workloads run – it’s about making them work together as a single, reliable system.

What actually runs at the edge

The real challenge: operating at scale

Why cloud and near-edge infrastructure matter

From architecture to real-world deployment

Tech Talks

Loading...

Vultr Docs

Loading...

Products

Features

Solutions

Marketplace

Resources

Company

Tech Talks

Vultr Docs