Edge AI

Edge-native
AI Infrastructure

Run AI close to the data and close to the mission. Keep workloads running when networks degrade and sites disconnect, on infrastructure you already control.

Let's talk

OCI models
Hardware
Backends
Drivers
Network
Runtime
Confidential
DDIL
Agents
AI-SPM

01AI supply chain

Models delivered as OCI containers.

AI models are infrastructure artifacts, not loose blobs. Pull them from a public or private registry the same way you pull any container image. The runtime starts. The endpoint comes up. Async pull with progress and cancellation, resource-aware scheduling, the same lifecycle as any container workload.

For airgapped sites, pull from on-premises mirrors.

Give AI Models the same enterprise treatment as container workloads.

AI supply chain

02Hardware discovery

One model. Heterogeneous edge hardware.

The platform handles the discovery. Operators handle the mission. A single OCI artifact resolves to a GPU runtime on a GPU host, a mixed CPU+GPU runtime where layers are partially offloaded, or a CPU-only runtime with quantization chosen to fit available memory.

When capacity is insufficient the platform returns a clear, typed error rather than a half-loaded model.

Hardware discovery

03Inference backends

vLLM for production GPU. llama.cpp for the edge.

vLLM brings PagedAttention, continuous batching, and tensor parallelism for high-throughput GPU serving. llama.cpp covers CPU and mixed CPU+GPU paths with Q4, Q5, and Q8 quantization for constrained hosts.

Both expose OpenAI-compatible endpoints. Applications written against cloud inference APIs run unchanged against a local Starlight endpoint.

One endpoint shape. The platform picks the backend; the client code does not change.

Inference backends

04OS foundation

Drivers in the image. Atomic updates. No driver hell.

Edge AI deployments fail on driver pairing. Kernel and driver updated independently. Container Toolkit out of sync. CDI broken on the next reboot.

StarlightOS pairs the kernel, GPU driver, container toolkit, and CDI configuration in a single signed image. Updates are atomic. Failed updates roll back.

All GPU Drivers & Kernel in sync & signed.

OS foundation

05Network policy

Declarative network intent for AI endpoints.

Operators describe intent once. The platform compiles L3 and L4 firewall rules and L7 gateway configuration from the same artifact. When an AI workload starts, the gateway routes and the firewall holes are configured automatically. No hand-edits.

Validate, preview, preflight before any traffic-affecting change. Failed reload rolls back atomically.

Network policy

06Workload runtime

Built-in runtime protection across VMs, containers, and AI.

The same enforcement layer applies across the entire workload surface. Process-level visibility and control. Filesystem and network egress policy applied in real time. eBPF and LSM enforcement points in the host kernel.

Audit event for every policy decision. Hardened defaults with customer-extensible policy. Ships with the platform, with no separate agent install.

Runtime protection

07Hardware security

Confidential compute. Hardware-isolated inference.

AMD SEV-SNP and Intel TDX as the workload-level confidential computing path. AI workloads have data in-use encryption. Memory is protected while the workload is running, not just at rest or in transit.

GPU confidential compute on NVIDIA H100 and newer. CoCo for confidential containers. Capability discovery at host registration. No software license cost; the protection is a CPU and GPU feature.

Confidential compute

08Resilience

Edge AI that works when the network doesn't.

Disconnected, intermittent, and limited bandwidth are first-class operating conditions. When the cloud is reachable, the fleet is synced. When connectivity is partial, traversal and relay fallback take over. When it is gone entirely, offline authorization bundles carry user, license, and feature state into the disconnected environment.

Local workloads keep running across all three states. Identity, IAM, policy, and audit unchanged. Reconverges on reconnect.

The host is the source of truth. Connectivity is a feature, not a dependency.

DDIL operation

09Governance

MCP and agents. Same identity. Same policy. Same audit.

Human operators through desktop UI and CLI. Automation through Terraform, Ansible, and CI/CD. Agentic clients through standard agent protocol endpoints. All three connect to the same API surface.

Read-only modes, allowlists, and endpoint filtering apply to agents. Audit events distinguish agent-initiated actions from human and automation actions.

Governance

10AI security

Starlight AI-SPM.

Posture management across models, datasets, agents, and pipelines. Prompt firewall for injection defense, PII and secrets redaction. Automated red teaming for jailbreaks and supply chain risk. Behavioral detection and response with automated remediation. Agent protection with least-privilege NHI controls. Governance mapped to NIST AI RMF, MITRE AI, EU AI Act, ISO 42001, and OWASP.

On-prem on StarlightPartner SaaS

Detect and block AI-specific threats via model red-teaming, prompt filtering, and secure ML supply-chain controls.

AI-SPM

Schedule Starlight demo

See virtual machines

Edge-nativeAI Infrastructure