Local AIMay 14, 20263 min read

Local-first AI is an architecture, not a privacy label

What local-first AI really requires across execution, storage, model routing, synchronization, permissions, and user control.

“Runs locally” has become an attractive label for AI products. Sometimes it means the model executes on the user's device. Sometimes only a small interface runs there while prompts, files, memory, and telemetry still travel through a cloud service.

Local-first is more demanding than local inference. It is an architecture in which the user's device remains the primary home of state and capability, while remote services are optional, explicit participants.

Ask where every layer lives

A serious local-first claim should answer where each part executes or persists:

Layer	Questions to ask
Model inference	Which model runs locally, and when is a remote model used?
Context assembly	Where are files read, filtered, and transformed?
Memory	Where are preferences, summaries, and histories stored?
Tool execution	Which actions happen on the device or network?
Embeddings and indexes	Can private source material leave the machine?
Telemetry	What events, content, and identifiers are transmitted?
Synchronization	What is copied between devices and how is it encrypted?

If these answers are unavailable, “local” is mostly branding.

Local and cloud are routing choices

The useful design is not a religious choice between local and cloud. Different tasks have different constraints.

A small local model may classify files, redact sensitive fields, select context, or handle autocomplete with low latency. A larger remote model may be chosen for difficult reasoning. A regulated workflow may require an on-premises model. A user may choose a European provider for one project and an offline model for another.

The routing policy should be visible and controllable. Before a request leaves the device, the system should know which data is included, which provider receives it, and why local execution was insufficient.

One local core, multiple interfaces

Running intelligence separately inside every editor, desktop app, and mobile client fragments memory and permissions.

A stronger pattern is a single user-controlled core that owns context, policy, memory, connections, and audit history. Interfaces become clients of that core. The editor can contribute open files, the desktop workspace can manage documents, and a phone can request an action without each interface becoming a separate brain.

This architecture creates difficult but solvable problems: authentication between interfaces, concurrent state updates, secure remote access, version compatibility, and offline synchronization. It also gives the user one place to inspect and govern the system.

Local software still needs security boundaries

On-device execution can access highly sensitive capabilities. Filesystem access, terminals, browser sessions, credentials, and local applications should never become one undifferentiated permission.

Use scoped capabilities, operating-system isolation where possible, short-lived credentials, explicit network rules, and per-action logs. A local agent should be able to say which interface requested an action, which model proposed it, which tool executed it, and what changed.

Privacy without accountability is incomplete.

The update and recovery problem

Cloud applications can patch one server. Local-first systems must handle many operating systems, hardware profiles, versions, and partially offline devices.

That requires signed updates, schema migration, crash recovery, model compatibility checks, and exportable state. Users need a way to back up or move their data without surrendering ownership to a central account.

Local-first is partly a product promise and partly operational discipline.

Our opinion

The future is neither entirely local nor entirely cloud. It is user-controlled routing across models and compute locations.

The valuable platform is the layer that keeps context, permissions, memory, and history coherent while models remain replaceable. Users should be able to choose where intelligence runs without rebuilding their workflow around every provider.

That is the standard local-first AI should be held to.