Syntax Docs

A high-level walkthrough of Syntax's three planes, the Bridge, and what happens to a request from your editor to a model and back.

Syntax is built around three planes that work together but stay clearly separated. You don't have to understand every layer to use Syntax — but the mental model below is enough to predict what will happen for any given configuration.

The three planes

Plane	What it owns
Control	Identity, organization policy, secrets, budgets, audit logs.
Execution	Your sessions, the harness lifecycle, the local proxy, approvals, tool orchestration.
Inference	The model catalog, hardware detection, engine selection, autotuning, model lifecycle.

The three planes are deliberately decoupled. The control plane never sees the content of your sessions; the execution plane never has to think about how a model is autotuned for your specific GPU; the inference plane never reaches into your editor.

First-class inter-compatibility

Every supported coding assistant talks to a single OpenAI- and Anthropic-compatible endpoint on localhost. That endpoint is the Bridge — the piece of Syntax that accepts requests in the format your harness already speaks and routes them to the right backend.

What happens to a request

Your harness sends a chat request to its configured endpoint, which is actually Syntax's local Bridge.
The Bridge resolves the requested model name against your active model policy (alias resolution, tier overrides, budget caps).
The Bridge picks a backend — local engine, remote self-hosted engine, dUX-managed remote, or a hosted provider — based on what's deployed and what your policy allows.
The chosen backend serves the request. Local serving uses the most efficient engine for your hardware (see Multi-engine inference).
Tokens stream back to your harness in the wire format it expects, so streaming, tool calls, reasoning, and multimodal content all render correctly.

What you control

Syntax exposes a small set of high-leverage knobs:

Model policy — which models are allowed for which tiers, with aliases and per-deployment overrides.
Routing strategy — Latency vs Throughput, Performance vs Economy on multi-host deploys, public vs private endpoint exposure.
Approvals — what tool calls your harness is allowed to run without asking, and how risky operations get gated.
Budgets — hard caps and soft warnings for tokens or compute, per user and per organization.

Where this plays out

The Bridge — what it is and why every harness talks to it — is covered in Differentiators → First-class inter-compatibility.
The catalog and inference engines are covered in Inference.
The dUX-backed managed remote story is in Syntax × dUX.

How Syntax works

The three planes

First-class inter-compatibility

What happens to a request

What you control

Where this plays out

On this page