Syntax Docs

Run models on your own remote box — your server, your GPU, your SSH — with Syntax handling the lifecycle.

Remote self-hosted is for users who already have a GPU server (a beefy home tower, a colo box, a cloud VM you provisioned yourself) and want Syntax to drive it without giving up control of the machine.

What "remote self-hosted" means

You provide:

An SSH-reachable host with the right hardware (GPU, RAM, disk).
An account / key Syntax can use to log in.

Syntax handles:

Engine installation on the remote host (curated images, no manual driver wrangling beyond the GPU driver itself).
Model weight delivery.
Engine lifecycle (start, stop, health checks).
Wiring the resulting endpoint into the local Bridge so your harness reaches the remote model the same way it reaches a local one.

Setting up a remote target

From Settings → Remote Targets in the desktop app:

Add the host (hostname, port, username).
Provide an SSH key that's authorized on the host.
Test the connection — Syntax verifies it can reach the host, has access to the right paths, and can probe the GPU.
Save.

Once a remote target is saved, deploying a model to it is the same flow as a local deployment — just pick Self-Managed Remote as the target.

Disk layout on the remote host

Syntax keeps remote artifacts under a small set of well-known paths in your home directory on the remote host. Weights, engine binaries, and log files all live in predictable locations so they're easy to clean up if you ever decide to remove Syntax.

Engine selection on the remote host

The same multi-engine inference logic that runs locally also runs on the remote host: Syntax picks the right engine for the model and the remote hardware. You don't have to install or manage CUDA, ROCm, attention backends, or quantization toolchains by hand.

Multi-host remote deployments

Some models are too large to fit on a single host. For multi-host deployments, you provide multiple remote targets and pick a Strategy:

Performance — one model per host (lowest latency).
Economy — pack onto the fewest hosts (lowest cost).

The strategy applies to multi-model parties only; single-host targets ignore it.

When to use remote self-hosted

You already own a GPU server.
You want full control of the OS and drivers.
You want SSH-level visibility into the running process.
You don't want managed cloud GPU pricing or vendor lock-in.

When to use managed remote (dUX) instead

You don't have hardware and don't want to provision and maintain it.
You want autoscaling without writing it yourself.
Your team needs shared deployments behind a single endpoint.

→ Managed remote

Remote self-hosted