Syntax

Managed remote (dUX)

dUX-backed cloud GPU. Pick a model, pick a tier, deploy. dUX handles placement, autoscaling, drivers, and ingress.

Managed remote is the path for teams that want cloud GPU without managing infrastructure. It is available to all users and uses dUX to orchestrate the hardware inside your own cloud accounts — you remain the sole admin of the underlying machines.

How it works (at a glance)

  1. You pick a model (or a party) and a target tier in the desktop app.
  2. Syntax submits the deployment intent to dUX.
  3. dUX handles the cloud-side work: GPU placement, autoscaling, driver compatibility, ingress, isolation.
  4. dUX returns the endpoint(s).
  5. Syntax wires those endpoints into the Bridge.
  6. Your harness sees the managed-remote deployment as a normal model in its model list and routes transparently.

Target tiers

Two managed-remote tiers map to two optimization profiles:

TierOptimized for
LatencyLowest time-to-first-token, lowest per-request latency.
ThroughputHighest tokens-per-second under load, best cost-per-token.

The exact placement strategy and replica policy live in dUX's orchestration layer; from your perspective, you choose Latency or Throughput and dUX handles the rest.

Saved remote targets

The first time you deploy a model managed remotely, you can choose to save the target — name, tier, exposure, replica policy, and anything else you configured. Future deploys to the same logical target are one click.

Public vs private endpoints

When you deploy managed remote, you can set:

  • Expose private — the endpoint is reachable from your other Syntax tools but not from the public internet.
  • Expose public — the endpoint is reachable from anywhere with the bearer token, suitable for sharing with non-Syntax tools.

Both surfaces issue a per-deployment bearer token (sk-syntax-…) that's scoped to the deployment and can be revoked at any time. See Concepts → Exposed endpoints for the revocation flow.

What you don't have to think about

  • GPU drivers and CUDA / ROCm versions.
  • Autoscaler configuration (KEDA, DCGM, etc.).
  • Kubernetes namespaces.
  • Ingress and load balancing.
  • Multi-replica weight distribution.
  • Node-pool capacity planning.

dUX orchestrates all of that for you, inside your own cloud accounts, with you as the sole admin. Syntax stays your control surface.

Multi-model parties on managed remote

The same Party Builder that composes parties for local deployment can also deploy them to managed remote. dUX returns placements and Syntax wires every model in the party into the Bridge so the Main Agent can call its specialists transparently.

Where to go next