Multi-model parties
One main agent, one sub-agent, up to six specialists — composed into a single deployment with capability scoring and a unified plan.
Most agent stacks pretend a single LLM is enough. A real coding workflow needs a strong main model, a cheap sub-agent for routine work, and the option to invoke specialists when the task calls for it. Syntax's Party Builder is the answer.
What a party gives you
A party is a single deployment that exposes:
- A Main Agent — the model your harness primarily talks to.
- A Default Sub-Agent — the cheaper model the main agent delegates routine tasks to.
- Up to six Specialists — each a distinct model with a specific capability (e.g., image understanding, OCR, image generation, segmentation, TTS, time-series forecasting, etc.).
Specialists are exposed to the main agent as tools. The main agent decides when to invoke them, just like any other tool call. The response is folded back into the conversation transparently.
Why this beats one big model
- Cost. A strong main model is expensive per token. A cheap sub-agent that handles 80% of routine work cuts the bill dramatically without losing capability on the hard 20%.
- Latency. Smaller specialists answer faster than asking the main model to do everything.
- Specialization. Some tasks (image segmentation, OCR, TTS) are not LLM tasks at all. Specialists let you reach the right tool for each job.
- Visibility. The party UI shows which model is carrying which capability and where there are coverage gaps before you deploy.
Capability scoring & plan generation
When you compose a party, the Party Builder shows:
- Which capabilities the chosen models cover and where there are gaps.
- A per-model strength bar so you can see who's doing what.
- A predicted hardware footprint — how the party will fit on your local GPU, on a self-hosted box, or on managed remote.
You see all of that before you deploy.
Presets
If composing a party from scratch is more work than you want, the Presets tab gives you ready-to-deploy party definitions — schema- versioned templates for common workflows that you pick and deploy directly. Presets are also a clean way to share standard party configurations across a team or organization.
Where it deploys
A party deploys to any of the same targets as a single model:
- Local — multiple models on your own machine (subject to fit).
- Self-Managed Remote — your own SSH-reachable GPU box(es).
- Managed Remote — dUX handles placement.
The deployment surface is the same in each case; only the underlying hardware changes.
Where to start
Multi-engine inference
Hardware-aware engine selection across a large compatibility matrix — Syntax owns the optimization work so you don't.
Agent Handoff (vs compaction)
Long sessions don't degrade — at the long-context threshold, Syntax does a structured handoff to a fresh context instead of in-place compaction.