Party Builder & Specialists

Compose a strong main agent, a cheaper sub-agent, and up to six specialists into a single deployment.

Real coding workflows rarely fit a single model. You want a strong main agent for hard problems, a cheap and fast sub-agent for everything else, and sometimes one or more specialists for things like image understanding, OCR, image generation, time-series forecasting, or other non-text tasks.

The Party Builder is the UI and runtime that lets you compose those together as a single deployment.

The shape of a party

A party has up to eight slots:

Role	Count	Purpose
Main Agent	1 (required)	The model the harness primarily talks to.
Default Sub-Agent	1 (required — can re-use main agent's model)	The cheaper model the main agent delegates routine work to.
Specialist	up to 6 (optional)	A model with a specific capability, exposed as a tool the main agent can call.

Specialists can be any model in the catalog. Each gets an optional custom instruction that the main agent sees when deciding whether to call it.

How specialists are called

When you deploy a party, every specialist is registered with the main agent as a tool, along with a structured description the main agent can use to decide when to invoke it. The agent calls the tool; the call is forwarded to the specialist; the specialist's response is folded back into the conversation.

Presets

The Party Builder ships with a Presets tab — schema-versioned ready-to-deploy party definitions you can pick instead of composing one yourself. Presets are useful for common workflows ("a coding party", "a vision-and-coding party", "a document-processing party") and for sharing standard configurations across a team.

Capability scoring and plan generation

Before you deploy, the Party Builder shows:

Coverage: which capabilities the chosen models cover (text, reasoning, image understanding, image generation, audio, etc.) and where there are gaps.
Strength: a per-model strength bar so you can see which model is carrying which capability.
A deployment plan: the expected hardware footprint of the party on local GPU, on managed remote, or on self-hosted remote.

For local and self-hosted deployments, the plan is computed by the inference plane's autotuning logic, which knows how each model fits on your hardware and where to put it relative to the others. For managed remote, the plan is sent to dUX, which returns the placement.

Where to deploy

A party deploys to any of the same targets as a single model:

Local — one or more models on your own machine.
Self-Managed Remote — your own SSH-reachable GPU box(es).
Managed Remote — dUX-backed cloud.

The deployment process is the same in each case; only the underlying hardware changes.

When to build a custom party vs use a preset

Use a preset if your workflow lines up with a common template.
Build a custom party if you have a specific main model you trust, cheaper specialists you want to lean on for routine work, and capability requirements that aren't covered by presets.

Where this connects

The main and sub-agent slots are two of the three reasons for Multi-model parties — see Differentiators → Multi-model parties.
The deployment targets are described in Inference → Overview.
The capability scoring system reuses the Models → Purposes taxonomy.