LLM Stack
How TRUE actually uses LLMs — model routing, system prompts, the tool registry, structured-output enforcement, safety guards, and circuit breakers.
What this is
TRUE runs Claude as the LLM substrate across every AI surface — chat, agent prompt parsing, signal generation, staking rationale, and admin formatting. Routing, structured output, and safety are enforced outside the model: deterministic rules in code that intercept the model’s input and validate its output before anything reaches a user or a transaction.
Models in use
| Surface | Model | Job |
|---|---|---|
| Chat (DARS dispatcher) | claude-haiku-4-5 (escalates to Sonnet on complexity) | User-facing conversation, tool selection, response synthesis. |
| Agent prompt parsing | claude-haiku-4-5 | Natural-language strategy → structured CreateAgentSchema JSON. |
| Portfolio analysis | claude-haiku-4-5 (with failover) | Wallet-aware strategy suggestions. |
| Signal generation | claude-haiku-4-5 | 6 picks per slot + per-locale thesis. |
| Staking plan | claude-haiku-4-5 | Plan rationale streamed via SSE. |
| Stake translation | claude-haiku-4-5 | Localised rationale. |
| Admin AI formatting | claude-haiku-4-5 | Raw text → sanitised HTML. |
| Documentation translation | claude-haiku-4-5 | These docs in 7 locales. |
Haiku is the workhorse for cost and latency. The chat dispatcher escalates to Sonnet when the orchestrator’s complexity score crosses the configured threshold (long context, multi-step reasoning, ambiguous intent across multiple tools).
How chat is routed
The chat-api does not call the model directly. It dispatches to the agent service over HTTP and streams the response back to the client over SSE. Two layers shape the model’s behaviour:
Authoritative routing rules (excerpts)
These are not prompt-engineering hints; they are deterministic rules in agent.routing.ts that override model judgment.
- Memecoin vs equities disambiguation. Queries that look like memecoin lookups MUST use
getMemecoin*tools (Price Engine memecoin path). Fortune-500 tickers (AMD, NVDA, AAPL) are guarded — they route to equities providers unless a crypto keyword is present, to prevent the model from hallucinating a same-letter memecoin. - Agent management. Creating, listing, and toggling agents always uses
agents.create,agents.list,agents.toggle. The model is not allowed to invent its own equivalent. - Trade outcome reporting. The model is explicitly forbidden from claiming a swap or agent execution succeeded or failed without a system-emitted signal. Outcome strings come from the executor, not the LLM.
Structured output is enforced, not requested
Anywhere the system needs structured JSON (agent draft, signal pick), it uses Anthropic tool-use with tool_choice forced to a specific tool. The model cannot return free-form text in those flows; the only path forward is a tool call whose arguments are validated against a Zod schema before persistence.
Example: agent prompt parsing forces tool_choice: { type: "tool", name: "build_agent" } and validates the arguments against triggerSchema and actionSchema. A schema failure causes a hard reject, not a soft retry on the next user turn.
System prompts and policy text
System prompts are versioned alongside the code that uses them.
- Agent parser — long, deterministic prompt that classifies trigger types (recurring, price-target, price-move, sentiment, macro, on-chain, cross-asset) and applies smart defaults. Lives in
artifacts/api-agents/src/mcp/haiku.ts. - Signals analyst — short identity prompt followed by candidate JSON; the model returns six picks via tool-use. Lives in
artifacts/api-server/src/jobs/signals-generation.ts. - Admin formatter — formatter persona, sanitised HTML output, tight allow-list on tags. Lives in
artifacts/api-server/src/routes/ai-format.ts. - Doc translator — 12 absolute rules covering MDX preservation, frontmatter, code blocks, proper nouns. Lives in
artifacts/true-docs/scripts/translate.mjs.
Tool registry exposed to the chat model
| Tool | Purpose |
|---|---|
agents.create | Create a draft agent from a natural-language prompt + wallet context. |
agents.list | List the user’s agents, filtered by status/asset class. |
agents.toggle | Pause / resume / stop an agent. |
getPrice, getMultiPrice | Read from the unified Price Engine. |
getMemecoin* | Memecoin-specific reads routed through the Birdeye-backed router. |
getOhlcv | Candlestick reads. |
getHolders, getTopTraders, getSecurity | Long-tail token analytics. |
getSignals | Read the active signals set. |
getStakeOptions | Read available staking providers and live APYs. |
searchToken, getTrending, getNewListings | Discovery surfaces. |
Read tools have no side effects. Mutating tools (agents.create, agents.toggle) require walletAddress and are validated server-side; the user must still sign the permit at activation, which the LLM cannot do on their behalf.
Memory and context
The chat-api dispatch is stateless per turn. The model does not see the entire chat history on every turn; it sees:
- The current user message.
- The injected routing instructions.
- A wallet context object (totals, holdings) when the routing implies the model should size something.
- Tool results, inline, as they return.
Long-term context (saved signals, agent state, wallet history) is fetched on-demand through tool calls, not packed into the prompt. This keeps token cost predictable and prevents stale context from drifting answers.
Safety guards
Every signing path requires a user-signed permit (for agents) or an explicit user signature (for swaps and stakes). The LLM has no path to spend funds. Tools that touch user state are validated against the schema, the permit, and the anomaly detector before they take effect.
Specific guards in code:
- Stock-ticker confusion guard. Ambiguous tickers like AMD route to equities by default to prevent memecoin hallucinations on a Fortune-500 query.
- High-risk confirmation.
agents.createrejects agents whoseamountexceeds 10% of wallet equity unlessconfirmHighRisk: trueis present. - Outcome honesty. The chat model is forbidden from asserting that a swap succeeded or failed without a system-emitted signal.
- Schema-tight tool args. Zod validation on every mutating tool call. A bad tool call is dropped, not “best-effort” applied.
Reliability — failover and circuit breakers
createMessageWithFailover wraps every Anthropic call with a primary breaker and a fallback breaker:
- Primary breaker trips on sustained 5xx / timeout patterns from the primary provider.
- Fallback breaker wraps the fallback path so a degraded fallback doesn’t keep getting hit either.
- Health probes half-open the breakers periodically to recover automatically.
For surfaces that can tolerate degraded behaviour (admin formatting, translation), failure surfaces a structured error to the caller. For surfaces that cannot (agent activation, signal generation), the operation is aborted and retried at the next slot.
Where to find this in the repo
| Concern | File |
|---|---|
| Chat dispatch and SSE → Redis fan-out | artifacts/chat-api/src/modules/agent/agent.dispatcher.ts |
| Routing rules and tool-use policy | artifacts/chat-api/src/modules/agent/agent.routing.ts |
| Agent prompt parser (Haiku, tool-choice forced) | artifacts/api-agents/src/mcp/haiku.ts |
| MCP tool router and validators | artifacts/api-agents/src/mcp/router.ts, tools.ts |
| Signal generation prompts and slot logic | artifacts/api-server/src/jobs/signals-generation.ts |
| Failover / breakers | artifacts/api-server/src/lib/anthropic.ts |
| Admin formatter prompt | artifacts/api-server/src/routes/ai-format.ts |
Calling the chat API
The chat surface is consumed via the streaming endpoint documented on the Chat page. Tool calls and results surface as discrete SSE frames so a custom client can render reasoning, tool invocations, and tokens distinctly.
Safety, limits, failure modes
- Token-budget overruns. A turn that exceeds the budget is truncated and surfaced as a structured warning, not silently shortened.
- Tool-call hallucination. Calls to unknown tools are dropped at the dispatcher; they never reach the network.
- Schema rejection. A tool call that fails Zod validation is dropped and the user-facing turn explains the failure rather than fabricating success.
- Provider outage. Failover engages automatically; if both paths are down, the surface returns a structured error and clients should retry with backoff.
See also
- Agents (DARS) — the orchestration layer above the dispatcher.
- Chat — the streaming runtime that consumes the LLM stack.
- True Agents — where parsed prompts become enforced strategies.
- MCP — the tool surface exposed to external LLM clients.