LLM Stack

How TRUE actually uses LLMs — model routing, system prompts, the tool registry, structured-output enforcement, safety guards, and circuit breakers.

What this is

TRUE runs Claude as the LLM substrate across every AI surface — chat, agent prompt parsing, signal generation, staking rationale, and admin formatting. Routing, structured output, and safety are enforced outside the model: deterministic rules in code that intercept the model’s input and validate its output before anything reaches a user or a transaction.

Models in use

SurfaceModelJob
Chat (DARS dispatcher)claude-haiku-4-5 (escalates to Sonnet on complexity)User-facing conversation, tool selection, response synthesis.
Agent prompt parsingclaude-haiku-4-5Natural-language strategy → structured CreateAgentSchema JSON.
Portfolio analysisclaude-haiku-4-5 (with failover)Wallet-aware strategy suggestions.
Signal generationclaude-haiku-4-56 picks per slot + per-locale thesis.
Staking planclaude-haiku-4-5Plan rationale streamed via SSE.
Stake translationclaude-haiku-4-5Localised rationale.
Admin AI formattingclaude-haiku-4-5Raw text → sanitised HTML.
Documentation translationclaude-haiku-4-5These docs in 7 locales.

Haiku is the workhorse for cost and latency. The chat dispatcher escalates to Sonnet when the orchestrator’s complexity score crosses the configured threshold (long context, multi-step reasoning, ambiguous intent across multiple tools).

How chat is routed

The chat-api does not call the model directly. It dispatches to the agent service over HTTP and streams the response back to the client over SSE. Two layers shape the model’s behaviour:

01
Routing instructions
agent.routing.ts injects authoritative rules into the system prompt: which tools to use for which intent, when to refuse, when to defer.
02
Tool selection
The dispatcher exposes a fixed MCP tool registry. The model can call agents.create / agents.list / agents.toggle plus the read-only data tools.
03
Streaming dispatch
HttpAgentDispatcher consumes the upstream JSONL/SSE stream and republishes onto Redis pub/sub channels scoped to userId and publishId.
04
Persistence
aiContent is accumulated as it streams and written as a single ai-role message to MongoDB on completion.

Authoritative routing rules (excerpts)

These are not prompt-engineering hints; they are deterministic rules in agent.routing.ts that override model judgment.

  • Memecoin vs equities disambiguation. Queries that look like memecoin lookups MUST use getMemecoin* tools (Price Engine memecoin path). Fortune-500 tickers (AMD, NVDA, AAPL) are guarded — they route to equities providers unless a crypto keyword is present, to prevent the model from hallucinating a same-letter memecoin.
  • Agent management. Creating, listing, and toggling agents always uses agents.create, agents.list, agents.toggle. The model is not allowed to invent its own equivalent.
  • Trade outcome reporting. The model is explicitly forbidden from claiming a swap or agent execution succeeded or failed without a system-emitted signal. Outcome strings come from the executor, not the LLM.

Structured output is enforced, not requested

Anywhere the system needs structured JSON (agent draft, signal pick), it uses Anthropic tool-use with tool_choice forced to a specific tool. The model cannot return free-form text in those flows; the only path forward is a tool call whose arguments are validated against a Zod schema before persistence.

Example: agent prompt parsing forces tool_choice: { type: "tool", name: "build_agent" } and validates the arguments against triggerSchema and actionSchema. A schema failure causes a hard reject, not a soft retry on the next user turn.

System prompts and policy text

System prompts are versioned alongside the code that uses them.

  • Agent parser — long, deterministic prompt that classifies trigger types (recurring, price-target, price-move, sentiment, macro, on-chain, cross-asset) and applies smart defaults. Lives in artifacts/api-agents/src/mcp/haiku.ts.
  • Signals analyst — short identity prompt followed by candidate JSON; the model returns six picks via tool-use. Lives in artifacts/api-server/src/jobs/signals-generation.ts.
  • Admin formatter — formatter persona, sanitised HTML output, tight allow-list on tags. Lives in artifacts/api-server/src/routes/ai-format.ts.
  • Doc translator — 12 absolute rules covering MDX preservation, frontmatter, code blocks, proper nouns. Lives in artifacts/true-docs/scripts/translate.mjs.

Tool registry exposed to the chat model

ToolPurpose
agents.createCreate a draft agent from a natural-language prompt + wallet context.
agents.listList the user’s agents, filtered by status/asset class.
agents.togglePause / resume / stop an agent.
getPrice, getMultiPriceRead from the unified Price Engine.
getMemecoin*Memecoin-specific reads routed through the Birdeye-backed router.
getOhlcvCandlestick reads.
getHolders, getTopTraders, getSecurityLong-tail token analytics.
getSignalsRead the active signals set.
getStakeOptionsRead available staking providers and live APYs.
searchToken, getTrending, getNewListingsDiscovery surfaces.

Read tools have no side effects. Mutating tools (agents.create, agents.toggle) require walletAddress and are validated server-side; the user must still sign the permit at activation, which the LLM cannot do on their behalf.

Memory and context

The chat-api dispatch is stateless per turn. The model does not see the entire chat history on every turn; it sees:

  • The current user message.
  • The injected routing instructions.
  • A wallet context object (totals, holdings) when the routing implies the model should size something.
  • Tool results, inline, as they return.

Long-term context (saved signals, agent state, wallet history) is fetched on-demand through tool calls, not packed into the prompt. This keeps token cost predictable and prevents stale context from drifting answers.

Safety guards

The model never holds keys, never signs, never exfiltrates secrets

Every signing path requires a user-signed permit (for agents) or an explicit user signature (for swaps and stakes). The LLM has no path to spend funds. Tools that touch user state are validated against the schema, the permit, and the anomaly detector before they take effect.

Specific guards in code:

  • Stock-ticker confusion guard. Ambiguous tickers like AMD route to equities by default to prevent memecoin hallucinations on a Fortune-500 query.
  • High-risk confirmation. agents.create rejects agents whose amount exceeds 10% of wallet equity unless confirmHighRisk: true is present.
  • Outcome honesty. The chat model is forbidden from asserting that a swap succeeded or failed without a system-emitted signal.
  • Schema-tight tool args. Zod validation on every mutating tool call. A bad tool call is dropped, not “best-effort” applied.

Reliability — failover and circuit breakers

createMessageWithFailover wraps every Anthropic call with a primary breaker and a fallback breaker:

  • Primary breaker trips on sustained 5xx / timeout patterns from the primary provider.
  • Fallback breaker wraps the fallback path so a degraded fallback doesn’t keep getting hit either.
  • Health probes half-open the breakers periodically to recover automatically.

For surfaces that can tolerate degraded behaviour (admin formatting, translation), failure surfaces a structured error to the caller. For surfaces that cannot (agent activation, signal generation), the operation is aborted and retried at the next slot.

For Developers

Where to find this in the repo

ConcernFile
Chat dispatch and SSE → Redis fan-outartifacts/chat-api/src/modules/agent/agent.dispatcher.ts
Routing rules and tool-use policyartifacts/chat-api/src/modules/agent/agent.routing.ts
Agent prompt parser (Haiku, tool-choice forced)artifacts/api-agents/src/mcp/haiku.ts
MCP tool router and validatorsartifacts/api-agents/src/mcp/router.ts, tools.ts
Signal generation prompts and slot logicartifacts/api-server/src/jobs/signals-generation.ts
Failover / breakersartifacts/api-server/src/lib/anthropic.ts
Admin formatter promptartifacts/api-server/src/routes/ai-format.ts

Calling the chat API

The chat surface is consumed via the streaming endpoint documented on the Chat page. Tool calls and results surface as discrete SSE frames so a custom client can render reasoning, tool invocations, and tokens distinctly.

Safety, limits, failure modes

  • Token-budget overruns. A turn that exceeds the budget is truncated and surfaced as a structured warning, not silently shortened.
  • Tool-call hallucination. Calls to unknown tools are dropped at the dispatcher; they never reach the network.
  • Schema rejection. A tool call that fails Zod validation is dropped and the user-facing turn explains the failure rather than fabricating success.
  • Provider outage. Failover engages automatically; if both paths are down, the surface returns a structured error and clients should retry with backoff.

See also

  • Agents (DARS) — the orchestration layer above the dispatcher.
  • Chat — the streaming runtime that consumes the LLM stack.
  • True Agents — where parsed prompts become enforced strategies.
  • MCP — the tool surface exposed to external LLM clients.
Last updated: