LLM Stack

How TRUE actually uses LLMs — model routing, system prompts, the tool registry, structured-output enforcement, safety guards, and circuit breakers.

What this is

TRUE runs Claude as the LLM substrate across every AI surface — chat, agent prompt parsing, signal generation, staking rationale, and admin formatting. Routing, structured output, and safety are enforced outside the model: deterministic rules in code that intercept the model’s input and validate its output before anything reaches a user or a transaction.

Models in use

Surface	Model	Job
Chat (DARS dispatcher)	`claude-haiku-4-5` (escalates to Sonnet on complexity)	User-facing conversation, tool selection, response synthesis.
Agent prompt parsing	`claude-haiku-4-5`	Natural-language strategy → structured `CreateAgentSchema` JSON.
Portfolio analysis	`claude-haiku-4-5` (with failover)	Wallet-aware strategy suggestions.
Signal generation	`claude-haiku-4-5`	6 picks per slot + per-locale thesis.
Staking plan	`claude-haiku-4-5`	Plan rationale streamed via SSE.
Stake translation	`claude-haiku-4-5`	Localised rationale.
Admin AI formatting	`claude-haiku-4-5`	Raw text → sanitised HTML.
Documentation translation	`claude-haiku-4-5`	These docs in 7 locales.

Haiku is the workhorse for cost and latency. The chat dispatcher escalates to Sonnet when the orchestrator’s complexity score crosses the configured threshold (long context, multi-step reasoning, ambiguous intent across multiple tools).

How chat is routed

The chat-api does not call the model directly. It dispatches to the agent service over HTTP and streams the response back to the client over SSE. Two layers shape the model’s behaviour:

Routing instructions

agent.routing.ts injects authoritative rules into the system prompt: which tools to use for which intent, when to refuse, when to defer.

Tool selection

The dispatcher exposes a fixed MCP tool registry. The model can call agents.create / agents.list / agents.toggle plus the read-only data tools.

Streaming dispatch

HttpAgentDispatcher consumes the upstream JSONL/SSE stream and republishes onto Redis pub/sub channels scoped to userId and publishId.

Persistence

aiContent is accumulated as it streams and written as a single ai-role message to MongoDB on completion.

Authoritative routing rules (excerpts)

These are not prompt-engineering hints; they are deterministic rules in agent.routing.ts that override model judgment.

Memecoin vs equities disambiguation. Queries that look like memecoin lookups MUST use getMemecoin* tools (Price Engine memecoin path). Fortune-500 tickers (AMD, NVDA, AAPL) are guarded — they route to equities providers unless a crypto keyword is present, to prevent the model from hallucinating a same-letter memecoin.
Agent management. Creating, listing, and toggling agents always uses agents.create, agents.list, agents.toggle. The model is not allowed to invent its own equivalent.
Trade outcome reporting. The model is explicitly forbidden from claiming a swap or agent execution succeeded or failed without a system-emitted signal. Outcome strings come from the executor, not the LLM.

Structured output is enforced, not requested

Anywhere the system needs structured JSON (agent draft, signal pick), it uses Anthropic tool-use with tool_choice forced to a specific tool. The model cannot return free-form text in those flows; the only path forward is a tool call whose arguments are validated against a Zod schema before persistence.

Example: agent prompt parsing forces tool_choice: { type: "tool", name: "build_agent" } and validates the arguments against triggerSchema and actionSchema. A schema failure causes a hard reject, not a soft retry on the next user turn.

System prompts and policy text

System prompts are versioned alongside the code that uses them.

Agent parser — long, deterministic prompt that classifies trigger types (recurring, price-target, price-move, sentiment, macro, on-chain, cross-asset) and applies smart defaults. Lives in artifacts/api-agents/src/mcp/haiku.ts.
Signals analyst — short identity prompt followed by candidate JSON; the model returns six picks via tool-use. Lives in artifacts/api-server/src/jobs/signals-generation.ts.
Admin formatter — formatter persona, sanitised HTML output, tight allow-list on tags. Lives in artifacts/api-server/src/routes/ai-format.ts.
Doc translator — 12 absolute rules covering MDX preservation, frontmatter, code blocks, proper nouns. Lives in artifacts/true-docs/scripts/translate.mjs.

Tool registry exposed to the chat model

Tool	Purpose
`agents.create`	Create a draft agent from a natural-language prompt + wallet context.
`agents.list`	List the user’s agents, filtered by status/asset class.
`agents.toggle`	Pause / resume / stop an agent.
`getPrice`, `getMultiPrice`	Read from the unified Price Engine.
`getMemecoin*`	Memecoin-specific reads routed through the Birdeye-backed router.
`getOhlcv`	Candlestick reads.
`getHolders`, `getTopTraders`, `getSecurity`	Long-tail token analytics.
`getSignals`	Read the active signals set.
`getStakeOptions`	Read available staking providers and live APYs.
`searchToken`, `getTrending`, `getNewListings`	Discovery surfaces.

Read tools have no side effects. Mutating tools (agents.create, agents.toggle) require walletAddress and are validated server-side; the user must still sign the permit at activation, which the LLM cannot do on their behalf.

Memory and context

The chat-api dispatch is stateless per turn. The model does not see the entire chat history on every turn; it sees:

The current user message.
The injected routing instructions.
A wallet context object (totals, holdings) when the routing implies the model should size something.
Tool results, inline, as they return.

Long-term context (saved signals, agent state, wallet history) is fetched on-demand through tool calls, not packed into the prompt. This keeps token cost predictable and prevents stale context from drifting answers.

Safety guards

The model never holds keys, never signs, never exfiltrates secrets

Every signing path requires a user-signed permit (for agents) or an explicit user signature (for swaps and stakes). The LLM has no path to spend funds. Tools that touch user state are validated against the schema, the permit, and the anomaly detector before they take effect.

Specific guards in code:

Stock-ticker confusion guard. Ambiguous tickers like AMD route to equities by default to prevent memecoin hallucinations on a Fortune-500 query.
High-risk confirmation. agents.create rejects agents whose amount exceeds 10% of wallet equity unless confirmHighRisk: true is present.
Outcome honesty. The chat model is forbidden from asserting that a swap succeeded or failed without a system-emitted signal.
Schema-tight tool args. Zod validation on every mutating tool call. A bad tool call is dropped, not “best-effort” applied.

Reliability — failover and circuit breakers

createMessageWithFailover wraps every Anthropic call with a primary breaker and a fallback breaker:

Primary breaker trips on sustained 5xx / timeout patterns from the primary provider.
Fallback breaker wraps the fallback path so a degraded fallback doesn’t keep getting hit either.
Health probes half-open the breakers periodically to recover automatically.

For surfaces that can tolerate degraded behaviour (admin formatting, translation), failure surfaces a structured error to the caller. For surfaces that cannot (agent activation, signal generation), the operation is aborted and retried at the next slot.

For Developers

Where to find this in the repo

Concern	File
Chat dispatch and SSE → Redis fan-out	`artifacts/chat-api/src/modules/agent/agent.dispatcher.ts`
Routing rules and tool-use policy	`artifacts/chat-api/src/modules/agent/agent.routing.ts`
Agent prompt parser (Haiku, tool-choice forced)	`artifacts/api-agents/src/mcp/haiku.ts`
MCP tool router and validators	`artifacts/api-agents/src/mcp/router.ts`, `tools.ts`
Signal generation prompts and slot logic	`artifacts/api-server/src/jobs/signals-generation.ts`
Failover / breakers	`artifacts/api-server/src/lib/anthropic.ts`
Admin formatter prompt	`artifacts/api-server/src/routes/ai-format.ts`

Calling the chat API

The chat surface is consumed via the streaming endpoint documented on the Chat page. Tool calls and results surface as discrete SSE frames so a custom client can render reasoning, tool invocations, and tokens distinctly.

Safety, limits, failure modes

Token-budget overruns. A turn that exceeds the budget is truncated and surfaced as a structured warning, not silently shortened.
Tool-call hallucination. Calls to unknown tools are dropped at the dispatcher; they never reach the network.
Schema rejection. A tool call that fails Zod validation is dropped and the user-facing turn explains the failure rather than fabricating success.
Provider outage. Failover engages automatically; if both paths are down, the surface returns a structured error and clients should retry with backoff.