Model API

The model adapter converts internal messages, tools, thinking settings, system prompt blocks, and cache metadata into an Anthropic Messages API request. It then turns streaming protocol events back into assistant messages and raw stream events for the query loop.

Adapter flow

queryModelWithStreaming()
  wrap queryModel() in streaming VCR

queryModel()
  resolve model and beta flags
  map Tool objects to API schemas
  normalize internal messages
  build system prompt blocks
  paramsFromContext()
  anthropic.beta.messages.create({ stream: true })
  for each stream event:
    collect text, thinking, signatures, tool input JSON
    on content_block_stop: yield AssistantMessage
    on message_delta: update usage and stop reason
    yield raw stream_event
The wrapper keeps streaming observable

queryModelWithStreaming is a thin wrapper around queryModel that records streaming behavior through a VCR layer. The query loop depends on the adapter yielding useful events while the response is still in progress.

Code reference: queryModelWithStreaming

Internal tools become API tool schemas

The adapter resolves the model, beta flags, previous request state, advisor/tool-search settings, and tool schemas. Built-in and MCP tools have already been normalized to the same local Tool shape, so this layer can map them into Anthropic-compatible tool schema objects.

External references: Anthropic Messages API and Anthropic tool use.

System prompt blocks and cache control are explicit

The adapter builds system prompt blocks separately from user messages. The request builder can attach cache-control metadata, include thinking settings, output config, metadata, speed options, and context management configuration.

The API call is a streamed Messages request

The adapter creates an Anthropic client and calls anthropic.beta.messages.create with stream: true, an abort signal, and headers. This is wrapped with retry handling.

Code reference: streamed API call with retry

External reference: Anthropic streaming.

Stream parsing yields assistant messages incrementally

The stream handler accumulates text, thinking, signatures, tool_use JSON input, and server tool_use input. When a content block stops, it creates and yields an assistant message. Message deltas mutate usage and stop reason on the last yielded message, while raw stream events are also yielded for telemetry and UI consumers.