Canonical token-usage shape — single source of truth across providers.
Phase A of svar 0.6.0. Replaces the hybrid pre-0.6 shape that emitted
:prompt_tokens with provider-dependent semantics (Anthropic
additive, OpenAI inclusive) under the same key. Every provider
normalizer now produces the SAME shape; downstream consumers read
one set of keys regardless of which model served the call.
Canonical shape — INVARIANT: regular + cache-write + cache-read = input-tokens:
{:input-tokens <long> ;; TOTAL prompt tokens (always inclusive) :output-tokens <long> ;; TOTAL completion tokens :input-tokens-details {:regular <long> ;; not from cache, not written :cache-write <long> ;; written this request (1.25× input rate, anthropic; 0 else) :cache-read <long>} ;; served from cache (0.1× input rate, anthropic; ~10-50% off, openai) :output-tokens-details {:reasoning <long>} ;; subset of output-tokens :total-tokens <long> ;; convenience = input-tokens + output-tokens :raw <map>} ;; original provider envelope (debug / forensics)
Provider differences:
Anthropic Messages API (:anthropic api-style): RAW
input_tokens excludes cached AND cache-creation. Canonical
:input-tokens adds all three so the value is TOTAL.
OpenAI Chat / Responses (:openai-compatible-* api-styles): RAW
prompt_tokens / input_tokens IS the total. Cached subset lives
under prompt_tokens_details.cached_tokens /
input_tokens_details.cached_tokens. No native cache-write
concept (server-managed implicit caching), so :cache-write is
always 0 here UNLESS the provider proxies Anthropic via OpenRouter
and surfaces cache_creation_input_tokens as a pydantic extra
field.
Z.ai (GLM coding-plan / OpenAI-compatible-chat): same as OpenAI.
Industry alignment (May 2026):
inputTokens always TOTAL
with inputTokensDetails {regular, cacheWrite, cacheRead}.gen_ai.usage.input_tokens (≥ v1.37): SHOULD be
inclusive (all kinds of input tokens).context_window.total_input_tokens = input + cache_creation + cache_read.Reject: the additive convention (e.g. litellm PR #23342) leaves
total_tokens inconsistent with prompt_tokens and breaks naive
aggregation downstream.
Canonical token-usage shape — single source of truth across providers.
Phase A of svar 0.6.0. Replaces the hybrid pre-0.6 shape that emitted
`:prompt_tokens` with provider-dependent semantics (Anthropic
additive, OpenAI inclusive) under the same key. Every provider
normalizer now produces the SAME shape; downstream consumers read
one set of keys regardless of which model served the call.
Canonical shape — INVARIANT: `regular + cache-write + cache-read = input-tokens`:
{:input-tokens <long> ;; TOTAL prompt tokens (always inclusive)
:output-tokens <long> ;; TOTAL completion tokens
:input-tokens-details {:regular <long> ;; not from cache, not written
:cache-write <long> ;; written this request (1.25× input rate, anthropic; 0 else)
:cache-read <long>} ;; served from cache (0.1× input rate, anthropic; ~10-50% off, openai)
:output-tokens-details {:reasoning <long>} ;; subset of output-tokens
:total-tokens <long> ;; convenience = input-tokens + output-tokens
:raw <map>} ;; original provider envelope (debug / forensics)
Provider differences:
- Anthropic Messages API (`:anthropic` api-style): RAW
`input_tokens` excludes cached AND cache-creation. Canonical
`:input-tokens` adds all three so the value is TOTAL.
- OpenAI Chat / Responses (`:openai-compatible-*` api-styles): RAW
`prompt_tokens` / `input_tokens` IS the total. Cached subset lives
under `prompt_tokens_details.cached_tokens` /
`input_tokens_details.cached_tokens`. No native `cache-write`
concept (server-managed implicit caching), so `:cache-write` is
always 0 here UNLESS the provider proxies Anthropic via OpenRouter
and surfaces `cache_creation_input_tokens` as a pydantic extra
field.
- Z.ai (GLM coding-plan / OpenAI-compatible-chat): same as OpenAI.
Industry alignment (May 2026):
- Vercel AI SDK V3 spec (vercel/ai#9921): `inputTokens` always TOTAL
with `inputTokensDetails {regular, cacheWrite, cacheRead}`.
- OpenTelemetry `gen_ai.usage.input_tokens` (≥ v1.37): SHOULD be
inclusive (all kinds of input tokens).
- Claude Code official statusline JSON:
`context_window.total_input_tokens = input + cache_creation + cache_read`.
Reject: the additive convention (e.g. litellm PR #23342) leaves
`total_tokens` inconsistent with `prompt_tokens` and breaks naive
aggregation downstream.(anthropic-canonical usage)Anthropic Messages API → canonical shape.
Anthropic is ADDITIVE: input_tokens excludes cached AND
cache_creation. To get a TOTAL we sum all three.
usage is the raw :usage object from an Anthropic response
(message_start.message.usage, top-level :usage on non-stream,
or a message_delta.usage delta). Missing fields are treated as 0.
Returns nil for nil input.
Anthropic Messages API → canonical shape. Anthropic is ADDITIVE: `input_tokens` excludes cached AND cache_creation. To get a TOTAL we sum all three. `usage` is the raw `:usage` object from an Anthropic response (`message_start.message.usage`, top-level `:usage` on non-stream, or a `message_delta.usage` delta). Missing fields are treated as 0. Returns nil for nil input.
(canonical->tokens canonical)Project the canonical shape to the FLAT :tokens map returned by
ask! / ask-code!. Backward-compatible field names where
possible:
{:input <input-tokens, TOTAL> :output <output-tokens> :reasoning <output-tokens-details.reasoning> :total <total-tokens> :cached <input-tokens-details.cache-read> :cache-created <input-tokens-details.cache-write> :input-regular <input-tokens-details.regular>}
This is the shape downstream consumers (vis loop, vis TUI footer,
CLI bracket, Telegram tagline) read. :input now ALWAYS means
TOTAL prompt tokens (not anthropic's pre-cache RAW).
Project the canonical shape to the FLAT `:tokens` map returned by
`ask!` / `ask-code!`. Backward-compatible field names where
possible:
{:input <input-tokens, TOTAL>
:output <output-tokens>
:reasoning <output-tokens-details.reasoning>
:total <total-tokens>
:cached <input-tokens-details.cache-read>
:cache-created <input-tokens-details.cache-write>
:input-regular <input-tokens-details.regular>}
This is the shape downstream consumers (vis loop, vis TUI footer,
CLI bracket, Telegram tagline) read. `:input` now ALWAYS means
TOTAL prompt tokens (not anthropic's pre-cache RAW).(openai-canonical usage)OpenAI Chat / Responses API → canonical shape.
OpenAI is INCLUSIVE: prompt_tokens / input_tokens IS the total.
Cached subset lives under prompt_tokens_details.cached_tokens or
input_tokens_details.cached_tokens. OpenRouter-proxied Anthropic
may surface cache_creation_input_tokens as a pydantic-extra; we
subtract both from the total to compute regular.
Accepts EITHER :prompt_tokens (Chat API) or :input_tokens
(Responses API). Cache subkey is :prompt_tokens_details /
:input_tokens_details; we read whichever is present.
Returns nil for nil input.
OpenAI Chat / Responses API → canonical shape. OpenAI is INCLUSIVE: `prompt_tokens` / `input_tokens` IS the total. Cached subset lives under `prompt_tokens_details.cached_tokens` or `input_tokens_details.cached_tokens`. OpenRouter-proxied Anthropic may surface `cache_creation_input_tokens` as a pydantic-extra; we subtract both from the total to compute `regular`. Accepts EITHER `:prompt_tokens` (Chat API) or `:input_tokens` (Responses API). Cache subkey is `:prompt_tokens_details` / `:input_tokens_details`; we read whichever is present. Returns nil for nil input.
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |