The model registry powers sdk/list-models, sdk/model-info, sdk/model-capabilities, sdk/model-context-length, and cost estimation. It is designed to answer useful questions offline while still allowing live provider catalogs and caller overrides.
Higher tiers win:
| Tier | Source | Purpose |
|---|---|---|
| Override | sdk/register-model-info | Caller data for private deployments or newly released models. |
| Live | Provider /models endpoint | What the current account can see right now. |
| LiteLLM snapshot | resources/litellm-snapshot.json | Broad pricing/context coverage for providers the SDK supports. |
| models.dev | resources/models-dev-snapshot.json plus optional cache | Large public model catalog fallback. |
The returned model entry carries :model/source so callers can see which tier answered.
(sdk/list-models)
(sdk/list-models :openai)
(sdk/model-info :openai "gpt-4o")
(sdk/model-capabilities :openai "gpt-4o")
(sdk/model-context-length :openai "gpt-4o")
Single-argument lookup scans providers in a stable preference order. Prefer two-argument lookup when the model id is ambiguous across providers.
(sdk/refresh-models! :provider :openai)
(sdk/refresh-models!)
Live refresh requires provider credentials and only runs for providers with supported model-list endpoints. Failures are returned per provider instead of aborting the whole refresh.
Use overrides for private models, self-hosted endpoints, or pricing data that has not reached public catalogs yet:
(sdk/register-model-info
:acme "magic-7"
{:model/context-length 32000
:model/max-output-tokens 4096
:model/capabilities #{:chat :tools}
:model/cost {:input-per-million 0.5
:output-per-million 2.0
:cache-read-per-million 0.1
:cache-write-per-million 0.6
:request-cost 0.005}})
Overrides are in-memory. Applications that need durable custom catalogs should register them during process startup.
(sdk/estimate-cost
:openai "gpt-4o"
{:usage/input-tokens 1000
:usage/output-tokens 500
:usage/cached-input-tokens 200})
Cost results are explicit about uncertainty:
{:cost/usd 0.00625M
:cost/estimated? true
:cost/pricing-source "litellm-snapshot"
:cost/breakdown {...}}
If pricing is unavailable or incomplete for a reported billable dimension,
:cost/usd is :unknown. The SDK never substitutes zero for unknown
cost and never double counts cached tokens: normalized input tokens are
uncached input, while cache reads and cache writes are priced from their
own fields when the registry has those rates.
The registry can carry token, cache, request, image, transcription, text-to-speech, and search-query price fields:
{:input-per-million 2.5
:output-per-million 10.0
:cache-read-per-million 1.25
:cache-write-per-million 3.75
:request-cost 0.005
:image-per-image 0.04
:image-per-megapixel 0.02
:transcription-per-minute 0.006
:tts-per-million-chars 15.0
:search-per-call 0.005}
sdk/estimate-cost covers token, cache, and per-request chat costs.
Use the modality helpers in llm.sdk.pricing for image, transcription,
and text-to-speech attribution when the response path does not include
chat-style usage.
Provider usage normalizers only emit cache counters when providers actually report them. sdk/complete uses those counters to stamp:
{:cache/status :hit
:cache/cached-tokens 200
:cache/cache-write-tokens :unknown}
If a provider is silent about cache statistics, :cache/status is :unknown. This lets applications distinguish "the provider said zero cached tokens" from "the provider did not report cache information."
The LiteLLM snapshot is rebuilt by fetching a single upstream file
(model_prices_and_context_window.json) directly over HTTPS — no local
LiteLLM checkout required:
python3 scripts/build_litellm_snapshot.py
To pin a version or work offline, pass an override (a URL, a path to that
JSON file, or a directory containing it) as the first argument or via the
LITELLM_SOURCE env var:
python3 scripts/build_litellm_snapshot.py /path/to/litellm
The script filters LiteLLM data down to providers this SDK can actually address. It preserves token pricing plus available request, image, transcription, and text-to-speech pricing fields for cost-aware applications.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |