com.blockether.svar.internal.router

Liking cljdoc? Tell your friends :D

Clojure only.

Router: provider/model registry, circuit breakers, rate limiting, budget tracking, and routing resolution.

Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic) to provide a single cohesive namespace for all routing concerns.

Router: provider/model registry, circuit breakers, rate limiting, budget tracking,
and routing resolution.

Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic)
to provide a single cohesive namespace for all routing concerns.

raw docstring

check-context-limit^clj

(check-context-limit model messages)

(check-context-limit model
                     messages
                     {:keys [output-reserve throw? context-limits]
                      :or {output-reserve DEFAULT_OUTPUT_RESERVE throw? false}})

Checks if messages fit within model context limit.

Checks if messages fit within model context limit.

source raw docstring

context-limit^clj

(context-limit model)

(context-limit model context-limits)

Returns provider-agnostic conservative context window size for a model.

Params: model - String. Model name. context-limits - Map, optional. Override map (merged defaults from config).

Returns: Integer. Conservative context tokens.

Returns provider-agnostic conservative context window size for a model.

Params:
`model` - String. Model name.
`context-limits` - Map, optional. Override map (merged defaults from config).

Returns:
Integer. Conservative context tokens.

source raw docstring

count-and-estimate^clj

(count-and-estimate model messages output-text)

(count-and-estimate model
                    messages
                    output-text
                    {:keys [pricing input-tokens api-usage api-style
                            cache-creation-ttl]
                     :as opts})

Counts tokens and estimates cost in one call. Cost map separates uncached input, cached-read input, cache-write input, output, and total.

Counts tokens and estimates cost in one call. Cost map separates
uncached input, cached-read input, cache-write input, output, and total.

source raw docstring

count-messages^clj

(count-messages model messages)

Counts tokens for a chat completion message array.

Counts tokens for a chat completion message array.

source raw docstring

count-tokens^clj

(count-tokens model text)

Counts tokens for a given text string using the specified model's encoding.

Counts tokens for a given text string using the specified model's encoding.

source raw docstring

DEFAULT_IDLE_TIMEOUT_MS^clj

Default idle-stream timeout (ms) for streaming HTTP responses. If no SSE bytes arrive for this long the underlying InputStream is closed and the call surfaces :svar.core/stream-idle-timeout. Distinct from DEFAULT_TIMEOUT_MS (whole-request cap): the idle watchdog tolerates arbitrarily long total durations as long as the stream keeps emitting bytes (content deltas, SSE : ping keepalives, or even blank separators — the watchdog resets on every .readLine, so it's ping-aware for free).

2 minutes (120000 ms) is the considered sweet spot:

Matches Anthropic's own SDK proposal (anthropics/anthropic-sdk- typescript#867 suggests 120_000 per-request, #959 ships 90s default with ping-reset).
~4× Anthropic's published ping interval (15-30 s) for safety.
Catches real hangs (e.g. z.ai glm streams that simply stop sending body frames after headers) in 2 minutes instead of forever — the original 5-minute DEFAULT_TIMEOUT_MS doesn't reliably fire on JDK 25 + HTTP/2 streaming.
Anthropic's documented worst case for legitimate extended thinking on Opus 4.5 is ~185 s with zero events (see anthropics/claude-agent-sdk-typescript#44). Callers running extended-thinking workloads should bump this to 240-300 s, or pass :idle-timeout-ms nil to disable.

Disable per-call: (svar/ask-code! router {... :idle-timeout-ms nil}).

45 s default. Catches genuinely hung streams (no SSE bytes, no keepalive pings) in well under a minute while still allowing the provider to take up to ~40 s between tokens during extended reasoning. The original 120 s default let timeouts blow the whole per-task budget.

Default idle-stream timeout (ms) for streaming HTTP responses. If no
SSE bytes arrive for this long the underlying `InputStream` is closed
and the call surfaces `:svar.core/stream-idle-timeout`. Distinct from
`DEFAULT_TIMEOUT_MS` (whole-request cap): the idle watchdog tolerates
arbitrarily long total durations as long as the stream keeps emitting
bytes (content deltas, SSE `: ping` keepalives, or even blank
separators — the watchdog resets on every `.readLine`, so it's
ping-aware for free).

2 minutes (120000 ms) is the considered sweet spot:
  - Matches Anthropic's own SDK proposal (anthropics/anthropic-sdk-
    typescript#867 suggests `120_000` per-request, #959 ships 90s
    default with ping-reset).
  - ~4× Anthropic's published `ping` interval (15-30 s) for safety.
  - Catches real hangs (e.g. z.ai glm streams that simply stop
    sending body frames after headers) in 2 minutes instead of
    forever — the original 5-minute `DEFAULT_TIMEOUT_MS` doesn't
    reliably fire on JDK 25 + HTTP/2 streaming.
  - Anthropic's documented worst case for legitimate extended
    thinking on Opus 4.5 is ~185 s with zero events (see
    anthropics/claude-agent-sdk-typescript#44). Callers running
    extended-thinking workloads should bump this to 240-300 s, or
    pass `:idle-timeout-ms nil` to disable.

Disable per-call: `(svar/ask-code! router {... :idle-timeout-ms nil})`.

45 s default. Catches genuinely hung streams (no SSE bytes, no
keepalive pings) in well under a minute while still allowing the
provider to take up to ~40 s between tokens during extended
reasoning. The original 120 s default let timeouts blow the whole
per-task budget.

source raw docstring

DEFAULT_OUTPUT_RESERVE^clj

Default number of tokens to reserve for model output. 0 means no reservation — let the API handle overflow naturally.

Default number of tokens to reserve for model output.
0 means no reservation — let the API handle overflow naturally.

source raw docstring

DEFAULT_RATE_LIMIT_ROUTING^clj

Default router-owned 429 policy.

:same-provider-delays-ms — sleep schedule for same-provider retries. :fallback-after-ms — hard cap on wall time the 429 phase can consume (measured from the FIRST 429 caught). When the configured delay vector is exhausted OR elapsed ≥ budget, the router falls back to the next provider. Each delay clamps to remaining budget so the loop never overshoots. :respect-retry-after? — honor server Retry-After header value in place of the configured delay; the clamp to remaining budget still applies. :fallback-provider? — when budget is exhausted, fall back to the next provider/model. Set false to surface the rate-limit error to the caller instead.

The 60 000 ms (60 s) default tolerates Anthropic / OpenAI / z.ai Retry-After values that can land between 30-60 s under quota pressure on reasoning-heavy workloads, while still bounding the wait so a single user request cannot hang for minutes.

Default router-owned 429 policy.

`:same-provider-delays-ms` — sleep schedule for same-provider retries.
`:fallback-after-ms`       — hard cap on wall time the 429 phase can
                             consume (measured from the FIRST 429
                             caught). When the configured delay vector
                             is exhausted OR elapsed ≥ budget, the
                             router falls back to the next provider.
                             Each delay clamps to remaining budget so
                             the loop never overshoots.
`:respect-retry-after?`    — honor server `Retry-After` header value
                             in place of the configured delay; the
                             clamp to remaining budget still applies.
`:fallback-provider?`      — when budget is exhausted, fall back to
                             the next provider/model. Set false to
                             surface the rate-limit error to the
                             caller instead.

The 60 000 ms (60 s) default tolerates Anthropic / OpenAI / z.ai
Retry-After values that can land between 30-60 s under quota
pressure on reasoning-heavy workloads, while still bounding the
wait so a single user request cannot hang for minutes.

source raw docstring

DEFAULT_RETRY^clj

Default retry policy for transient HTTP errors.

Default retry policy for transient HTTP errors.

source raw docstring

DEFAULT_SEMANTIC_TIMEOUT_MS^clj

Default semantic-stream timeout (ms) for streaming HTTP responses. Nil by default: disabled unless caller opts in. If enabled and bytes keep arriving (SSE pings/comments) but no model/progress event arrives for this long, the stream is closed and surfaced as :svar.core/stream-semantic-timeout.

Distinct from DEFAULT_IDLE_TIMEOUT_MS: idle watches transport liveness; semantic watches model progress. Enable per-call with e.g. :semantic-timeout-ms 180000.

Default semantic-stream timeout (ms) for streaming HTTP responses.
Nil by default: disabled unless caller opts in. If enabled and bytes
keep arriving (SSE pings/comments) but no model/progress event arrives
for this long, the stream is closed and surfaced as
`:svar.core/stream-semantic-timeout`.

Distinct from `DEFAULT_IDLE_TIMEOUT_MS`: idle watches transport
liveness; semantic watches model progress. Enable per-call with e.g.
`:semantic-timeout-ms 180000`.

source raw docstring

DEFAULT_TIMEOUT_MS^clj

Default HTTP request timeout in milliseconds (5 minutes). Reasoning models (e.g. glm-5-turbo) may need extended time for chain-of-thought.

Default HTTP request timeout in milliseconds (5 minutes).
Reasoning models (e.g. glm-5-turbo) may need extended time for chain-of-thought.

source raw docstring

DEFAULT_TTFT_TIMEOUT_MS^clj

Default time-to-first-token timeout (ms) for streaming HTTP responses. Bounds the wait between sending the HTTP request and receiving response headers. On fire, raises :svar.core/stream-ttft-timeout and the caller thread's interrupt unparks the underlying CompletableFuture.get.

30 s default. Tight enough to surface stuck provider connections inside one iteration (the original 90 s default sometimes wasted a whole autoresearch iter waiting for headers), generous enough for real reasoning cold starts — z.ai glm-5.1 has been observed sending first headers between 8 and 22 s. Disable per-call with :ttft-timeout-ms nil; pass a larger value for slow reasoning models with long pre-stream queues.

Default time-to-first-token timeout (ms) for streaming HTTP responses.
Bounds the wait between sending the HTTP request and receiving response
headers. On fire, raises `:svar.core/stream-ttft-timeout` and the caller
thread's interrupt unparks the underlying `CompletableFuture.get`.

30 s default. Tight enough to surface stuck provider connections
inside one iteration (the original 90 s default sometimes wasted a
whole autoresearch iter waiting for headers), generous enough for
real reasoning cold starts — z.ai glm-5.1 has been observed sending
first headers between 8 and 22 s. Disable per-call with
`:ttft-timeout-ms nil`; pass a larger value for slow reasoning
models with long pre-stream queues.

source raw docstring

estimate-cost^clj

(estimate-cost model input-tokens output-tokens)

(estimate-cost model input-tokens output-tokens pricing-map)

(estimate-cost model input-tokens output-tokens pricing-map opts)

Estimates USD cost with separate uncached input, cached input, cache creation, and output components. Rates are USD per 1M tokens.

input-tokens normally means provider prompt/input tokens. For OpenAI, Z.ai, Gemini, and OpenRouter this includes cached-read tokens, so cached reads are subtracted before normal input cost. For Anthropic normalized usage, input_tokens excludes cache read/write tokens; pass {:cache-tokens-in-input? false} to keep base input intact.

Estimates USD cost with separate uncached input, cached input, cache
creation, and output components. Rates are USD per 1M tokens.

`input-tokens` normally means provider prompt/input tokens. For OpenAI,
Z.ai, Gemini, and OpenRouter this includes cached-read tokens, so cached
reads are subtracted before normal input cost. For Anthropic normalized
usage, `input_tokens` excludes cache read/write tokens; pass
`{:cache-tokens-in-input? false}` to keep base input intact.

source raw docstring

format-cost^clj

(format-cost cost)

Formats a cost value as a human-readable USD string.

Formats a cost value as a human-readable USD string.

source raw docstring

infer-model-metadata^clj

(infer-model-metadata {:keys [name] :as model-map})

Returns provider-independent model metadata. Looks up KNOWN_MODEL_METADATA first. Falls back to regex inference for unknown models. Explicit fields in model-map override inferred values.

Returns provider-independent model metadata.
Looks up KNOWN_MODEL_METADATA first. Falls back to regex inference for unknown models.
Explicit fields in model-map override inferred values.

source raw docstring

KNOWN_MODEL_METADATA^clj

Per-model static metadata. :reasoning? flags a model whose provider accepts a reasoning-depth parameter. :reasoning-style (optional) pins the wire shape to emit — see REASONING_LEVELS keys. When omitted, the style is inferred from the provider's :api-style (:anthropic → anthropic thinking, everything else → openai-effort).

Per-model static metadata. `:reasoning?` flags a model whose provider
accepts a reasoning-depth parameter. `:reasoning-style` (optional) pins the
wire shape to emit — see `REASONING_LEVELS` keys. When omitted, the style
is inferred from the provider's `:api-style` (`:anthropic` → anthropic
thinking, everything else → openai-effort).

source raw docstring

KNOWN_PROVIDER_MODELS^clj

source

KNOWN_PROVIDERS^clj

source

make-router^clj

(make-router providers)

(make-router providers opts)

Creates a router from a vector of provider maps.

Vector order = priority (first provider is highest priority). First model in provider vector = root model. Provider :base-url auto-resolved from KNOWN_PROVIDERS for known IDs. Model metadata auto-inferred from :name and merged with provider-scoped pricing/context. Duplicate provider :id values are a hard error.

opts - Optional map: :network - {:timeout-ms N :max-retries N ...} router-level network defaults :tokens - {:check-context? bool :pricing {} :context-limits {}} token defaults :budget - {:max-tokens N :max-cost N} spend limits (nil = no limit) :rate-limit - {:same-provider-delays-ms [...] :fallback-after-ms N ...} :failure-threshold - Int. Failures before circuit opens (default: 5) :recovery-ms - Int. Ms before open→half-open (default: 60000)

Example: (make-router [{:id :blockether :api-key <key> :models [{:name <model-a>} {:name <model-b>}]} {:id :openai :api-key <key> :models [{:name <model-a>} {:name <model-b>}]}] {:budget {:max-tokens 1000000 :max-cost 5.0}})

Creates a router from a vector of provider maps.

Vector order = priority (first provider is highest priority).
First model in provider vector = root model.
Provider :base-url auto-resolved from KNOWN_PROVIDERS for known IDs.
Model metadata auto-inferred from :name and merged with provider-scoped pricing/context.
Duplicate provider :id values are a hard error.

`opts` - Optional map:
  :network   - {:timeout-ms N :max-retries N ...} router-level network defaults
  :tokens    - {:check-context? bool :pricing {} :context-limits {}} token defaults
  :budget    - {:max-tokens N :max-cost N} spend limits (nil = no limit)
  :rate-limit - {:same-provider-delays-ms [...] :fallback-after-ms N ...}
  :failure-threshold - Int. Failures before circuit opens (default: 5)
  :recovery-ms       - Int. Ms before open→half-open (default: 60000)

Example:
  (make-router [{:id :blockether :api-key <key>
                 :models [{:name <model-a>} {:name <model-b>}]}
                {:id :openai :api-key <key>
                 :models [{:name <model-a>} {:name <model-b>}]}]
               {:budget {:max-tokens 1000000 :max-cost 5.0}})

source raw docstring

max-input-tokens^clj

(max-input-tokens model)

(max-input-tokens model {:keys [output-reserve trim-ratio context-limits]})

Calculates maximum input tokens for a model, reserving space for output.

Calculates maximum input tokens for a model, reserving space for output.

source raw docstring

MODEL_CONTEXT_LIMITS^clj

Best-effort flattened model context limits for legacy token utilities. When a model exists on multiple providers with different contexts, the most conservative context is used. Provider-aware code should use provider-model-context instead.

Best-effort flattened model context limits for legacy token utilities.
When a model exists on multiple providers with different contexts, the most
conservative context is used. Provider-aware code should use
provider-model-context instead.

source raw docstring

MODEL_PRICING^clj

Best-effort flattened model pricing for legacy token utilities. When a model exists on multiple providers, the lowest total pricing is chosen. Provider-aware code should NOT use this — use provider-model-pricing instead.

Best-effort flattened model pricing for legacy token utilities.
When a model exists on multiple providers, the lowest total pricing is chosen.
Provider-aware code should NOT use this — use provider-model-pricing instead.

source raw docstring

normalize-model^clj

(normalize-model model-map)

Normalizes a model entry: {:name "gpt-4o"} -> full provider-independent model metadata.

Normalizes a model entry: {:name "gpt-4o"} -> full provider-independent model metadata.

source raw docstring

normalize-provider^clj

(normalize-provider idx provider-map)

Normalizes a provider entry:

resolves :base-url from KNOWN_PROVIDERS if not provided
derives :priority from vector index
derives :root from first model
merges provider-independent model metadata with provider-scoped pricing/context

Normalizes a provider entry:
- resolves :base-url from KNOWN_PROVIDERS if not provided
- derives :priority from vector index
- derives :root from first model
- merges provider-independent model metadata with provider-scoped pricing/context

source raw docstring

normalize-reasoning-level^clj

(normalize-reasoning-level v)

Coerce any accepted spelling to a canonical :quick|:balanced|:deep keyword. Accepts:

:quick / :balanced / :deep (keywords, case-insensitive)
"quick" / "balanced" / "deep" (strings, case-insensitive)
OpenAI-style aliases :low→:quick, :medium→:balanced, :high→:deep (so :reasoning_effort migrations don't break). Returns nil for unknown input.

Coerce any accepted spelling to a canonical :quick|:balanced|:deep keyword.
Accepts:
  - :quick / :balanced / :deep (keywords, case-insensitive)
  - "quick" / "balanced" / "deep" (strings, case-insensitive)
  - OpenAI-style aliases :low→:quick, :medium→:balanced, :high→:deep
    (so `:reasoning_effort` migrations don't break).
Returns nil for unknown input.

source raw docstring

provider-excluded-model?^clj

(provider-excluded-model? provider-id model-name)

True when a provider-scoped catalog marks a model unavailable. Provider config may add :exclude-models as exact model names and/or :min-gpt-version such as [5 3] to hide older GPT family models.

True when a provider-scoped catalog marks a model unavailable.
Provider config may add `:exclude-models` as exact model names and/or
`:min-gpt-version` such as [5 3] to hide older GPT family models.

source raw docstring

provider-model-context^clj

(provider-model-context provider-id model-name)

Returns provider-scoped context window for provider/model, falling back to flattened MODEL_CONTEXT_LIMITS.

Returns provider-scoped context window for provider/model, falling back to flattened MODEL_CONTEXT_LIMITS.

source raw docstring

provider-model-entry^clj

(provider-model-entry provider-id model-name)

Returns provider-scoped entry for a provider/model, or nil if excluded.

Composition:

Catalog entry from models.dev (pricing, context, modalities, capability flags, family, knowledge cutoff, release dates) — looked up under :pricing-source if declared, else :id.
svar overlay from KNOWN_PROVIDER_MODELS (wire/policy keys: :api-style, :reasoning-style, :json-object-mode?, :extra-body, plus any pricing/context overrides).

Overlay wins on conflicts. Pricing maps deep-merge so an overlay can override a single rate without dropping :cache-read / :cache-write from the catalog.

Returns provider-scoped entry for a provider/model, or nil if excluded.

Composition:
  1. Catalog entry from models.dev (pricing, context, modalities,
     capability flags, family, knowledge cutoff, release dates) —
     looked up under `:pricing-source` if declared, else `:id`.
  2. svar overlay from KNOWN_PROVIDER_MODELS (wire/policy keys:
     `:api-style`, `:reasoning-style`, `:json-object-mode?`,
     `:extra-body`, plus any pricing/context overrides).

Overlay wins on conflicts. Pricing maps deep-merge so an overlay
can override a single rate without dropping `:cache-read` /
`:cache-write` from the catalog.

source raw docstring

provider-model-pricing^clj

(provider-model-pricing provider-id model-name)

Returns provider-scoped pricing for provider/model, falling back to flattened MODEL_PRICING.

Returns provider-scoped pricing for provider/model, falling back to flattened MODEL_PRICING.

source raw docstring

provider-model-visible?^clj

(provider-model-visible? provider-id model-name)

True when provider-scoped model filters allow model-name.

True when provider-scoped model filters allow `model-name`.

source raw docstring

reasoning-extra-body^clj

(reasoning-extra-body api-style model-map level)

(reasoning-extra-body api-style model-map level {:keys [preserved-thinking?]})

Translates an abstract reasoning level into provider-specific extra-body. Returns nil when:

level is nil / unknown
the selected model is not flagged :reasoning?
the reasoning-style has no mapping in REASONING_LEVELS.

Dispatches on the model's :reasoning-style first (explicit pin), falling back to inference from api-style when the model doesn't declare one.

Callers pass the returned map through merge into their extra-body; silent nil keeps non-reasoning models untouched.

Four-arity form takes an opts map: :preserved-thinking? — Z.ai-only. Emits clear_thinking: false inside the :thinking block, asking the server to retain reasoning_content from previous assistant turns (Preserved Thinking, GLM-5 / GLM-4.7). Callers using this MUST echo the complete, unmodified reasoning_content back to the API in subsequent assistant turns, otherwise cache hit rates and model quality degrade. No-op on non-z.ai reasoning styles and on the Coding Plan endpoint (which has preserved thinking on server-side by default, but setting the flag explicitly is harmless).

Translates an abstract reasoning level into provider-specific extra-body.
Returns nil when:
  - `level` is nil / unknown
  - the selected model is not flagged `:reasoning?`
  - the reasoning-style has no mapping in REASONING_LEVELS.

Dispatches on the model's `:reasoning-style` first (explicit pin), falling
back to inference from `api-style` when the model doesn't declare one.

Callers pass the returned map through merge into their extra-body; silent
nil keeps non-reasoning models untouched.

Four-arity form takes an opts map:
  `:preserved-thinking?` — Z.ai-only. Emits `clear_thinking: false` inside
     the `:thinking` block, asking the server to retain reasoning_content
     from previous assistant turns (Preserved Thinking, GLM-5 / GLM-4.7).
     Callers using this MUST echo the complete, unmodified reasoning_content
     back to the API in subsequent assistant turns, otherwise cache hit
     rates and model quality degrade. No-op on non-z.ai reasoning styles
     and on the Coding Plan endpoint (which has preserved thinking on
     server-side by default, but setting the flag explicitly is harmless).

source raw docstring

REASONING_LEVELS^clj

Abstract reasoning levels translated per reasoning-style. Vocabulary is intentionally provider-neutral — callers pass :quick|:balanced|:deep and svar picks the right on-the-wire shape for the selected model.

Sub-key semantics: :openai-effort → flat top-level :reasoning_effort string. Used by GPT-5.x, o-series, Gemini 2.5 via OpenAI gateway, DeepSeek Reasoner, Copilot, and most OpenAI-compatible reasoners. :anthropic-thinking → Claude thinking controls. Claude Opus 4.7 / Opus 4.6 / Sonnet 4.6 use adaptive thinking + output_config.effort. Older Claude 4 models use manual budget_tokens. :zai-thinking → binary :thinking {:type "enabled"|"disabled"} on Z.ai / GLM-4.6+. No budget_tokens — thinking is on/off. :quick disables, :balanced/:deep enable. See also :preserved-thinking? below for the clear_thinking: false flag that keeps reasoning across assistant turns.

Abstract reasoning levels translated per reasoning-style.
Vocabulary is intentionally provider-neutral — callers pass :quick|:balanced|:deep
and svar picks the right on-the-wire shape for the selected model.

Sub-key semantics:
  `:openai-effort`      → flat top-level `:reasoning_effort` string.
                          Used by GPT-5.x, o-series, Gemini 2.5 via OpenAI gateway,
                          DeepSeek Reasoner, Copilot, and most OpenAI-compatible reasoners.
  `:anthropic-thinking` → Claude thinking controls.
                          Claude Opus 4.7 / Opus 4.6 / Sonnet 4.6 use
                          adaptive thinking + output_config.effort. Older
                          Claude 4 models use manual budget_tokens.
  `:zai-thinking`       → binary `:thinking {:type "enabled"|"disabled"}` on
                          Z.ai / GLM-4.6+. No budget_tokens — thinking is on/off.
                          `:quick` disables, `:balanced`/`:deep` enable.
                          See also `:preserved-thinking?` below for the
                          `clear_thinking: false` flag that keeps reasoning
                          across assistant turns.

source raw docstring

reset-budget!^clj

(reset-budget! router)

Resets the router's token/cost budget counters to zero.

Resets the router's token/cost budget counters to zero.

source raw docstring

reset-provider!^clj

(reset-provider! router provider-id)

Manually resets a provider's circuit breaker to :closed.

Manually resets a provider's circuit breaker to :closed.

source raw docstring

resolve-effective-model^clj

(resolve-effective-model router)

(resolve-effective-model router overrides)

Resolves the effective routed model from the router, optionally applying routing overrides.

Returns a model descriptor map (or nil when no provider is available): {:name :reasoning? :provider :api-style :pricing :context :intelligence :speed ...}

overrides (optional map): :optimize — :cost | :speed | :intelligence (translated to :prefer) :provider — explicit provider id keyword :model — explicit model name string :reasoning — reasoning level keyword (implies reasoning-capable model)

(resolve-effective-model router) ;; root model (resolve-effective-model router {:optimize :cost}) ;; cheapest (resolve-effective-model router {:optimize :intelligence}) ;; frontier (resolve-effective-model router {:provider :openai :model "gpt-5-mini"}) ;; exact

Resolves the effective routed model from the router, optionally applying
routing overrides.

Returns a model descriptor map (or nil when no provider is available):
{:name :reasoning? :provider :api-style :pricing :context :intelligence :speed ...}

`overrides` (optional map):
  :optimize  — :cost | :speed | :intelligence (translated to :prefer)
  :provider  — explicit provider id keyword
  :model     — explicit model name string
  :reasoning — reasoning level keyword (implies reasoning-capable model)

(resolve-effective-model router)                                         ;; root model
(resolve-effective-model router {:optimize :cost})                       ;; cheapest
(resolve-effective-model router {:optimize :intelligence})               ;; frontier
(resolve-effective-model router {:provider :openai :model "gpt-5-mini"}) ;; exact

source raw docstring

resolve-routing^clj

(resolve-routing router routing-opts)

Resolves :routing opts to prefs for with-provider-fallback. Returns {:prefs prefs-map :error-strategy kw}. Throws on invalid provider/model combinations.

:reasoning in the routing opts (abstract level — :quick/:balanced/:deep or strings/aliases) implies :require-reasoning? true in prefs, which filters model selection to :reasoning? true models in resolve-model. This makes {:optimize :cost :reasoning :deep} pick the cheapest reasoning-capable model rather than silently dropping :deep when the cost-cheapest model happens to be non-reasoning.

Resolves :routing opts to prefs for with-provider-fallback.
Returns {:prefs prefs-map :error-strategy kw}.
Throws on invalid provider/model combinations.

`:reasoning` in the routing opts (abstract level — :quick/:balanced/:deep
or strings/aliases) implies `:require-reasoning? true` in prefs, which
filters model selection to `:reasoning? true` models in `resolve-model`.
This makes `{:optimize :cost :reasoning :deep}` pick the cheapest
*reasoning-capable* model rather than silently dropping `:deep` when the
cost-cheapest model happens to be non-reasoning.

source raw docstring

router-stats^clj

(router-stats router)

Returns cumulative + windowed stats for the router.

Returns cumulative + windowed stats for the router.

source raw docstring

select-provider^clj

(select-provider router prefs)

Returns [provider model-map] or nil. Read-only.

Cross-provider ranking: models are scored by :prefer first, provider priority second. So :optimize :intelligence picks the frontier model across the whole fleet; ties are broken by provider vector order.

Returns [provider model-map] or nil. Read-only.

Cross-provider ranking: models are scored by `:prefer` first, provider
priority second. So `:optimize :intelligence` picks the frontier model
across the whole fleet; ties are broken by provider vector order.

source raw docstring

truncate-text^clj

(truncate-text model text max-tokens)

(truncate-text model
               text
               max-tokens
               {:keys [truncation-marker from] :or {from :end}})

Truncates text to fit within a token limit. Uses proper tokenization to ensure accurate truncation.

Truncates text to fit within a token limit.
Uses proper tokenization to ensure accurate truncation.

source raw docstring

with-provider-fallback^clj

(with-provider-fallback router prefs f)

source

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close