com.blockether.svar.internal.router

Liking cljdoc? Tell your friends :D

Clojure only.

Router: provider/model registry, circuit breakers, rate limiting, budget tracking, and routing resolution.

Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic) to provide a single cohesive namespace for all routing concerns.

Router: provider/model registry, circuit breakers, rate limiting, budget tracking,
and routing resolution.

Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic)
to provide a single cohesive namespace for all routing concerns.

raw docstring

check-context-limit^clj

(check-context-limit model messages)

(check-context-limit model
                     messages
                     {:keys [output-reserve throw? context-limits]
                      :or {output-reserve DEFAULT_OUTPUT_RESERVE throw? false}})

Checks if messages fit within model context limit.

Checks if messages fit within model context limit.

source raw docstring

context-limit^clj

(context-limit model)

(context-limit model context-limits)

Returns the maximum context window size for a model.

Params: model - String. Model name. context-limits - Map, optional. Override map (merged defaults from config).

Returns: Integer. Maximum context tokens.

Returns the maximum context window size for a model.

Params:
`model` - String. Model name.
`context-limits` - Map, optional. Override map (merged defaults from config).

Returns:
Integer. Maximum context tokens.

source raw docstring

count-and-estimate^clj

(count-and-estimate model messages output-text)

(count-and-estimate model
                    messages
                    output-text
                    {:keys [pricing input-tokens api-usage]})

Counts tokens and estimates cost in one call.

Counts tokens and estimates cost in one call.

source raw docstring

count-messages^clj

(count-messages model messages)

Counts tokens for a chat completion message array.

Counts tokens for a chat completion message array.

source raw docstring

count-tokens^clj

(count-tokens model text)

Counts tokens for a given text string using the specified model's encoding.

Counts tokens for a given text string using the specified model's encoding.

source raw docstring

DEFAULT_OUTPUT_RESERVE^clj

Default number of tokens to reserve for model output. 0 means no reservation — let the API handle overflow naturally.

Default number of tokens to reserve for model output.
0 means no reservation — let the API handle overflow naturally.

source raw docstring

DEFAULT_RETRY^clj

Default retry policy for transient HTTP errors.

Default retry policy for transient HTTP errors.

source raw docstring

DEFAULT_TIMEOUT_MS^clj

Default HTTP request timeout in milliseconds (5 minutes). Reasoning models (e.g. glm-5-turbo) may need extended time for chain-of-thought.

Default HTTP request timeout in milliseconds (5 minutes).
Reasoning models (e.g. glm-5-turbo) may need extended time for chain-of-thought.

source raw docstring

estimate-cost^clj

(estimate-cost model input-tokens output-tokens)

(estimate-cost model input-tokens output-tokens pricing-map)

Estimates the cost in USD for a given token count.

Estimates the cost in USD for a given token count.

source raw docstring

format-cost^clj

(format-cost cost)

Formats a cost value as a human-readable USD string.

Formats a cost value as a human-readable USD string.

source raw docstring

infer-model-metadata^clj

(infer-model-metadata {:keys [name] :as model-map})

Returns provider-independent model metadata. Looks up KNOWN_MODEL_METADATA first. Falls back to regex inference for unknown models. Explicit fields in model-map override inferred values.

Returns provider-independent model metadata.
Looks up KNOWN_MODEL_METADATA first. Falls back to regex inference for unknown models.
Explicit fields in model-map override inferred values.

source raw docstring

KNOWN_MODEL_METADATA^clj

Per-model static metadata. :reasoning? flags a model whose provider accepts a reasoning-depth parameter. :reasoning-style (optional) pins the wire shape to emit — see REASONING_LEVELS keys. When omitted, the style is inferred from the provider's :api-style (:anthropic → anthropic thinking, everything else → openai-effort).

Per-model static metadata. `:reasoning?` flags a model whose provider
accepts a reasoning-depth parameter. `:reasoning-style` (optional) pins the
wire shape to emit — see `REASONING_LEVELS` keys. When omitted, the style
is inferred from the provider's `:api-style` (`:anthropic` → anthropic
thinking, everything else → openai-effort).

source raw docstring

KNOWN_PROVIDER_MODELS^clj

source

KNOWN_PROVIDERS^clj

source

make-router^clj

(make-router providers)

(make-router providers opts)

Creates a router from a vector of provider maps.

Vector order = priority (first provider is highest priority). First model in provider vector = root model. Provider :base-url auto-resolved from KNOWN_PROVIDERS for known IDs. Model metadata auto-inferred from :name and merged with provider-scoped pricing/context. Duplicate provider :id values are a hard error.

opts - Optional map: :network - {:timeout-ms N :max-retries N ...} router-level network defaults :tokens - {:check-context? bool :pricing {} :context-limits {}} token defaults :budget - {:max-tokens N :max-cost N} spend limits (nil = no limit) :failure-threshold - Int. Failures before circuit opens (default: 5) :recovery-ms - Int. Ms before open→half-open (default: 60000)

Example: (make-router [{:id :blockether :api-key <key> :models [{:name <model-a>} {:name <model-b>}]} {:id :openai :api-key <key> :models [{:name <model-a>} {:name <model-b>}]}] {:budget {:max-tokens 1000000 :max-cost 5.0}})

Creates a router from a vector of provider maps.

Vector order = priority (first provider is highest priority).
First model in provider vector = root model.
Provider :base-url auto-resolved from KNOWN_PROVIDERS for known IDs.
Model metadata auto-inferred from :name and merged with provider-scoped pricing/context.
Duplicate provider :id values are a hard error.

`opts` - Optional map:
  :network   - {:timeout-ms N :max-retries N ...} router-level network defaults
  :tokens    - {:check-context? bool :pricing {} :context-limits {}} token defaults
  :budget    - {:max-tokens N :max-cost N} spend limits (nil = no limit)
  :failure-threshold - Int. Failures before circuit opens (default: 5)
  :recovery-ms       - Int. Ms before open→half-open (default: 60000)

Example:
  (make-router [{:id :blockether :api-key <key>
                 :models [{:name <model-a>} {:name <model-b>}]}
                {:id :openai :api-key <key>
                 :models [{:name <model-a>} {:name <model-b>}]}]
               {:budget {:max-tokens 1000000 :max-cost 5.0}})

source raw docstring

max-input-tokens^clj

(max-input-tokens model)

(max-input-tokens model {:keys [output-reserve trim-ratio context-limits]})

Calculates maximum input tokens for a model, reserving space for output.

Calculates maximum input tokens for a model, reserving space for output.

source raw docstring

MODEL_CONTEXT_LIMITS^clj

Best-effort flattened model context limits for legacy token utilities. When a model exists on multiple providers with different contexts, the maximum is used.

Best-effort flattened model context limits for legacy token utilities.
When a model exists on multiple providers with different contexts, the maximum is used.

source raw docstring

MODEL_PRICING^clj

Best-effort flattened model pricing for legacy token utilities. When a model exists on multiple providers, the lowest total pricing is chosen. Provider-aware code should NOT use this — use provider-model-pricing instead.

Best-effort flattened model pricing for legacy token utilities.
When a model exists on multiple providers, the lowest total pricing is chosen.
Provider-aware code should NOT use this — use provider-model-pricing instead.

source raw docstring

normalize-model^clj

(normalize-model model-map)

Normalizes a model entry: {:name "gpt-4o"} -> full provider-independent model metadata.

Normalizes a model entry: {:name "gpt-4o"} -> full provider-independent model metadata.

source raw docstring

normalize-provider^clj

(normalize-provider idx provider-map)

Normalizes a provider entry:

resolves :base-url from KNOWN_PROVIDERS if not provided
derives :priority from vector index
derives :root from first model
merges provider-independent model metadata with provider-scoped pricing/context

Normalizes a provider entry:
- resolves :base-url from KNOWN_PROVIDERS if not provided
- derives :priority from vector index
- derives :root from first model
- merges provider-independent model metadata with provider-scoped pricing/context

source raw docstring

normalize-reasoning-level^clj

(normalize-reasoning-level v)

Coerce any accepted spelling to a canonical :quick|:balanced|:deep keyword. Accepts:

:quick / :balanced / :deep (keywords, case-insensitive)
"quick" / "balanced" / "deep" (strings, case-insensitive)
OpenAI-style aliases :low→:quick, :medium→:balanced, :high→:deep (so :reasoning_effort migrations don't break). Returns nil for unknown input.

Coerce any accepted spelling to a canonical :quick|:balanced|:deep keyword.
Accepts:
  - :quick / :balanced / :deep (keywords, case-insensitive)
  - "quick" / "balanced" / "deep" (strings, case-insensitive)
  - OpenAI-style aliases :low→:quick, :medium→:balanced, :high→:deep
    (so `:reasoning_effort` migrations don't break).
Returns nil for unknown input.

source raw docstring

provider-model-context^clj

(provider-model-context provider-id model-name)

Returns provider-scoped context window for provider/model, falling back to flattened MODEL_CONTEXT_LIMITS.

Returns provider-scoped context window for provider/model, falling back to flattened MODEL_CONTEXT_LIMITS.

source raw docstring

provider-model-entry^clj

(provider-model-entry provider-id model-name)

Returns provider-scoped entry {:pricing ... :context ...} for a provider/model, or nil.

Returns provider-scoped entry {:pricing ... :context ...} for a provider/model, or nil.

source raw docstring

provider-model-pricing^clj

(provider-model-pricing provider-id model-name)

Returns provider-scoped pricing for provider/model, falling back to flattened MODEL_PRICING.

Returns provider-scoped pricing for provider/model, falling back to flattened MODEL_PRICING.

source raw docstring

reasoning-extra-body^clj

(reasoning-extra-body api-style model-map level)

(reasoning-extra-body api-style model-map level {:keys [preserved-thinking?]})

Translates an abstract reasoning level into provider-specific extra-body. Returns nil when:

level is nil / unknown
the selected model is not flagged :reasoning?
the reasoning-style has no mapping in REASONING_LEVELS.

Dispatches on the model's :reasoning-style first (explicit pin), falling back to inference from api-style when the model doesn't declare one.

Callers pass the returned map through merge into their extra-body; silent nil keeps non-reasoning models untouched.

Four-arity form takes an opts map: :preserved-thinking? — Z.ai-only. Emits clear_thinking: false inside the :thinking block, asking the server to retain reasoning_content from previous assistant turns (Preserved Thinking, GLM-5 / GLM-4.7). Callers using this MUST echo the complete, unmodified reasoning_content back to the API in subsequent assistant turns, otherwise cache hit rates and model quality degrade. No-op on non-z.ai reasoning styles and on the Coding Plan endpoint (which has preserved thinking on server-side by default, but setting the flag explicitly is harmless).

Translates an abstract reasoning level into provider-specific extra-body.
Returns nil when:
  - `level` is nil / unknown
  - the selected model is not flagged `:reasoning?`
  - the reasoning-style has no mapping in REASONING_LEVELS.

Dispatches on the model's `:reasoning-style` first (explicit pin), falling
back to inference from `api-style` when the model doesn't declare one.

Callers pass the returned map through merge into their extra-body; silent
nil keeps non-reasoning models untouched.

Four-arity form takes an opts map:
  `:preserved-thinking?` — Z.ai-only. Emits `clear_thinking: false` inside
     the `:thinking` block, asking the server to retain reasoning_content
     from previous assistant turns (Preserved Thinking, GLM-5 / GLM-4.7).
     Callers using this MUST echo the complete, unmodified reasoning_content
     back to the API in subsequent assistant turns, otherwise cache hit
     rates and model quality degrade. No-op on non-z.ai reasoning styles
     and on the Coding Plan endpoint (which has preserved thinking on
     server-side by default, but setting the flag explicitly is harmless).

source raw docstring

REASONING_LEVELS^clj

Abstract reasoning levels translated per reasoning-style. Vocabulary is intentionally provider-neutral — callers pass :quick|:balanced|:deep and svar picks the right on-the-wire shape for the selected model.

Sub-key semantics: :openai-effort → flat top-level :reasoning_effort string. Used by GPT-5.x, o-series, Gemini 2.5 via OpenAI gateway, DeepSeek Reasoner, and most OpenAI-compatible reasoners. :anthropic-thinking → nested :thinking {:type "enabled" :budget_tokens N}. Budget magnitudes fit within 200k-context Claude 4.x max_tokens windows; tune if you hit ceilings. :zai-thinking → binary :thinking {:type "enabled"|"disabled"} on Z.ai / GLM-4.6+. No budget_tokens — thinking is on/off. :quick disables, :balanced/:deep enable. See also :preserved-thinking? below for the clear_thinking: false flag that keeps reasoning across assistant turns.

Abstract reasoning levels translated per reasoning-style.
Vocabulary is intentionally provider-neutral — callers pass :quick|:balanced|:deep
and svar picks the right on-the-wire shape for the selected model.

Sub-key semantics:
  `:openai-effort`      → flat top-level `:reasoning_effort` string.
                          Used by GPT-5.x, o-series, Gemini 2.5 via OpenAI gateway,
                          DeepSeek Reasoner, and most OpenAI-compatible reasoners.
  `:anthropic-thinking` → nested `:thinking {:type "enabled" :budget_tokens N}`.
                          Budget magnitudes fit within 200k-context Claude 4.x
                          max_tokens windows; tune if you hit ceilings.
  `:zai-thinking`       → binary `:thinking {:type "enabled"|"disabled"}` on
                          Z.ai / GLM-4.6+. No budget_tokens — thinking is on/off.
                          `:quick` disables, `:balanced`/`:deep` enable.
                          See also `:preserved-thinking?` below for the
                          `clear_thinking: false` flag that keeps reasoning
                          across assistant turns.

source raw docstring

reset-budget!^clj

(reset-budget! router)

Resets the router's token/cost budget counters to zero.

Resets the router's token/cost budget counters to zero.

source raw docstring

reset-provider!^clj

(reset-provider! router provider-id)

Manually resets a provider's circuit breaker to :closed.

Manually resets a provider's circuit breaker to :closed.

source raw docstring

resolve-routing^clj

(resolve-routing router routing-opts)

Resolves :routing opts to prefs for with-provider-fallback. Returns {:prefs prefs-map :error-strategy kw}. Throws on invalid provider/model combinations.

:reasoning in the routing opts (abstract level — :quick/:balanced/:deep or strings/aliases) implies :require-reasoning? true in prefs, which filters model selection to :reasoning? true models in resolve-model. This makes {:optimize :cost :reasoning :deep} pick the cheapest reasoning-capable model rather than silently dropping :deep when the cost-cheapest model happens to be non-reasoning.

Resolves :routing opts to prefs for with-provider-fallback.
Returns {:prefs prefs-map :error-strategy kw}.
Throws on invalid provider/model combinations.

`:reasoning` in the routing opts (abstract level — :quick/:balanced/:deep
or strings/aliases) implies `:require-reasoning? true` in prefs, which
filters model selection to `:reasoning? true` models in `resolve-model`.
This makes `{:optimize :cost :reasoning :deep}` pick the cheapest
*reasoning-capable* model rather than silently dropping `:deep` when the
cost-cheapest model happens to be non-reasoning.

source raw docstring

router-stats^clj

(router-stats router)

Returns cumulative + windowed stats for the router.

Returns cumulative + windowed stats for the router.

source raw docstring

select-provider^clj

(select-provider router prefs)

Returns [provider model-map] or nil. Read-only.

Cross-provider ranking: models are scored by :prefer first, provider priority second. So :optimize :intelligence picks the frontier model across the whole fleet; ties are broken by provider vector order.

Returns [provider model-map] or nil. Read-only.

Cross-provider ranking: models are scored by `:prefer` first, provider
priority second. So `:optimize :intelligence` picks the frontier model
across the whole fleet; ties are broken by provider vector order.

source raw docstring

truncate-text^clj

(truncate-text model text max-tokens)

(truncate-text model
               text
               max-tokens
               {:keys [truncation-marker from] :or {from :end}})

Truncates text to fit within a token limit. Uses proper tokenization to ensure accurate truncation.

Truncates text to fit within a token limit.
Uses proper tokenization to ensure accurate truncation.

source raw docstring

with-provider-fallback^clj

(with-provider-fallback router prefs f)

source

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close