Router: provider/model registry, circuit breakers, rate limiting, budget tracking, and routing resolution.
Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic) to provide a single cohesive namespace for all routing concerns.
Router: provider/model registry, circuit breakers, rate limiting, budget tracking, and routing resolution. Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic) to provide a single cohesive namespace for all routing concerns.
(check-context-limit model messages)(check-context-limit model
messages
{:keys [output-reserve throw? context-limits]
:or {output-reserve DEFAULT_OUTPUT_RESERVE throw? false}})Checks if messages fit within model context limit.
Checks if messages fit within model context limit.
(context-limit model)(context-limit model context-limits)Returns the maximum context window size for a model.
Params:
model - String. Model name.
context-limits - Map, optional. Override map (merged defaults from config).
Returns: Integer. Maximum context tokens.
Returns the maximum context window size for a model. Params: `model` - String. Model name. `context-limits` - Map, optional. Override map (merged defaults from config). Returns: Integer. Maximum context tokens.
(count-and-estimate model messages output-text)(count-and-estimate model
messages
output-text
{:keys [pricing input-tokens api-usage]})Counts tokens and estimates cost in one call.
Counts tokens and estimates cost in one call.
(count-messages model messages)Counts tokens for a chat completion message array.
Counts tokens for a chat completion message array.
(count-tokens model text)Counts tokens for a given text string using the specified model's encoding.
Counts tokens for a given text string using the specified model's encoding.
Default number of tokens to reserve for model output. 0 means no reservation — let the API handle overflow naturally.
Default number of tokens to reserve for model output. 0 means no reservation — let the API handle overflow naturally.
Default retry policy for transient HTTP errors.
Default retry policy for transient HTTP errors.
Default HTTP request timeout in milliseconds (5 minutes). Reasoning models (e.g. glm-5-turbo) may need extended time for chain-of-thought.
Default HTTP request timeout in milliseconds (5 minutes). Reasoning models (e.g. glm-5-turbo) may need extended time for chain-of-thought.
(estimate-cost model input-tokens output-tokens)(estimate-cost model input-tokens output-tokens pricing-map)Estimates the cost in USD for a given token count.
Estimates the cost in USD for a given token count.
(format-cost cost)Formats a cost value as a human-readable USD string.
Formats a cost value as a human-readable USD string.
(infer-model-metadata {:keys [name] :as model-map})Returns provider-independent model metadata. Looks up KNOWN_MODEL_METADATA first. Falls back to regex inference for unknown models. Explicit fields in model-map override inferred values.
Returns provider-independent model metadata. Looks up KNOWN_MODEL_METADATA first. Falls back to regex inference for unknown models. Explicit fields in model-map override inferred values.
(make-router providers)(make-router providers opts)Creates a router from a vector of provider maps.
Vector order = priority (first provider is highest priority). First model in provider vector = root model. Provider :base-url auto-resolved from KNOWN_PROVIDERS for known IDs. Model metadata auto-inferred from :name and merged with provider-scoped pricing/context. Duplicate provider :id values are a hard error.
opts - Optional map:
:network - {:timeout-ms N :max-retries N ...} router-level network defaults
:tokens - {:check-context? bool :pricing {} :context-limits {}} token defaults
:budget - {:max-tokens N :max-cost N} spend limits (nil = no limit)
:failure-threshold - Int. Failures before circuit opens (default: 5)
:recovery-ms - Int. Ms before open→half-open (default: 60000)
Example: (make-router [{:id :blockether :api-key <key> :models [{:name <model-a>} {:name <model-b>}]} {:id :openai :api-key <key> :models [{:name <model-a>} {:name <model-b>}]}] {:budget {:max-tokens 1000000 :max-cost 5.0}})
Creates a router from a vector of provider maps.
Vector order = priority (first provider is highest priority).
First model in provider vector = root model.
Provider :base-url auto-resolved from KNOWN_PROVIDERS for known IDs.
Model metadata auto-inferred from :name and merged with provider-scoped pricing/context.
Duplicate provider :id values are a hard error.
`opts` - Optional map:
:network - {:timeout-ms N :max-retries N ...} router-level network defaults
:tokens - {:check-context? bool :pricing {} :context-limits {}} token defaults
:budget - {:max-tokens N :max-cost N} spend limits (nil = no limit)
:failure-threshold - Int. Failures before circuit opens (default: 5)
:recovery-ms - Int. Ms before open→half-open (default: 60000)
Example:
(make-router [{:id :blockether :api-key <key>
:models [{:name <model-a>} {:name <model-b>}]}
{:id :openai :api-key <key>
:models [{:name <model-a>} {:name <model-b>}]}]
{:budget {:max-tokens 1000000 :max-cost 5.0}})(max-input-tokens model)(max-input-tokens model {:keys [output-reserve trim-ratio context-limits]})Calculates maximum input tokens for a model, reserving space for output.
Calculates maximum input tokens for a model, reserving space for output.
Best-effort flattened model context limits for legacy token utilities. When a model exists on multiple providers with different contexts, the maximum is used.
Best-effort flattened model context limits for legacy token utilities. When a model exists on multiple providers with different contexts, the maximum is used.
Best-effort flattened model pricing for legacy token utilities. When a model exists on multiple providers, the lowest total pricing is chosen. Provider-aware code should NOT use this — use provider-model-pricing instead.
Best-effort flattened model pricing for legacy token utilities. When a model exists on multiple providers, the lowest total pricing is chosen. Provider-aware code should NOT use this — use provider-model-pricing instead.
(normalize-model model-map)Normalizes a model entry: {:name "gpt-4o"} -> full provider-independent model metadata.
Normalizes a model entry: {:name "gpt-4o"} -> full provider-independent model metadata.
(normalize-provider idx provider-map)Normalizes a provider entry:
Normalizes a provider entry: - resolves :base-url from KNOWN_PROVIDERS if not provided - derives :priority from vector index - derives :root from first model - merges provider-independent model metadata with provider-scoped pricing/context
(provider-model-context provider-id model-name)Returns provider-scoped context window for provider/model, falling back to flattened MODEL_CONTEXT_LIMITS.
Returns provider-scoped context window for provider/model, falling back to flattened MODEL_CONTEXT_LIMITS.
(provider-model-entry provider-id model-name)Returns provider-scoped entry {:pricing ... :context ...} for a provider/model, or nil.
Returns provider-scoped entry {:pricing ... :context ...} for a provider/model, or nil.
(provider-model-pricing provider-id model-name)Returns provider-scoped pricing for provider/model, falling back to flattened MODEL_PRICING.
Returns provider-scoped pricing for provider/model, falling back to flattened MODEL_PRICING.
(reset-budget! router)Resets the router's token/cost budget counters to zero.
Resets the router's token/cost budget counters to zero.
(reset-provider! router provider-id)Manually resets a provider's circuit breaker to :closed.
Manually resets a provider's circuit breaker to :closed.
(resolve-routing router routing-opts)Resolves :routing opts to prefs for with-provider-fallback. Returns {:prefs prefs-map :error-strategy kw}. Throws on invalid provider/model combinations.
Resolves :routing opts to prefs for with-provider-fallback.
Returns {:prefs prefs-map :error-strategy kw}.
Throws on invalid provider/model combinations.(router-stats router)Returns cumulative + windowed stats for the router.
Returns cumulative + windowed stats for the router.
(select-provider router prefs)Returns [provider model-map] or nil. Read-only.
Returns [provider model-map] or nil. Read-only.
(truncate-text model text max-tokens)(truncate-text model
text
max-tokens
{:keys [truncation-marker from] :or {from :end}})Truncates text to fit within a token limit. Uses proper tokenization to ensure accurate truncation.
Truncates text to fit within a token limit. Uses proper tokenization to ensure accurate truncation.
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |