All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
ask-code! / ask-code!* return maps and :done? true chunks now
carry three extra observation keys alongside the existing
:blocks:
:all-blocks — the pre-select-blocks vec (every fence
extracted, regardless of lang); callers diagnose wrong-lang or
untagged drops via (> (count :all-blocks) (count :blocks)).:saw-fence? — boolean; true when the raw response contained at
least one fence-shaped line. Lets callers distinguish the
fenceless-fallback path (:saw-fence? false plus one :lang nil block) from a clean fenced response.:malformed? — boolean; true when the fence parser flagged a
torn boundary (glued close+open or unclosed terminal fence). Use
this to attach a more specific recovery hint than "parse failed".codes/extract-code-blocks-detail — new internal helper that
returns the full parser observation {:blocks :saw-fence? :malformed?}. The public codes/extract-code-blocks keeps its
bare-vec contract by delegating.code-tail-pointer-text shrunk from a 4-bullet Rules: block to a
single-line directive: "Reply with \``lang … ``` fenced blocks;
untagged or other-lang fences are DROPPED.". The retired bullets (opener/closer on own line, blank line between blocks, no prose, no glued boundaries) were either over-prescription or already handled by the FenceNormalizer. Saves ~40 tokens perask-code!`
call without weakening the strict-lang contract; the warning that
untagged or wrong-lang fences are silently dropped stays explicit.DEFAULT_RATE_LIMIT_ROUTING :fallback-after-ms default bumped from
30 000 ms to 60 000 ms. Anthropic, OpenAI, and z.ai routinely emit
Retry-After headers in the 30-60 s range under quota pressure on
reasoning-heavy workloads; a 30 s budget clamped these to ~30 s and
forced cross-provider fallback when the same provider was about to
clear. 60 s lets the same-provider retry schedule complete in most
real-world quota windows while still bounding the wait so a single
user request cannot hang for minutes. Callers that need the prior
behavior set {:router {:rate-limit {:fallback-after-ms 30000}}}
explicitly.:router :rate-limit policy (:same-provider-delays-ms,
:fallback-after-ms, :respect-retry-after?, :fallback-provider?)
is now a hard cap on the same-provider 429 phase, not a wait floor.
Each configured delay clamps to remaining budget so the loop never
overshoots; once the schedule is exhausted OR elapsed ≥ budget,
the router falls back immediately. The previous "pad to boundary"
reading would have stalled requests deliberately past the budget
and is gone.:llm.routing/provider-retry events now carry :elapsed-ms and
:error alongside :attempt / :delay-ms; :llm.routing/provider-fallback
events carry :elapsed-ms measured from the first 429 of the
same-provider phase. Persistence + TUI consumers can render the
wait reason without re-deriving from :at-ms diffs.:on-chunk is threaded from caller opts into resolve-routing prefs
so routing events fire live alongside streaming content for every
routed entrypoint (ask!, ask-code!, abstract!, eval!,
refine!, sample!, routed-chat-completion). Previously they
landed only in the final :routed/trace and the TUI saw nothing
during multi-second 429 retry sleeps.core-test/abstract!-integration-test (entire defdescribe, 8 cases
across 4 describe blocks) and router-zai-live-test/":deep + preserved succeeds — clear_thinking:false accepted" removed. Every
case asserted on the exact content shape returned by live LLM calls
via the Blockether LiteLLM proxy (gpt-4o) or z.ai (glm-4.7); under
load the proxy intermittently truncates JSON, returns HTTP 500 from
a downstream Copilot auth hiccup, or emits empty content. Each
flake reproduced in CI was an upstream infra problem, never an svar
regression. abstract!-baseline-test and router-zai-live-test's
:quick + preserved variant cover the same code paths without the
flake; once we have a deterministic recording fixture for Blockether
One we can put the integration coverage back behind it.Time-to-first-token watchdog as a sibling of the idle-stream watchdog.
New :ttft-timeout-ms option on chat-completion / ask! / ask-code!
/ abstract! / eval! / refine! / sample! / routed-chat-completion
bounds the pre-headers phase (before http/post returns and the body
InputStream becomes available). On fire it interrupts the calling
thread inside HttpClient.send -> CompletableFuture.get and surfaces
typed :svar.core/stream-ttft-timeout. Default 90 s
(router/DEFAULT_TTFT_TIMEOUT_MS), matching Anthropic SDK PR #959.
Two distinct phases, two distinct watchdogs:
Both keys flow through the same precedence chain (caller opts >
router :network > package default) thanks to the new LLM_PASSTHROUGH_KEYS
membership; passing an explicit nil per call disables each
independently. routed-chat-completion now reads router defaults via
the same helper, so direct callers get unified handling without
re-implementing the precedence chain.
Watchdog tick-resolution capped at 5 s. Previously
start-idle-stream-watchdog! used (quot idle-timeout-ms 4) as the
per-tick sleep, which meant a 120 s default idle timeout parked the
daemon thread for 30 s between checks — callers waiting on shutdown
could sit for up to 30 s after the read loop ended. Clamped to
[100, 5000] ms so caller-side shutdown latency is at most 5 s while
the firing precision still tracks the configured deadline closely.
start-ttft-watchdog! converted from a single Thread/sleep ttft-timeout-ms to a poll loop with the same clamp, eliminating a
long-lived thread parked past caller completion.
Idle-stream watchdog for streaming HTTP responses. New
:idle-timeout-ms option on chat-completion / ask! / ask-code!
closes the SSE InputStream if no bytes arrive within the window and
surfaces a typed :svar.core/stream-idle-timeout ex-info. Default is
router/DEFAULT_IDLE_TIMEOUT_MS (120s / 2 min, matching Anthropic's
own SDK proposal anthropics/anthropic-sdk-typescript#867 for the
per-request override). Pass :idle-timeout-ms nil to disable; bump
to 240-300 s for Opus extended-thinking workloads (anthropics/claude-
agent-sdk-typescript#44 documents ~185 s legitimate silences). Distinct
from the existing :timeout-ms (whole-request cap): the idle watchdog
tolerates arbitrarily long total durations as long as the stream keeps
emitting bytes (content deltas, SSE : ping keepalives, or blank
separators — every .readLine resets the timer, so the watchdog is
ping-aware for free). Motivation: on JDK 25 + HTTP/2 streaming bodies
HttpRequest.Builder.timeout doesn't reliably fire when the upstream
sends headers and then stalls without body frames (real repro: 11-minute
hang on z.ai glm-5.1 past the 5-min request timeout, never raised).
The watchdog is signal-driven so stalls surface in seconds regardless
of the JDK timer's mood.
::stream-started and ::stream-headers trove logs around the
streaming HTTP boundary, paired with the existing ::stream-finalized.
Closes the observability gap that made mid-call stalls invisible
(previously only :stop-shaped events were emitted).
ask-code! :lang is now REQUIRED. The previous "clojure" default is
gone; callers must pass an explicit non-blank string. Throws
:svar.core/invalid-lang otherwise.select-blocks now drops :lang nil (untagged) blocks unconditionally
instead of treating them as a wildcard match. Models MUST tag their
Markdown code block with the requested lang. The lenient+ fenceless-
fallback path in extract-code-blocks still produces a :lang nil
block, but it no longer survives select-blocks.ask-code! return map no longer contains :result (the concatenated
source string). :blocks is the single source of truth; callers that
want a concatenated string call codes/concat-sources themselves. The
streaming on-chunk payload also drops :result.code-tail-pointer-text rewritten as a compact 5-line Rules: list.
Now spells out the strict-lang contract ("Untagged or other-lang blocks
are DROPPED") so models learn the rule in-context instead of being
punished invisibly. Says "Markdown code blocks" instead of "fences"
for precision. ~280 chars vs the previous ~480.ask-code!*
used to invoke codes/extract-code-blocks on every SSE delta over the
full accumulated buffer. Combined with (?m)…$ regex normalizers,
total work was quadratic in stream size. Live repro on glm-5.1 wedged a
Vis TUI virtual thread in java.util.regex.Pattern$BmpCharPropertyGreedy.match
for minutes (Vis conversation 0c8188ac). Streaming on-chunk now
signals progress only (:raw, :reasoning, :done?); the final
parse happens exactly once after the stream closes and is surfaced in
the :done? true chunk and the return value. Mid-stream :result /
:blocks are nil — a deliberate contract change.com.blockether.svar.FenceNormalizer (Java). Single linear scan
over the input. Three fence transforms (opener-split,
inline-boundary-split, closer-split) folded into one fused per-line
emit. Fast paths for inputs without carriage returns and without
backticks. Replaces the prior (?m)… regex pipeline in
codes/normalize-*.com.blockether.svar.FenceBlocksParser (Java). Line-based fence
block parser. Body materialised with one String.substring per block
(bodies are contiguous slices of the normalized input). Replaces the
Clojure parse-fenced-blocks loop.bench/ with :bench alias (criterium
0.4.6). Reproduces the Vis hang scenario and tracks throughput of
the new Java parsers.extract-code-blocks end-to-end on a 0.87 MB / 12 000-line fenced
response: 2.80 ms (~310 MB/s). Pre-fix: minutes of CPU.FenceNormalizer/normalize: 1.02 ms on the same input.FenceBlocksParser/parse: 1.54 ms on the same input.pathological-line (78 KB single line with stray backticks +
glued closer): 166 µs.codes/normalize-fence-openers, normalize-fence-closers,
normalize-inline-fence-boundaries, parse-fenced-blocks,
malformed-fence-fragment? and the FENCE_LINE_RE constant are
gone. Their behaviour now lives in the Java parsers behind the
existing public codes/extract-code-blocks API.resources/models.dev.json (1.9 MB, 118 providers) drives pricing,
context, modalities, cache-read/write, family, capability flags,
knowledge cutoff, and release dates for every known provider model.
Refresh with make refresh-models.com.blockether.svar.internal.modelsdev exposing catalog,
provider-models, provider-meta, normalize-model, resolve-models,
and merge-overlay.:pricing-source overlay key on KNOWN_PROVIDERS redirects catalog
lookup to a different provider id. :openai-codex and
:anthropic-coding-plan now meter at retail OpenAI / Anthropic
rates (honest accounting once plan quota is exceeded).:modalities, :cache-read, :cache-write, :input-limit,
:output-limit, :family, :knowledge-cutoff, :release-date,
:tool-call?, :attachment?, :open-weights?, :temperature?
surface on every provider-model-entry / (:models provider).KNOWN_PROVIDER_MODELS slimmed to wire/policy-only overlays.
Pricing/context flow from the catalog by default; overlays add only
what the catalog can't express (Anthropic 5m/1h cache tiers,
OpenAI long-context tiers, GLM :json-object-mode?, Copilot
per-model :extra-body + :reasoning-style).provider-model-entry returns catalog ⊕ overlay merge (overlay wins
on wire keys, pricing maps deep-merge so overlay rate overrides keep
catalog :cache-read / :cache-write).MODEL_CONTEXT_LIMITS and MODEL_PRICING now union catalog + overlay
via the new private merged-provider-models, so legacy
tokens/estimate-cost and context-limit see the full catalog.:zai-coding no longer duplicates :zai's GLM table; inherits via
new :provider-model-source :zai (overlay) +
:pricing-source :zai (retail metering for subscription overage).resources/ is now on :paths in deps.edn and copied into the jar
by build.clj so the bundled catalog ships with every release.:blockether provider (KNOWN_PROVIDERS entry, model table,
BLOCKETHER_LLM_DEFAULT_MODEL env fallback, README built-in mention).
Still fully usable as a user-supplied custom provider — callers pass
:base-url and per-model :pricing / :reasoning? like any other
custom provider.models! is now OAuth-aware and platform-agnostic. The internal http-get!
routes through the same make-llm-headers dispatcher chat does, so
Anthropic OAuth tokens (Claude Code subscription) attach
anthropic-version, anthropic-beta, user-agent, x-app; Anthropic
API keys attach x-api-key; everything else falls back to bearer auth.
Provider :llm-headers (e.g. Codex chatgpt-account-id) merge on top.:models-path + :models-query-params hooks in
KNOWN_PROVIDERS. :openai-codex now points at
/codex/models?client_version=1.0.0 to surface the live Codex
inference fleet (gpt-5.5, gpt-5.4, gpt-5.3-codex, ...). The bare
/models route on the same host returns chatgpt.com product
metadata, not inference models.normalize-models-response accepts both OpenAI/Anthropic
{:data [...]} and ChatGPT-backend {:models [{:slug ...}]}
shapes; :slug promotes to :id so downstream filters work
unchanged.models! returns [] on HTTP failure by default; pass
{:strict? true} to surface the underlying ex-info./v1/models no longer 400s on
"anthropic-version: header is required" — the OAuth header set
flows through the same code path used for /v1/messages.ask-code! / ask-code!*: plain-text completion + fenced code-block
extraction. Sibling of ask! for callers that want raw source (e.g.
Clojure for an RLM agent loop) instead of a structured JSON envelope.
No spec, no schema-prompt inlining, no JSON-mode tricks. Sends
:messages verbatim, parses the assistant response, filters by
:lang (default "clojure"), and returns the concatenated source.
Returns {:result :blocks :raw :reasoning :tokens :cost :duration-ms};
empty :result (no matching blocks) is a valid success — the caller
decides what to do with it. Throws on transport-level failures only
(:svar.llm/empty-content, HTTP errors).ask-code! / ask-code!*: :code-tail-pointer? option (default
true). Mirrors the ask!* schema-tail-pointer feature for the
fenced-code path: appends a short, lang-aware reminder as the LAST
text block of the LAST user message ("Reply with lang source inside
lang … fences. …"). Restores recency-driven format adherence on
long transcripts without burning a cache breakpoint. Set to false to
opt out (e.g. quirky local models that double-emit on reminders).extract-code-blocks (re-exported on svar.core): pure utility that
parses fenced code blocks from a raw text string. Returns a vector of
{:lang <str-or-nil> :source <str>}. Lenient+: matches
```clojure / ``` (untagged) / falls back to treating the entire
input as one untagged block when no fence is present. Multi-block aware.ask! / ask!*: :format-retries option (default 0). When the
provider returns content that fails schema parsing
(:svar.spec/schema-rejected, :svar.spec/required-field-missing),
svar can re-prompt the model with a tiny FORMAT-RETRY turn and try
again locally instead of bubbling the failure to the caller. Each
attempt is recorded in the result map under :format-attempts (only
surfaced when retries actually happen) and in the terminal exception's
ex-data. Lets agent loops absorb provider-format noise without burning
user-visible iteration budget. Configurable via :format-retry-on
(default set: #{:svar.spec/schema-rejected :svar.spec/required-field-missing};
callers can opt in to retrying :svar.llm/empty-content too).ask! / ask!*: :json-object-mode? option. When true on :openai
api-style providers, auto-injects response_format: {type: "json_object"}
into the request body. Hardens models that historically leak prose into
content under :deep reasoning. Defaults to model metadata: GLM family
(glm-5.1, glm-4.7, glm-5-turbo, glm-4.6, glm-4.6v) is opted in
by default across :zai, :zai-coding, and :blockether providers.
Caller's explicit :extra-body :response_format always wins.ask!: :on-format-error routing strategy. :fail (default) preserves
current behavior; :fallback-provider treats schema/format-typed errors
as transient and tries the next provider/model in the fleet, excluding
the offender. The terminal exception (when all providers fail) carries
the LAST format error's full envelope plus :routed/trace and
:format-failed.:svar.llm/empty-content exceptions now carry the
full forensic envelope (:model, :api-style, :chat-url,
:duration-ms, :api-usage, :reasoning, :content, :http-response,
:format-attempts) verbatim in ex-data — no truncation. Lets callers
reproduce / persist / display the failing call without scraping logs.SCHEMA_ENFORCEMENT_BANNER rendered into
every spec prompt) now explicitly states the top-level value must be a
JSON object (not a JSON string) and the first non-whitespace character
must be {. Reduces the GLM prose-leak rate without auto-injection.ask!* internally (abstract!,
abstract!*, eval!, eval!*, refine!, refine!*, sample!)
now propagate the new options (:format-retries, :format-retry-on,
:json-object-mode?, :on-format-error, :cache-system?,
:extra-body, :timeout-ms, :check-context?, :output-reserve)
through every internal LLM call. Previously these were silently
dropped at the select-keys boundary inside helpers — e.g.
(svar/abstract! router {:format-retries 2 ...}) would have ignored
the retry budget on every CoD iteration. Centralised through a new
private LLM_PASSTHROUGH_KEYS constant + llm-passthrough helper so
future svar-level options stay consistent across the public surface.Group.items: item[] when only
:group is in the main spec's :refs) used to be partitioned as
"unused" and dropped from the rendered prompt; the LLM saw
items: item[] with no item { … } definition and degraded into
positional arrays. Adds a regression test under
spec->prompt-test → transitive ref usage.::values vector shorthand — enum fields can now declare values as a
plain vector [\"high\" \"medium\" \"low\"] instead of a {value desc}
map. spec->prompt emits the inline \"a\" or \"b\" type union but
skips the per-value comment block — meaningful savings on system
prompts that declare self-explanatory enums (confidence, model class, answer-type, etc.). The validator treats both shapes
identically. Back-compatible: existing {value desc} maps keep
emitting per-value docs unchanged.Hook system v3 — per-tool :before/:after/:wrap chains and a global lifecycle :hooks map, both wired through register-env-fn! and query-env!. Policy (deny / transform / recover) lives in per-tool chains; observation (logging / metrics / UI streaming) lives in global hooks. See rlm.tools/normalize-hooks / execute-tool / wrap-tool-for-sci for the full engine.
Per-tool chains (attached via register-env-fn! tool-def):
:before — each hook receives the invocation map, may return {:args v} (transform args), {:skip v} (short-circuit, :after still runs), {:error e} (short-circuit to error), or nil.:after — receives the outcome map, may return {:result v}, {:error e}, {:result v :error nil} (recover from failure), or nil. Independent sequential chain, NOT paired setup/teardown.:wrap — ring-style middleware, vector is vec-LAST = outermost (matches (-> handler inner outer) convention).register-env-fn! twice on the same symbol merges hooks by :id, same id replaces in place, new ids append, old ids preserved.(:invoke inv-map) — call another registered tool through its own per-tool chain, bypassing global observers. Depth-tracked via explicit query-ctx (:depth), cap MAX_HOOK_DEPTH = 8.Global lifecycle hooks (query-env! :hooks {:on-iteration ... :on-cancel ...}) — all pure observers, return ignored, exceptions swallowed:
:on-iteration — fires after store-iteration! with {:iteration :status :thinking :executions :final-result :error :duration-ms}. Status ∈ #{:error :empty :success :final}. Replaces the top-level :on-iteration opt.:status-id namespaced keywords (for example :rlm.status/error, :rlm.status/cancelled) alongside legacy :status.:on-cancel — fires when the cancel-atom is observed true.:on-chunk — migrated from top-level :on-chunk opt into :hooks.:on-tool-invoked / :on-tool-completed — fire around the per-tool pipeline for tools registered via register-env-fn!.query-env! opts:
:hooks {...} — canonical entry point for global lifecycle hooks.:cancel-atom (atom false) — caller-owned atom. Flip from any thread to cancel the in-progress query; the iteration loop finishes the current cycle and returns {:status :cancelled}. If omitted, query-env! creates a fresh one locally.:eval-timeout-ms — unchanged. Clamped [1s, 30min].Inspection / maintenance:
rlm/list-tool-hooks env sym → {:before [{:id :position :fn-name}] :after [...] :wrap [...]}rlm/list-registered-tools env → [{:sym :hook-counts {:before :after :wrap}}]rlm/unregister-hook! env sym stage id → true/falseOther additions (unchanged from prior unreleased shipping):
rlm.schema/*eval-timeout-ms* dynamic var replacing the former hardcoded EVAL_TIMEOUT_MS constant.rlm.schema/MIN_EVAL_TIMEOUT_MS (1s) and rlm.schema/MAX_EVAL_TIMEOUT_MS (30min) — hard bounds to prevent runaway SCI futures.search-entities, get-entity, list-relationships bound from existing rlm.db fns.pageindex.vision/extract-text-from-pdf: :extraction-strategy enum validation (throws on non-#{:vision :ocr}).rlm/ingest-git! opens git repos via JGit (no shell-out), reads commits, stores them as :event + :person + :file entities with relationships, and attaches each open Repository to the env keyed by repo-name. Multi-repo by default — call ingest-git! multiple times with distinct :repo-name values and every attached repo remains queryable. .gitignore does not affect JGit, so repos living inside gitignored subdirectories (e.g. external-repos/foo/.git) work transparently. The SCI sandbox exposes seven git query tools — all prefixed git- — that remain invisible unless ingest-git! has been called:
:document-id): git-search-commits, git-commit-history, git-commits-by-ticket:repo opt needed): git-file-history, git-blame, git-commit-diff, git-commit-parents:rlm/no-repo-for-path with :reason :relative-path otherwise).HEAD are ambiguous in multi-repo mode and throw :rlm/ambiguous-ref.GIT REPO context block per attached repo (name, path, head short-sha, branch, commits-ingested). In multi-repo mode the tool-doc list advertises "pass ABSOLUTE path" guidance.dispose-env! closes every attached Repository and clears both atoms.org.eclipse.jgit/org.eclipse.jgit 6.10.0.rlm.schema commit attrs: :commit/parents (many strings, for graph walking) and :commit/author-email (denormalized for db-search-commits queries by author).rlm.db/db-search-commits — query commit entities by :category, :since, :until, :ticket, :path, :author-email, :document-id, :limit. Backs the search-commits / commit-history / commits-by-ticket SCI tools.rlm.db/db-commit-by-sha — SHA-prefix lookup of a single commit entity.rlm/query-env! opt changes. Top-level :on-chunk and :on-iteration are removed. Move them into the :hooks map: {:hooks {:on-chunk ... :on-iteration ...}}.rlm/cancel-query! removed. Cancellation is caller-owned now: create an (atom false), pass it via :cancel-atom opt to query-env!, and (reset! the-atom true) from any thread to cancel.:cancel-atom. Callers that reached into the env to flip cancellation must pass their own atom via the :cancel-atom query-env! opt.rlm.tools hook engine refactor. Replaced transient hook dynamic bindings with explicit per-query context threading (query-ctx) plus caller-owned atoms (:cancel-atom, :current-iteration-atom) where mutation is required.:custom-bindings-atom for fns — register-env-fn! writes to the new :tool-registry-atom instead. :custom-bindings-atom is still used by register-env-def! for constants/values.rlm.core into rlm.data helper functions (store-extraction-results! + normalized entity/relationship tx builders).internal/llm.clj shared-http-client: HTTP/1.1 pin now documented as load-bearing for OCR. Mirror note added in pageindex/vision.clj above the OCR section.rlm/git.clj rewritten: replaced clojure.java.shell/sh with JGit interop. Pure parsers (parse-commit-message, extract-ticket-refs, prefix->category, commit->entity, ingest-commits!) preserved and unit-tested. New IO surface: open-repo, git-available?, read-commits, head-info, blame, commit-diff, file-history, commit-parents. The old read-git-log / custom git log --format=sha:%H%n... string parser is gone..omc/plans/autopilot-impl.md)query-env!.:iteration.var/schema-version + re-exec mode — schema change; needs review.:extract-entities? default to true — breaking; needs explicit consent.Can you improve this documentation? These fine people already did:
blockether-deployer, Karol Wojcik, agent, svar-agent, vis-agent & Michał KrukEdit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |