com — com.blockether/svar 0.5.3

com.blockether.svar.core

LLM interaction utilities for structured and unstructured outputs.

SVAR = Structured Validated Automated Reasoning

Provides main functions:

ask! - Structured output using the spec DSL
abstract! - Text summarization using Chain of Density prompting
eval! - LLM self-evaluation for reliability and accuracy assessment
refine! - Iterative refinement using decomposition and verification
models! - Fetch available models from the LLM API
sample! - Generate test data samples matching a spec

Guardrails:

static-guard - Pattern-based prompt injection detection
moderation-guard - LLM-based content moderation
guard - Run one or more guards on input

Humanization:

humanize-string - Strip AI-style phrases from text
humanize-data - Humanize string values in data structures
humanizer - Create a reusable humanizer function

Re-exports spec DSL (field, spec, str->data, str->data-with-spec, data->str, validate-data, spec->prompt, build-ref-registry), and make-router so users can require only this namespace.

Configuration: LLM calls route automatically via the default router.

Example: (ask! {:spec my-spec :messages [(system "Help the user.") (user "What is 2+2?")] :model "gpt-4o"})

References:

Chain of Density: https://arxiv.org/abs/2309.04269
LLM Self-Evaluation: https://learnprompting.org/docs/reliability/lm_self_eval
DuTy: https://learnprompting.org/docs/advanced/decomposition/duty-distinct-chain-of-thought
CoVe: https://learnprompting.org/docs/advanced/self_criticism/chain_of_verification

LLM interaction utilities for structured and unstructured outputs.

SVAR = Structured Validated Automated Reasoning

 Provides main functions:
 - `ask!` - Structured output using the spec DSL
 - `abstract!` - Text summarization using Chain of Density prompting
 - `eval!` - LLM self-evaluation for reliability and accuracy assessment
 - `refine!` - Iterative refinement using decomposition and verification
 - `models!` - Fetch available models from the LLM API
 - `sample!` - Generate test data samples matching a spec

 Guardrails:
 - `static-guard` - Pattern-based prompt injection detection
 - `moderation-guard` - LLM-based content moderation
 - `guard` - Run one or more guards on input

 Humanization:
 - `humanize-string` - Strip AI-style phrases from text
 - `humanize-data` - Humanize string values in data structures
 - `humanizer` - Create a reusable humanizer function

  Re-exports spec DSL (`field`, `spec`, `str->data`, `str->data-with-spec`,
  `data->str`, `validate-data`, `spec->prompt`, `build-ref-registry`),
  and `make-router` so users can require only this namespace.

Configuration:
LLM calls route automatically via the default router.

 Example:
  (ask! {:spec my-spec
         :messages [(system "Help the user.")
                    (user "What is 2+2?")]
         :model "gpt-4o"})

References:
- Chain of Density: https://arxiv.org/abs/2309.04269
- LLM Self-Evaluation: https://learnprompting.org/docs/reliability/lm_self_eval
- DuTy: https://learnprompting.org/docs/advanced/decomposition/duty-distinct-chain-of-thought
- CoVe: https://learnprompting.org/docs/advanced/self_criticism/chain_of_verification

raw docstring

com.blockether.svar.extension

Extension-facing helpers for router limits and parse diagnostics.

Preserved-thinking handoff is provider-agnostic via the :assistant-message field on ask! / ask-code! results. The value is a canonical svar message — {:role "assistant" :content [<canonical-blocks>]} — which callers append to :messages on the next call. Canonical {:type "thinking"} content blocks carry the per-provider preserved-reasoning state under :thinking-signature; svar's wire serializers transform them into native shapes (Anthropic signed thinking blocks, z.ai reasoning_content field, OpenAI Responses reasoning input items). Plain chat models without preserved thinking just don't surface :assistant-message, so the same caller pipeline (keep :assistant-message results) works uniformly across every provider with zero per-provider branching.

Extension-facing helpers for router limits and parse diagnostics.

Preserved-thinking handoff is provider-agnostic via the
`:assistant-message` field on `ask!` / `ask-code!` results. The
value is a canonical svar message — `{:role "assistant" :content
[<canonical-blocks>]}` — which callers append to `:messages` on
the next call. Canonical `{:type "thinking"}` content blocks
carry the per-provider preserved-reasoning state under
`:thinking-signature`; svar's wire serializers transform them into
native shapes (Anthropic signed thinking blocks, z.ai
`reasoning_content` field, OpenAI Responses reasoning input items).
Plain chat models without preserved thinking just don't surface
`:assistant-message`, so the same caller pipeline
`(keep :assistant-message results)` works uniformly across every
provider with zero per-provider branching.

raw docstring

com.blockether.svar.internal.codes

Fenced code-block extraction from raw LLM text responses.

Pure parsing. No HTTP, no provider knowledge. Used by ask-code! to turn a plain-text completion into a vector of tagged code blocks the caller reads/evals directly.

Extraction rules — extract-code-blocks recognizes three shapes:

Tagged fence: clojure\n…\n → {:lang "clojure" :source …}
Untagged fence: \n…\n → {:lang nil :source …}
No fence at all: entire response → {:lang nil :source …}

select-blocks then enforces strict lang matching: ONLY blocks whose :lang equals the caller-supplied target survive. Untagged blocks (:lang nil) — including the fenceless-fallback case — are DROPPED. Models that want their code accepted MUST tag their fence with the requested lang.

Fenced code-block extraction from raw LLM text responses.

Pure parsing. No HTTP, no provider knowledge. Used by `ask-code!` to turn
a plain-text completion into a vector of tagged code blocks the caller
reads/evals directly.

Extraction rules — `extract-code-blocks` recognizes three shapes:
  1. Tagged fence:        ```clojure\n…\n```   →  {:lang "clojure" :source …}
  2. Untagged fence:      ```\n…\n```          →  {:lang nil       :source …}
  3. No fence at all:     entire response          →  {:lang nil       :source …}

`select-blocks` then enforces strict lang matching: ONLY blocks whose
`:lang` equals the caller-supplied target survive. Untagged blocks
(`:lang nil`) — including the fenceless-fallback case — are DROPPED.
Models that want their code accepted MUST tag their fence with the
requested lang.

raw docstring

com.blockether.svar.internal.guard

Input guardrails for LLM interactions.

Provides factory functions that create guards to validate user input:

static - Pattern-based detection of prompt injection attempts
moderation - LLM-based content policy violation detection (requires :ask-fn)
guard - Runs one or more guards on input

Guards are functions that take input and return it unchanged on success, or throw ExceptionInfo on violation.

Usage: (require '[com.blockether.svar.core :as svar]) (def my-guards [(static) (moderation {:ask-fn svar/ask! :policies #{:hate}})]) (-> user-input (guard my-guards) (svar/ask! ...))

Input guardrails for LLM interactions.

Provides factory functions that create guards to validate user input:
- `static` - Pattern-based detection of prompt injection attempts
- `moderation` - LLM-based content policy violation detection (requires :ask-fn)
- `guard` - Runs one or more guards on input

Guards are functions that take input and return it unchanged on success,
or throw ExceptionInfo on violation.

Usage:
(require '[com.blockether.svar.core :as svar])
(def my-guards [(static) 
                (moderation {:ask-fn svar/ask! :policies #{:hate}})])
(-> user-input
    (guard my-guards)
    (svar/ask! ...))

raw docstring

com.blockether.svar.internal.humanize

AI response humanization module.

Removes AI-style phrases and patterns from LLM outputs to make responses sound more natural and human-like.

Two tiers of patterns:

SAFE_PATTERNS (default): AI identity, refusal, knowledge, punctuation. Unambiguously AI-generated; safe for arbitrary text.
AGGRESSIVE_PATTERNS (opt-in): hedging, overused verbs/adjectives/nouns, opening/closing cliches. May match valid English in non-AI text.

AI response humanization module.

Removes AI-style phrases and patterns from LLM outputs to make responses
sound more natural and human-like.

Two tiers of patterns:
- SAFE_PATTERNS (default): AI identity, refusal, knowledge, punctuation.
  Unambiguously AI-generated; safe for arbitrary text.
- AGGRESSIVE_PATTERNS (opt-in): hedging, overused verbs/adjectives/nouns,
  opening/closing cliches. May match valid English in non-AI text.

raw docstring

com.blockether.svar.internal.jsonish

Wrapper for the JsonishParser Java class.

Provides SAP (Schemaless Adaptive Parsing) for malformed JSON from LLMs. Handles unquoted keys/values, trailing commas, markdown code blocks, etc.

Wrapper for the JsonishParser Java class.

Provides SAP (Schemaless Adaptive Parsing) for malformed JSON from LLMs.
Handles unquoted keys/values, trailing commas, markdown code blocks, etc.

raw docstring

com.blockether.svar.internal.llm

LLM client layer: HTTP transport, message construction, and all LLM interaction functions (ask!, abstract!, eval!, refine!, models!, sample!).

LLM client layer: HTTP transport, message construction, and all LLM interaction
functions (ask!, abstract!, eval!, refine!, models!, sample!).

raw docstring

com.blockether.svar.internal.modelsdev

models.dev catalog loader.

Reads the bundled resources/models.dev.json snapshot (refreshed via make refresh-models) and exposes a normalized view that downstream router code merges with KNOWN_PROVIDERS wire/policy overlay.

Catalog wins for: pricing, context, modalities, capability flags, release dates, family. svar overlay wins for: api-style, reasoning-style, llm-headers, env-keys, base-url, paths, extra-body, exclude-models, rate budgets, default-models.

Plan-vs-retail pricing — per-provider entries on models.dev already reflect plan zeros (e.g. github-copilot, zai-coding-plan ship {input:0, output:0}). For svar's :openai-codex and :anthropic-coding-plan we explicitly want retail pricing (the user pays at API rates once metered), so the overlay declares :pricing-source to redirect catalog lookup to the retail provider.

models.dev catalog loader.

Reads the bundled `resources/models.dev.json` snapshot (refreshed via
`make refresh-models`) and exposes a normalized view that downstream
router code merges with `KNOWN_PROVIDERS` wire/policy overlay.

Catalog wins for: pricing, context, modalities, capability flags,
release dates, family.
svar overlay wins for: api-style, reasoning-style, llm-headers,
env-keys, base-url, paths, extra-body, exclude-models, rate budgets,
default-models.

Plan-vs-retail pricing — per-provider entries on models.dev already
reflect plan zeros (e.g. `github-copilot`, `zai-coding-plan` ship
{input:0, output:0}). For svar's `:openai-codex` and
`:anthropic-coding-plan` we explicitly want **retail** pricing
(the user pays at API rates once metered), so the overlay declares
`:pricing-source` to redirect catalog lookup to the retail provider.

raw docstring

com.blockether.svar.internal.router

Router: provider/model registry, circuit breakers, rate limiting, budget tracking, and routing resolution.

Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic) to provide a single cohesive namespace for all routing concerns.

Router: provider/model registry, circuit breakers, rate limiting, budget tracking,
and routing resolution.

Extracted from defaults.clj (provider/model metadata) and llm.clj (routing logic)
to provide a single cohesive namespace for all routing concerns.

raw docstring

com.blockether.svar.internal.spec

Structured output specification system for LLM responses.

This namespace provides a DSL for defining expected output structures, converting specs to LLM prompts, and parsing LLM responses back to Clojure data.

Primary functions:

field - Define a field with name, type, cardinality, and description
spec - Create a spec from field definitions
build-ref-registry - Build a registry of referenced specs for nested types
spec->prompt - Generate LLM prompt text from a spec (sent to LLM)
str->data - Parse LLM response string to Clojure data (schemaless)
str->data-with-spec - Parse LLM response with spec-based type coercion
validate-data - Validate parsed data against a spec
data->str - Serialize Clojure data to JSON string

Data Flow:

Define spec with spec and field functions
Generate prompt with spec->prompt (sent to LLM)
Parse response with str->data-with-spec (LLM response -> typed Clojure map)
Optionally validate with validate-data
Optionally serialize with data->str

Structured output specification system for LLM responses.

This namespace provides a DSL for defining expected output structures,
converting specs to LLM prompts, and parsing LLM responses back to Clojure data.

Primary functions:
- `field` - Define a field with name, type, cardinality, and description
- `spec` - Create a spec from field definitions
- `build-ref-registry` - Build a registry of referenced specs for nested types
- `spec->prompt` - Generate LLM prompt text from a spec (sent to LLM)
- `str->data` - Parse LLM response string to Clojure data (schemaless)
- `str->data-with-spec` - Parse LLM response with spec-based type coercion
- `validate-data` - Validate parsed data against a spec
- `data->str` - Serialize Clojure data to JSON string

Data Flow:
1. Define spec with `spec` and `field` functions
2. Generate prompt with `spec->prompt` (sent to LLM)
3. Parse response with `str->data-with-spec` (LLM response -> typed Clojure map)
4. Optionally validate with `validate-data`
5. Optionally serialize with `data->str`

raw docstring

com.blockether.svar.internal.util

Shared internal utilities.

Shared internal utilities.

raw docstring

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field