llm.sdk.providers — net.clojars.deadmeme5441/clojure-llm-sdk 0.1.0

llm.sdk.providers.anthropic.chat

Anthropic Messages API transport adapter. Supports thinking blocks, cache_control, tool use, streaming deltas. Preserves provider-specific replay state (reasoning_details, signatures).

Anthropic Messages API transport adapter.
Supports thinking blocks, cache_control, tool use, streaming deltas.
Preserves provider-specific replay state (reasoning_details, signatures).

raw docstring

llm.sdk.providers.bedrock-image

Compatibility shim. Implementation lives in llm.sdk.providers.bedrock.image.

Compatibility shim. Implementation lives in llm.sdk.providers.bedrock.image.

raw docstring

llm.sdk.providers.bedrock.converse

AWS Bedrock Converse API transport adapter.

Auth: AWS Signature V4 — sdk/complete dispatches on :profile/auth-strategy :aws-sigv4 and signs the request via llm.sdk.aws-sigv4 just before the HTTP send.

Streaming: Bedrock's /converse-stream emits binary event-stream frames (vnd.amazon.eventstream). sdk/complete reads the raw InputStream via llm.sdk.aws-eventstream/frame-seq and hands each parsed frame to parse-stream-event-bedrock as a map.

Model-id mapping: canonical short ids (e.g. claude-sonnet-4-5, nova-pro) are mapped to Bedrock's region-versioned id format (e.g. anthropic.claude-sonnet-4-5-20250101-v1:0); unknown ids pass through verbatim so callers can provide explicit ARNs.

AWS Bedrock Converse API transport adapter.

Auth: AWS Signature V4 — sdk/complete dispatches on
:profile/auth-strategy :aws-sigv4 and signs the request via
llm.sdk.aws-sigv4 just before the HTTP send.

Streaming: Bedrock's /converse-stream emits binary event-stream
frames (vnd.amazon.eventstream). sdk/complete reads the raw
InputStream via llm.sdk.aws-eventstream/frame-seq and hands
each parsed frame to parse-stream-event-bedrock as a map.

Model-id mapping: canonical short ids (e.g. claude-sonnet-4-5,
nova-pro) are mapped to Bedrock's region-versioned id format
(e.g. anthropic.claude-sonnet-4-5-20250101-v1:0); unknown ids
pass through verbatim so callers can provide explicit ARNs.

raw docstring

llm.sdk.providers.bedrock.image

Bedrock image-generation adapter (Titan Image Generator + Stability SD3 / SDXL). All use bedrock-runtime /model/{id}/invoke with SigV4. Each model has a different body shape:

amazon.titan-image-generator-v1 / -v2:0 {:taskType "TEXT_IMAGE" :textToImageParams {:text "..."} :imageGenerationConfig {:numberOfImages N :width W :height H :cfgScale 8 :seed 0}} response {:images ["b64", ...]}

stability.stable-diffusion-xl-v1 {:text_prompts [{:text "..." :weight 1.0}] :cfg_scale N :seed N :steps 30} response {:artifacts [{:base64 "..."}]}

We dispatch on a substring match against the model id and route to the matching builder/parser pair.

Bedrock image-generation adapter (Titan Image Generator + Stability
SD3 / SDXL). All use bedrock-runtime /model/{id}/invoke with SigV4.
Each model has a different body shape:

  amazon.titan-image-generator-v1 / -v2:0
    {:taskType "TEXT_IMAGE"
     :textToImageParams {:text "..."}
     :imageGenerationConfig {:numberOfImages N :width W :height H :cfgScale 8 :seed 0}}
    response {:images ["b64", ...]}

  stability.stable-diffusion-xl-v1
    {:text_prompts [{:text "..." :weight 1.0}]
     :cfg_scale N :seed N :steps 30}
    response {:artifacts [{:base64 "..."}]}

We dispatch on a substring match against the model id and route to
the matching builder/parser pair.

raw docstring

llm.sdk.providers.bedrock.rerank

Bedrock Agent Runtime /rerank adapter.

Canonical SDK rerank requests follow Cohere-style inputs: {model, query, documents, top-n}. Bedrock expects an Agent Runtime request containing queries, sources, and a Bedrock reranking configuration. The request is signed by llm.sdk.rerank via SigV4.

Bedrock Agent Runtime /rerank adapter.

Canonical SDK rerank requests follow Cohere-style inputs:
{model, query, documents, top-n}. Bedrock expects an Agent Runtime
request containing queries, sources, and a Bedrock reranking
configuration. The request is signed by llm.sdk.rerank via SigV4.

raw docstring

llm.sdk.providers.codex

Compatibility shim. Implementation lives in llm.sdk.providers.codex.responses.

Compatibility shim. Implementation lives in llm.sdk.providers.codex.responses.

raw docstring

llm.sdk.providers.codex.responses

OpenAI Responses API (Codex) transport adapter. Covers both the standard OpenAI Responses API (api.openai.com) and the Codex backend (chatgpt.com/backend-api/codex).

For the Codex backend, auth is read from ~/.codex/auth.json (shared with the official OpenAI Codex CLI).

OpenAI Responses API (Codex) transport adapter.
Covers both the standard OpenAI Responses API (api.openai.com)
and the Codex backend (chatgpt.com/backend-api/codex).

For the Codex backend, auth is read from ~/.codex/auth.json
(shared with the official OpenAI Codex CLI).

raw docstring

llm.sdk.providers.cohere-chat

Compatibility shim. Implementation lives in llm.sdk.providers.cohere.chat.

Compatibility shim. Implementation lives in llm.sdk.providers.cohere.chat.

raw docstring

llm.sdk.providers.cohere-embed

Compatibility shim. Implementation lives in llm.sdk.providers.cohere.embeddings.

Compatibility shim. Implementation lives in llm.sdk.providers.cohere.embeddings.

raw docstring

llm.sdk.providers.cohere-rerank

Compatibility shim. Implementation lives in llm.sdk.providers.cohere.rerank.

Compatibility shim. Implementation lives in llm.sdk.providers.cohere.rerank.

raw docstring

llm.sdk.providers.cohere.chat

Cohere /v2/chat native transport adapter.

Cohere is OpenAI-compat-ish but differs enough to need its own adapter: it has a typed message-content array, a documents field, a citation_options control, citations on the response, and a streaming event taxonomy with separate content-start / content-delta / content-end plus tool-plan-delta and citation-* events.

Reference: litellm-ref/llms/cohere/chat/v2_transformation.py.

Cohere /v2/chat native transport adapter.

 Cohere is OpenAI-compat-ish but differs enough to need its own
 adapter: it has a typed message-content array, a documents field,
 a citation_options control, citations on the response, and a
 streaming event taxonomy with separate content-start /
 content-delta / content-end plus tool-plan-delta and citation-*
 events.

Reference: litellm-ref/llms/cohere/chat/v2_transformation.py.

raw docstring

llm.sdk.providers.cohere.embeddings

Cohere embed adapter — POST {base}/embed.

Cohere's wire shape diverges from OpenAI's in three places:

Request uses :texts (vector) instead of :input.
Request carries a required :input_type (search_document / search_query / classification / clustering) which lives in canonical request as :embed/provider-options :input-type. Defaults to "search_document" when omitted — that's the safest fallback for general-purpose retrieval.
Response embeddings live under :embeddings.float (newer API with multi-format support) or :embeddings (legacy single format). Usage is in :meta.billed_units.input_tokens.

Live smoke is env-gated under COHERE_API_KEY.

Cohere embed adapter — POST {base}/embed.

 Cohere's wire shape diverges from OpenAI's in three places:
- Request uses :texts (vector) instead of :input.
- Request carries a required :input_type
     (search_document / search_query / classification / clustering)
     which lives in canonical request as
     :embed/provider-options :input-type. Defaults to
     "search_document" when omitted — that's the safest fallback
     for general-purpose retrieval.
- Response embeddings live under :embeddings.float (newer API
     with multi-format support) or :embeddings (legacy single
     format). Usage is in :meta.billed_units.input_tokens.

 Live smoke is env-gated under COHERE_API_KEY.

raw docstring

llm.sdk.providers.cohere.rerank

Cohere /rerank transport. The wire shape is also used by Jina — both accept {model, query, documents, top_n, return_documents} and return {results [{index, relevance_score, document {text}}]}.

Cohere additionally returns :meta.billed_units.search_units for usage; Jina returns :usage {total_tokens}. Both are surfaced through the canonical :response/usage where possible.

Cohere /rerank transport. The wire shape is also used by Jina —
both accept {model, query, documents, top_n, return_documents}
and return {results [{index, relevance_score, document {text}}]}.

Cohere additionally returns :meta.billed_units.search_units for
usage; Jina returns :usage {total_tokens}. Both are surfaced
through the canonical :response/usage where possible.

raw docstring

llm.sdk.providers.elevenlabs

Compatibility shim. Implementation lives in llm.sdk.providers.elevenlabs.tts.

Compatibility shim. Implementation lives in llm.sdk.providers.elevenlabs.tts.

raw docstring

llm.sdk.providers.elevenlabs.tts

ElevenLabs TTS adapter — POST /v1/text-to-speech/:voice_id with xi-api-key header. Voice id is part of the URL; model id and text live in the JSON body. Returns audio bytes (mp3 by default).

Reference: litellm-ref/llms/elevenlabs/ + ElevenLabs API docs.

ElevenLabs TTS adapter — POST /v1/text-to-speech/:voice_id with
xi-api-key header. Voice id is part of the URL; model id and
text live in the JSON body. Returns audio bytes (mp3 by default).

Reference: litellm-ref/llms/elevenlabs/ + ElevenLabs API docs.

raw docstring

llm.sdk.providers.fake

Compatibility shim. Implementation lives in llm.sdk.providers.fake.chat.

Compatibility shim. Implementation lives in llm.sdk.providers.fake.chat.

raw docstring

make-fake-transport

llm.sdk.providers.fake.chat

Fake/test provider that returns deterministic responses. Conforms to the Transport protocol.

Fake/test provider that returns deterministic responses.
Conforms to the Transport protocol.

raw docstring

make-fake-transport

llm.sdk.providers.gemini-native

Compatibility shim. Implementation lives in llm.sdk.providers.gemini.native.

Compatibility shim. Implementation lives in llm.sdk.providers.gemini.native.

raw docstring

llm.sdk.providers.gemini.imagen

Vertex AI Imagen 3 / 4 image-generation adapter.

Endpoint: POST {host}/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:predict Body: {:instances [{:prompt "..."}] :parameters {:sampleCount N :aspectRatio "1:1" :seed ...}} Response: {:predictions [{:bytesBase64Encoded "..." :mimeType "image/png"}]}

Auth: same GCP OAuth as vertex-gemini — token from :request provider-options.vertex.access-token or GOOGLE_OAUTH_ACCESS_TOKEN.

Models surfaced under :vertex-imagen include imagen-3.0-generate-002, imagen-3.0-fast-generate-001, imagen-4.0-generate-001.

Vertex AI Imagen 3 / 4 image-generation adapter.

Endpoint:
  POST {host}/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:predict
Body:
  {:instances [{:prompt "..."}]
   :parameters {:sampleCount N :aspectRatio "1:1" :seed ...}}
Response:
  {:predictions [{:bytesBase64Encoded "..." :mimeType "image/png"}]}

Auth: same GCP OAuth as vertex-gemini — token from
:request provider-options.vertex.access-token or
GOOGLE_OAUTH_ACCESS_TOKEN.

Models surfaced under :vertex-imagen include imagen-3.0-generate-002,
imagen-3.0-fast-generate-001, imagen-4.0-generate-001.

raw docstring

llm.sdk.providers.gemini.native

Gemini Native API transport adapter. Handles thought signatures, streaming deltas, safety metadata. Preserves provider-specific replay state.

Gemini Native API transport adapter.
Handles thought signatures, streaming deltas, safety metadata.
Preserves provider-specific replay state.

raw docstring

llm.sdk.providers.gemini.vertex

Vertex AI Gemini transport adapter.

Builds on Gemini native with different auth (GCP OAuth) and endpoint structure. Auth resolution follows the standard GCP ADC chain via llm.sdk.gcp-auth: request opts → GOOGLE_OAUTH_ACCESS_TOKEN env → gcloud auth print-access-token → GOOGLE_APPLICATION_CREDENTIALS service-account JSON (RS256-signed JWT exchanged at oauth2.googleapis.com/token).

Project resolution: request opts → profile quirks → GOOGLE_CLOUD_PROJECT env → SA JSON project_id.

Vertex AI Gemini transport adapter.

Builds on Gemini native with different auth (GCP OAuth) and endpoint
structure. Auth resolution follows the standard GCP ADC chain via
llm.sdk.gcp-auth: request opts → GOOGLE_OAUTH_ACCESS_TOKEN env →
`gcloud auth print-access-token` → GOOGLE_APPLICATION_CREDENTIALS
service-account JSON (RS256-signed JWT exchanged at
oauth2.googleapis.com/token).

Project resolution: request opts → profile quirks →
GOOGLE_CLOUD_PROJECT env → SA JSON project_id.

raw docstring

llm.sdk.providers.ollama-native

Compatibility shim. Implementation lives in llm.sdk.providers.ollama.native.

Compatibility shim. Implementation lives in llm.sdk.providers.ollama.native.

raw docstring

llm.sdk.providers.ollama.native

Native Ollama adapter — /api/chat (chat) and /api/embed (embeddings).

Ollama also exposes an OpenAI-compat /v1/chat/completions endpoint that the existing :ollama profile (registered) targets. This namespace registers a sibling :ollama-native profile for callers who want the native shape — older Ollama versions, vision input via the native :images field, or workflows that need the native :options keys (e.g. :num_ctx, :num_predict, :mirostat).

Streaming: Ollama uses NDJSON (one JSON object per line), NOT SSE. We re-use the http/sse-request line reader and parse each line as a raw JSON object instead of stripping a 'data: ' prefix.

Native Ollama adapter — /api/chat (chat) and /api/embed (embeddings).

Ollama also exposes an OpenAI-compat /v1/chat/completions endpoint
that the existing :ollama profile (registered) targets.
This namespace registers a sibling :ollama-native profile for callers
who want the native shape — older Ollama versions, vision input via
the native :images field, or workflows that need the native
:options keys (e.g. :num_ctx, :num_predict, :mirostat).

Streaming: Ollama uses NDJSON (one JSON object per line), NOT
SSE. We re-use the http/sse-request line reader and parse each line
as a raw JSON object instead of stripping a 'data: ' prefix.

raw docstring

llm.sdk.providers.openai-chat

Compatibility shim. Implementation lives in llm.sdk.providers.openai.chat.

Compatibility shim. Implementation lives in llm.sdk.providers.openai.chat.

raw docstring

llm.sdk.providers.openai-compat.aliases

Data-only OpenAI-compatible provider alias specs.

These providers share the OpenAI chat-completions wire shape. Adapter code may still apply provider quirks from the profile, but the registry should not need one hand-written register-provider call per alias.

Data-only OpenAI-compatible provider alias specs.

These providers share the OpenAI chat-completions wire shape. Adapter code
may still apply provider quirks from the profile, but the registry should
not need one hand-written register-provider call per alias.

raw docstring

llm.sdk.providers.openai-embed

Compatibility shim. Implementation lives in llm.sdk.providers.openai.embeddings.

Compatibility shim. Implementation lives in llm.sdk.providers.openai.embeddings.

raw docstring

llm.sdk.providers.openai-image

Compatibility shim. Implementation lives in llm.sdk.providers.openai.image.

Compatibility shim. Implementation lives in llm.sdk.providers.openai.image.

raw docstring

llm.sdk.providers.openai-moderation

Compatibility shim. Implementation lives in llm.sdk.providers.openai.moderation.

Compatibility shim. Implementation lives in llm.sdk.providers.openai.moderation.

raw docstring

llm.sdk.providers.openai-speak

Compatibility shim. Implementation lives in llm.sdk.providers.openai.speak.

Compatibility shim. Implementation lives in llm.sdk.providers.openai.speak.

raw docstring

llm.sdk.providers.openai-transcribe

Compatibility shim. Implementation lives in llm.sdk.providers.openai.transcribe.

Compatibility shim. Implementation lives in llm.sdk.providers.openai.transcribe.

raw docstring

llm.sdk.providers.openai.audio

OpenAI audio provider family namespace.

OpenAI audio provider family namespace.

raw docstring

llm.sdk.providers.openai.chat

OpenAI Chat Completions transport adapter. Covers OpenAI, OpenRouter, DeepSeek, and other OpenAI-compatible providers.

OpenAI Chat Completions transport adapter.
Covers OpenAI, OpenRouter, DeepSeek, and other OpenAI-compatible providers.

raw docstring

llm.sdk.providers.openai.embeddings

OpenAI embeddings adapter — POST {base}/embeddings.

Same auth and base-url plumbing as the chat adapter; we share the profile, just register an additional :profile/embed-transport- constructor on it. Other OpenAI-compat hosts that offer embeddings (Mistral, Together, Voyage, Jina, etc.) can reuse this transport by attaching the same constructor.

OpenAI embeddings adapter — POST {base}/embeddings.

Same auth and base-url plumbing as the chat adapter; we share the
profile, just register an additional :profile/embed-transport-
constructor on it. Other OpenAI-compat hosts that offer embeddings
(Mistral, Together, Voyage, Jina, etc.) can reuse this transport by
attaching the same constructor.

raw docstring

llm.sdk.providers.openai.image

OpenAI image generation adapter.

POST {base}/images/generations. Covers DALL-E 3, DALL-E 2, and the gpt-image-1 family. The wire body differs subtly across them (gpt-image-1 takes :quality :low|:medium|:high|:auto and returns b64_json only; DALL-E 3 takes :quality :standard|:hd and :style :vivid|:natural). The adapter passes canonical fields straight through — provider-specific values are the caller's responsibility, and the same provider-options :extra_body hatch as elsewhere covers anything we haven't surfaced.

OpenAI image generation adapter.

POST {base}/images/generations. Covers DALL-E 3, DALL-E 2, and
the gpt-image-1 family. The wire body differs subtly across them
(gpt-image-1 takes :quality :low|:medium|:high|:auto and returns
b64_json only; DALL-E 3 takes :quality :standard|:hd and :style
:vivid|:natural). The adapter passes canonical fields straight
through — provider-specific values are the caller's responsibility,
and the same provider-options :extra_body hatch as elsewhere
covers anything we haven't surfaced.

raw docstring

llm.sdk.providers.openai.moderation

OpenAI Moderations adapter.

POST {base}/moderations. omni-moderation-latest (the default since Nov 2024) accepts multi-modal input — a vector of {:type :text|:image_url} maps as well as plain strings. text-moderation-* models are text-only.

Response shape per the OpenAI Moderations API: {:id :model :results [{:flagged bool :categories {category-name bool} :category_scores {category-name float} :category_applied_input_types {category-name ["text"|"image"]}}]}

OpenAI Moderations adapter.

POST {base}/moderations. omni-moderation-latest (the default since
Nov 2024) accepts multi-modal input — a vector of {:type :text|:image_url}
maps as well as plain strings. text-moderation-* models are
text-only.

Response shape per the OpenAI Moderations API:
  {:id :model
   :results [{:flagged bool
              :categories {category-name bool}
              :category_scores {category-name float}
              :category_applied_input_types {category-name ["text"|"image"]}}]}

raw docstring

llm.sdk.providers.openai.speak

OpenAI /audio/speech adapter — POST {model, voice, input, response_format} returns raw audio bytes.

OpenAI /audio/speech adapter — POST {model, voice, input, response_format}
returns raw audio bytes.

raw docstring

llm.sdk.providers.openai.transcribe

OpenAI /audio/transcriptions adapter. Wire shape is shared by Groq's /openai/v1/audio/transcriptions endpoint (same field names, same verbose_json output), so the same transport class powers both profiles.

OpenAI /audio/transcriptions adapter. Wire shape is shared by Groq's
/openai/v1/audio/transcriptions endpoint (same field names, same
verbose_json output), so the same transport class powers both
profiles.

raw docstring

llm.sdk.providers.openrouter

Compatibility shim. Implementation lives in llm.sdk.providers.openrouter.chat.

Compatibility shim. Implementation lives in llm.sdk.providers.openrouter.chat.

raw docstring

llm.sdk.providers.openrouter.chat

OpenRouter transport adapter. Builds on OpenAI Chat Completions with OpenRouter-specific quirks:

provider preferences routing in extra_body
Pareto Code router plugin
reasoning config in extra_body (not top-level)
special model naming and error handling.

OpenRouter transport adapter.
Builds on OpenAI Chat Completions with OpenRouter-specific quirks:
- provider preferences routing in extra_body
- Pareto Code router plugin
- reasoning config in extra_body (not top-level)
- special model naming and error handling.

raw docstring

llm.sdk.providers.openrouter.image

OpenRouter image generation transport.

OpenRouter image models generate images through chat completions, not OpenAI's /images/generations endpoint. This adapter mirrors that wire shape and extracts images from choices[].message.images[].image_url.url.

OpenRouter image generation transport.

OpenRouter image models generate images through chat completions, not
OpenAI's /images/generations endpoint. This adapter mirrors that wire
shape and extracts images from choices[].message.images[].image_url.url.

raw docstring

llm.sdk.providers.perplexity

Compatibility shim. Implementation lives in llm.sdk.providers.perplexity.chat.

Compatibility shim. Implementation lives in llm.sdk.providers.perplexity.chat.

raw docstring

llm.sdk.providers.perplexity.chat

Perplexity transport — OpenAI-shape body + citation/search-results surfacing.

Request building is identical to openai-chat. Response parsing extends the OpenAI parser with two extractions:

:search_results [{:url :title :snippet}, ...] → richer CitationPart per result
:citations ["url", ...] → URL-only CitationPart when search_results isn't present

Usage normalization delegates to normalize-openai-usage, which already picks up Perplexity's :citation_tokens and :num_search_queries when present.

Streaming: the final SSE chunk on /chat/completions carries :citations alongside :usage and :finish_reason. parse-stream-event returns a vector of events in that case — sdk/complete flattens multi-event return values.

Perplexity transport — OpenAI-shape body + citation/search-results
 surfacing.

 Request building is identical to openai-chat. Response parsing
 extends the OpenAI parser with two extractions:

- :search_results [{:url :title :snippet}, ...] → richer
     CitationPart per result
- :citations ["url", ...]                       → URL-only
     CitationPart when search_results isn't present

 Usage normalization delegates to normalize-openai-usage, which
 already picks up Perplexity's :citation_tokens and
 :num_search_queries when present.

 Streaming: the final SSE chunk on /chat/completions carries
 :citations alongside :usage and :finish_reason. parse-stream-event
 returns a vector of events in that case — sdk/complete flattens
 multi-event return values.

raw docstring

llm.sdk.providers.vertex-gemini

Compatibility shim. Implementation lives in llm.sdk.providers.gemini.vertex.

Compatibility shim. Implementation lives in llm.sdk.providers.gemini.vertex.

raw docstring

llm.sdk.providers.vertex-imagen

Compatibility shim. Implementation lives in llm.sdk.providers.gemini.imagen.

Compatibility shim. Implementation lives in llm.sdk.providers.gemini.imagen.

raw docstring

llm.sdk.providers.voyage-rerank

Compatibility shim. Implementation lives in llm.sdk.providers.voyage.rerank.

Compatibility shim. Implementation lives in llm.sdk.providers.voyage.rerank.

raw docstring

llm.sdk.providers.voyage.rerank

Voyage /rerank transport. Differs from Cohere/Jina on field names only: request : top_k (not top_n) response: data (not results) Document representation is also slightly different — Voyage returns :document as a plain string when :return_documents=true.

Voyage usage shape: {:usage {:total_tokens N}}.

Voyage /rerank transport. Differs from Cohere/Jina on field names
only:
  request : top_k  (not top_n)
  response: data   (not results)
Document representation is also slightly different — Voyage returns
:document as a plain string when :return_documents=true.

Voyage usage shape: {:usage {:total_tokens N}}.

raw docstring

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field