Compatibility shim. Implementation lives in llm.sdk.providers.anthropic.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.anthropic.chat.
Anthropic Messages API transport adapter. Supports thinking blocks, cache_control, tool use, streaming deltas. Preserves provider-specific replay state (reasoning_details, signatures).
Anthropic Messages API transport adapter. Supports thinking blocks, cache_control, tool use, streaming deltas. Preserves provider-specific replay state (reasoning_details, signatures).
Compatibility shim. Implementation lives in llm.sdk.providers.bedrock.converse.
Compatibility shim. Implementation lives in llm.sdk.providers.bedrock.converse.
Compatibility shim. Implementation lives in llm.sdk.providers.bedrock.image.
Compatibility shim. Implementation lives in llm.sdk.providers.bedrock.image.
AWS Bedrock Converse API transport adapter.
Auth: AWS Signature V4 — sdk/complete dispatches on :profile/auth-strategy :aws-sigv4 and signs the request via llm.sdk.aws-sigv4 just before the HTTP send.
Streaming: Bedrock's /converse-stream emits binary event-stream frames (vnd.amazon.eventstream). sdk/complete reads the raw InputStream via llm.sdk.aws-eventstream/frame-seq and hands each parsed frame to parse-stream-event-bedrock as a map.
Model-id mapping: canonical short ids (e.g. claude-sonnet-4-5, nova-pro) are mapped to Bedrock's region-versioned id format (e.g. anthropic.claude-sonnet-4-5-20250101-v1:0); unknown ids pass through verbatim so callers can provide explicit ARNs.
AWS Bedrock Converse API transport adapter. Auth: AWS Signature V4 — sdk/complete dispatches on :profile/auth-strategy :aws-sigv4 and signs the request via llm.sdk.aws-sigv4 just before the HTTP send. Streaming: Bedrock's /converse-stream emits binary event-stream frames (vnd.amazon.eventstream). sdk/complete reads the raw InputStream via llm.sdk.aws-eventstream/frame-seq and hands each parsed frame to parse-stream-event-bedrock as a map. Model-id mapping: canonical short ids (e.g. claude-sonnet-4-5, nova-pro) are mapped to Bedrock's region-versioned id format (e.g. anthropic.claude-sonnet-4-5-20250101-v1:0); unknown ids pass through verbatim so callers can provide explicit ARNs.
Bedrock image-generation adapter (Titan Image Generator + Stability SD3 / SDXL). All use bedrock-runtime /model/{id}/invoke with SigV4. Each model has a different body shape:
amazon.titan-image-generator-v1 / -v2:0 {:taskType "TEXT_IMAGE" :textToImageParams {:text "..."} :imageGenerationConfig {:numberOfImages N :width W :height H :cfgScale 8 :seed 0}} response {:images ["b64", ...]}
stability.stable-diffusion-xl-v1 {:text_prompts [{:text "..." :weight 1.0}] :cfg_scale N :seed N :steps 30} response {:artifacts [{:base64 "..."}]}
We dispatch on a substring match against the model id and route to the matching builder/parser pair.
Bedrock image-generation adapter (Titan Image Generator + Stability
SD3 / SDXL). All use bedrock-runtime /model/{id}/invoke with SigV4.
Each model has a different body shape:
amazon.titan-image-generator-v1 / -v2:0
{:taskType "TEXT_IMAGE"
:textToImageParams {:text "..."}
:imageGenerationConfig {:numberOfImages N :width W :height H :cfgScale 8 :seed 0}}
response {:images ["b64", ...]}
stability.stable-diffusion-xl-v1
{:text_prompts [{:text "..." :weight 1.0}]
:cfg_scale N :seed N :steps 30}
response {:artifacts [{:base64 "..."}]}
We dispatch on a substring match against the model id and route to
the matching builder/parser pair.Bedrock Agent Runtime /rerank adapter.
Canonical SDK rerank requests follow Cohere-style inputs: {model, query, documents, top-n}. Bedrock expects an Agent Runtime request containing queries, sources, and a Bedrock reranking configuration. The request is signed by llm.sdk.rerank via SigV4.
Bedrock Agent Runtime /rerank adapter.
Canonical SDK rerank requests follow Cohere-style inputs:
{model, query, documents, top-n}. Bedrock expects an Agent Runtime
request containing queries, sources, and a Bedrock reranking
configuration. The request is signed by llm.sdk.rerank via SigV4.Compatibility shim. Implementation lives in llm.sdk.providers.codex.responses.
Compatibility shim. Implementation lives in llm.sdk.providers.codex.responses.
OpenAI Responses API (Codex) transport adapter. Covers both the standard OpenAI Responses API (api.openai.com) and the Codex backend (chatgpt.com/backend-api/codex).
For the Codex backend, auth is read from ~/.codex/auth.json (shared with the official OpenAI Codex CLI).
OpenAI Responses API (Codex) transport adapter. Covers both the standard OpenAI Responses API (api.openai.com) and the Codex backend (chatgpt.com/backend-api/codex). For the Codex backend, auth is read from ~/.codex/auth.json (shared with the official OpenAI Codex CLI).
Compatibility shim. Implementation lives in llm.sdk.providers.cohere.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.cohere.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.cohere.embeddings.
Compatibility shim. Implementation lives in llm.sdk.providers.cohere.embeddings.
Compatibility shim. Implementation lives in llm.sdk.providers.cohere.rerank.
Compatibility shim. Implementation lives in llm.sdk.providers.cohere.rerank.
Cohere /v2/chat native transport adapter.
Cohere is OpenAI-compat-ish but differs enough to need its own adapter: it has a typed message-content array, a documents field, a citation_options control, citations on the response, and a streaming event taxonomy with separate content-start / content-delta / content-end plus tool-plan-delta and citation-* events.
Reference: litellm-ref/llms/cohere/chat/v2_transformation.py.
Cohere /v2/chat native transport adapter. Cohere is OpenAI-compat-ish but differs enough to need its own adapter: it has a typed message-content array, a documents field, a citation_options control, citations on the response, and a streaming event taxonomy with separate content-start / content-delta / content-end plus tool-plan-delta and citation-* events. Reference: litellm-ref/llms/cohere/chat/v2_transformation.py.
Cohere embed adapter — POST {base}/embed.
Cohere's wire shape diverges from OpenAI's in three places:
Live smoke is env-gated under COHERE_API_KEY.
Cohere embed adapter — POST {base}/embed.
Cohere's wire shape diverges from OpenAI's in three places:
- Request uses :texts (vector) instead of :input.
- Request carries a required :input_type
(search_document / search_query / classification / clustering)
which lives in canonical request as
:embed/provider-options :input-type. Defaults to
"search_document" when omitted — that's the safest fallback
for general-purpose retrieval.
- Response embeddings live under :embeddings.float (newer API
with multi-format support) or :embeddings (legacy single
format). Usage is in :meta.billed_units.input_tokens.
Live smoke is env-gated under COHERE_API_KEY.Cohere /rerank transport. The wire shape is also used by Jina — both accept {model, query, documents, top_n, return_documents} and return {results [{index, relevance_score, document {text}}]}.
Cohere additionally returns :meta.billed_units.search_units for usage; Jina returns :usage {total_tokens}. Both are surfaced through the canonical :response/usage where possible.
Cohere /rerank transport. The wire shape is also used by Jina —
both accept {model, query, documents, top_n, return_documents}
and return {results [{index, relevance_score, document {text}}]}.
Cohere additionally returns :meta.billed_units.search_units for
usage; Jina returns :usage {total_tokens}. Both are surfaced
through the canonical :response/usage where possible.Compatibility shim. Implementation lives in llm.sdk.providers.elevenlabs.tts.
Compatibility shim. Implementation lives in llm.sdk.providers.elevenlabs.tts.
ElevenLabs TTS adapter — POST /v1/text-to-speech/:voice_id with xi-api-key header. Voice id is part of the URL; model id and text live in the JSON body. Returns audio bytes (mp3 by default).
Reference: litellm-ref/llms/elevenlabs/ + ElevenLabs API docs.
ElevenLabs TTS adapter — POST /v1/text-to-speech/:voice_id with xi-api-key header. Voice id is part of the URL; model id and text live in the JSON body. Returns audio bytes (mp3 by default). Reference: litellm-ref/llms/elevenlabs/ + ElevenLabs API docs.
Compatibility shim. Implementation lives in llm.sdk.providers.fake.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.fake.chat.
Fake/test provider that returns deterministic responses. Conforms to the Transport protocol.
Fake/test provider that returns deterministic responses. Conforms to the Transport protocol.
Compatibility shim. Implementation lives in llm.sdk.providers.gemini.native.
Compatibility shim. Implementation lives in llm.sdk.providers.gemini.native.
Vertex AI Imagen 3 / 4 image-generation adapter.
Endpoint: POST {host}/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:predict Body: {:instances [{:prompt "..."}] :parameters {:sampleCount N :aspectRatio "1:1" :seed ...}} Response: {:predictions [{:bytesBase64Encoded "..." :mimeType "image/png"}]}
Auth: same GCP OAuth as vertex-gemini — token from :request provider-options.vertex.access-token or GOOGLE_OAUTH_ACCESS_TOKEN.
Models surfaced under :vertex-imagen include imagen-3.0-generate-002, imagen-3.0-fast-generate-001, imagen-4.0-generate-001.
Vertex AI Imagen 3 / 4 image-generation adapter.
Endpoint:
POST {host}/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:predict
Body:
{:instances [{:prompt "..."}]
:parameters {:sampleCount N :aspectRatio "1:1" :seed ...}}
Response:
{:predictions [{:bytesBase64Encoded "..." :mimeType "image/png"}]}
Auth: same GCP OAuth as vertex-gemini — token from
:request provider-options.vertex.access-token or
GOOGLE_OAUTH_ACCESS_TOKEN.
Models surfaced under :vertex-imagen include imagen-3.0-generate-002,
imagen-3.0-fast-generate-001, imagen-4.0-generate-001.Gemini Native API transport adapter. Handles thought signatures, streaming deltas, safety metadata. Preserves provider-specific replay state.
Gemini Native API transport adapter. Handles thought signatures, streaming deltas, safety metadata. Preserves provider-specific replay state.
Vertex AI Gemini transport adapter.
Builds on Gemini native with different auth (GCP OAuth) and endpoint
structure. Auth resolution follows the standard GCP ADC chain via
llm.sdk.gcp-auth: request opts → GOOGLE_OAUTH_ACCESS_TOKEN env →
gcloud auth print-access-token → GOOGLE_APPLICATION_CREDENTIALS
service-account JSON (RS256-signed JWT exchanged at
oauth2.googleapis.com/token).
Project resolution: request opts → profile quirks → GOOGLE_CLOUD_PROJECT env → SA JSON project_id.
Vertex AI Gemini transport adapter. Builds on Gemini native with different auth (GCP OAuth) and endpoint structure. Auth resolution follows the standard GCP ADC chain via llm.sdk.gcp-auth: request opts → GOOGLE_OAUTH_ACCESS_TOKEN env → `gcloud auth print-access-token` → GOOGLE_APPLICATION_CREDENTIALS service-account JSON (RS256-signed JWT exchanged at oauth2.googleapis.com/token). Project resolution: request opts → profile quirks → GOOGLE_CLOUD_PROJECT env → SA JSON project_id.
Compatibility shim. Implementation lives in llm.sdk.providers.ollama.native.
Compatibility shim. Implementation lives in llm.sdk.providers.ollama.native.
Native Ollama adapter — /api/chat (chat) and /api/embed (embeddings).
Ollama also exposes an OpenAI-compat /v1/chat/completions endpoint that the existing :ollama profile (registered) targets. This namespace registers a sibling :ollama-native profile for callers who want the native shape — older Ollama versions, vision input via the native :images field, or workflows that need the native :options keys (e.g. :num_ctx, :num_predict, :mirostat).
Streaming: Ollama uses NDJSON (one JSON object per line), NOT SSE. We re-use the http/sse-request line reader and parse each line as a raw JSON object instead of stripping a 'data: ' prefix.
Native Ollama adapter — /api/chat (chat) and /api/embed (embeddings). Ollama also exposes an OpenAI-compat /v1/chat/completions endpoint that the existing :ollama profile (registered) targets. This namespace registers a sibling :ollama-native profile for callers who want the native shape — older Ollama versions, vision input via the native :images field, or workflows that need the native :options keys (e.g. :num_ctx, :num_predict, :mirostat). Streaming: Ollama uses NDJSON (one JSON object per line), NOT SSE. We re-use the http/sse-request line reader and parse each line as a raw JSON object instead of stripping a 'data: ' prefix.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.chat.
Data-only OpenAI-compatible provider alias specs.
These providers share the OpenAI chat-completions wire shape. Adapter code may still apply provider quirks from the profile, but the registry should not need one hand-written register-provider call per alias.
Data-only OpenAI-compatible provider alias specs. These providers share the OpenAI chat-completions wire shape. Adapter code may still apply provider quirks from the profile, but the registry should not need one hand-written register-provider call per alias.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.embeddings.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.embeddings.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.image.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.image.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.moderation.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.moderation.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.speak.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.speak.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.transcribe.
Compatibility shim. Implementation lives in llm.sdk.providers.openai.transcribe.
OpenAI audio provider family namespace.
OpenAI audio provider family namespace.
OpenAI Chat Completions transport adapter. Covers OpenAI, OpenRouter, DeepSeek, and other OpenAI-compatible providers.
OpenAI Chat Completions transport adapter. Covers OpenAI, OpenRouter, DeepSeek, and other OpenAI-compatible providers.
OpenAI embeddings adapter — POST {base}/embeddings.
Same auth and base-url plumbing as the chat adapter; we share the profile, just register an additional :profile/embed-transport- constructor on it. Other OpenAI-compat hosts that offer embeddings (Mistral, Together, Voyage, Jina, etc.) can reuse this transport by attaching the same constructor.
OpenAI embeddings adapter — POST {base}/embeddings.
Same auth and base-url plumbing as the chat adapter; we share the
profile, just register an additional :profile/embed-transport-
constructor on it. Other OpenAI-compat hosts that offer embeddings
(Mistral, Together, Voyage, Jina, etc.) can reuse this transport by
attaching the same constructor.OpenAI image generation adapter.
POST {base}/images/generations. Covers DALL-E 3, DALL-E 2, and the gpt-image-1 family. The wire body differs subtly across them (gpt-image-1 takes :quality :low|:medium|:high|:auto and returns b64_json only; DALL-E 3 takes :quality :standard|:hd and :style :vivid|:natural). The adapter passes canonical fields straight through — provider-specific values are the caller's responsibility, and the same provider-options :extra_body hatch as elsewhere covers anything we haven't surfaced.
OpenAI image generation adapter.
POST {base}/images/generations. Covers DALL-E 3, DALL-E 2, and
the gpt-image-1 family. The wire body differs subtly across them
(gpt-image-1 takes :quality :low|:medium|:high|:auto and returns
b64_json only; DALL-E 3 takes :quality :standard|:hd and :style
:vivid|:natural). The adapter passes canonical fields straight
through — provider-specific values are the caller's responsibility,
and the same provider-options :extra_body hatch as elsewhere
covers anything we haven't surfaced.OpenAI Moderations adapter.
POST {base}/moderations. omni-moderation-latest (the default since Nov 2024) accepts multi-modal input — a vector of {:type :text|:image_url} maps as well as plain strings. text-moderation-* models are text-only.
Response shape per the OpenAI Moderations API: {:id :model :results [{:flagged bool :categories {category-name bool} :category_scores {category-name float} :category_applied_input_types {category-name ["text"|"image"]}}]}
OpenAI Moderations adapter.
POST {base}/moderations. omni-moderation-latest (the default since
Nov 2024) accepts multi-modal input — a vector of {:type :text|:image_url}
maps as well as plain strings. text-moderation-* models are
text-only.
Response shape per the OpenAI Moderations API:
{:id :model
:results [{:flagged bool
:categories {category-name bool}
:category_scores {category-name float}
:category_applied_input_types {category-name ["text"|"image"]}}]}OpenAI /audio/speech adapter — POST {model, voice, input, response_format} returns raw audio bytes.
OpenAI /audio/speech adapter — POST {model, voice, input, response_format}
returns raw audio bytes.OpenAI /audio/transcriptions adapter. Wire shape is shared by Groq's /openai/v1/audio/transcriptions endpoint (same field names, same verbose_json output), so the same transport class powers both profiles.
OpenAI /audio/transcriptions adapter. Wire shape is shared by Groq's /openai/v1/audio/transcriptions endpoint (same field names, same verbose_json output), so the same transport class powers both profiles.
Compatibility shim. Implementation lives in llm.sdk.providers.openrouter.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.openrouter.chat.
OpenRouter transport adapter. Builds on OpenAI Chat Completions with OpenRouter-specific quirks:
OpenRouter transport adapter. Builds on OpenAI Chat Completions with OpenRouter-specific quirks: - provider preferences routing in extra_body - Pareto Code router plugin - reasoning config in extra_body (not top-level) - special model naming and error handling.
OpenRouter image generation transport.
OpenRouter image models generate images through chat completions, not OpenAI's /images/generations endpoint. This adapter mirrors that wire shape and extracts images from choices[].message.images[].image_url.url.
OpenRouter image generation transport. OpenRouter image models generate images through chat completions, not OpenAI's /images/generations endpoint. This adapter mirrors that wire shape and extracts images from choices[].message.images[].image_url.url.
Compatibility shim. Implementation lives in llm.sdk.providers.perplexity.chat.
Compatibility shim. Implementation lives in llm.sdk.providers.perplexity.chat.
Perplexity transport — OpenAI-shape body + citation/search-results surfacing.
Request building is identical to openai-chat. Response parsing extends the OpenAI parser with two extractions:
Usage normalization delegates to normalize-openai-usage, which already picks up Perplexity's :citation_tokens and :num_search_queries when present.
Streaming: the final SSE chunk on /chat/completions carries :citations alongside :usage and :finish_reason. parse-stream-event returns a vector of events in that case — sdk/complete flattens multi-event return values.
Perplexity transport — OpenAI-shape body + citation/search-results
surfacing.
Request building is identical to openai-chat. Response parsing
extends the OpenAI parser with two extractions:
- :search_results [{:url :title :snippet}, ...] → richer
CitationPart per result
- :citations ["url", ...] → URL-only
CitationPart when search_results isn't present
Usage normalization delegates to normalize-openai-usage, which
already picks up Perplexity's :citation_tokens and
:num_search_queries when present.
Streaming: the final SSE chunk on /chat/completions carries
:citations alongside :usage and :finish_reason. parse-stream-event
returns a vector of events in that case — sdk/complete flattens
multi-event return values.Compatibility shim. Implementation lives in llm.sdk.providers.gemini.vertex.
Compatibility shim. Implementation lives in llm.sdk.providers.gemini.vertex.
Compatibility shim. Implementation lives in llm.sdk.providers.gemini.imagen.
Compatibility shim. Implementation lives in llm.sdk.providers.gemini.imagen.
Compatibility shim. Implementation lives in llm.sdk.providers.voyage.rerank.
Compatibility shim. Implementation lives in llm.sdk.providers.voyage.rerank.
Voyage /rerank transport. Differs from Cohere/Jina on field names only: request : top_k (not top_n) response: data (not results) Document representation is also slightly different — Voyage returns :document as a plain string when :return_documents=true.
Voyage usage shape: {:usage {:total_tokens N}}.
Voyage /rerank transport. Differs from Cohere/Jina on field names
only:
request : top_k (not top_n)
response: data (not results)
Document representation is also slightly different — Voyage returns
:document as a plain string when :return_documents=true.
Voyage usage shape: {:usage {:total_tokens N}}.cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |