Liking cljdoc? Tell your friends :D

Provider Shape Audit

This document is the control surface for provider wire-shape correctness. The SDK should not rely on a provider being "OpenAI-compatible" without request, response, stream, or live evidence that the adapter still matches the provider surface.

Coverage markers:

  • request-golden - tests assert method, URL, headers, query params, and body.
  • response-fixture - tests parse provider-shaped response fixtures or stubs.
  • stream-fixture - tests replay SSE/stream/eventstream chunks.
  • live-smoke - env-gated live test calls the real provider.
  • unchecked - adapter behavior is inferred from shared code or docs only.

Chat

ProviderIDAdapter familyCoverageHigh-risk gaps
OpenAI:openaiOpenAI Chat Completionsrequest-golden, response-fixture, stream-fixture, live-smokeBroaden live tool/JSON-schema coverage after refactor.
Anthropic:anthropicAnthropic Messagesrequest-golden, response-fixture, stream-fixture, live-smokeKeep thinking signatures, OAuth headers, tool replay, and cache markers pinned.
Gemini Native:gemini-nativeGemini RESTrequest-golden, response-fixture, stream-fixture, live-smokeThought signatures, tool responses, cachedContent, and safety blocks need explicit fixtures.
Vertex Gemini:vertex-geminiVertex REST wrapper over Geminirequest-golden, response-fixture, stream-fixture, live-smokeRegion/global URL routing, ADC headers, and provider id preservation are critical.
OpenRouter:openrouterOpenAI-wire custom wrapperrequest-golden, response-fixture, stream-fixture, live-smokeProvider routing, plugins, reasoning, envelope cache, embeddings, and provider-specific usage need fixtures.
Codex Responses:codexOpenAI Responsesrequest-golden, response-fixture, stream-fixture, live-smokeEncrypted reasoning/provider replay state must stay intact.
Codex Backend:codex-backendChatGPT Codex backendrequest-golden, response-fixture, stream-fixture, live-smokeSSE-first non-streaming behavior, auth headers, and backend usage parsing are load-bearing.
Perplexity:perplexityOpenAI-wire custom wrapperrequest-golden, response-fixture, stream-fixture, live-smokeCitation/search result extraction and final stream chunk flattening are critical.
Cohere:cohereNative Cohere chatrequest-golden, response-fixture, stream-fixture, live-smokeDocuments, citations, tool calls, and v2 response variants need pinned fixtures.
Bedrock:bedrockAWS Bedrock Converserequest-golden, response-fixture, stream-fixtureLive proof depends on AWS env; eventstream and cachePoint fixtures are mandatory.
Ollama Native:ollama-nativeOllama /api/chatrequest-golden, response-fixture, stream-fixtureLocal runtime live smoke is optional; NDJSON stream shape must be pinned.
Fake/Test:fakeDeterministic test transportrequest-golden, response-fixtureKeep as SDK-internal test fixture only; never count as provider parity.
DeepSeek:deepseekOpenAI-compatible aliasrequest-golden, response-fixtureLive smoke and provider-specific reasoning output drift.
Kimi / Moonshot:kimiOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeThinking fields and model catalog separation.
Kimi Code:kimi-codeOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeRequired KimiCLI identity headers and prompt cache key.
Mistral:mistralOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeDropped penalty fields and JSON-schema compatibility.
Groq:groqOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeReasoning format and transcription alias interactions.
Cerebras:cerebrasOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeModel names and reasoning fields.
Together:togetherOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeChat and embedding model/provider ids.
xAI:xaiOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeReasoning/cache routing fields.
HuggingFace Router:huggingfaceOpenAI-compatible aliasrequest-golden, response-fixture, live-smokeRouter model ids and tool support are model-dependent.
Aggregator aliases:sambanova, :deepinfra, :lambda, :nebius, :hyperbolic, :novita, :friendliai, :featherless, :cloudflare, :dashscope, :volcengineOpenAI-compatible aliasesrequest-goldenMostly unchecked live behavior; keep claims conservative.

Embeddings

ProviderIDAdapter familyCoverageHigh-risk gaps
OpenAI:openaiOpenAI /embeddingsrequest-golden, response-fixture, live-smokeDimensions override and multi-input ordering.
Cohere:cohereCohere /embedrequest-golden, response-fixture, live-smokev3 input type and embedding type variants.
Voyage:voyageOpenAI-compatible embeddingsrequest-golden, response-fixture, live-smokeInput type/provider-options behavior.
Mistral:mistralOpenAI-compatible embeddingsrequest-golden, live-smokeAdd fixture response coverage.
Together:togetherOpenAI-compatible embeddingsrequest-golden, live-smokeAdd fixture response coverage.
Jina:jinaOpenAI-compatible embeddingsrequest-golden, live-smokeAdd fixture response coverage.
OpenRouter:openrouterOpenAI-compatible embeddings with OpenRouter headersrequest-golden, response-fixture, live-smokeModel prefix stripping and billing-route pricing must stay explicit.
Nebius:nebiusOpenAI-compatible embeddingsrequest-golden, response-fixtureLive smoke depends on Nebius credentials/model availability.
Ollama Native:ollama-nativeOllama /api/embedrequest-golden, response-fixtureOptional local live smoke.

Other Modalities

ModalityProviderIDCoverageHigh-risk gaps
ModerationOpenAI:openairequest-golden, response-fixture, live-smokeMulti-modal moderation shape should stay explicit.
RerankCohere:cohererequest-golden, response-fixture, live-smokev2 and v3 field compatibility.
RerankVoyage:voyagerequest-golden, response-fixture, live-smokeScore/document response variants.
RerankJina:jinarequest-golden, response-fixture, live-smokeProvider id tagging and document return shape.
RerankBedrock:bedrockrequest-golden, response-fixtureBedrock Agent Runtime /rerank requires SigV4 and model ARN routing.
ImageOpenAI:openairequest-golden, response-fixturegpt-image-1 usage and b64-only behavior need live proof.
ImageOpenRouter:openrouterrequest-golden, response-fixtureUses chat completions with message.images, not /images/generations.
ImageVertex Imagen:vertex-imagenrequest-golden, response-fixtureADC/project routing and Imagen 3/4 differences.
ImageBedrock:bedrockrequest-golden, response-fixtureTitan vs Stability request variants.
TranscriptionOpenAI:openairequest-golden, response-fixtureMultipart boundary and verbose JSON variants.
TranscriptionGroq:groqrequest-golden, response-fixtureGroq endpoint/base URL and model aliases.
TTSOpenAI:openairequest-golden, response-fixtureRaw byte response headers.
TTSElevenLabs:elevenlabsrequest-golden, response-fixtureVoice id URL path and output_format query.

File Attachments

File/document attachments are SDK request-shape surfaces, distinct from file lifecycle APIs such as upload/list/delete. Canonical :part/type :file content parts are provider-native only where the underlying API has a stable attachment shape:

ProviderIDWire ShapeCoverageHigh-risk gaps
OpenAI Chat:openaiChat Completions file content part with file_data or file_idrequest-golden, live-smokeFile data must be a data:<mime>;base64,... URI.
Codex Responses:codex, :codex-backendResponses input_file with file_data, file_id, or file_urlrequest-golden, live-smokeFile data must be a data:<mime>;base64,... URI.
Anthropic:anthropicMessages document block with Files API, URL, base64, or text sourcerequest-goldenFile IDs require the files-api-2025-04-14 beta header.
Gemini Native:gemini-nativefileData URI or inlineData base64 partrequest-goldenURI sources must already be provider-accessible.
Vertex Gemini:vertex-geminiGemini fileData URI or inlineData base64 part under Vertex routingrequest-goldenGCS URI permissions are outside SDK serialization.
Bedrock:bedrockConverse document content block with bytes, text, or S3 sourcerequest-goldenDocument names/formats are sanitized before signing.
Cohere:cohereTextual :file/content maps to native top-level documentsrequest-golden, live-smokeCohere does not fetch file IDs or decode binary data; those fail explicitly.
OpenAI-compatible aliasesaliases such as :deepseek, :kimi-code, :mistralUnsupported for canonical :file content partsrequest-goldenThese adapters fail explicitly instead of inheriting OpenAI-only file support.

Refactor Gate

Before moving a provider family, the corresponding rows above must have request and response coverage. For streaming providers, stream fixtures must also exist. Live tests are useful proof, but fixtures remain the required offline regression net.

Capability Coverage Gate

The detailed per-provider SDK surface matrix lives in src/llm/sdk/provider_coverage.clj and is enforced by test/llm/sdk/provider_coverage_test.clj.

Every registered provider must explicitly declare coverage for:

  • public SDK surfaces: complete/chat, streaming, embeddings, moderation, rerank, image generation, transcription, and TTS as applicable
  • request shape and response shape fixtures
  • stream shape, including explicit :not-applicable or :none
  • context-cache behavior, including honest :none or :not-applicable
  • usage/metrics normalization and canonical response stamping
  • pricing/cost source, including :unknown only where intentionally unknown
  • model listing source, including live /models, snapshot-only, local-only, or unsupported
  • auth path and structured error classification
  • live smoke status

Generated provider clients are not part of the production SDK architecture. Official OpenAPI, Discovery, Smithy, or SDK-derived examples may be stored as spec snapshots and used to create or validate fixtures.

Current Logic Audit

The provider-family rewrite includes a direct implementation read across the provider owners, not only a metadata coverage pass:

  • OpenAI-compatible, OpenRouter, Perplexity, Cohere, Bedrock, Gemini, Anthropic, Codex, and Ollama chat builders were checked for request shape, stream event flattening, usage normalization, cache handling, and error classification.
  • Provider-native cache logic stays provider-specific: Anthropic uses native cache_control, OpenRouter uses envelope markers where upstream Claude/Qwen cache semantics apply, Gemini accepts explicit cachedContent, Bedrock uses Converse cachePoint, and unsupported providers explicitly report no cache strategy.
  • Cost attribution is wired in the public drivers for chat, embeddings, rerank, image generation, transcription, and TTS. Unknown pricing remains explicit and is not treated as zero.
  • Stop-sequence handling is pinned across Anthropic, Gemini, Cohere, Bedrock, and Ollama so a single string is one stop sequence rather than a vector of characters.

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close