A Clojure port of the popular LiteLLM library, providing a unified interface for multiple LLM providers with comprehensive observability and thread pool management.
Provider | Status | Models | Function Calling | Streaming |
---|---|---|---|---|
OpenAI | ✅ Supported | GPT-3.5-Turbo, GPT-4, GPT-4o | ✅ | ✅ |
Anthropic | ✅ Supported | Claude 3 (Opus, Sonnet, Haiku), Claude 2.x | ❌ | ✅ |
OpenRouter | ✅ Supported | All OpenRouter models | ✅ | ✅ |
Azure OpenAI | 🔄 Planned | - | - | - |
Google Gemini | ✅ Supported | Gemini Pro, Gemini Pro Vision, Gemini Ultra | ❌ | ✅ |
Cohere | 🔄 Planned | Command | - | - |
Hugging Face | 🔄 Planned | Various open models | - | - |
Mistral | 🔄 Planned | Mistral, Mixtral | - | - |
Ollama | 🔄 Planned | Local models | - | - |
Together AI | 🔄 Planned | Various open models | - | - |
Replicate | 🔄 Planned | Various open models | - | - |
Add to your deps.edn
:
{:deps {tech.unravel/litellm-clj {:mvn/version "0.2.0"}}}
Or with Leiningen, add to your project.clj
:
[tech.unravel/litellm-clj "0.2.0"]
(require '[litellm.core :as litellm])
;; Start the LiteLLM system
(def system (litellm/start-system {:telemetry {:enabled true}
:thread-pools {:io-pool-size 10
:cpu-pool-size 4}}))
;; Make a completion request
(def response @(litellm/completion system
{:model "gpt-3.5-turbo"
:messages [{:role "user" :content "Hello, how are you?"}]}))
;; Access the response
(println (-> response :choices first :message :content))
;; Stop the system when done
(litellm/stop-system system)
(require '[litellm.core :as litellm])
(def system (litellm/start-system {}))
;; Simple completion
(def response @(litellm/completion system
{:model "gpt-3.5-turbo"
:messages [{:role "user" :content "Explain quantum computing"}]
:max_tokens 100}))
;; Stream responses for better UX
(litellm/completion system
{:model "gpt-4"
:messages [{:role "user" :content "Write a poem"}]
:stream true}
{:on-chunk (fn [chunk]
(print (-> chunk :choices first :delta :content)))
:on-complete (fn [response]
(println "\nStream complete!"))
:on-error (fn [error]
(println "Error:" error))})
(def response @(litellm/completion system
{:model "gpt-4"
:messages [{:role "user" :content "What's the weather in Boston?"}]
:functions [{:name "get_weather"
:description "Get the current weather"
:parameters {:type "object"
:properties {:location {:type "string"
:description "City name"}}
:required ["location"]}}]}))
Set your API key as an environment variable:
export OPENAI_API_KEY=your-api-key-here
(litellm/completion system
{:model "gpt-4"
:messages [{:role "user" :content "Hello!"}]})
Set your API key:
export ANTHROPIC_API_KEY=your-api-key-here
(litellm/completion system
{:model "claude-3-opus-20240229"
:messages [{:role "user" :content "Hello Claude!"}]
:max_tokens 1024})
OpenRouter provides access to multiple LLM providers through a single API:
export OPENROUTER_API_KEY=your-api-key-here
;; Use OpenAI models via OpenRouter
(litellm/completion system
{:model "openai/gpt-4"
:messages [{:role "user" :content "Hello!"}]})
;; Use Anthropic models via OpenRouter
(litellm/completion system
{:model "anthropic/claude-3-opus"
:messages [{:role "user" :content "Hello!"}]})
;; Use Meta models via OpenRouter
(litellm/completion system
{:model "meta-llama/llama-2-70b-chat"
:messages [{:role "user" :content "Hello!"}]})
Set your API key:
export GEMINI_API_KEY=your-api-key-here
;; Use Gemini Pro
(litellm/completion system
{:model "gemini-pro"
:messages [{:role "user" :content "Explain quantum computing"}]})
;; Use Gemini Pro Vision with images
(litellm/completion system
{:model "gemini-pro-vision"
:messages [{:role "user"
:content [{:type "text" :text "What's in this image?"}
{:type "image_url" :image_url {:url "https://..."}}]}]})
;; Configure safety settings and generation params
(litellm/completion system
{:model "gemini-pro"
:messages [{:role "user" :content "Write a story"}]
:temperature 0.9
:top_p 0.95
:top_k 40
:max_tokens 1024})
Run Ollama locally and use local models:
(litellm/completion system
{:model "ollama/llama2"
:messages [{:role "user" :content "Hello!"}]
:api_base "http://localhost:11434"})
(def system (litellm/start-system
{:telemetry {:enabled true ;; Enable observability
:metrics-interval 60} ;; Metrics collection interval (seconds)
:thread-pools {:io-pool-size 10 ;; Thread pool for I/O operations
:cpu-pool-size 4} ;; Thread pool for CPU-bound tasks
:cache {:enabled true ;; Enable response caching
:ttl 3600} ;; Cache TTL in seconds
:retry {:max-attempts 3 ;; Max retry attempts
:backoff-ms 1000}})) ;; Initial backoff delay
{:model "gpt-4" ;; Model identifier
:messages [{:role "user" ;; Conversation messages
:content "Hello"}]
:max_tokens 100 ;; Maximum tokens to generate
:temperature 0.7 ;; Sampling temperature (0.0-2.0)
:top_p 1.0 ;; Nucleus sampling
:n 1 ;; Number of completions
:stream false ;; Enable streaming
:stop ["\n"] ;; Stop sequences
:presence_penalty 0.0 ;; Presence penalty (-2.0 to 2.0)
:frequency_penalty 0.0 ;; Frequency penalty (-2.0 to 2.0)
:user "user-123"} ;; User identifier for tracking
;; Check system health
(litellm/health-check system)
;; => {:status :healthy
;; :providers {:openai :ready
;; :anthropic :ready}
;; :thread-pools {:io-pool {:active 2 :size 10}
;; :cpu-pool {:active 0 :size 4}}}
;; Get cost estimate for a request
(def cost-info (litellm/estimate-cost
{:model "gpt-4"
:messages [{:role "user" :content "Hello"}]}))
;; => {:estimated-tokens 50
;; :estimated-cost-usd 0.0015}
This project is licensed under the MIT License - see the LICENSE file for details.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
Ctrl+k | Jump to recent docs |
← | Move to previous article |
→ | Move to next article |
Ctrl+/ | Jump to the search field |