Liking cljdoc? Tell your friends :D

benchgecko.core

Clojure SDK for BenchGecko — compare LLM benchmarks, estimate inference costs, and explore AI model performance data across providers.

Models are represented as maps with the following keys: :name - Model identifier (e.g., "gpt-4o") :provider - Provider name (e.g., "OpenAI") :context-window - Maximum context in tokens (optional) :scores - Map of benchmark category keywords to scores (0-100) :pricing - Map with :input-per-mtok and :output-per-mtok in USD (optional)

Benchmark categories: :reasoning :coding :knowledge :instruction :multilingual :safety :long-context :vision :agentic

Clojure SDK for BenchGecko — compare LLM benchmarks, estimate inference
costs, and explore AI model performance data across providers.

Models are represented as maps with the following keys:
  :name           - Model identifier (e.g., "gpt-4o")
  :provider       - Provider name (e.g., "OpenAI")
  :context-window - Maximum context in tokens (optional)
  :scores         - Map of benchmark category keywords to scores (0-100)
  :pricing        - Map with :input-per-mtok and :output-per-mtok in USD (optional)

Benchmark categories:
  :reasoning :coding :knowledge :instruction :multilingual
  :safety :long-context :vision :agentic
raw docstring

average-scoreclj

(average-score model)

Calculate the average benchmark score for a model. Returns nil if the model has no scores.

Calculate the average benchmark score for a model.
Returns nil if the model has no scores.
sourceraw docstring

benchmark-categoriesclj

All benchmark categories tracked by BenchGecko.

All benchmark categories tracked by BenchGecko.
sourceraw docstring

best-valueclj

(best-value models)

Find the model with the best value score (performance per dollar). Returns nil if no models have both pricing and scores.

Find the model with the best value score (performance per dollar).
Returns nil if no models have both pricing and scores.
sourceraw docstring

compare-modelsclj

(compare-models model-a model-b)

Compare two models across all mutually-scored benchmark categories. Returns a map with: :model-a - First model :model-b - Second model :deltas - Map of category to score difference (positive = A leads) :a-wins - Categories where model A scores higher :b-wins - Categories where model B scores higher :ties - Categories with identical scores :winner - The model with higher average score (ties favor A)

(compare-models gpt4 claude)

Compare two models across all mutually-scored benchmark categories.
Returns a map with:
  :model-a  - First model
  :model-b  - Second model
  :deltas   - Map of category to score difference (positive = A leads)
  :a-wins   - Categories where model A scores higher
  :b-wins   - Categories where model B scores higher
  :ties     - Categories with identical scores
  :winner   - The model with higher average score (ties favor A)

(compare-models gpt4 claude)
sourceraw docstring

estimate-costclj

(estimate-cost model input-tokens output-tokens)

Estimate inference cost in USD for a given number of input and output tokens. Returns nil if the model has no pricing information.

(estimate-cost gpt4 5000 2000) ;=> 0.0325

Estimate inference cost in USD for a given number of input and output tokens.
Returns nil if the model has no pricing information.

(estimate-cost gpt4 5000 2000)  ;=> 0.0325
sourceraw docstring

estimate-monthlyclj

(estimate-monthly model daily-requests avg-input-tokens avg-output-tokens)

Estimate monthly cost assuming a daily request volume.

(estimate-monthly gpt4 1000 3000 1000) ;=> monthly USD cost

Estimate monthly cost assuming a daily request volume.

(estimate-monthly gpt4 1000 3000 1000)  ;=> monthly USD cost
sourceraw docstring

filter-by-tierclj

(filter-by-tier models tier)

Filter models by performance tier.

(filter-by-tier models :S) ;=> sequence of S-tier models

Filter models by performance tier.

(filter-by-tier models :S)  ;=> sequence of S-tier models
sourceraw docstring

get-modelclj

(get-model models name)

Look up a model by name from a collection of models. Returns the first model matching the given name, or nil.

Look up a model by name from a collection of models.
Returns the first model matching the given name, or nil.
sourceraw docstring

make-modelclj

(make-model name provider & {:keys [context-window scores pricing]})

Create a model map with required name and provider, plus optional fields.

(make-model "gpt-4o" "OpenAI" :context-window 128000 :scores {:reasoning 92.3 :coding 89.1} :pricing {:input-per-mtok 2.50 :output-per-mtok 10.00})

Create a model map with required name and provider, plus optional fields.

(make-model "gpt-4o" "OpenAI"
  :context-window 128000
  :scores {:reasoning 92.3 :coding 89.1}
  :pricing {:input-per-mtok 2.50 :output-per-mtok 10.00})
sourceraw docstring

model-summaryclj

(model-summary model)

Generate a human-readable summary string for a model.

(model-summary gpt4) ;=> "gpt-4o (OpenAI) [S-Tier] avg=90.0 value=7.2"

Generate a human-readable summary string for a model.

(model-summary gpt4)
;=> "gpt-4o (OpenAI) [S-Tier] avg=90.0 value=7.2"
sourceraw docstring

model-tierclj

(model-tier model)

Classify a model into a performance tier (:S :A :B :C :D). Returns nil if the model has no scores.

TierAverage ScoreDescription
:S90+Elite frontier
:A80-89Strong general
:B70-79Capable mid-range
:C60-69Budget / older gen
:D<60Entry-level / legacy
Classify a model into a performance tier (:S :A :B :C :D).
Returns nil if the model has no scores.

| Tier | Average Score | Description          |
|------|---------------|----------------------|
| :S   | 90+           | Elite frontier       |
| :A   | 80-89         | Strong general       |
| :B   | 70-79         | Capable mid-range    |
| :C   | 60-69         | Budget / older gen   |
| :D   | <60           | Entry-level / legacy |
sourceraw docstring

rank-by-categoryclj

(rank-by-category models category)

Rank models by score in a specific benchmark category (descending). Models without a score in that category are excluded.

(rank-by-category models :coding) ;=> [{:model gpt4 :score 89.1} {:model claude :score 93.7}]

Rank models by score in a specific benchmark category (descending).
Models without a score in that category are excluded.

(rank-by-category models :coding)
;=> [{:model gpt4 :score 89.1} {:model claude :score 93.7}]
sourceraw docstring

value-scoreclj

(value-score model)

Compute value score: average benchmark performance per dollar of blended token price. Higher is better. Returns nil if pricing or scores are missing.

Compute value score: average benchmark performance per dollar of blended
token price. Higher is better. Returns nil if pricing or scores are missing.
sourceraw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close