Liking cljdoc? Tell your friends :D

postgres-extras-clj

A Clojure toolbox for inspecting and diagnosing PostgreSQL databases.

com.github.perrygeo/postgres-extras-clj {:mvn/version "0.1.23"}

🐘 Motivation

PostgreSQL is a fantastic database but using it in production requires some care, as do all databases. The postgres system catalogs allow us to monitor things like query performance, connection management, index efficiency, disk usage, and MVCC bloat. But accessing that information requires some arcane knowledge and reasonably hardcore SQL skills.

This project was inspired by the rustprooflabs pgdd extension and by the phoenix web framework which ships with a developer-centric, postgres-specific dashboard based on ecto_psql_extras. Both projects demonstrate that high-level database tooling can, and probably should, be built on top of the system catalogs. Understanding your database isn't optional.

postgres-extras-clj provides this missing toolkit for Clojure developers.

The SQL logic lives in resources/sql/*.sql and is formatted with pgFormat for consistency. SQL is annotated with HugSQL comments to turn them into clojure fns via macro magic. Instead of a web interface or postgres extension, postgres-extras-clj.core provides a clojure namespace with a few dozen useful functions that query system tables and return diagnostics as plain data structures. The goal is to run with limited, SELECT-only privileges of system schemas and tables (with a few noted exceptions).

All you need is a JDBC connection and a REPL.

📚 Data Dictionary

The data dictionary functionality is based on pgdd. These include COMMENTS and are helpful for understanding the structure of your database, from a data modeling lens.

Function	Scenario
`columns`	List all database column objects
`databases`	List all databases
`functions`	List all function objects in current database
`indexes`	List all index objects in current database
`schemas`	List all shemas in current database
`partition-children`	List all child partitions in current database
`partition-parents`	List all parent partition tables in current database
`tables`	List all table objects in current database
`views`	List all view objects in current database

To get a full map of data objects, use (read-data-dictionary db) which returns a map, with keywords mirroring the above functions.

🛠️ Operational Diagnostics

Diagnostic stats based on ecto_psql_extras. These are valuable for looking at your database through an operations or DBA lens.

Function	Scenario
`all-locks`	Queries with active locks
`bloat`	Table and index "bloat" in your database ordered by most wasteful
`blocking`	Queries holding locks other queries are waiting to be released
`cache-hit`	Index and table hit rate
`calls`	Queries that have the highest frequency of execution
`connections`	Returns the list of all active database connections
`db-settings`	Values of selected PostgreSQL settings
`duplicate-indexes`	Multiple indexes that have the same set of columns, same opclass, expression and predicate
`extensions`	Available and installed extensions
`health-check`	Checks the db for liveliness
`index-cache-hit`	Calculates your cache hit rate for reading indexes
`index-size`	The size of indexes, descending by size
`index-usage`	Index hit rate (effective databases are at 99% and up)
`kill-all!`	Kill all the active database connections
`locks`	Queries with active exclusive locks
`long-running-queries`	All queries longer than the threshold by descending duration
`null-indexes`	Find indexes with a high ratio of NULL values
`outliers`	Queries that have longest execution time in aggregate.
`records-rank`	All tables and the number of rows in each ordered by number of rows descending
`seq-scans`	Count of sequential scans by table descending by order
`table-cache-hit`	Calculates your cache hit rate for reading tables
`table-indexes-size`	Total size of all the indexes on each table, descending by size
`table-size`	Size of the tables (excluding indexes), descending by size
`total-index-size`	Total size of all indexes in MB
`total-table-size`	Size of the tables (including indexes), descending by size
`unused-indexes`	Unused and almost unused indexes
`vacuum-stats`	Dead rows and whether an automatic vacuum is expected to be triggered

To get a full map of diagnostic stats, use (read-stats db) which returns a map, with keywords mirroring the above functions.

Use the (diagnose (read-stats db)) and (diagnose-warnings (read-stats db)) functions to evaluate the stats according to a set of heuristics.

Usage

Check out the examples if you're looking to create a fresh namespace.

The following is a REPL demonstration of postgres-extras-clj with the next.jdbc adapter. Run clj -M:dev then evaluate the following forms

(require '[postgres-extras-clj.core :as pgex] :reload-all)
(require '[hugsql.core :as hugsql])
(require '[hugsql.adapter.next-jdbc :as next-adapter])
(require '[next.jdbc :as jdbc])

(def db
  (jdbc/get-datasource
   "jdbc:postgresql://localhost:5432/main?user=postgres&password=password"))

(hugsql/set-adapter! (next-adapter/hugsql-adapter-next-jdbc))

For this example, wer'e using a next.jdbc datasource but there are other options, see the HugSQL Adapters documentation.

Do a quick health check

(pgex/health-check db)
; {:now #inst "2024-07-05T18:04:57.506678000-00:00",
;  :version "PostgreSQL 16.1 (Debian 16.1-1.pgdg110+1) on x86_64-pc-linux-gnu..."}

Generate a data dictionary summarizing all major objects in your database.

(def dd (pgex/read-data-dictionary db))
(keys dd)
; (:databases
;  :columns
;  :functions
;  :indexes
;  :schemas
;  :tables
;  :views 
;  :partition-children
;  :partition-parents)

(rand-nth (:tables dd))
; {:size_pretty "16 kB",
;  :description nil,
;  :owned_by "postgres",
;  :size_plus_indexes "48 kB",
;  :rows 1,
;  :oid 19789,
;  :data_type "table",
;  :size_plus_indexes_bytes 49152,
;  :s_name "public",
;  :system_object false,
;  :t_name "users",
;  :size_bytes 16384,
;  :bytes_per_row 16384}

Create a full map of diagnostic stats.

(def stats (pgex/read-stats db))
(keys stats)
; (:duplicate-indexes
;  :db-settings
;  :locks
;  :vacuum-stats
;  :index-usage
;  :total-index-size
;  :cache-hit
;  :health-check
;  :records-rank
;  :null-indexes
;  :index-cache-hit
;  :all-locks
;  :outliers
;  :long-running-queries
;  :extensions
;  :total-table-size
;  :unused-indexes
;  :bloat
;  :calls
;  :table-size
;  :connections
;  :table-cache-hit
;  :table-indexes-size
;  :blocking
;  :seq-scans
;  :index-size)

(rand-nth (:connections stats))
; {:username "postgres"
;  :client_address "172.22.0.1/32"
;  :application_name "psql"}

All of the stats and data dictionary keywords mirror the name of a public function in the postgres-extras-clj.core namespace so you can invoke them selectively, instead of getting them from the full map.

(rand-nth (pgex/tables db))
; {:size_pretty "16 kB",
;  :description nil,
;  :owned_by "postgres",
;  :size_plus_indexes "48 kB",
;  :rows 1,
;  :oid 19789,
;  :data_type "table",
;  :size_plus_indexes_bytes 49152,
;  :s_name "public",
;  :system_object false,
;  :t_name "users",
;  :size_bytes 16384,
;  :bytes_per_row 16384}

(rand-nth (pgex/connections db))
; {:username "postgres"
;  :client_address "172.22.0.1/32"
;  :application_name "psql"}

Read stats and print the default diagnostics:

;; warnings only
(doseq [w (pgex/diagnose-warnings (pgex/read-stats db))]
  (println (:message w)))

The default-diagnostic-fns can be overridden. To create your own diagnostics:

(def unrealistic-expectations
  {:table-cache-hit
   {:pred #(> (:ratio %) 0.999)
    :onfalse "The cache hit ratio is not as insanely high as I'd like."
    :idfn :name}})

(doseq [w (pgex/diagnose-warnings
           (pgex/read-stats db)
           :diagnostic-fns unrealistic-expectations)]
  (println (:message w)))

; ! Warning :table-cache-hit, message_topics, The cache hit ratio is not as insanely high as I'd like.
; {:ratio 0.9806201550387597, :schema "public", :name "message_topics", :buffer_hits 253, :block_reads 5, :total_read 258}
; ... many more

Development

Test runner with coverage

clj -X:test

Run NREPL and interactive terminal REPL in one

clj -M:dev

Build a jar. Output in ./target/com.github.perrygeo/postgres-extras-clj-*.jar

clj -T:build jar

Deploy to Clojars. Set CLOJARS_USERNAME and CLOJARS_PASSWORD env vars. Assumes that clj -T:build jar has already been run.

clj -T:build deploy

License

The credit for the SQL query logic goes to the fantastic work done by these three projects:

Their licenses (all MIT) are included in the SQL files.

Can you improve this documentation? These fine people already did:
Matthew Perry & Matthew T. PerryEdit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close