Liking cljdoc? Tell your friends :D

Datahike Database Configuration

Datahike is highly configurable to support different deployment models and use cases. Configuration is set at database creation and cannot be changed afterward (though data can be migrated to a new configuration).

Configuration Methods

Datahike uses the environ library for configuration, supporting three methods:

  1. Environment variables (lowest priority)
  2. Java system properties (middle priority)
  3. Configuration map argument (highest priority - overwrites others)

This allows flexible deployment: hardcode configs in development, use environment variables in containers, or Java properties in production JVMs.

Basic Configuration

The minimal configuration map includes:

{:store              {:backend :memory      ;keyword - storage backend
                      :id #uuid "550e8400-e29b-41d4-a716-446655440020"} ;UUID - database identifier
 :name               nil                    ;string - optional database name (auto-generated if nil)
 :schema-flexibility :write                 ;keyword - :read or :write
 :keep-history?      true                   ;boolean - enable time-travel queries
 :attribute-refs?    false                  ;boolean - use entity IDs for attributes (Datomic-compatible)
 :index              :datahike.index/persistent-set  ;keyword - index implementation
 :store-cache-size   1000                   ;number - store cache entries
 :search-cache-size  10000}                 ;number - search cache entries

Quick start with defaults (in-memory database):

(require '[datahike.api :as d])
(d/create-database)  ;; Creates memory DB with sensible defaults

Storage Backends

Datahike supports multiple storage backends via konserve. The choice of backend determines durability, scalability, and deployment model.

Built-in backends:

  • :memory - In-memory (ephemeral)
  • :file - File-based persistent storage

External backend libraries:

  • LMDB - High-performance local storage
  • JDBC - PostgreSQL, MySQL, H2
  • Redis - High write throughput
  • S3 - AWS cloud storage
  • GCS - Google Cloud storage
  • DynamoDB - AWS NoSQL
  • IndexedDB - Browser storage

For detailed backend selection guidance, see Storage Backends Documentation.

Environment Variable Configuration

When using environment variables or Java system properties, name them like:

propertiesenvvar
datahike.store.backendDATAHIKE_STORE_BACKEND
datahike.store.usernameDATAHIKE_STORE_USERNAME
datahike.schema.flexibilityDATAHIKE_SCHEMA_FLEXIBILITY
datahike.keep.historyDATAHIKE_KEEP_HISTORY
datahike.attribute.refsDATAHIKE_ATTRIBUTE_REFS
datahike.nameDATAHIKE_NAME

etc.

Note: Do not use : in keyword strings for environment variables—it will be added automatically.

Backend Configuration Examples

Memory (Built-in)

Ephemeral storage for testing and development:

{:store {:backend :memory
         :id #uuid "550e8400-e29b-41d4-a716-446655440021"}}

Environment variables:

DATAHIKE_STORE_BACKEND=memory
DATAHIKE_STORE_CONFIG='{:id #uuid "550e8400-e29b-41d4-a716-446655440021"}'

File (Built-in)

Persistent local file storage:

{:store {:backend :file
         :path "/var/db/datahike"}}

Environment variables:

DATAHIKE_STORE_BACKEND=file
DATAHIKE_STORE_CONFIG='{:path "/var/db/datahike"}'

LMDB (External Library)

High-performance local storage via datahike-lmdb:

{:store {:backend :lmdb
         :path "/var/db/datahike-lmdb"}}

JDBC (External Library)

PostgreSQL or other JDBC databases via datahike-jdbc:

{:store {:backend :jdbc
         :dbtype "postgresql"
         :host "db.example.com"
         :port 5432
         :dbname "datahike"
         :user "datahike"
         :password "secret"}}

S3 (External Library)

AWS S3 storage via konserve-s3:

{:store {:backend :s3
         :bucket "my-datahike-bucket"
         :region "us-east-1"}}

TieredStore (Composable)

Memory hierarchy (e.g., Memory → IndexedDB for browsers):

{:store {:backend :tiered
         :id #uuid "550e8400-e29b-41d4-a716-446655440022"
         :frontend-config {:backend :memory
                          :id #uuid "550e8400-e29b-41d4-a716-446655440022"}
         :backend-config {:backend :indexeddb
                         :name "persistent-db"
                         :id #uuid "550e8400-e29b-41d4-a716-446655440022"}}}
         ;; All :id values must match for konserve validation

For complete backend options and selection guidance, see Storage Backends.

Core Configuration Options

Database Name

Optional identifier for the database. Auto-generated if not specified. Useful when running multiple databases:

{:name "production-db"
 :store {:backend :file :path "/var/db/prod"}}

Schema Flexibility

Controls when schema validation occurs:

  • :write (default): Strict schema—attributes must be defined before use. Catches errors early.
  • :read: Schema-less—accept any data, validate on read. Flexible for evolving data models.
{:schema-flexibility :read}  ;; Allow any data structure

With :read flexibility, you can still define critical schema like :db/unique, :db/cardinality, or :db.type/ref where needed.

See Schema Documentation for details.

Time-Travel Queries

Enable historical query capabilities:

{:keep-history? true}  ;; Default: true

When enabled, use history, as-of, and since to query past states:

(d/q '[:find ?e :where [?e :name "Alice"]] (d/as-of db #inst "2024-01-01"))

Disable if: You never need historical queries and want to save storage space.

See Time Variance Documentation for time-travel query examples.

Attribute References

Store attributes as entity IDs (integers) instead of keywords in datoms for performance and Datomic compatibility:

{:attribute-refs? true}  ;; Default: false

How it works:

Without attribute references (default):

;; Datoms store attribute keywords directly
#datahike/Datom [1 :name "Alice" 536870913 true]

With attribute references enabled:

;; Datoms store attribute entity IDs (integers)
#datahike/Datom [1 73 "Alice" 536870913 true]  ;; where 73 is the entity ID for :name

Benefits:

  • Better performance: Integer comparisons are significantly faster than keyword comparisons, especially with many attributes
  • Datomic compatibility: Matches Datomic's internal representation for easier migration
  • Attributes as entities: Attributes become queryable entities in the database
  • Recommended for production: Generally beneficial unless you have specific reasons to use keywords

Considerations:

  • Must use :schema-flexibility :write (cannot use with :read)
  • Requires ID ↔ keyword mapping (maintained automatically)
  • System schema is bootstrapped into the index on database creation
  • You still use keyword syntax in queries and transactions - translation is automatic

Example:

;; Create database with attribute references
(def cfg {:store {:backend :memory
                  :id #uuid "550e8400-e29b-41d4-a716-446655440000"}
          :attribute-refs? true
          :schema-flexibility :write})

(d/create-database cfg)
(def conn (d/connect cfg))

;; Use normal keyword syntax in transactions and queries
(d/transact conn [{:db/ident :name
                   :db/valueType :db.type/string
                   :db/cardinality :db.cardinality/one}])

(d/transact conn [{:name "Alice"}])

;; Queries use keywords as usual - translation happens automatically
(d/q '[:find ?n :where [?e :name ?n]] @conn)
;; => #{["Alice"]}

;; But internally, datoms store integer attribute IDs for performance

When to use:

  • Use :attribute-refs? true for production databases (recommended for performance)
  • Use :attribute-refs? false only if you need :schema-flexibility :read or have specific compatibility requirements

Index Selection

Choose the underlying index implementation:

{:index :datahike.index/persistent-set}  ;; Default (recommended)

Available indexes:

  • :datahike.index/persistent-set - Default, actively maintained, supports all features
  • :datahike.index/hitchhiker-tree - Legacy, requires explicit library and namespace loading

Most users should use the default. Hitchhiker-tree is maintained for backward compatibility with existing databases.

Advanced Configuration

Single-Writer Model (Distributed Access)

For distributed deployments, configure a writer to handle all transactions while readers access storage directly via Distributed Index Space.

HTTP Server Writer

{:store {:backend :file :path "/shared/db"}
 :writer {:backend :datahike-server
          :url "http://writer.example.com:4444"
          :token "secure-token"}}

Clients connect and transact through the HTTP server. Reads happen locally from shared storage.

Kabel WebSocket Writer (Beta)

Real-time reactive updates via WebSocket:

{:store {:backend :indexeddb :name "app-db" :id store-id}
 :writer {:backend :kabel
          :peer-id server-peer-id
          :local-peer @client-peer}}  ;; Set up via kabel/distributed-scope

Enables browser clients with live synchronization. See Distributed Architecture for setup details.

Branching (Beta)

Access specific database branches (git-like versioning):

{:store {:backend :file :path "/var/db"}
 :branch :staging}  ;; Default branch is :db

Create and merge branches for testing, staging, or experiments. See Versioning for the branching API.

Remote Procedure Calls

Send all operations (reads and writes) to a remote server:

{:store {:backend :memory :id #uuid "550e8400-e29b-41d4-a716-446655440023"}
 :remote-peer {:backend :datahike-server
               :url "http://server.example.com:4444"
               :token "secure-token"}}

Useful for thin clients or when you want centralized query execution. See Distributed Architecture for RPC vs. DIS trade-offs.

Initial Transaction

Seed the database with schema or data on creation:

{:store {:backend :memory :id #uuid "550e8400-e29b-41d4-a716-446655440024"}
 :initial-tx [{:db/ident :name
               :db/valueType :db.type/string
               :db/cardinality :db.cardinality/one}
              {:db/ident :email
               :db/valueType :db.type/string
               :db/unique :db.unique/identity
               :db/cardinality :db.cardinality/one}]}

Convenient for testing or deploying databases with predefined schema.

Complete Configuration Example

{:store {:backend :file
         :path "/var/datahike/production"
         :id #uuid "550e8400-e29b-41d4-a716-446655440000"}
 :name "production-db"
 :schema-flexibility :write
 :keep-history? true
 :attribute-refs? false
 :index :datahike.index/persistent-set
 :store-cache-size 10000
 :search-cache-size 100000
 :initial-tx [{:db/ident :user/email
               :db/valueType :db.type/string
               :db/unique :db.unique/identity
               :db/cardinality :db.cardinality/one}]
 :writer {:backend :datahike-server
          :url "http://writer.example.com:4444"
          :token "secure-token"}
 :branch :db}

Migration and Compatibility

URI Scheme (Pre-0.3.0, Deprecated)

Prior to version 0.3.0, Datahike used URI-style configuration. This is still supported but deprecated in favor of the more flexible hashmap format.

Old URI format:

"datahike:memory://my-db?temporal-index=true&schema-on-read=true"

New hashmap format (equivalent):

{:store {:backend :memory :id #uuid "550e8400-e29b-41d4-a716-446655440025"}
 :keep-history? true
 :schema-flexibility :read}

Key changes:

  • :temporal-index:keep-history?
  • :schema-on-read:schema-flexibility (:read or :write)
  • Store parameters moved to :store map
  • Memory backend: :host/:path:id
  • Direct support for advanced features (writer, branches, initial-tx)

Existing URI configurations continue to work—no migration required unless you need new features.

Further Documentation

Can you improve this documentation? These fine people already did:
Konrad Kühne, Timo Kramer, Christian Weilbach, Judith Massa, Judith & JC
Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close