A library that adds support for simple compound indices to datomic.
(:require [datomic-compound-index.core :as dci])
Suppose you have time-series event data. You have lots of users, and each user creates lots of events. Each event has a timestamp. You would like to query "find all events for User X on day Y".
schema:
(attribute :event/at :db.type/instant
:db/index true)
(attribute :event/user :db.type/ref
:db/index true)
Datomic queries use a single index only. If the DB is indexed on users, the query will sequential scan on all events from all time, created by that user. If the DB is indexed on event time, the query will sequential scan on all events on that day, for all users.
This readme spends a lot of time talking about how to solve the problem without datomic-compound-index (dci), so you'll understand what's going on under the covers.
The solution is to use a compound-index, an indexed attribute that contains both values.
(attribute :event/user-at :db.type/string
:db/index true)
Let's say the current time is 1426630517988
(epoch), and the user we're interested in is user-id 1234
. Then we store
"1234|1426630517988|" in the attribute :event/user-at. (In reality, dci uses byte-arrays rather than strings, but the principle is the same).
(d/transact conn [{:db/id (d/tempid :events)
:event/user 1234
:event/at 1426630517988
:event/user-at "1234|1426630517988|"}])
To find all events created by that user today, without dci, we'd use d/index-range:
(d/index-range db :event/user-at "1234|1426550400000|" "1234|1426636800000|")
This can also be extended past two attributes. Let's say we want to find all events of a specific type, by a certain user, on a specific day:
(attribute :event/type :db.type/keyword)
(attribute :event/user-at-type :db.type/string
:db/index true)
(d/index-range db :event/user-at-type "1234|1426550400000|:foo|" "1234|1426636800000|:foo|")
(of course, if your DB has :event/user-at-type, then :event/user-at is unnecessary).
dci provides functions for the use cases described above. dci/index-key is the main entry point, it creates the keys used for inserting and querying on compound indices.
Create an indexed attribute, of type bytes. Create a second attribute of type bytes with the same name, with the suffix "-metadata". If the indexed attribute is named ":user/foo", the second attribute would be named ":user/foo-metadata". In a comment/docstring, specify the type and order of values that will be indexed.
When inserting new entities, use insert-index-key to create the value for the compound-indexed attribute:
(let [event-at (Date.)
user-id 1234
event-type :foo]
@(d/transact conn [(merge
{:db/id (d/tempid :eventss)
:event/type event-type}
(dci/insert-index-key :event/user-at-type [user-id event-at event-type]))]))
insert-index-key
takes two args, the compound attribute, and a
vector of values. Every entity with this attribute must use values of
the same type, in the same order. insert-index-key returns a map
containing { , }, so it should be
merged in with the entity map. Values can be String or Long (or
something that implements d-c-i.core/Serialize, see below).
If you're querying for an entire compound key (not partial), you can use d/q
as normal, using index-key for the value
(d/q '[:find ?e :in $ ?v :where [?e :event/user-at-type ?v]] db (dci/index-key [:foo :bar :baz]))
When querying across a range, e.g. all events in a single day, use dci/search-range:
(dci/search-range db :event/user-at [1234 (-> (time/today-at-midnight) to-date)] [1234 (-> (time/today-at-midnight) (time/plus (time/days 1)) to-date)])
search-range
takes two keys, and returns the seq of datoms between them.
(Note that index-key converts values to byte-array representations, so j.u.Date instances can be passed in. (See the Serialize Protocol in source))
When searching for a partial key (i.e. you know the first two values of a 3-value compound key) use dci/search:
(dci/search db :user/type-created-at [123 :foo])
This returns all datoms where the initial part of the key is identical. dci/search can also be used for entire key lookups, but d/q might be more idiomatic.
dci/search-range also supports partial key searches:
(dci/search-range db :event/user-at-type [1234 1426550400000] [1234 1426636800000])
Under the covers, partial searches work by creating a partial index-key (i.e. a byte-array shorter than the 'full' key), and then matching on values stored in the DB. Datomic indices are used (d/seek-datoms, and d/index-range, respectively), so these are efficient.
Note that dci is still very early. I'm using it in staging, but not yet production. If you do use it production, be able to re-create the values of your compound indices (i.e. store the component pieces in other attributes).
search
and searc-range
Copyright © 2015 Allen Rohner
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close