Liking cljdoc? Tell your friends :D

Memento

A library for function memoization with scoped caches and tagged eviction capabilities.

Motivation

Why is there a need for another caching library? Motivation here.

Usage

With require as [memento.core :as m]:

You can attach a cache to the function, by wrapping it in a memo call:

(def my-function (m/memo #(* 2 %) {}))

The first argument is the function (or a var), and the second one is the cache conf.

If a var is specified, the root binding of the var is modified to the cached function.

(defn my-function [x] (* 2 x))
(m/memo #'my-function {})

The cache conf is a plain map. Use variables and normal map configurations to construct these.

The {} conf results in the default cache being created, which is a cache that does no caching (this is useful for reasons listed later).

But if we want a cache that does actual caching, we can create an infinite duration cache implemented by Guava:

(m/memo #'my-function {:memento.core/type :memento.core/guava})
; or using the namespace shorthands
(m/memo #'my-function #::m {:type ::m/guava})

Such cache works just like clojure.core/memoize, a memoization cache with unlimited duration and size.

We have specified cache implementation type to be guava (instead of :memento.core/none which is a noop cache type).

Guava is the main implementation provided by this library.

Guava type takes additional parameters to customize behaviour:

(m/memo  #'my-function #::m {:type ::m/guava 
                             :ttl [40 :min]})

It can be cumbersome to remember all these properties and to type them out.

For this purpose, and for purpose of documentation, there are special configuration namespaces with vars are conf keys. The docstrings explain the settings.

Here's an example of a complicated cache conf:

(ns memento.tryout
  (:require [memento.core :as m]
    ; general cache conf keys
            [memento.config :as mc]
    ; guava specific cache conf keys
            [memento.guava.config :as mcg]))

(def my-weird-cache
  "Conf for guava cache that caches up to 20 seconds and up to 30 entries, uses weak
  references and prints when keys get evicted."
  {mc/type mc/guava
   mc/size< 30
   mc/ttl 20
   mcg/weak-values true
   mcg/removal-listener #(println (apply format "Function %s key %s, value %s got evicted because of %s" %&))})

(defn my-function [x] (* 2 x))
(m/memo #'my-function my-weird-cache)

Read doc strings in memento.config and memento.guava.config namespaces for available on cache properties.

I suggest you collect cache configurations you commonly use in a namespace and reuse them in your code, to keep the code brief.

Create a namespace like myproject.cache and write vars like:

(def inf {mc/type mc/guava}) ; infinite cache

and then simply use it everywhere in your project:

(ns myproject.some-ns
  (:require [myproject.cache :as cache]
            [memento.core :as m]))

; simply use value for conf 
(m/memo #'myfunction cache/inf)

Another example using tags, scoped caches and tagged eviction:

(ns myproject.some-ns
  (:require [myproject.cache :as cache]
            [memento.core :as m]))

(defn get-person-by-id [person-id]
  (let [person (db/get-person person-id)]
    ; tag the returned object with :person + id pair
    (m/with-tag-id person :person (:id person))))

; add a cache to the function with tags :person and :request
(m/memo #'get-person-by-id [:person :request] cache/inf)

; remove cache entries from every cache tagged :person globally, where the
; entry is tagged with :person 1
(m/memo-clear-tag! :person 1)

(m/with-caches :request (constantly (m/create cache/inf))
  ; inside this block, a fresh new cache is used (and discarded)
  ; making a scope-like functionality
  (get-person-by-id 5))

Major concepts

Back to basics. Enabling memoization of a function is composed of two distinct steps:

creating a Cache (optional, as you can use an existing cache)
binding the cache to the function (a MountPoint is used to connect a function being memoized to the cache)

A cache, an instance of memento.base/Cache, can contain entries from multiple functions and can be shared between memoized functions. Each memoized function is bound to a Cache via MountPoint.

Creating a cache is done by using memento.core/create, which takes a map of configuration (called cache conf). You can use the resulting Cache with multiple functions. The configuration properties (map keys) can be found in memento.config and memento.guava.config, look for "Cache setting" in docstring.

If memento.config/enabled? is false, this function always returns memento.base/no-cache, which is a Cache implementation that doesn't do any caching. You can set this at start-up by specifying java property: -Dmemento.enabled=false.

Binding the cache

Binding the cache to a function is done by memento.core/bind. Parameters are:

a fn or a var, if var, the root value of var is changed to a memoized version
a mount point configuration or mount conf for short
a Cache instance that you want to bind

Mount conf is either a map of mount point configuration properties, or a shorthand (see below). The configuration properties (map keys) can be found in memento.config, look for "function bind" in docstring.

Instead of map of properties, mount conf can be a shorthand, which has the following two shorthands:

[:some-keyword :another-keyword] -> {:memento.core/tags [:some-keyword :another-keyword]}
:a-keyword -> {:memento.core/tags [:a-keyword]}

Create + bind combined

You can combine both functions into 1 call using memento.core/memo.

(m/memo fn-or-var mount-conf cache-conf)

To make things shorter, there's a 2-arg variant that allows that you specify both configurations at once:

(m/memo fn-or-var conf)

If conf is a map, then all the properties valid for mount conf are treated as such. The rest is passed to cache create. If conf is a mount conf shorthand then cache conf is considered to be {}. E.g.

(m/memo my-fn :my-tag)

This creates a memoized function tagged with :my-tag bound to a cache that does no caching.

Additional features

Changing cache key

Add :memento.core/key-fn to cache or mount config (or use mc/key-fn value) to specify a function with which to manipulate the key cache will use for the entry.

Example building on previous suggested cache/inf cache configuration:

(defn get-person-by-id [db-conn account-id person-id] {})

; when creating the cache key, remove db connection
(m/memo #'get-person-by-id (assoc cache/inf-cache mc/key-fn #(remove db-conn? %)))
; or more explicit
(m/memo #'get-person-by-id {mc/type mc/guava mc/key-fn #(remove db-conn? %)})

When creating the cache key, remove db connection, so the cache uses [account-id person-id] as key. Thus calling the function with different db connection but same ids returns the cached value.

Another example:

(defn return-my-user-info-json [http-request]
  (load-user (-> http-request :session :user-id)))

;; clearly the cache hit is based on a deeply nested property out of a huge request map
;; so we want to use that as basis for caching
(m/memo #'return-my-user-info-json (assoc cache/inf-cache mc/key-fn #(-> % first :session :user-id)))

This is both a mount conf setting, and a cache setting. The obvious difference is that specifying key-fn for the Cache will affect all functions using that cache and in mount conf, only that one function will be affected. If using 2-arg memo, then this setting is applied to mount conf.

Prevent caching of a specific return value

If you want to prevent caching of a specific function return, you can wrap it in special record using memento.core/do-not-cache function. Example:

(defn get-person-by-id [db-conn account-id person-id]
  (if-let [person (db-get-person db-conn account-id person-id)]
    {:status 200 :body person} 
    (m/do-not-cache {:status 404})))

404 responses won't get cached, and the function will be invoked every time for those ids.

Modifying returned value

Sticking a piece of caching logic into your function logic isn't very clean. Instead, you can add :memento.core/ret-fn to cache or mount conf (or use mc/ret-fn value) to specify a function that can modify the return value from a cached function before it is cached. This is useful when using the do-not-cache function above to do the wrapping outside the function being cached. Example:

; first argument is args, second is the returned value
(defn no-cache-error-resp [[db-conn account-id person-id :as args] resp]
  (if (<= 400 (:status resp) 599)
    (m/do-not-cache resp)
    resp))

(defn get-person-by-id [db-conn account-id person-id]
  (if (nil? person-id)
    {:status 404}
    {:status 200}))

(m/memo #'get-person-by-id (assoc cache/inf-cache mc/ret-fn no-cache-error-resp))

This is both a mount conf setting, and a cache setting. This has same consequences as with key-fn setting above.

Manual eviction

You can manually evict entries:

; invalidate everything, also works on MountPoint instances
(m/memo-clear! memoized-function)
; invaliate an arg-list, also works on MountPoint instances
(m/memo-clear! memoized-function arg1 arg2 ...)

You can manually evict all entries in a Cache instance:

(m/memo-clear-cache! cache-instance)

Manually adding entries

You can add entries to a function's cache at any time:

; also works on MountPoint instances
(m/memo-add! memoized-function {[arg1 arg2] result})

Additional utility

(m/as-map memoized-function) -> map of cache entries, also works on MountPoint instances
(m/memoized? a-function) -> returns true if the function is memoized
(m/memo-unwrap memoized-function) -> returns original uncached function, also works on MountPoint instances

Namespace scan

You can scan loaded namespaces for annotated vars and automatically create caches. The scan looks for Vars with :memento.core/cache key in the meta. That value is used as a cache spec.

Given require [memento.ns-scan :as ns-scan]:

(ns myproject.some-ns
  (:require 
    [myproject.cache :as cache]
    [memento.core :as m]))

; defn already has a nice way for adding meta
(defn test1
  "A function using built-in defn meta mechanism to specify a cache region"
  {::m/cache cache/inf}
  [arg1 arg2]
  (+ arg1 arg2))

; you can also do standard meta syntax
(defn ^{::m/cache cache/inf} test2
  "A function using normal meta syntax to add a cache to itself"
  [arg1 arg2] (+ arg1 arg2))

; this also works on def
(def ^{::m/cache cache/inf} test3 (fn [arg1 arg2] (+ arg1 arg2)))

; attach caches
(ns-scan/attach-caches)

This only works on LOADED namespaces, so beware.

Calling attach-caches multiple times attaches new caches, replaces existing caches.

Namespaces clojure.* and nrepl.* are not scanned by default, but you can provide your own blacklists, see doc.

Events

You can fire an event at a memoized function. The target can be a particular function (or MountPoint), or you can specify a tag (and all tagged functions get the event). Each function can configure its own handler for events. Event can be any object, I suggest you use a structure that will enable event handlers to distinguish events.

Event handler is a function of two arguments, the MountPoint it's been triggered (most core functions work on those) and the event.

Main use case is to enable adding entries to different functions from same data. Example:

(defn get-project-name
  "Returns project name"
  [project-id])

(m/memo #'get-project-name inf)

(defn get-project-owner
  "Returns project's owner user ID"
  [project-id])

(m/memo #'get-project-owner inf)

(defn get-user-projects
  "Returns a big expensive list"
  [user-id]
  (let [project-list '...]
    project-list))

In that example, when get-user-projects is called, we might load over a 100 projects, and we'd hate to waste that and not inform get-project-name and get-project-owner about the facts we've established here, especially since we might be calling these smaller functions in a loop right after fetching the big list.

Here's a way to make sure data is reused by manually pushing entries into the caches as supported by most caching libs:

(defn get-user-projects
  "Returns a big expensive list"
  [user-id]
  (let [project-list '...]
    ;; preload entries for seen projects into caches
    (m/memo-add! get-project-name
                 (zipmap (map (comp list :id) project-list)
                         (map :name project-list)))
    (m/memo-add! get-project-owner
                 (zipmap (map (comp list :id) project-list)
                         (repeat user-id)))
    project-list))

The problem with this solution is that it is an absolute nightmare to maintain:

adding/removing data consuming functions like get-project-name means that I have to also fix producing functions like get-user-projects
worse yet, the producer function has to be aware of what the argument list of consuming function looks like and how the output of that function is related to that. For instance if I change arg list for get-project-owner I must fix the get-user-projects code that pushes cache entries
if I want additional producers like get-user-projects then each such producer must implement all these changes and each has a massive block to feed all the consumers

I can use events instead and co-locate the code that feeds the cache with the function:

(defn get-project-name
  "Returns project name"
  [project-id])

(m/memo #'get-project-name
        (assoc inf
          mc/evt-fn (m/evt-cache-add
                      :project-seen
                      (fn [{:keys [name id]}] {[id] name}))
          mc/tags [:project]))

(defn get-project-owner
  "Returns project's owner user ID"
  [project-id])

(m/memo #'get-project-owner
        (assoc inf
          mc/evt-fn (m/evt-cache-add
                      :project-seen
                      (fn [{:keys [id user-id]}] {[id] user-id}))
          mc/tags [:project]))

(defn get-user-projects
  "Returns a big expensive list"
  [user-id]
  (let [project-list '...]
    (doseq [p project-list]
      (m/fire-event! :project [:project-seen (assoc p :user-id user-id)]))
    project-list))

We're using the evt-cache-add convenience function that assumes event shape is a vector of type + payload and that the intent is to add entries to the cache.

In this case the producer function is only concerned with firing events at tagged caches. It doesn't need to consider the number of shape of consumers.

The caching declaration of consumer functions is where there the cache feeding logic is located, which makes things manageable.

Skip/disable caching

If you set -Dmemento.enabled=false JVM option (or change memento.config/enabled? var root binding), then all caches created will memento.base/no-cache, which does no caching.

Reload guards

When you memoize a function with tags, a special object is created that will clean up in internal tag mappings when memoized function is GCed. It's important when reloading namespaces to remove mount points on the old function versions.

It uses finalize, which isn't free (takes extra work to allocate and GC has to work harder), so if you don't use namespace reloading and you want to optimize you can disable reload guard objects.

Set -Dmemento.reloadable=false JVM option (or change memento.config/reload-guards? var root binding).