Describes data structures. Like validating, but less assertive. Milquetoast validator. Still in its infancy.
Novel? Interesting? I hope so - I'm releasing this because I think this approach to validation is new and somewhat fun to think about, and I'd love to find some people to think about it with me (or at least tell me it's done better elsewhere so I can use that and stop working on this). I hope that's you!
[sweet-tooth/describe "0.4.0"]
Plenty of validation libraries already exist, like spec, schema, vlad, and bouncer. Why would I, a human being with hopes and dreams and presumably better things to do, bother to write another one?
There are three things I want in a validation library that I, in all my travels on this earth, have not found in one package:
All of these are exemplified in the most common usecase on the planet: validating a username. When validating a username, I can apply these validations:
The control flow for these validations looks like this:
UCOUNT
↗ ↘
UEMPTY UTAKEN
↘ ↗
UCHAR
Meaning, "If UEMPTY applies, don't attempt to apply any other validations. If UEMPTY does not apply, attempt UCOUNT and UCHAR; it's possible for both of these to apply to the username. If either UCOUNT or UCHAR applies, don't apply UTAKEN so that you don't waste time performing an unnecessary DB call."
describe is built to accommodate this kind of scenario. It's still a baby, and just as I plan to treat my own eventual human babies, I've already put it to work but I'm not getting my hopes up yet. In particular, describe is more verbose than other libraries. If that doesn't appeal to you, vlad is another good option.
Here's some code to give you an idea of how you would use describe:
(ns examples
(:require [sweet-tooth.describe :as d]))
(defn username-taken?
[username db]
(some #(= username (:username %)) db))
(def username-empty (d/empty :username))
(def username-invalid-length (d/count-not-in-range :username 6 24))
(def username-not-alnum (d/not-alnum :username))
(def username-taken
{:pred username-taken?
:args [:username (d/context :db)]
:dscr [::username-taken]})
(def new-user-rules
[[username-empty username-invalid-length username-taken]
[username-empty username-not-alnum username-taken]])
(d/describe {} new-user-rules)
;; =>
#{[:username [:sweet-tooth.describe/empty]]}
(d/describe {:username "b3!"} new-user-rules)
;; =>
#{[:username [:sweet-tooth.describe/count-not-in-range 6 24]]
[:username [:sweet-tooth.describe/not-alnum]]}
(d/describe {:username "bubba56"}
new-user-rules
{:db [{:username "bubba56"}]})
;; =>
#{[:username [:examples/username-taken]]}
(Note for the observant, describe
returns values like #{[:username [:examples/username-taken]]}
, which obviously need to be transformed
before you can associate validation messages with invalid values. The
function map-rollup-descriptions
helps with this.)
If you're using describe to validate data, then you can treat the presence of a description to mean the value is invalid.
describe
functionYou use the sweet-tooth.describe/describe
function to describe
some value. To describe a value is to apply a set of predicate
functions to it. For each predicate that returns true, a description
is added to a set of descriptions which will constitute describe
's
return value. The following sections will cover all the details of
writing predicates and descriptions.
describe
functionThe describe
function takes three arguments:
You call it like this:
(d/describe {:blub :dub} ; thing described
#{rule-1 rule-2} ; set of rules
{:optional :context}) ; optional context
The first argument doesn't have to be a map, it can be any value. If you pass in a map, then the rules should be written to handle a map. If you pass in a sequential value, then the rules should be written to handle a sequential value, and so on.
A rule is a map with three keys:
:pred
, a predicate function that determines whether the
description should be applied:args
, a vector of functions that supply arguments to the
predicate function:dscr
, description details, which can be any data or a
function. If a function, it will be applied to the return value of
the predicate function to get a final description details.Here's a simple example:
(in-ns 'tutorial)
(require '[sweet-tooth.describe :as d])
(def username-empty
{:pred empty?
:args [:username]
:dscr [::username-empty]})
(d/describe {:username "hurmp"} #{username-empty})
;; =>
nil
(d/describe {:username nil} #{username-empty})
;; =>
#{[:username [::username-empty]]}
Here we're first describing the valaue {:username "hurmp"}
using the
username-empty
rule. username-empty
's predicate function is
empty?
, and empty?
is applied to the value returned by
:username
- in this case, "hurmp"
. Since the predicate function
returns false, no description is applied, and when no descriptions are
applied describe
returns nil
.
Next we describe {:username nil}
. Since empty?
returns true, we
add a description, [:username [::username-empty]]
, to a set of
descriptions. A description is a vector of two elements, an identifier
(:username
) and details ([::username-empty]
). The details are
specified with the :dscr
key of the rule.
:pred
, :args
, and :dscr
Above I said that :args
is "a vector of functions that supply
arguments to the predicate function." In the previous example, you
applied a rule to the map {:username "hurmp"}
. The rule's
predicate function was empty?
, and the :args
vector contained
:username
. This specifies, "call :username
as a function on the
value being described, {:username "hurmp"}
. Pass the return value,
"hurmp"
, to the predicate function empty?
.
Since :args
is a vector of functions, you can have predicate
functions that take more than value. One place you'd want to use this
is when validating that :password
and :password-confirmation
fields match:
(def passwords-dont-match
{:pred not=
:args [:password :password-confirmation]
:dscr [::passwords-dont-match]})
(d/describe {:password "secure" :password-confirmation "$ecure"}
#{passwords-dont-match})
;; =>
#{[:password [:examples/passwords-dont-match]]}
You need to take extra care when you want to pass in a keyword as a constant. Check this out:
(def missing-keys
{:pred #(empty? (select-keys %1 %&))
:args [identity (constantly :a) (constantly :b)]
:dscr [::missing-keys]})
(d/describe {:username "hurmp"} #{missing-keys})
;; =>
#{[#function[clojure.core/identity] [:examples/missing-keys]]}
We want to apply this rule to a map as a whole; we don't want it
to apply to any particular key. Therefore, the first element in
:args
is identity
- this passes in the entire map to the predicate
function. The arguments (constantly :a)
and (constantly :b)
yield
the keys that we want to check for. We have to wrap keywords because
describe
would tries to call the keywords as functions with
{:username "hurmp"}
as their argument. It's as if you wrote this:
(let [to-describe {:username "hurmp"}
pred #(empty? (select-keys %1 %&))]
(pred (identity to-describe)
((constantly :a) to-describe)
((constantly :b) to-describe)))
The description returned in the previous example is funky:
[#function[clojure.core/identity] [:examples/missing-keys]]
The identifier (first element of the description vector) is the
function identity
. This is because, by default, describe uses the
first element in args
as the identifier. You can specify your own
identifier with :as
:
(def missing-keys-custom-identifier
{:pred #(empty? (select-keys %1 %&))
:args [identity (constantly :a) (constantly :b)]
:dscr [::missing-keys]})
(d/describe {:username "hurmp"} #{missing-keys})
; =>
#{[:entire-map [:examples/missing-keys]]}
If you want to spice things up even more, you can also provide a
function for description details. That function should take one
argument, which will be the return value of the predicate
function. For example, you could take advantage of this feature to
write a rule that includes the return value of clojure.spec's
explain-data
:
(:require '[clojure.spec.alpha :as s])
(s/def ::username string?)
(def username-explain-data
{:pred (partial s/explain-data ::username)
:args [:username]
:dscr (fn [explanation]
[::username-explan-data explanation])})
(d/describe {:username 3} #{username-explain-data})
; =>
#{[:username
[:examples/username-explan-data
#:clojure.spec.alpha{:problems
[{:path [],
:pred clojure.core/string?,
:val 3,
:via [:examples/username],
:in []}],
:spec :examples/username,
:value 3}]]}
context
Sometimes validating input requires comparing it to data outside the input. A common usecase is to check whether a username is taken. Here's how you could do that with describe:
(defn username-taken?
[username db]
(some #(= username (:username %)) db))
(def username-taken
{:pred username-taken?
:args [:username (d/context :db)]
:dscr [::username-taken]})
(d/describe {:username "bubba56"}
new-user-rules
{:db [{:username "bubba56"}]})
;; #{[:username [:examples/username-taken]]}
describe
's third argument is the context. You can supply any value
you want here, but it probably makes sense to pass in a map - in this
case, we're passing in {:db [{:username "bubba56"}]}
. Rules can
access the context by using the context
function, which you can see
with :args [:username (d/context :db)]
. Hopefully you can see how
the context value is threaded from the call to describe
, through the
context
function in :args
, and finally passed to the predicate
function username-taken?
.
context
takes one argument, a function to apply to the context; the
return value is passed to the predicate function.
describe
's second argument takes a data structure that represents a
graph, and that graph is used to control when describe
should
attempt to apply a rule. The next couple sections explain the
relationship between the rule graph and control flow, and how to
define a rule graph.
You can structure rules like this:
B
↗
A
↘
C
Which means, If the rule A is applied, don't apply B or C. If A is not applied, then it's possible for B and C to both apply. You could use this when describing passwords: The A rule would check whether the password exists. If it does, then B would check its length and C would check whether it contains special characters.
You can also structure rules like this:
A
↘
C
↗
B
This means, Try to apply both A and C. If either applies, don't try to apply B. You might use this when checking usernames: A would check username length, and B would check that it doesn't contain special characters. C would check whether the username exists. Since that involves a database operation, we don't want to perform it unless we know that it's a valid username.
Rules form a directed graph, and the application of any rule prevents the application of all rules in the subgraph reachable from the parent rule.
The second argument to describe
is a sequence (preferably a vector
or set for readability) representing a graph of rules. Here, the
set #{username-empty}
is transformed into a graph with a single node:
(d/describe {:username "hurmp"} #{username-empty})
Graph syntax is as follows, with arrows representing directed edges in a digraph.
Directed edges.
#{[A B] [A C]}
;; => A → B, A → C
This set contains two vectors. Each vector represents two nodes, with
a directed edge from the first to the second. This establishes control
flow such that if rule A's predicate returns true, then its
description will be applied and describe
will not attempt to apply B
or C.
Two nodes pointing at one node.
#{[B A] [C A] D}
;; => B → A, C → A
The data structure below describes a graph where both B and C are pointing at A. If either B or C applies, describe will not attempt to apply A.
Unconnected nodes.
#{A}
Describe will always attempt to apply unconnected nodes.
Maps as digraphs.
#{{A [B C]}}
;; => A → B, A → C
Maps of the form {A [B C ...]}
will form a digraph where A points to
B, C, etc.
Vectors as hierarchy.
#{[A B C]}
;; => A → B → C
A vector of [A B C]
forms a digraph such that A points to B and B
points to C. If A is applied, describe will not attempt to apply B or
C.
Don't get too fancy. If you want to create a complex graph, don't try to get too fancy by having deeply-nested vectors and maps. Sometimes this will require you to write the same node multiple times so that you can specify all of its edges. In the very first example at the top of this README, we saw a graph defined like this:
(def new-user-rules
[[username-empty username-invalid-length username-taken]
[username-empty username-not-alnum username-taken]])
This results in the following graph:
username-invalid-length
↗ ↘
username-empty username-taken
↘ ↗
username-not-alnum
Sometimes you want to validate nested maps? Check it:
(def street-empty (d/empty :street))
(def city-empty (d/empty :city))
(def address-invalid
(d/key-rule :address #{street-empty city-empty}))
(d/describe {:address {:street "street"}} new-user-rules)
; =>
#{[:address #{[:city [:describe.core/empty]]}]}
Using key-rule
, we define a set of rules to apply to the
value associated with :address
. The description for :address
is a
set of descriptions. Indeed, the implementation for key-rule
recursively calls describe
in its predicate function:
(defn key-rule
"Treats value returned by key-fn as new subject that you're applying rules to"
[key-fn rules]
{:pred (fn [key-val ctx] (describe key-val rules ctx))
:args [key-fn (context identity)]})
Is this a good idea? Who knows?
You can also use path-rule
for nested maps:
(d/describe {:person {:address {}}}
[(d/path-rule [:person :address] [street-empty city-empty])])
;; =>
#{[:person
#{[:address
#{[:city ::d/empty]
[:street ::d/empty]}]}]}
It looks like it'd be difficult to relate the descriptions back to the
original map. That's where map-rollup-descriptions
comes in...
NOTE: I would looooove suggestions on how to do this better :)
map-rollup-descriptions
will create maps where it's possible:
(d/map-rollup-descriptions
(d/describe {:person {:address {}}}
[(d/path-rule [:person :address] [street-empty city-empty])]))
;; =>
{:person
#{{:address
#{{:city #{::d/empty}
:street #{::d/empty}}}}}}
;; compare to the un-rolled-up-version:
#{[:person
#{[:address
#{[:city ::d/empty]
[:street ::d/empty]}]}]}
The value associated with :person
is a set with one member, a map
with the :address
key. :address
's value is also a set with one
member, the map #{:city ::d/empty, :street ::d/empty}
.
This is in keeping with describe's feature of enabling multiple
descriptions per keyword. It lets you, for example, associate multiple
descriptions with :city
: you could describe it with both "contains
non-alphanumeric text" and "must be less than 500 characters". It's a
bit unwieldy, though, and I'd love ideas for improvements.
TODO: explain how to use translation to transform values like
[::d/empty]
or [::d/not-in-range 6 24]
into user-friendly text.
TODO: explain map-describe
.
I am not the world's best open source project maintainer. It often takes me weeks or months to respond to issues, PRs, and other pro-social forms of communication. I apologize for this shortcoming.
That being said, I do very much appreciate any feedback: how could this be better? Is something broken? What am I missing? Please do open issues and PRs with your ideas. Thank you in advance for deciding to engage with someone as lacking in joie de maintainership as myself!
Copyright © 2019 Daniel Higginbotham
Distributed under the MIT License
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close