A Jepsen history is a list of operations--invocations and completions. A generator's job is to specify what invocations to perform, and when. In a sense, a generator becomes a history as Jepsen incrementally applies it to a real system.
Naively, we might define a history as a fixed sequence of invocations to perform at certain times, but this is impossible: we have only a fixed set of threads, and they may not be free to perform our operations. A thread must be free in order to perform an operation.
Time, too, is a dependency. When we schedule an operation to occur once per second, we mean that only once a certain time has passed can the next operation begin.
There may also be dependencies between threads. Perhaps only after a nemesis has initiated a network partition does our client perform a particular read. We want the ability to hold until a certain operation has begun.
Conceptually, then, a generator is a graph of events, some of which have not yet occurred. Some events are invocations: these are the operations the generator will provide to clients. Some events are completions: these are provided by clients to the generator. Other events are temporal: a certain time has passed.
This graph has some invocations which are ready to perform. When we have a ready invocation, we apply the invocation as an input to the graph, obtaining a new graph, and hand the operation to the relevant client.
A context is a map which provides context for generators. For instance, a generator might need to know the number of threads which will ask it for operations. It can get that number from the context. Users can add their own values to the context map, which allows two generators to share state. When one generator calls another, it can pass a modified version of the context, which allows us to write generators that, say, run two independent workloads, each with their own concurrency and thread mappings.
The standard context mappings, which are provided by Jepsen when invoking the top-level generator, and can be expected by every generator, are:
:time The current Jepsen linear time, in nanoseconds
:free-threads A collection of idle threads which could perform work
:workers A map of thread identifiers to process identifiers
We use (op gen test context)
to ask the generator for the next invocation
that we can process.
The operation can have three forms:
nil
, which means the generator is done, and
there is nothing more to do. Once a generator does this, it must never
return anything other than nil
, even if the context changes.But (op gen test context) returns more than just an operation; it also returns the subsequent state of the generator, if that operation were to be emitted. The two are bundled into a tuple.
(op gen test context) => [op gen'] ; known op [:pending gen] ; unsure nil ; exhausted
The analogous operations for sequences are (first) and (next); why do we
couple them here? Why not use the update mechanism strictly to evolve state?
Because the behavior in sequences is relatively simple: next always moves
forward one item, whereas only some updates actually cause systems to move
forward. Seqs always do the same thing in response to next
, whereas
generators may do different things depending on context. Moreover, Jepsen
generators are often branched, rather than linearly wrapped, as sequences
are, resulting in questions about which branch needs to be updated.
When I tried to strictly separate implementations of (op) and (update), it resulted in every update call attempting to determine whether this particular generator did or did not emit the given invocation event. This is remarkably tricky to do well, and winds up relying on all kinds of non-local assumptions about the behavior of the generators you wrap, and those which wrap you.
We still want the ability to respond to invocations and completions, e.g. by tracking that information in context variables. Therefore, in addition to (op) returning a new generator, we have a separate function, (update gen test context event), which allows generators to react to changing circumstances.
Updates use a context with a specific relationship to the event:
TODO: this is not true yet. Fix this.
Nil is a valid generator; it ignores updates and always yields nil for operations.
IPersistentMaps are generators which ignore updates and return operations which look like the map itself, but with default values for time, process, and type provided based on the context. This means you can write a generator like
{:f :write, :value 2}
and it will generate ops like
{:type :invoke, :process 3, :time 1234, :f :write, :value 2}
Sequences are generators which assume the elements of the sequence are themselves generators. They ignore updates, and return all operations from the first generator in the sequence, then all operations from the second, and so on. They do not synchronize.
Functions are generators which ignore updates and can take either test and context as arguments, or no args. Functions should be mostly pure, but some creative impurity is probably OK. For instance, returning randomized :values for maps is probably all right. I don't know the laws! What is this, Haskell?
Functions can return two things:
In the future, we might consider:
A Jepsen history is a list of operations--invocations and completions. A generator's job is to specify what invocations to perform, and when. In a sense, a generator *becomes* a history as Jepsen incrementally applies it to a real system. Naively, we might define a history as a fixed sequence of invocations to perform at certain times, but this is impossible: we have only a fixed set of threads, and they may not be free to perform our operations. A thread must be *free* in order to perform an operation. Time, too, is a dependency. When we schedule an operation to occur once per second, we mean that only once a certain time has passed can the next operation begin. There may also be dependencies between threads. Perhaps only after a nemesis has initiated a network partition does our client perform a particular read. We want the ability to hold until a certain operation has begun. Conceptually, then, a generator is a *graph* of events, some of which have not yet occurred. Some events are invocations: these are the operations the generator will provide to clients. Some events are completions: these are provided by clients to the generator. Other events are temporal: a certain time has passed. This graph has some invocations which are *ready* to perform. When we have a ready invocation, we apply the invocation as an input to the graph, obtaining a new graph, and hand the operation to the relevant client. ## Contexts A *context* is a map which provides context for generators. For instance, a generator might need to know the number of threads which will ask it for operations. It can get that number from the *context*. Users can add their own values to the context map, which allows two generators to share state. When one generator calls another, it can pass a modified version of the context, which allows us to write generators that, say, run two independent workloads, each with their own concurrency and thread mappings. The standard context mappings, which are provided by Jepsen when invoking the top-level generator, and can be expected by every generator, are: :time The current Jepsen linear time, in nanoseconds :free-threads A collection of idle threads which could perform work :workers A map of thread identifiers to process identifiers ## Fetching an operation We use `(op gen test context)` to ask the generator for the next invocation that we can process. The operation can have three forms: - The generator may return `nil`, which means the generator is done, and there is nothing more to do. Once a generator does this, it must never return anything other than `nil`, even if the context changes. - The generator may return :pending, which means there might be more ops later, but it can't tell yet. - The generator may return an operation, in which case: - If its time is in the past, we can evaluate it now - If its time is in the future, we wait until either: - The time arrives - Circumstances change (e.g. we update the generator) But (op gen test context) returns more than just an operation; it also returns the *subsequent state* of the generator, if that operation were to be emitted. The two are bundled into a tuple. (op gen test context) => [op gen'] ; known op [:pending gen] ; unsure nil ; exhausted The analogous operations for sequences are (first) and (next); why do we couple them here? Why not use the update mechanism strictly to evolve state? Because the behavior in sequences is relatively simple: next always moves forward one item, whereas only *some* updates actually cause systems to move forward. Seqs always do the same thing in response to `next`, whereas generators may do different things depending on context. Moreover, Jepsen generators are often branched, rather than linearly wrapped, as sequences are, resulting in questions about *which branch* needs to be updated. When I tried to strictly separate implementations of (op) and (update), it resulted in every update call attempting to determine whether this particular generator did or did not emit the given invocation event. This is *remarkably* tricky to do well, and winds up relying on all kinds of non-local assumptions about the behavior of the generators you wrap, and those which wrap you. ## Updating a generator We still want the ability to respond to invocations and completions, e.g. by tracking that information in context variables. Therefore, in addition to (op) returning a new generator, we have a separate function, (update gen test context event), which allows generators to react to changing circumstances. - We invoke an operation (e.g. one that the generator just gave us) - We complete an operation Updates use a context with a specific relationship to the event: - The context :time is equal to the event :time - The free processes and worker maps reflect those *prior* to the event taking place. This ensures that generators can examine the worker map to identify which thread performed the given operation. TODO: this is not true yet. Fix this. ## Default implementations Nil is a valid generator; it ignores updates and always yields nil for operations. IPersistentMaps are generators which ignore updates and return operations which look like the map itself, but with default values for time, process, and type provided based on the context. This means you can write a generator like {:f :write, :value 2} and it will generate ops like {:type :invoke, :process 3, :time 1234, :f :write, :value 2} Sequences are generators which assume the elements of the sequence are themselves generators. They ignore updates, and return all operations from the first generator in the sequence, then all operations from the second, and so on. They do not synchronize. Functions are generators which ignore updates and can take either test and context as arguments, or no args. Functions should be *mostly* pure, but some creative impurity is probably OK. For instance, returning randomized :values for maps is probably all right. I don't know the laws! What is this, Haskell? Functions can return two things: - nil: signifies that the function generator is exhausted. - a tuple of [op gen]: passed through directly; the gen replaces the fn - a map: the map is treated as a generator, which lets it fill in a process, time, etc. In the future, we might consider: - Returning any other generator: the function could be *replaced* by that generator, allowing us to compute generators lazily?
(all-processes context)
Given a context, returns all processes currently being executed by threads.
Given a context, returns all processes currently being executed by threads.
(all-threads context)
Given a context, returns a collection of all threads.
Given a context, returns a collection of all threads.
(any & gens)
Takes multiple generators and binds them together. Operations are taken from any generator. Updates are propagated to all generators.
Takes multiple generators and binds them together. Operations are taken from any generator. Updates are propagated to all generators.
(clients client-gen)
(clients client-gen nemesis-gen)
In the single-arity form, wraps a generator such that only clients request operations from it. In its two-arity form, combines a generator of client operations and a generator for nemesis operations into one. When the process requesting an operation is :nemesis, routes to the nemesis generator; otherwise to the client generator.
In the single-arity form, wraps a generator such that only clients request operations from it. In its two-arity form, combines a generator of client operations and a generator for nemesis operations into one. When the process requesting an operation is :nemesis, routes to the nemesis generator; otherwise to the client generator.
(delay-til dt gen)
Given a time dt in seconds, and an underlying generator gen, constructs a generator which aligns invocations to intervals of dt seconds.
Given a time dt in seconds, and an underlying generator gen, constructs a generator which aligns invocations to intervals of dt seconds.
(dissoc-vec v i)
Cut a single index out of a vector, returning a vector one shorter, without the element at that index.
Cut a single index out of a vector, returning a vector one shorter, without the element at that index.
(each-thread gen)
Takes a generator. Constructs a generator which maintains independent copies of that generator for every thread. Each generator sees exactly one thread in its free process list. Updates are propagated to the generator for the thread which emitted the operation.
Takes a generator. Constructs a generator which maintains independent copies of that generator for every thread. Each generator sees exactly one thread in its free process list. Updates are propagated to the generator for the thread which emitted the operation.
(f-map f-map g)
Takes a function f-map
converting op functions (:f op) to other functions,
and a generator g
. Returns a generator like g
, but where fs are replaced
according to f-map
. Useful for composing generators together for use with a
composed nemesis.
Takes a function `f-map` converting op functions (:f op) to other functions, and a generator `g`. Returns a generator like `g`, but where fs are replaced according to `f-map`. Useful for composing generators together for use with a composed nemesis.
(filter f gen)
A generator which filters operations from an underlying generator, passing
on only those which match (f op). Like map
, :pending and nil operations
bypass the filter.
A generator which filters operations from an underlying generator, passing on only those which match (f op). Like `map`, :pending and nil operations bypass the filter.
(free-processes context)
Given a context, returns a collection of processes which are not actively processing invocations.
Given a context, returns a collection of processes which are not actively processing invocations.
(free-threads context)
Given a context, returns a collection of threads that are not actively processing invocations.
Given a context, returns a collection of threads that are not actively processing invocations.
(op gen test context)
(update gen test context event)
Updates the generator to reflect an event having taken place
Updates the generator to reflect an event having taken place
(ignore-updates gen)
Wraps a generator. Any call to update
is ignored, returning this
generator with no changes.
It's not clear if this actually confers any performance advantage right now.
Wraps a generator. Any call to `update` is ignored, returning this generator with no changes. It's not clear if this actually confers any performance advantage right now.
(limit remaining gen)
Wraps a generator and ensures that it returns at most limit
operations.
Propagates every update to the underlying generator.
Wraps a generator and ensures that it returns at most `limit` operations. Propagates every update to the underlying generator.
(log msg)
A generator which, when asked for an operation, logs a message and yields nil.
A generator which, when asked for an operation, logs a message and yields nil.
(map f gen)
A generator which wraps another generator g, transforming operations it
generates with (f op test process), of if that fails, (f op). When the
underlying generator yields :pending or nil, this generator does too, without
calling f
. Passes updates to underlying generator.
A generator which wraps another generator g, transforming operations it generates with (f op test process), of if that fails, (f op). When the underlying generator yields :pending or nil, this generator does too, without calling `f`. Passes updates to underlying generator.
(mix gens)
A random mixture of several generators. Takes a collection of generators and chooses between them uniformly. Ignores updates; some users create broad (hundreds of generators) mixes.
To be precise, a mix behaves like a sequence of one-time, randomly selected generators from the given collection. This is efficient and prevents multiple generators from competing for the next slot, making it hard to control the mixture of operations.
A random mixture of several generators. Takes a collection of generators and chooses between them uniformly. Ignores updates; some users create broad (hundreds of generators) mixes. To be precise, a mix behaves like a sequence of one-time, randomly selected generators from the given collection. This is efficient and prevents multiple generators from competing for the next slot, making it hard to control the mixture of operations.
(nemesis nemesis-gen)
(nemesis nemesis-gen client-gen)
In the single-arity form, wraps a generator such that only the nemesis requests operations from it. In its two-arity form, combines a generator of client operations and a generator for nemesis operations into one. When the process requesting an operation is :nemesis, routes to the nemesis generator; otherwise to the client generator.
In the single-arity form, wraps a generator such that only the nemesis requests operations from it. In its two-arity form, combines a generator of client operations and a generator for nemesis operations into one. When the process requesting an operation is :nemesis, routes to the nemesis generator; otherwise to the client generator.
(next-process context thread)
When a process being executed by a thread crashes, this function returns the
next process for a given thread. You should probably only use this with the
global context, because it relies on the size of the :workers
map.
When a process being executed by a thread crashes, this function returns the next process for a given thread. You should probably only use this with the *global* context, because it relies on the size of the `:workers` map.
(on-threads f gen)
Wraps a generator, restricting threads which can use it to only those threads which satisfy (f thread). Alters the context passed to the underlying generator: it will only include free threads and workers satisfying f. Updates are passed on only when the thread performing the update matches f.
Wraps a generator, restricting threads which can use it to only those threads which satisfy (f thread). Alters the context passed to the underlying generator: it will only include free threads and workers satisfying f. Updates are passed on only when the thread performing the update matches f.
(on-threads-context f ctx)
Helper function to transform contexts for OnThreads.
Helper function to transform contexts for OnThreads.
(once gen)
Emits only a single item from the underlying generator.
Emits only a single item from the underlying generator.
(phases & generators)
Takes several generators, and constructs a generator which evaluates everything from the first generator, then everything from the second, and so on.
Takes several generators, and constructs a generator which evaluates everything from the first generator, then everything from the second, and so on.
(process->thread context process)
Takes a context and a process, and returns the thread which is executing that process.
Takes a context and a process, and returns the thread which is executing that process.
(process-limit n gen)
Takes a generator and returns a generator with bounded concurrency--it emits operations for up to n distinct processes, but no more.
Specifically, we track the set of all processes in a context's workers
map:
the underlying generator can return operations only from contexts such that
the union of all processes across all such contexts has cardinality at most
n
. Tracking the union of all possible processes, rather than just those
processes actually performing operations, prevents the generator from
"trickling" at the end of a test, i.e. letting only one or two processes
continue to perform ops, rather than the full concurrency of the test.
Takes a generator and returns a generator with bounded concurrency--it emits operations for up to n distinct processes, but no more. Specifically, we track the set of all processes in a context's `workers` map: the underlying generator can return operations only from contexts such that the union of all processes across all such contexts has cardinality at most `n`. Tracking the union of all *possible* processes, rather than just those processes actually performing operations, prevents the generator from "trickling" at the end of a test, i.e. letting only one or two processes continue to perform ops, rather than the full concurrency of the test.
(random-int-seq)
(random-int-seq seed)
Generates a reproducible sequence of random longs, given a random seed. If seed is not provided, taken from (rand-int)).
Generates a reproducible sequence of random longs, given a random seed. If seed is not provided, taken from (rand-int)).
(sleep)
Informally, pauses for dt seconds before yielding nil
for an operation.
Formally, this is sort of a weird one, because everything here is pure, and
there's no notion of blocking. We can delay operations, which have times, but
delaying nil, which does NOT have a time, isn't straightforward.
What we'll need to do, later, is invent a special type of operation map which delays the scheduler and is not passed to clients. Or just refuse to implement this at all. It's nicely readable, but it's semantically kind of weird (Who sleeps? When?) compared to delaying a subsequent operation, which has a well defined behavior.
Informally, pauses for dt seconds before yielding `nil` for an operation. Formally, this is sort of a weird one, because everything here is pure, and there's no notion of blocking. We can delay operations, which have times, but delaying *nil*, which does NOT have a time, isn't straightforward. What we'll need to do, later, is invent a special type of operation map which delays the scheduler and is *not* passed to clients. Or just refuse to implement this at all. It's nicely readable, but it's semantically kind of weird (Who sleeps? When?) compared to delaying a subsequent operation, which has a well defined behavior.
(soonest-op-vec pair1 pair2)
Takes two [op, ...] vectors, and returns the vector whose op occurs first.
Op maps occur before those which are :pending. :pending occurs before nil
.
We use vectors here because you may want to pass [op, gen'] pairs, or possibly encode additional information into the vector, so you can, for instance, identify which of several generators was the next one.
Takes two [op, ...] vectors, and returns the vector whose op occurs first. Op maps occur before those which are :pending. :pending occurs before `nil`. We use vectors here because you may want to pass [op, gen'] pairs, or possibly encode additional information into the vector, so you can, for instance, identify *which* of several generators was the next one.
(stagger dt gen)
Wraps a generator. Operations from that generator are delayed by a uniform random time between 0 to 2 * dt.
Note that unlike jepsen's original version of stagger
, this delay applies
to all operations, not to each thread independently. If your old stagger
dt is 10, and your concurrency is 5, your new stagger dt should be 2.
Wraps a generator. Operations from that generator are delayed by a uniform random time between 0 to 2 * dt. Note that unlike jepsen's original version of `stagger`, this delay applies to *all* operations, not to each thread independently. If your old stagger dt is 10, and your concurrency is 5, your new stagger dt should be 2.
(synchronize gen)
Takes a generator, and waits for all workers to be free before it begins.
Takes a generator, and waits for all workers to be free before it begins.
(then a b)
Generator A, synchronize, then generator B. Note that this takes its arguments backwards: b comes before a. Why? Because it reads better in ->> composition. You can say:
(->> (fn [] {:f :write :value 2})
(limit 3)
(then (once {:f :read})))
Generator A, synchronize, then generator B. Note that this takes its arguments backwards: b comes before a. Why? Because it reads better in ->> composition. You can say: (->> (fn [] {:f :write :value 2}) (limit 3) (then (once {:f :read})))
(time-limit dt gen)
Takes a time in seconds, and an underlying generator. Once this emits an operation (taken from that underlying generator), will only emit operations for dt seconds.
Takes a time in seconds, and an underlying generator. Once this emits an operation (taken from that underlying generator), will only emit operations for dt seconds.
(validate gen)
Validates the well-formedness of operations emitted from the underlying generator.
Validates the well-formedness of operations emitted from the underlying generator.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close