Liking cljdoc? Tell your friends :D

jepsen.independent

Some tests are expensive to check--for instance, linearizability--which requires we verify only short histories. But if histories are short, we may not be able to sample often or long enough to reveal concurrency errors. This namespace supports splitting a test into independent components--for example taking a test of a single register and lifting it to a map of keys to registers.

Some tests are expensive to check--for instance, linearizability--which
requires we verify only short histories. But if histories are short, we may
not be able to sample often or long enough to reveal concurrency errors. This
namespace supports splitting a test into independent components--for example
taking a test of a single register and lifting it to a *map* of keys to
registers.
raw docstring

checkerclj

(checker checker)

Takes a checker that operates on :values like v, and lifts it to a checker that operates on histories with values of [k v] tuples--like those generated by sequential-generator.

We partition the history into (count (distinct keys)) subhistories. The subhistory for key k contains every element from the original history except those whose values are MapEntries with a different key. This means that every history sees, for example, un-keyed nemesis operations or informational logging.

The checker we build is valid iff the given checker is valid for all subhistories. Under the :results key we store a map of keys to the results from the underlying checker on the subhistory for that key. :failures is the subset of that :results map which were not valid.

Takes a checker that operates on :values like `v`, and lifts it to a checker
that operates on histories with values of `[k v]` tuples--like those
generated by `sequential-generator`.

We partition the history into (count (distinct keys)) subhistories. The
subhistory for key k contains every element from the original history
*except* those whose values are MapEntries with a different key. This means
that every history sees, for example, un-keyed nemesis operations or
informational logging.

The checker we build is valid iff the given checker is valid for all
subhistories. Under the :results key we store a map of keys to the results
from the underlying checker on the subhistory for that key. :failures is the
subset of that :results map which were not valid.
raw docstring

concurrent-generatorclj

(concurrent-generator n keys fgen)

Takes a positive integer n, a sequence of keys (k1 k2 ...) and a function (fgen k) which, when called with a key, yields a generator. Returns a generator which runs tests on independent keys concurrently, with n threads per key. Once a key's generator is exhausted, it obtains a new key, constructs a new generator from key, and moves on.

The nemesis does not run in subgenerators; only normal workers evaluate these operations.

The concurrency model may change; I'm not sure what the best way to do it is.

One strategy is to divide the set of processes into n groups, and have each group focus on one key. This might interact poorly with gen/reserve, which limits some processe to specific generators. We can still run with gen/reserve, but users will have to know that if their concurrency is, say, 100 and they have n=10, then inside an independent generator they only have 10 threads to work with. We also can't have n > concurrency, because there wouldn't be enough processes, and realistically you want a process for each node minimum, so concurrency=100 with 5 nodes could run twenty keys at most.

Another tactic is to have each process cycle through the various active keys. This is safe, because once the checker strains through the history, each subhistory consists of the full set of processes doing what it should. We can also move to arbitrarily high concurrencies. What concerns me is timing: if one key does an expensive operation, it's gonna prevent any other key from scheduling an op on that process until it comes back. This could mask latency anomalies in a subhistory, because some processes won't even be making requests when they're tied up working on other keys. We could, I think, possibly miss concurrency errors by having processes stuck doing other keys work, and we can't give you feedback that the test's resolving power has been compromised. Things like delay would have to be rewritten to take into account process/thread timesharing.

Worse yet, imagine we use an inter-process synchronizer, e.g. gen/phases. Then the second strategy could actually deadlock.

I would rather not write my own thread scheduler, so, we're gonna do the first strategy and keep using the Java synchronization primitives. We'll split the process set into n distinct pools. Contiguous, because Jepsen stripes processes across nodes mod node-count. Generators inside will run with a reduced test :concurrency, and with a rebound value of gen/threads, so barriers work independently for each key.

I think this coupling is kinda gross and suggests a fundamental rewrite of jepsen.core and the generator implementation might be needed.

Takes a positive integer n, a sequence of keys (k1 k2 ...) and a function
(fgen k) which, when called with a key, yields a generator. Returns a
generator which runs tests on independent keys concurrently, with n threads
per key. Once a key's generator is exhausted, it obtains a new key,
constructs a new generator from key, and moves on.

The nemesis does not run in subgenerators; only normal workers evaluate these
operations.

The concurrency model may change; I'm not sure what the best way to do it is.

One strategy is to divide the set of processes into n groups, and have each
group focus on one key. This might interact poorly with `gen/reserve`, which
limits some processe to specific generators. We can still run with
gen/reserve, but users will have to know that if their concurrency is, say,
100 and they have n=10, then *inside* an independent generator they only have
10 threads to work with. We also can't have n > concurrency, because there
wouldn't be enough processes, and realistically you want a process for each
node minimum, so concurrency=100 with 5 nodes could run twenty keys at most.

Another tactic is to have each process cycle through the various active keys.
This is *safe*, because once the checker strains through the history, each
subhistory consists of the full set of processes doing what it should. We can
also move to arbitrarily high concurrencies. What concerns me is *timing*: if
one key does an expensive operation, it's gonna prevent *any other* key from
scheduling an op on that process until it comes back. This could mask latency
anomalies in a subhistory, because some processes won't even be *making*
requests when they're tied up working on other keys. We could, I think,
possibly miss concurrency errors by having processes stuck doing other keys
work, and we can't give you feedback that the test's resolving power has been
compromised. Things like `delay` would have to be rewritten to take into
account process/thread timesharing.

Worse yet, imagine we use an inter-process synchronizer, e.g. gen/phases.
Then the second strategy could actually *deadlock*.

I would rather not write my own thread scheduler, so, we're gonna do the
first strategy and keep using the Java synchronization primitives. We'll
split the process set into n distinct pools. Contiguous, because Jepsen
stripes processes across nodes mod node-count. Generators inside will run
with a reduced test :concurrency, and with a rebound value of gen/*threads*,
so barriers work independently for each key.

I think this coupling is kinda gross and suggests a fundamental rewrite of
jepsen.core and the generator implementation might be needed.
raw docstring

dirclj

What directory should we write independent results to?

What directory should we write independent results to?
raw docstring

history-keysclj

(history-keys history)

Takes a history and returns the set of keys in it.

Takes a history and returns the set of keys in it.
raw docstring

sequential-generatorclj

(sequential-generator keys fgen)

Takes a sequence of keys (k1 k2 ...), and a function (fgen k) which, when called with a key, yields a generator. Returns a generator which starts with the first key k1 and constructs a generator gen1 from (fgen k1). Returns elements from gen1 until it is exhausted, then moves to k2.

The generator wraps each :value in the operations it generates. Let (:value (op gen1)) be v; then the generator we construct yields the kv tuple [k1 v].

fgen must be pure and idempotent.

Takes a sequence of keys (k1 k2 ...), and a function (fgen k) which, when
called with a key, yields a generator. Returns a generator which starts with
the first key k1 and constructs a generator gen1 from (fgen k1). Returns
elements from gen1 until it is exhausted, then moves to k2.

The generator wraps each :value in the operations it generates. Let (:value
(op gen1)) be v; then the generator we construct yields the kv tuple [k1 v].

fgen must be pure and idempotent.
raw docstring

subhistoryclj

(subhistory k history)

Takes a history and a key k and yields the subhistory composed of all ops in history which do not have values with a differing key, unwrapping tuples to their original values.

Takes a history and a key k and yields the subhistory composed of all ops in
history which do not have values with a differing key, unwrapping tuples to
their original values.
raw docstring

tupleclj

(tuple k v)

Constructs a kv tuple

Constructs a kv tuple
raw docstring

tuple?clj

(tuple? value)

Is the given value generated by an independent generator?

Is the given value generated by an independent generator?
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close