Parallelized context tree traversal.
First, some definitions (sketchy – some details omitted):
::handler
, describing a handler that you may run on it.Now imagine that we have a root context. We can run its ::handler
on it, obtaining a series of child contexts. If these contexts in
turn contain their own ::handler
s, we can invoke each on its
associated context, obtaining another series of grandchild contexts.
Repeatedly applying this process gives rise to a tree, called a
context tree.
We call that tree implicit because it is never reified as a whole in the process; rather, its nodes are computed individually.
This ns implements context tree traversal parallelized using core.async, with the following provisos:
A handler can be either synchronous (in which case it's a function
taking context and returning seq of contexts) or asynchronous (in
which case it takes a seq of contexts and a callback, should return
immediately, and should arrange for that callback to be called with
a list of return contexts when it's ready). Whether a handler is
synchronous or asynchronous depends on a context's ::call-protocol
.
It supports context priorities, letting you control the order in which
the context tree nodes will be visited. These are specified by the
::priority
context key: the less the number, the higher the priority.
Parallelized context tree traversal. First, some definitions (sketchy – some details omitted): 1. A _handler_ is a function taking a map and returning a seq of maps (or a symbol naming such a function). 2. A _context_ is a map that may contain a special key, `::handler`, describing a handler that you may run on it. Now imagine that we have a root context. We can run its `::handler` on it, obtaining a series of child contexts. If these contexts in turn contain their own `::handler`s, we can invoke each on its associated context, obtaining another series of grandchild contexts. Repeatedly applying this process gives rise to a tree, called a _context tree_. We call that tree _implicit_ because it is never reified as a whole in the process; rather, its nodes are computed individually. This ns implements context tree traversal parallelized using core.async, with the following provisos: - A handler can be either synchronous (in which case it's a function taking context and returning seq of contexts) or asynchronous (in which case it takes a seq of contexts and a callback, should return immediately, and should arrange for that callback to be called with a list of return contexts when it's ready). Whether a handler is synchronous or asynchronous depends on a context's `::call-protocol`. - It supports context priorities, letting you control the order in which the context tree nodes will be visited. These are specified by the `::priority` context key: the less the number, the higher the priority.
(close-all! channels)
Closes channels used by the traversal process. Call this function
after wait!
returns.
Closes channels used by the traversal process. Call this function after `wait!` returns.
(launch seed options)
Launches a parallel tree traversal. Spins up a number of core.async
threads that actually perform it, then immediately returns a map of
channels used to orchestrate the process – most importantly,
:terminate-chan
will be closed when the process completes.
options
is a map that may include:
:leaf-chan a channel where seqs of tree leaves will be put (default nil) :item-chan a channel where seqs of tree nodes will be put (default nil) :parallelism number of worker threads to create (default 4) :prioritize? take into account ::priority values (default false)
To wait until traversal is complete, use wait!
. Also, remember to
use close-all!
to close the channels returned by this
function. See traverse!
or chan->seq
for an example of how to
put it together.
Launches a parallel tree traversal. Spins up a number of core.async threads that actually perform it, then immediately returns a map of channels used to orchestrate the process – most importantly, `:terminate-chan` will be closed when the process completes. `options` is a map that may include: :leaf-chan a channel where seqs of tree leaves will be put (default nil) :item-chan a channel where seqs of tree nodes will be put (default nil) :parallelism number of worker threads to create (default 4) :prioritize? take into account ::priority values (default false) To wait until traversal is complete, use `wait!`. Also, remember to use `close-all!` to close the channels returned by this function. See `traverse!` or `chan->seq` for an example of how to put it together.
(leaf-seq seed options)
Returns a lazy seq of leaf nodes from a tree traversal. Any channels created will be automatically closed when the seq is fully consumed.
Returns a lazy seq of leaf nodes from a tree traversal. Any channels created will be automatically closed when the seq is fully consumed.
(traverse! seed options)
Traverses a tree and returns after the process is complete.
Parameters are the same as in launch
.
Traverses a tree and returns after the process is complete. Parameters are the same as in `launch`.
(wait! {:keys [terminate-chan]})
Waits until the scraping process is complete.
Waits until the scraping process is complete.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close