Tool used to crawl web pages with an aritrary handler.
Tool used to crawl web pages with an aritrary handler.
(add-worker config)
Given a config object, add a worker to the pool, returns the new worker count.
Given a config object, add a worker to the pool, returns the new worker count.
(crawl options)
Crawl a url with the given config.
Crawl a url with the given config.
(enqueue-url config url)
Enqueue the url assuming the url-count is below the limit and we haven't seen this url before.
Enqueue the url assuming the url-count is below the limit and we haven't seen this url before.
(enqueue-urls config urls)
Enqueue a collection of urls for work
Enqueue a collection of urls for work
(extract-all original-url body)
Dumb URL extraction based on regular expressions. Extracts relative URLs.
Dumb URL extraction based on regular expressions. Extracts relative URLs.
(palm {:keys [url body]})
It's a HAND-ler.. get it? That was a terrible pun.
It's a HAND-ler.. get it? That was a terrible pun.
(remove-worker config)
Given a config object, remove a worker from the pool, returns the new worker count.
Given a config object, remove a worker from the pool, returns the new worker count.
(start-worker config)
Start a worker thread for a config object, updating the config's state with the new Thread object.
Start a worker thread for a config object, updating the config's state with the new Thread object.
(stop-workers config)
Given a config object, stop all the workers for that config.
Given a config object, stop all the workers for that config.
(thread-status config)
Return a map of threadId to Thread.State for a config object.
Return a map of threadId to Thread.State for a config object.
(valid-url? url-str)
Test whether a URL is valid, returning a map of information about it if valid, nil otherwise.
Test whether a URL is valid, returning a map of information about it if valid, nil otherwise.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close