Liking cljdoc? Tell your friends :D

Shelob

Shelob wraps Selenium to let you browse a website and scrape its contents.

Rationale

Selenium automates web browsing, primarily for testing and administration purposes.

Shelob wraps Selenium to make it more idiomatic and adherent to the Clojure way of coding, and exposes facilities to scrape web pages.

Example

A simple DuckDuckGo search:

  • Type "clojure" on the search field

  • Click on the magnify glass to perform the search

  • Retrieve the URLs of the visible results

(require '[shelob.core :as sh])
(require '[shelob.browser :as shb])
(require '[shelob.scraper :as shs])

(def context
  {:driver-options {:browser :firefox}
   :pool-size 2
   :init-messages [{:msg :go :url "https://duckduckgo.com/"}]}))

(defn scrape-result
  [document]
  (map shs/text (shs/select document ".result__url__domain")))

(defn example
  []
  (sh/init context)
  (let [msg [{:msg :fill
              :locator (shb/by-css-selector "#search_form_input_homepage")
              :text "Clojure"}
             {:msg :click :locator (shb/by-css-selector "#search_button_homepage")}]]
    (sh/send-message context scrape-result msg))
  (sh/stop))

Running (example) results in:

user> [https://clojure.org https://en.wikipedia.org https://www.clojure.org
https://github.com https://learnxinyminutes.com https://www.reddit.com
https://cursive-ide.com https://marketplace.visualstudio.com https://github.com
https://leiningen.org]

License

Copyright © 2019 7bridges s.r.l. — Distributed under the Apache License 2.0.

Can you improve this documentation?Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close