clojure-hadoop.examples.wordcount1

Liking cljdoc? Tell your friends :D

Clojure only.

mapper-map
reducer-reduce
test-wordcount-1
tool-main
tool-run

mapper-map^clj

(mapper-map this key value context)

This is our implementation of the Mapper.map method. The key and value arguments are sub-classes of Hadoop's Writable interface, so we have to convert them to strings or some other type before we can use them. Likewise, we have to call the Context.collect method with objects that are sub-classes of Writable.

This is our implementation of the Mapper.map method.  The key and
value arguments are sub-classes of Hadoop's Writable interface, so
we have to convert them to strings or some other type before we can
use them.  Likewise, we have to call the Context.collect method with
objects that are sub-classes of Writable.

source raw docstring

reducer-reduce^clj

(reducer-reduce this key values context)

This is our implementation of the Reducer.reduce method. The key argument is a sub-class of Hadoop's Writable, but 'values' is a Java Iterator that returns successive values. We have to use iterator-seq to get a Clojure sequence from the Iterator.

Beware, however, that Hadoop re-uses a single object for every object returned by the Iterator. So when you get an object from the iterator, you must extract its value (as we do here with the 'get' method) immediately, before accepting the next value from the iterator. That is, you cannot hang on to past values from the iterator.

This is our implementation of the Reducer.reduce method.  The key
argument is a sub-class of Hadoop's Writable, but 'values' is a Java
Iterator that returns successive values.  We have to use
iterator-seq to get a Clojure sequence from the Iterator.

Beware, however, that Hadoop re-uses a single object for every
object returned by the Iterator.  So when you get an object from the
iterator, you must extract its value (as we do here with the 'get'
method) immediately, before accepting the next value from the
iterator.  That is, you cannot hang on to past values from the
iterator.

source raw docstring

test-wordcount-1^clj

source

tool-main^clj

source

tool-run^clj

(tool-run this args)

This is our implementation of the Tool.run method. args are the command-line arguments as a Java array of strings. We have to create a Job object, set all the MapReduce job parameters, then call the JobClient.runJob static method on it.

This method must return zero on success or Hadoop will report that the job failed.

This is our implementation of the Tool.run method.  args are the
command-line arguments as a Java array of strings.  We have to
create a Job object, set all the MapReduce job parameters, then
call the JobClient.runJob static method on it.

This method must return zero on success or Hadoop will report that
the job failed.

source raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close