(create-and-apply pcoll name clj-call options)
Inputs: [pcoll :- pcollections/PCollectionType name :- s/Str clj-call :- clj-fn-call/CljCall options :- #:s{Keyword s/Any}] Returns: pcollections/PCollectionType
Create a ParDo operation from a DoFn class, and apply it to a PCollection.
It accepts an options
map parameter that describes how the ParDo is created. You can specify the following keys:
:dofn-cls - which DoFn class is created for the ParDo. For now, it is required that the class inherits from
clj_headlights.AbstractCljDoFn
, because the constructor invocation is hardcoded. Create a CljDoFn by
default.
:outputs - a map associating the (tagged) outputs of the ParDo with their respective coders. This map should always
contain at least a :main key, that is used for the main output coder.
:side-inputs - a collection of side-inputs the ParDo needs. As Beam requires, the given side-inputs need to extend
PCollectionView. Note: this only attaches the views to the ParDo; it is up to you to carry around the
same view object in your code to access the view data.
Inputs: [pcoll :- pcollections/PCollectionType name :- s/Str clj-call :- clj-fn-call/CljCall options :- #:s{Keyword s/Any}] Returns: pcollections/PCollectionType Create a ParDo operation from a DoFn class, and apply it to a PCollection. It accepts an `options` map parameter that describes how the ParDo is created. You can specify the following keys: :dofn-cls - which DoFn class is created for the ParDo. For now, it is required that the class inherits from `clj_headlights.AbstractCljDoFn`, because the constructor invocation is hardcoded. Create a CljDoFn by default. :outputs - a map associating the (tagged) outputs of the ParDo with their respective coders. This map should always contain at least a :main key, that is used for the main output coder. :side-inputs - a collection of side-inputs the ParDo needs. As Beam requires, the given side-inputs need to extend PCollectionView. Note: this only attaches the views to the ParDo; it is up to you to carry around the same view object in your code to access the view data.
(emit-main-output context value)
Emit the main output value
Emit the main output value
(emit-side-output context tag value)
Emit a value to a side output.
Emit a value to a side output.
(get-side-output pcoll tag)
Retrieve the pcollection associated with a given tag in the output of a df-map-with-side-outputs.
Retrieve the pcollection associated with a given tag in the output of a df-map-with-side-outputs.
(get-side-outputs pcoll)
Retrieve the map of Tags to PCollections.
Retrieve the map of Tags to PCollections.
(input-restructurer coder)
Inputs: [coder :- Coder] Returns: s/Any
Given the coder of the input, create a function which pulls the input value from the context and turns it into clojure data structures.
Inputs: [coder :- Coder] Returns: s/Any Given the coder of the input, create a function which pulls the input value from the context and turns it into clojure data structures.
(kv-value-restructurer coder)
Inputs: [coder :- Coder] Returns: s/Any
If the output of the previous stage was a KV, then it may have been the result of a GroupBy or CoGroupBy, which means we need to transform the output of those operations into more idiomatic clojure data structures. For a GroupBy we wrap the list of results in a seq, so that it acts like a normal list. For a CoGroupBy we create a positional vector, where the results for each pcoll that was grouped are in the same position they were sent to co-group-by-key.
Inputs: [coder :- Coder] Returns: s/Any If the output of the previous stage was a KV, then it may have been the result of a GroupBy or CoGroupBy, which means we need to transform the output of those operations into more idiomatic clojure data structures. For a GroupBy we wrap the list of results in a seq, so that it acts like a normal list. For a CoGroupBy we create a positional vector, where the results for each pcoll that was grouped are in the same position they were sent to co-group-by-key.
(make-tags-list tags)
Inputs: [tags :- [s/Keyword]] Returns: TupleTagList
Inputs: [tags :- [s/Keyword]] Returns: TupleTagList
(process-element c
window
serialized-clj-call
creation-stack
input-extractor
&
states)
(set-side-output-coder pcoll tag coder)
Inputs: [pcoll :- PCollectionTuple tag :- s/Keyword coder :- Coder] Returns: PCollection
Sets the coder for the pcollection associated with a given tag in the output of a df-map-with-side-outputs.
Inputs: [pcoll :- PCollectionTuple tag :- s/Keyword coder :- Coder] Returns: PCollection Sets the coder for the pcollection associated with a given tag in the output of a df-map-with-side-outputs.
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close