The tensor abstraction is designed to provide a mid level math library built on top of the abstractions defined in the compute layer. The tensor system is built around a few key concepts, notably:
The tensor abstraction is built upon three main components:
Logically, a tensor is a simple map of
{:buffer data-buffer
:description description}
The description encodes information such as shape (A vector of dimensions) and strides (a vector of ...strides...). Each backend promises to obey the rules set in the description. This means that for example an in-place transpose operation looks like:
(defn transpose
"Transpose the dimensions. Returns a new dimensions that will access memory in a transposed order."
[{:keys [shape strides]} reorder-vec]
(when-not-error (= (count (distinct reorder-vec))
(count shape))
"Every dimension must be represented in the reorder vector"
{:shape shape
:reorder-vec reorder-vec})
(let [shape (mapv #(get shape %) reorder-vec)
strides (mapv #(get strides %) reorder-vec)]
{:shape shapee
:strides strides}))
This is the complete code. The backend contract specifies that it only needs to obey the description; we should be able to increase the number of backends without needing to implement transpose again for each backend.
This is one direct benefit of the separation of data from description. If we agree that the description describes the precise interpretation of the data and we modify the description in accordance to the contract then we do not have to change backend code at all in order to affect the way the data is interpreted.
There are a number typical operations that can be implemented in the dimension.clj namespace such as:
The second major direct benefit of the combination of the separation of data and description plus the buffer offsetting mechanism described in the compute document is that it enables algorithms such as:
Allocate the base buffer for all parameter and gradient buffers at once and then use offsetting and descriptions to assign sub regions and specific shapes for each parameter and gradient buffer. Then your optimization pass needs to optimize exactly 1 buffer as optimization is currently a linearly independent operation of the gradient, parameters, and the optimizer parameters.
Allocate one buffer and create multiple tensors that all exist at the same base address of the buffer.
Unary and binary operations obey a principle that they place their result into destination buffers and the destination buffer may be involved in the operation itself. When programmed on a GPU this requires either a reduction operation or the use of the CAS primitive. They are defined for all datatypes but none of the operations allow of marshalling as this would explode the space of function signatures needed for all types.
y = op(a * x)
y = op(a * x, b * z)
y = op(a * x, b * w, c * z)
All of these operations allow any or all of the operands to be scalars. They also all allow the generalized form of broadcasting described above for any operand including the destination. The only restriction is that if the destination is smaller than the operation (meaning the destination is being broadcast) then the operation is only defined for the datatypes for which CAS is defined; those are 4 and 8 byte operands only.
These are accessible through the tensor unary-op! binary-op! and ternary-op! functions respectively.
To add a new operation one needs basically 5 steps:
Your new operation is now setup and will work across all supported broadcast patterns and potentially all datatypes.
Broadcasting is a way of indexing through multple-operand functions that allows things such as:
Here is some good documentation
The rules for broadcasting in the tensor system are:
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close