No vars found in this namespace.
Macros for fast typed access and unsigned pathways.
Macros for fast typed access and unsigned pathways.
Base set of protocols required to move information from the host to the device as well as enable some form of computation on a given device. There is a cpu implementation provided for reference.
Base datatypes are defined:
Base set of protocols required to move information from the host to the device as well as enable some form of computation on a given device. There is a cpu implementation provided for reference. Base datatypes are defined: * Driver: Enables enumeration of devices and creation of host buffers. * Device: Creates streams and device buffers. * Stream: Stream of execution occuring on the device. * Event: A synchronization primitive emitted in a stream to notify other streams that might be blocking.
Place to store global information about the drivers available to the compute subystem.
Place to store global information about the drivers available to the compute subystem.
Tensor library used to implement the basic math abstraction fairly easily implementable across a wide range of compute devices. This abstraction is meant to provide a language in which to implement some amount of functionalty especially useful in quickly testing out algorithmic updates or moving data to/from external libraries. As such, it has extensive support for reshape/select/transpose type operations but only nominal base math facilities are provided by default.
There is an implicit assumption throughout this file that implementations will loop through smaller entities instead of throwing an exception if sizes don't match. This is referred to as broadcasting in numpy (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).
It does mean, however, that certain conditions that would actually be error cases are harder to detect because one has to check for remainders being zero (which potentially could cause a divide by zero error) instead of just checking for equality.
For binary operations there are four forms:
y = ax op by result = ax op by. y[idx] = ax[idx] op by[idx] result[idx] = ax[idx] op by[idx]
Op may be: [:+ :* :/].
In the non-indexed cases the element counts of y or x may differ but they need to be commensurate meaning that the smaller evenly divides the larger. When writing to result it is important that result is as large as the largest. This is a relaxation of the numpy broadcasting rules to allow more forms of broadcasting; the check is that the remainder is zero; not that the smaller dimension is 1.
In general we want as much error checking and analysis done in this file as opposed to at the implementation level (compute stream level) so that different implementations of this duplicate the least number of possible operations and so their edge cases agree to the extent possible.
Tensor library used to implement the basic math abstraction fairly easily implementable across a wide range of compute devices. This abstraction is meant to provide a language in which to implement some amount of functionalty especially useful in quickly testing out algorithmic updates or moving data to/from external libraries. As such, it has extensive support for reshape/select/transpose type operations but only nominal base math facilities are provided by default. There is an implicit assumption throughout this file that implementations will loop through smaller entities instead of throwing an exception if sizes don't match. This is referred to as broadcasting in numpy (https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). It does mean, however, that certain conditions that would actually be error cases are harder to detect because one has to check for remainders being zero (which potentially could cause a divide by zero error) instead of just checking for equality. For binary operations there are four forms: y = a*x op b*y result = a*x op b*y. y[idx] = a*x[idx] op b*y[idx] result[idx] = a*x[idx] op b*y[idx] Op may be: [:+ :* :/]. In the non-indexed cases the element counts of y or x may differ but they need to be commensurate meaning that the smaller evenly divides the larger. When writing to result it is important that result is as large as the largest. This is a relaxation of the numpy broadcasting rules to allow more forms of broadcasting; the check is that the remainder is zero; not that the smaller dimension is 1. In general we want as much error checking and analysis done in this file as opposed to at the implementation level (compute stream level) so that different implementations of this duplicate the least number of possible operations and so their edge cases agree to the extent possible.
Details of tensor operation implementation
Details of tensor operation implementation
Compute tensors dimensions control the shape and stride of the tensor along with offsetting into the actual data buffer. This allows multiple backends to share a single implementation of a system that will allow transpose, reshape, etc. assuming the backend correctly interprets the shape and stride of the dimension objects.
Shape vectors may have an index buffer in them at a specific dimension instead of a number. This means that that dimension should be indexed indirectly. If a shape has any index buffers then it is considered an indirect shape.
Compute tensors dimensions control the shape and stride of the tensor along with offsetting into the actual data buffer. This allows multiple backends to share a single implementation of a system that will allow transpose, reshape, etc. assuming the backend correctly interprets the shape and stride of the dimension objects. Shape vectors may have an index buffer in them at a specific dimension instead of a number. This means that that dimension should be indexed indirectly. If a shape has any index buffers then it is considered an indirect shape.
Selecting subsets from a larger set of dimensions leads to its own algebra.
Selecting subsets from a larger set of dimensions leads to its own algebra.
A shape vector entry can be a number of things. We want to be precise with handling them and abstract that handling so new things have a clear path.
A shape vector entry can be a number of things. We want to be precise with handling them and abstract that handling so new things have a clear path.
Protocol to abstract implementations from the tensor library. Tensors do not appear in at this level; at this level we have buffers, streams, and index systems. This is intended to allow operations that fall well outside of the tensor definition to happen with clever use of the buffer and index strategy mechanisms. In essence, the point is to make the kernels as flexible as possible so to allow extremely unexpected operations to happen without requiring new kernel creation. In addition the tensor api should be able to stand on some subset of the possible combinations of operations available.
Protocol to abstract implementations from the tensor library. Tensors do not appear in at this level; at this level we have buffers, streams, and index systems. This is intended to allow operations that fall well outside of the tensor definition to happen with clever use of the buffer and index strategy mechanisms. In essence, the point is to make the kernels as flexible as possible so to allow extremely unexpected operations to happen without requiring new kernel creation. In addition the tensor api should be able to stand on some subset of the possible combinations of operations available.
tensor operations with syntatic sugar
tensor operations with syntatic sugar
General protocols specific for tensors. Necessary to break the dependency chain
General protocols specific for tensors. Necessary to break the dependency chain
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close