All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog
and this project adheres to Semantic Versioning.
(layers/convolutional 3 0 1 64 :parents [:split-1])
will work now.
- Dropout is implemented in tensors (fewer cuda kernels!!)
- pooling layer is implemented in tensors.
- Non-overlapping nms algorithm for yolo for cases where you know things cannot overlap.
- Metrics for rating object detection systems.
- Special NMS algorithm used for yolo is now in and unit tested.
##[0.9.14]
- Yolo-style loss implemented with the tensor framework.
- Many optimizations and bugfixes around the tensor system.
- Lots of fast paths of the tensor system mapped to cudnn functions.
- Resnet50 optimizations - Memory significantly decreased (batch size of 32 possible in well under 1G video RAM).
- Resnet50 optimizations - Elide split when doing inference; simply reuse buffer without any copy operations.
- Resnet50 optimizations - GPU now pegged at 100% while training; batch upload happening during compute 100% of the time.
- Memory leak calling cuda kernels (!!)
- Small fix to ensure compilation in clojure-1.9 works properly
- Batch normalization could produce NAN in some cases.
- "Censor" loss to prevent propagating gradients when labels are unknown
- model-upgrader project to upgrade models from older versions of cortex
- orthogonal weight initialization #178
- tensorboard view #172
- Loss functions are moved to their individual files to be consistent with optimizer layout
- Only save base java types in file. This avoids incompatibility issues over time and upgrades #163
- Dependencies updated to reduce and use latest version possible of most libraries.
- thread colorspace into experiment so the mnist framework can be used for color images #162.
- inferring and training were subtly broken.
- Bugfixes in the classifcation example.
- CPU-only support. Cortex can now run on the CPU without CUDA drivers being installed.
- docker-example -- A simple example of how to run a cortex project in a docker container.
- multi-thread -- The execution context now supports specifying the device, allowing for more advanced asynchronous computations like pipeline parallelism and using multiple devices.