Liking cljdoc? Tell your friends :D

org.apache.clojure-mxnet.optimizer


ada-deltaclj

(ada-delta
  {:keys [rho rescale-gradient epsilon wd clip-gradient]
   :as opts
   :or {rho 0.05 rescale-gradient 1.0 epsilon 1.0E-8 wd 0.0 clip-gradient 0}})

AdaDelta optimizer as described in Matthew D. Zeiler, 2012. http://arxiv.org/abs/1212.5701

AdaDelta optimizer as described in Matthew D. Zeiler, 2012.
http://arxiv.org/abs/1212.5701
raw docstring

ada-gradclj

(ada-grad)
(ada-grad {:keys [learning-rate rescale-gradient epsilon wd]
           :or {learning-rate 0.05 rescale-gradient 1.0 epsilon 1.0E-7 wd 0.0}})

AdaGrad optimizer as described in Matthew D. Zeiler, 2012. http://arxiv.org/pdf/1212.5701v1.pdf

  • learning-rate Step size.
  • epsilon A small number to make the updating processing stable. Default value is set to 1e-7.
  • rescale-gradient rescaling factor of gradient.
  • wd L2 regularization coefficient add to all the weights
 AdaGrad optimizer as described in Matthew D. Zeiler, 2012.
http://arxiv.org/pdf/1212.5701v1.pdf

- learning-rate Step size.
- epsilon A small number to make the updating processing stable.
             Default value is set to 1e-7.
- rescale-gradient rescaling factor of gradient.
- wd L2 regularization coefficient add to all the weights
raw docstring

adamclj

(adam)
(adam {:keys [learning-rate beta1 beta2 epsilon decay-factor wd clip-gradient
              lr-scheduler]
       :or {learning-rate 0.002
            beta1 0.9
            beta2 0.999
            epsilon 1.0E-8
            decay-factor (- 1 1.0E-8)
            wd 0
            clip-gradient 0}})

Adam optimizer as described in [King2014]

[King2014] Diederik Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, http://arxiv.org/abs/1412.6980

  • learning-rate Step size.
  • beta1 Exponential decay rate for the first moment estimates.
  • beta2 Exponential decay rate for the second moment estimates.
  • epsilon
  • decay-factor
  • wd L2 regularization coefficient add to all the weights
  • clip-gradient clip gradient in range [-clip_gradient, clip_gradient]
  • lr-scheduler The learning rate scheduler
Adam optimizer as described in [King2014]

[King2014] Diederik Kingma, Jimmy Ba,
Adam: A Method for Stochastic Optimization,
http://arxiv.org/abs/1412.6980

 - learning-rate  Step size.
 - beta1  Exponential decay rate for the first moment estimates.
 - beta2  Exponential decay rate for the second moment estimates.
 -  epsilon
 - decay-factor
 - wd L2 regularization coefficient add to all the weights
 - clip-gradient  clip gradient in range [-clip_gradient, clip_gradient]
 - lr-scheduler The learning rate scheduler
raw docstring

create-stateclj

(create-state optimizer index weight)

Create additional optimizer state such as momentum.

Create additional optimizer state such as momentum.
raw docstring

dcasgdclj

(dcasgd)
(dcasgd
  {:keys [learning-rate momentum lambda wd clip-gradient lr-scheduler]
   :as opts
   :or {learning-rate 0.01 momentum 0.0 lambda 0.04 wd 0.0 clip-gradient 0}})

DCASGD optimizer with momentum and weight regularization. Implementation of paper 'Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning'

DCASGD optimizer with momentum and weight regularization.
Implementation of paper 'Asynchronous Stochastic Gradient Descent with
Delay Compensation for Distributed Deep Learning'
raw docstring

nagclj

(nag)
(nag {:keys [learning-rate momentum wd clip-gradient lr-scheduler]
      :as opts
      :or {learning-rate 0.01 momentum 0.0 wd 1.0E-4 clip-gradient 0}})

SGD with nesterov. It is implemented according to https://github.com/torch/optim/blob/master/sgd.lua

SGD with nesterov.
It is implemented according to
https://github.com/torch/optim/blob/master/sgd.lua
raw docstring

rms-propclj

(rms-prop)
(rms-prop {:keys [learning-rate rescale-gradient gamma1 gamma2 wd lr-scheduler
                  clip-gradient]
           :or {learning-rate 0.002
                rescale-gradient 1.0
                gamma1 0.95
                gamma2 0.9
                wd 0.0
                clip-gradient 0}})

RMSProp optimizer as described in Tieleman & Hinton, 2012. http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.

  • learningRate Step size.
  • gamma1 decay factor of moving average for gradient, gradient^^2.
  • gamma2 momentum factor of moving average for gradient.
  • rescale-gradient rescaling factor of gradient.
  • wd L2 regularization coefficient add to all the weights
  • clip-gradient clip gradient in range [-clip_gradient, clip_gradient]
  • lr-scheduler The learning rate scheduler
RMSProp optimizer as described in Tieleman & Hinton, 2012.
http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.
- learningRate Step size.
- gamma1  decay factor of moving average for gradient, gradient^^2.
-  gamma2  momentum factor of moving average for gradient.
-  rescale-gradient rescaling factor of gradient.
-  wd L2 regularization coefficient add to all the weights
-  clip-gradient clip gradient in range [-clip_gradient, clip_gradient]
-  lr-scheduler The learning rate scheduler
raw docstring

sgdclj

(sgd)
(sgd {:keys [learning-rate momentum wd clip-gradient lr-scheduler]
      :as opts
      :or {learning-rate 0.01 momentum 0.0 wd 1.0E-4 clip-gradient 0}})

A very simple SGD optimizer with momentum and weight regularization.

A very simple SGD optimizer with momentum and weight regularization.
raw docstring

sgldclj

(sgld)
(sgld {:keys [learning-rate rescale-gradient wd clip-gradient lr-scheduler]
       :or {learning-rate 0.01 rescale-gradient 1 wd 1.0E-4 clip-gradient 0}})

Stochastic Langevin Dynamics Updater to sample from a distribution.

  • learning-rate Step size.
  • rescale-gradient rescaling factor of gradient.
  • wd L2 regularization coefficient add to all the weights
  • clip-gradient Float, clip gradient in range [-clip_gradient, clip_gradient]
  • lr-scheduler The learning rate scheduler
Stochastic Langevin Dynamics Updater to sample from a distribution.

- learning-rate Step size.
- rescale-gradient rescaling factor of gradient.
- wd L2 regularization coefficient add to all the weights
- clip-gradient Float, clip gradient in range [-clip_gradient, clip_gradient]
- lr-scheduler The learning rate scheduler
raw docstring

updateclj

(update optimizer index weight grad state)

Update the parameters.

  • optimizer - the optimizer
  • index An unique integer key used to index the parameters
  • weight weight ndarray
  • grad grad ndarray
  • state NDArray or other objects returned by initState The auxiliary state used in optimization.
Update the parameters.
- optimizer - the optimizer
-  index An unique integer key used to index the parameters
-  weight weight ndarray
-  grad grad ndarray
 -  state NDArray or other objects returned by initState
          The auxiliary state used in optimization.
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close