org.apache.clojure-mxnet.optimizer

Liking cljdoc? Tell your friends :D

Clojure only.

ada-delta
ada-grad
adam
create-state
dcasgd
nag
rms-prop
sgd
sgld
update

ada-delta^clj

(ada-delta
  {:keys [rho rescale-gradient epsilon wd clip-gradient]
   :as opts
   :or {rho 0.05 rescale-gradient 1.0 epsilon 1.0E-8 wd 0.0 clip-gradient 0}})

AdaDelta optimizer as described in Matthew D. Zeiler, 2012. http://arxiv.org/abs/1212.5701

AdaDelta optimizer as described in Matthew D. Zeiler, 2012.
http://arxiv.org/abs/1212.5701

raw docstring

ada-grad^clj

(ada-grad)

(ada-grad {:keys [learning-rate rescale-gradient epsilon wd]
           :or {learning-rate 0.05 rescale-gradient 1.0 epsilon 1.0E-7 wd 0.0}})

AdaGrad optimizer as described in Matthew D. Zeiler, 2012. http://arxiv.org/pdf/1212.5701v1.pdf

learning-rate Step size.
epsilon A small number to make the updating processing stable. Default value is set to 1e-7.
rescale-gradient rescaling factor of gradient.
wd L2 regularization coefficient add to all the weights

 AdaGrad optimizer as described in Matthew D. Zeiler, 2012.
http://arxiv.org/pdf/1212.5701v1.pdf

- learning-rate Step size.
- epsilon A small number to make the updating processing stable.
             Default value is set to 1e-7.
- rescale-gradient rescaling factor of gradient.
- wd L2 regularization coefficient add to all the weights

raw docstring

adam^clj

(adam)

(adam {:keys [learning-rate beta1 beta2 epsilon decay-factor wd clip-gradient
              lr-scheduler]
       :or {learning-rate 0.002
            beta1 0.9
            beta2 0.999
            epsilon 1.0E-8
            decay-factor (- 1 1.0E-8)
            wd 0
            clip-gradient 0}})

Adam optimizer as described in [King2014]

[King2014] Diederik Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization, http://arxiv.org/abs/1412.6980

learning-rate Step size.
beta1 Exponential decay rate for the first moment estimates.
beta2 Exponential decay rate for the second moment estimates.
epsilon
decay-factor
wd L2 regularization coefficient add to all the weights
clip-gradient clip gradient in range [-clip_gradient, clip_gradient]
lr-scheduler The learning rate scheduler

Adam optimizer as described in [King2014]

[King2014] Diederik Kingma, Jimmy Ba,
Adam: A Method for Stochastic Optimization,
http://arxiv.org/abs/1412.6980

 - learning-rate  Step size.
 - beta1  Exponential decay rate for the first moment estimates.
 - beta2  Exponential decay rate for the second moment estimates.
 -  epsilon
 - decay-factor
 - wd L2 regularization coefficient add to all the weights
 - clip-gradient  clip gradient in range [-clip_gradient, clip_gradient]
 - lr-scheduler The learning rate scheduler

raw docstring

create-state^clj

(create-state optimizer index weight)

Create additional optimizer state such as momentum.

Create additional optimizer state such as momentum.

raw docstring

dcasgd^clj

(dcasgd)

(dcasgd
  {:keys [learning-rate momentum lambda wd clip-gradient lr-scheduler]
   :as opts
   :or {learning-rate 0.01 momentum 0.0 lambda 0.04 wd 0.0 clip-gradient 0}})

DCASGD optimizer with momentum and weight regularization. Implementation of paper 'Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning'

DCASGD optimizer with momentum and weight regularization.
Implementation of paper 'Asynchronous Stochastic Gradient Descent with
Delay Compensation for Distributed Deep Learning'

raw docstring

nag^clj

(nag)

(nag {:keys [learning-rate momentum wd clip-gradient lr-scheduler]
      :as opts
      :or {learning-rate 0.01 momentum 0.0 wd 1.0E-4 clip-gradient 0}})

SGD with nesterov. It is implemented according to https://github.com/torch/optim/blob/master/sgd.lua

SGD with nesterov.
It is implemented according to
https://github.com/torch/optim/blob/master/sgd.lua

raw docstring

rms-prop^clj

(rms-prop)

(rms-prop {:keys [learning-rate rescale-gradient gamma1 gamma2 wd lr-scheduler
                  clip-gradient]
           :or {learning-rate 0.002
                rescale-gradient 1.0
                gamma1 0.95
                gamma2 0.9
                wd 0.0
                clip-gradient 0}})

RMSProp optimizer as described in Tieleman & Hinton, 2012. http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.

learningRate Step size.
gamma1 decay factor of moving average for gradient, gradient^^2.
gamma2 momentum factor of moving average for gradient.
rescale-gradient rescaling factor of gradient.
wd L2 regularization coefficient add to all the weights
clip-gradient clip gradient in range [-clip_gradient, clip_gradient]
lr-scheduler The learning rate scheduler

RMSProp optimizer as described in Tieleman & Hinton, 2012.
http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.
- learningRate Step size.
- gamma1  decay factor of moving average for gradient, gradient^^2.
-  gamma2  momentum factor of moving average for gradient.
-  rescale-gradient rescaling factor of gradient.
-  wd L2 regularization coefficient add to all the weights
-  clip-gradient clip gradient in range [-clip_gradient, clip_gradient]
-  lr-scheduler The learning rate scheduler

raw docstring

sgd^clj

(sgd)

(sgd {:keys [learning-rate momentum wd clip-gradient lr-scheduler]
      :as opts
      :or {learning-rate 0.01 momentum 0.0 wd 1.0E-4 clip-gradient 0}})

A very simple SGD optimizer with momentum and weight regularization.

A very simple SGD optimizer with momentum and weight regularization.

raw docstring

sgld^clj

(sgld)

(sgld {:keys [learning-rate rescale-gradient wd clip-gradient lr-scheduler]
       :or {learning-rate 0.01 rescale-gradient 1 wd 1.0E-4 clip-gradient 0}})

Stochastic Langevin Dynamics Updater to sample from a distribution.

learning-rate Step size.
rescale-gradient rescaling factor of gradient.
wd L2 regularization coefficient add to all the weights
clip-gradient Float, clip gradient in range [-clip_gradient, clip_gradient]
lr-scheduler The learning rate scheduler

Stochastic Langevin Dynamics Updater to sample from a distribution.

- learning-rate Step size.
- rescale-gradient rescaling factor of gradient.
- wd L2 regularization coefficient add to all the weights
- clip-gradient Float, clip gradient in range [-clip_gradient, clip_gradient]
- lr-scheduler The learning rate scheduler

raw docstring

update^clj

(update optimizer index weight grad state)

Update the parameters.

optimizer - the optimizer
index An unique integer key used to index the parameters
weight weight ndarray
grad grad ndarray
state NDArray or other objects returned by initState The auxiliary state used in optimization.

Update the parameters.
- optimizer - the optimizer
-  index An unique integer key used to index the parameters
-  weight weight ndarray
-  grad grad ndarray
 -  state NDArray or other objects returned by initState
          The auxiliary state used in optimization.

raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close