This document summaries the APIs used to initialize and update the model weights during training
.. autosummary::
:nosignatures:
mxnet.initializer
mxnet.optimizer
mxnet.lr_scheduler
and how to develop a new optimization algorithm in MXNet.
Assume there there is a pre-defined Symbol
and a Module
is created for
it
>>> data = mx.symbol.Variable('data')
>>> label = mx.symbol.Variable('softmax_label')
>>> fc = mx.symbol.FullyConnected(data, name='fc', num_hidden=10)
>>> loss = mx.symbol.SoftmaxOutput(fc, label, name='softmax')
>>> mod = mx.mod.Module(loss)
>>> mod.bind(data_shapes=[('data', (128,20))], label_shapes=[('softmax_label', (128,))])
Next we can initialize the weights with values sampled uniformly from
[-1,1]
:
>>> mod.init_params(mx.initializer.Uniform(scale=1.0))
Then we will train a model with standard SGD which decreases the learning rate by multiplying 0.9 for each 100 batches.
>>> lr_sch = mx.lr_scheduler.FactorScheduler(step=100, factor=0.9)
>>> mod.init_optimizer(
... optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ('lr_scheduler', lr_sch)))
Finally run mod.fit(...)
to start training.
mxnet.initializer
package.. currentmodule:: mxnet.initializer
The base class Initializer
defines the default behaviors to initialize
various parameters, such as set bias to 1, except for the weight. Other classes
then defines how to initialize the weight.
.. autosummary::
:nosignatures:
Initializer
Uniform
Normal
Load
Mixed
Zero
One
Constant
Orthogonal
Xavier
MSRAPrelu
Bilinear
FusedRNN
mxnet.optimizer
package.. currentmodule:: mxnet.optimizer
The base class Optimizer
accepts commonly shared arguments such as
learning_rate
and defines the interface. Each other class in this package
implements one weight updating function.
.. autosummary::
:nosignatures:
Optimizer
SGD
NAG
RMSProp
Adam
AdaGrad
AdaDelta
Adamax
Nadam
DCASGD
SGLD
Signum
FTML
LBSGD
Ftrl
mxnet.lr_scheduler
package.. currentmodule:: mxnet.lr_scheduler
The base class LRScheduler
defines the interface, while other classes
implement various schemes to change the learning rate during training.
.. autosummary::
:nosignatures:
LRScheduler
FactorScheduler
MultiFactorScheduler
Most classes listed in this document are implemented in Python by using NDArray
.
So implementing new weight updating or initialization functions is
straightforward.
For initializer
, create a subclass of Initializer
and define the
_init_weight
method. We can also change the default behaviors to initialize
other parameters such as _init_bias
. See
initializer.py
for examples.
For optimizer
, create a subclass of Optimizer
and implement two methods create_state
and update
. Also add
@mx.optimizer.Optimizer.register
before this class. See
optimizer.py
for examples.
For lr_scheduler
, create a subclass of LRScheduler
and then implement the
__call__
method. See
lr_scheduler.py
for examples.
.. automodule:: mxnet.optimizer
:members:
.. automodule:: mxnet.lr_scheduler
:members:
.. automodule:: mxnet.initializer
:members:
Can you improve this documentation? These fine people already did:
Sheng Zha, Aaron Markham & Yao WangEdit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close