Key topics covered include the following:
The converting tool is available at
tools/caffe_converter. On
the remaining of this section, we assume we are on the tools/caffe_converter
directory.
If Caffe's python package is installed, namely we can run import caffe
in
python, then we are ready to go.
For example, we can used AWS Deep Learning AMI with both Caffe and MXNet installed.
Otherwise we can install the Google protobuf compiler and its python binding. It is easier to install, but may be slower during running.
protobuf-compiler
e.g. sudo apt-get install protobuf-compiler
for Ubuntu and sudo yum install protobuf-compiler
for
Redhat/Fedora.PATH
brew install protobuf
Install the python binding by either conda install -c conda-forge protobuf
or pip install protobuf
.
Compile Caffe proto definition. Run make
in Linux or Mac OS X, or
make_win32.bat
in Windows
There are three tools:
convert_symbol.py
: convert Caffe model definition in protobuf into MXNet's
Symbol in JSON format.convert_model.py
: convert Caffe model parameters into MXNet's NDArray formatconvert_mean.py
: convert Caffe input mean file into MXNet's NDArray formatIn addition, there are two tools:
convert_caffe_modelzoo.py
: download and convert models from Caffe model
zoo.test_converter.py
: test the converted models by checking the prediction
accuracy.Besides converting Caffe models, MXNet supports calling most Caffe operators, including network layer, data layer, and loss function, directly. It is particularly useful if there are customized operators implemented in Caffe, then we do not need to re-implement them in MXNet.
This feature requires Caffe. In particular, we need to re-compile Caffe before PR #4527 is merged into Caffe. There are the steps of how to rebuild Caffe:
git clone https://github.com/BVLC/caffe
cd caffe && wget https://github.com/BVLC/caffe/pull/4527.patch && git apply 4527.patch
Next we need to compile MXNet with Caffe supports
make/config.mk
(for Linux) or make/osx.mk
(for Mac) into the MXNet root folder as config.mk
if you have not done it yetconfig.mk
and uncomment these two lines
CAFFE_PATH = $(HOME)/caffe
MXNET_PLUGINS += plugin/caffe/caffe.mk
Modify CAFFE_PATH
to your Caffe installation, if necessary.
make clean && make -j8
.This Caffe plugin adds three components into MXNet:
sym.CaffeOp
: Caffe neural network layersym.CaffeLoss
: Caffe loss functionsio.CaffeDataIter
: Caffe data layersym.CaffeOp
The following example shows the definition of a 10 classes multi-layer perceptron:
data = mx.sym.Variable('data')
fc1 = mx.sym.CaffeOp(data_0=data, num_weight=2, name='fc1', prototxt="layer{type:\"InnerProduct\" inner_product_param{num_output: 128} }")
act1 = mx.sym.CaffeOp(data_0=fc1, prototxt="layer{type:\"TanH\"}")
fc2 = mx.sym.CaffeOp(data_0=act1, num_weight=2, name='fc2', prototxt="layer{type:\"InnerProduct\" inner_product_param{num_output: 64} }")
act2 = mx.sym.CaffeOp(data_0=fc2, prototxt="layer{type:\"TanH\"}")
fc3 = mx.sym.CaffeOp(data_0=act2, num_weight=2, name='fc3', prototxt="layer{type:\"InnerProduct\" inner_product_param{num_output: 10}}")
mlp = mx.sym.SoftmaxOutput(data=fc3, name='softmax')
Let's break it down. First, data = mx.sym.Variable('data')
defines a variable
as a placeholder for input. Then, it's fed through Caffe operators with fc1 = mx.sym.CaffeOp(...)
. CaffeOp
accepts several arguments:
data_i
for i=0, ..., num_data-1num_data
is the number of inputs. In default it is 1, and therefore
skipped in the above example.num_out
is the number of outputs. In default it is 1 and also skipped.num_weight
is the number of weights (blobs_
). Its default value is 0. We
need to explicitly specify it for a non-zero value.prototxt
is the protobuf configuration string.sym.CaffeLoss
Using Caffe loss is similar. We can replace the MXNet loss with Caffe loss. We can replace
Replacing the last line of the above example with the following two lines we can call Caffe loss instead of MXNet loss.
label = mx.sym.Variable('softmax_label')
mlp = mx.sym.CaffeLoss(data=fc3, label=label, grad_scale=1, name='softmax', prototxt="layer{type:\"SoftmaxWithLoss\"}")
Similar to CaffeOp
, CaffeLoss
has arguments num_data
(2 in default) and
num_out
(1 in default). But there are two differences
data
and label
. And we need to explicitly create a variable
placeholder for label, which is implicitly done in MXNet loss.grad_scale
is the weight of this loss.io.CaffeDataIter
We can also wrap a Caffe data layer into MXNet's data iterator. Below is an example for creating a data iterator for MNIST
train = mx.io.CaffeDataIter(
prototxt =
'layer { \
name: "mnist" \
type: "Data" \
top: "data" \
top: "label" \
include { \
phase: TEST \
} \
transform_param { \
scale: 0.00390625 \
} \
data_param { \
source: "caffe/examples/mnist/mnist_test_lmdb" \
batch_size: 100 \
backend: LMDB \
} \
}',
flat = flat,
num_examples = 60000,
)
The complete example is available at example/caffe
Can you improve this documentation? These fine people already did:
Sheng Zha & thinksankyEdit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close