Key topics covered include the following:
The converting tool is available at
tools/caffe_converter. On
the remaining of this section, we assume we are on the tools/caffe_converter
directory.
If Caffe's python package is installed, namely we can run import caffe in
python, then we are ready to go.
For example, we can used AWS Deep Learning AMI with both Caffe and MXNet installed.
Otherwise we can install the Google protobuf compiler and its python binding. It is easier to install, but may be slower during running.
protobuf-compiler e.g. sudo apt-get install protobuf-compiler for Ubuntu and sudo yum install protobuf-compiler for
Redhat/Fedora.PATHbrew install protobufInstall the python binding by either conda install -c conda-forge protobuf
or pip install protobuf.
Compile Caffe proto definition. Run make in Linux or Mac OS X, or
make_win32.bat in Windows
There are three tools:
convert_symbol.py : convert Caffe model definition in protobuf into MXNet's
Symbol in JSON format.convert_model.py : convert Caffe model parameters into MXNet's NDArray formatconvert_mean.py : convert Caffe input mean file into MXNet's NDArray formatIn addition, there are two tools:
convert_caffe_modelzoo.py : download and convert models from Caffe model
zoo.test_converter.py : test the converted models by checking the prediction
accuracy.Besides converting Caffe models, MXNet supports calling most Caffe operators, including network layer, data layer, and loss function, directly. It is particularly useful if there are customized operators implemented in Caffe, then we do not need to re-implement them in MXNet.
This feature requires Caffe. In particular, we need to re-compile Caffe before PR #4527 is merged into Caffe. There are the steps of how to rebuild Caffe:
git clone https://github.com/BVLC/caffecd caffe && wget https://github.com/BVLC/caffe/pull/4527.patch && git apply 4527.patch
Next we need to compile MXNet with Caffe supports
make/config.mk (for Linux) or make/osx.mk
(for Mac) into the MXNet root folder as config.mk if you have not done it yetconfig.mk and uncomment these two lines
CAFFE_PATH = $(HOME)/caffe
MXNET_PLUGINS += plugin/caffe/caffe.mk
Modify CAFFE_PATH to your Caffe installation, if necessary.
make clean && make -j8.This Caffe plugin adds three components into MXNet:
sym.CaffeOp : Caffe neural network layersym.CaffeLoss : Caffe loss functionsio.CaffeDataIter : Caffe data layersym.CaffeOpThe following example shows the definition of a 10 classes multi-layer perceptron:
data = mx.sym.Variable('data')
fc1 = mx.sym.CaffeOp(data_0=data, num_weight=2, name='fc1', prototxt="layer{type:\"InnerProduct\" inner_product_param{num_output: 128} }")
act1 = mx.sym.CaffeOp(data_0=fc1, prototxt="layer{type:\"TanH\"}")
fc2 = mx.sym.CaffeOp(data_0=act1, num_weight=2, name='fc2', prototxt="layer{type:\"InnerProduct\" inner_product_param{num_output: 64} }")
act2 = mx.sym.CaffeOp(data_0=fc2, prototxt="layer{type:\"TanH\"}")
fc3 = mx.sym.CaffeOp(data_0=act2, num_weight=2, name='fc3', prototxt="layer{type:\"InnerProduct\" inner_product_param{num_output: 10}}")
mlp = mx.sym.SoftmaxOutput(data=fc3, name='softmax')
Let's break it down. First, data = mx.sym.Variable('data') defines a variable
as a placeholder for input. Then, it's fed through Caffe operators with fc1 = mx.sym.CaffeOp(...). CaffeOp accepts several arguments:
data_i for i=0, ..., num_data-1num_data is the number of inputs. In default it is 1, and therefore
skipped in the above example.num_out is the number of outputs. In default it is 1 and also skipped.num_weight is the number of weights (blobs_). Its default value is 0. We
need to explicitly specify it for a non-zero value.prototxt is the protobuf configuration string.sym.CaffeLossUsing Caffe loss is similar. We can replace the MXNet loss with Caffe loss. We can replace
Replacing the last line of the above example with the following two lines we can call Caffe loss instead of MXNet loss.
label = mx.sym.Variable('softmax_label')
mlp = mx.sym.CaffeLoss(data=fc3, label=label, grad_scale=1, name='softmax', prototxt="layer{type:\"SoftmaxWithLoss\"}")
Similar to CaffeOp, CaffeLoss has arguments num_data (2 in default) and
num_out (1 in default). But there are two differences
data and label. And we need to explicitly create a variable
placeholder for label, which is implicitly done in MXNet loss.grad_scale is the weight of this loss.io.CaffeDataIterWe can also wrap a Caffe data layer into MXNet's data iterator. Below is an example for creating a data iterator for MNIST
train = mx.io.CaffeDataIter(
prototxt =
'layer { \
name: "mnist" \
type: "Data" \
top: "data" \
top: "label" \
include { \
phase: TEST \
} \
transform_param { \
scale: 0.00390625 \
} \
data_param { \
source: "caffe/examples/mnist/mnist_test_lmdb" \
batch_size: 100 \
backend: LMDB \
} \
}',
flat = flat,
num_examples = 60000,
)
The complete example is available at example/caffe
Can you improve this documentation? These fine people already did:
Sheng Zha & thinksankyEdit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |