In this tutorial we'll walk through how one can implement linear regression using MXNet APIs.
The function we are trying to learn is: y = x1 + 2x2, where (x1,x2) are input features and y is the corresponding label.
To complete this tutorial, we need:
MXNet. See the instructions for your operating system in Setup and Installation.
$ pip install jupyter
To begin, the following code imports the necessary packages we'll need for this exercise.
import mxnet as mx
import numpy as np
# Fix the random seed
mx.random.seed(42)
import logging
logging.getLogger().setLevel(logging.DEBUG)
In MXNet, data is input via Data Iterators. Here we will illustrate how to encode a dataset into an iterator that MXNet can use. The data used in the example is made up of 2D data points with corresponding integer labels.
#Training data
train_data = np.random.uniform(0, 1, [100, 2])
train_label = np.array([train_data[i][0] + 2 * train_data[i][1] for i in range(100)])
batch_size = 1
#Evaluation Data
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11,26,16])
Once we have the data ready, we need to put it into an iterator and specify
parameters such as batch_size
and shuffle
. batch_size
specifies the number
of examples shown to the model each time we update its parameters and shuffle
tells the iterator to randomize the order in which examples are shown to the model.
train_iter = mx.io.NDArrayIter(train_data, train_label, batch_size, shuffle=True, label_name='lin_reg_label')
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False, label_name='lin_reg_label')
In the above example, we have made use of NDArrayIter
, which is useful for iterating
over both numpy ndarrays and MXNet NDArrays. In general, there are different types of iterators in
MXNet and you can use one based on the type of data you are processing.
Documentation for iterators can be found here.
IO: The IO class as we already saw works on the data and carries out operations such as feeding data in batches and shuffling.
Symbol: The actual MXNet neural network is composed using symbols. MXNet has different types of symbols, including variable placeholders for input data, neural network layers, and operators that manipulate NDArrays.
Module: The module class in MXNet is used to define the overall computation. It is initialized with the model we want to train, the training inputs (data and labels) and some additional parameters such as learning rate and the optimization algorithm to use.
MXNet uses Symbols for defining a model. Symbols are the building blocks and make up various components of the model. Symbols are used to define:
FullyConnected
symbol which specifies a fully connected
layer of a neural network.SoftmaxOutput
layer). You can also
create your own loss function.
Some examples of existing losses are: LinearRegressionOutput
, which computes
the l2-loss between it's input symbol and the labels provided to it;
SoftmaxOutput
, which computes the categorical cross-entropy.The ones described above and other symbols are chained together with the output of one symbol serving as input to the next to build the network topology. More information about the different types of symbols can be found here.
X = mx.sym.Variable('data')
Y = mx.symbol.Variable('lin_reg_label')
fully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)
lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro")
The above network uses the following layers:
FullyConnected
: The fully connected symbol represents a fully connected layer
of a neural network (without any activation being applied), which in essence,
is just a linear regression on the input attributes. It takes the following
parameters:
data
: Input to the layer (specifies the symbol whose output should be fed here)num_hidden
: Number of hidden neurons in the layer, which is same as the dimensionality
of the layer's outputLinearRegressionOutput
: Output layers in MXNet compute training loss, which is
the measure of inaccuracy in the model's predictions. The goal of training is to minimize the
training loss. In our example, the LinearRegressionOutput
layer computes the l2 loss against
its input and the labels provided to it. The parameters to this layer are:
data
: Input to this layer (specifies the symbol whose output should be fed here)label
: The training labels against which we will compare the input to the layer for calculation of l2 lossNote on naming convention: the label variable's name should be the same as the
label_name
parameter passed to your training data iterator. The default value of
this is softmax_label
, but we have updated it to lin_reg_label
in this
tutorial as you can see in Y = mx.symbol.Variable('lin_reg_label')
and
train_iter = mx.io.NDArrayIter(..., label_name='lin_reg_label')
.
Finally, the network is input to a Module, where we specify the symbol
whose output needs to be minimized (in our case, lro
or the lin_reg_output
), the
learning rate to be used while optimization and the number of epochs we want to
train our model for.
model = mx.mod.Module(
symbol = lro ,
data_names=['data'],
label_names = ['lin_reg_label']# network structure
)
We can visualize the network we created by plotting it:
mx.viz.plot_network(symbol=lro, node_attrs={"shape":"oval","fixedsize":"false"})
Once we have defined the model structure, the next step is to train the
parameters of the model to fit the training data. This is accomplished using the
fit()
function of the Module
class.
model.fit(train_iter, eval_iter,
optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
num_epoch=20,
eval_metric='mse',
batch_end_callback = mx.callback.Speedometer(batch_size, 2))
Once we have a trained model, we can do a couple of things with it - we can either use it for inference or we can evaluate the trained model on test data. The latter is shown below:
model.predict(eval_iter).asnumpy()
We can also evaluate our model according to some metric. In this example, we are evaluating our model's mean squared error (MSE) on the evaluation data.
metric = mx.metric.MSE()
mse = model.score(eval_iter, metric)
print("Achieved {0:.6f} validation MSE".format(mse[0][1]))
assert model.score(eval_iter, metric)[0][1] < 0.01001, "Achieved MSE (%f) is larger than expected (0.01001)" % model.score(eval_iter, metric)[0][1]
Let us try and add some noise to the evaluation data and see how the MSE changes:
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False, label_name='lin_reg_label')
model.score(eval_iter, metric)
We can also create a custom metric and use it to evaluate a model. More information on metrics can be found in the API documentation.
Can you improve this documentation? These fine people already did:
Madan Jampani, Thomas Delteil, Naveen Swamy, Aaron Markham, Kai Li, Indhu Bharathi, Steffen Rochel, Yao Wang, Mu Li, Sheng Zha, Pracheer Gupta, Andrei Paleyes, ThomasDelteil & thinksankyEdit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close