Skip to content

Latest commit

 

History

History
 
 

mobilenet

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

MobileNet

MobileNet improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. MobileNet is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, it removes non-linearities in the narrow layers in order to maintain representational power. The models perform image classification - they take images as input and classifies the major object in the image into a set of pre-defined classes. They are trained ImageNet dataset which contains images from 1000 classes. MobileNet models are also very efficient in terms of speed and size and hence are ideal for embedded and mobile applications.

Model

MobileNet reduces the dimensionality of a layer thus reducing the dimensionality of the operating space. The trade off between computation and accuracy is exploited in Mobilenet via a width multiplier parameter approach which allows one to reduce the dimensionality of the activation space until the manifold of interest spans this entire space. The below model is using multiplier value as 1.0.

  • Version 2:
Model ONNX Model Model archives Top-1 accuracy (%) Top-5 accuracy (%)
MobileNet v2-1.0 13.6 MB 13.7 MB 70.94 89.99

Inference

We used MXNet as framework with gluon APIs to perform inference. View the notebook imagenet_inference to understand how to use above models for doing inference. Make sure to specify the appropriate model name in the notebook.

Input

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W), where N is the batch size, and H and W are expected to be at least 224. The inference was done using jpeg image.

Preprocessing

The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. The transformation should preferrably happen at preprocessing.

import mxnet
from mxnet.gluon.data.vision import transforms
def preprocess(img):   
    '''
    
    Preprocessing required on the images for inference with mxnet gluon
    The function takes path to an image and returns processed tensor
    ''''''
    transform_fn = transforms.Compose([
    transforms.Resize(224),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
    img = transform_fn(img)
    img = img.expand_dims(axis=0) # batchify
    
    return img
    

Output

The model outputs image scores for each of the 1000 classes of ImageNet.

Postprocessing

The post-processing involves calculating the softmax probablility scores for each classes and sorting them to report the most probable classes.

import mxnet as mx
def postprocess(scores): 
    '''
    Postprocessing with mxnet gluon
    The function takes scores generated by network and returns the class IDs in decreasing order
    of probability
    ''''''
    prob = mx.ndarray.softmax(scores).asnumpy()
    prob = np.squeeze(prob)
    a = np.argsort(prob)[::-1]
    return a
    

Inference with Model Server

To learn how to use model archives with Model Server, try out the Model Server QuickStart to get Model Server installed and tested. If you already have installed the server, you can use the commands below to start serving this model.

  • Start Server:
mxnet-model-server --models mobilenetv2_1_0=https://s3.amazonaws.com/onnx-model-zoo/mobilenet/mobilenetv2-1.0/mobilenetv2-1.0.model
  • Run Prediction:
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
curl -X POST http://127.0.0.1:8080/mobilenetv2_1_0/predict -F "[email protected]"

For inference requests with this model, Model Server expects the image to be passed in the data variable, which is the input layer's name in the model. In the previous example this was [email protected].

Dataset

Dataset used for train and validation: ImageNet (ILSVRC2012). Check imagenet_prep for guidelines on preparing the dataset.

Validation accuracy

The accuracy obtained by the model on the validation set is mentioned above. The accuracy has been calculate on center cropped images and is within 1-2% of the accuracy obtained in the paper.

Training

We used MXNet as framework with gluon APIs to perform training. View the training notebook to understand details for parameters and network for each of the above variants of MobileNet.

Validation

We used MXNet as framework with gluon APIs to perform validation. Use the notebook imagenet_validation to verify the accuracy of the model on the validation set. Make sure to specify the appropriate model name in the notebook.

References

MobileNet-v2 Model from the paper MobileNetV2: Inverted Residuals and Linear Bottlenecks

Contributors

Acknowledgments

MXNet, Gluon model zoo, GluonCV, MMS

Keyword

CNN, MobileNet, ONNX, ImageNet, Computer Vision