onnxTester

Introduction

The project onnxTester aims at analysing the performance of the open neural exchange (ONNX) format, which is used by data scientists or machine learning engineers to decouple the neural network models from frameworks like Tensorflow, Pytorch.

ONNX and ONNXruntime

ONNX is an open intermediate format for many prevalent Deep Learning (DL) frameworks, for instance TensorFlow, PyTorch, MXNET, enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community.

ONNXruntime is a high-performance cross-platform inference engine and training machine-learning accelerator.

Scenario

Let's say you are a TensorFlow expert but someday receive a well-trained PyTorch model from another colleague who is absolutely a PyTorch freak. To work directly on such model file, you can then ask that colleague to export the neural network as a .onnx file, which can be later loaded and operated in TensorFlow.

However, you question if the model can still perform well as it is in PyTorch...

Evaluation

Target platforms

The deep learning frameworks to test in the project include:

TensorFlow
PyTorch
MATLAB deep learning toolbox

Inference engines/backends to be benchmarked include:

TensorFlow Backend for ONNX: onnx-tf
Official model serving system by TensorFlow: TensorFlow Serving
onnxruntime

Tasks

onnxTester evaluates the interoperability of ONNX (and ONNXruntime) by conducting the following task:

inter-converting a trained model in different frameworks
comparing the latency and accuracy of model inference using different runtimes by comparing:

top-5 accuracy consistency
average prediction time
90-percentile prediction latency

Installation

Pre-requisite

Software

MATLAB 2020b

docker

Python and dependencies

python == 3.7

onnx

pip install numpy protobuf==3.16.0
pip install onnx

onnxruntime

pip install onnxruntime

tensorflow == 2.6.0

pip install tensorflow

onnx-tf

pip install onnx-tf

tf2onnx

pip install -U tf2onnx

PyTorch and Torchvision

pip install torch
pip install torchvision

h5py

pip install h5py

gin-config

pip install gin-config

pillow

pip install Pillow

scikit-image

pip install scikit-image

docker python SDK

pip install docker

MATLAB Engine API for Python

# At a MAC or Linux OS
cd "matlabroot/extern/engines/python"
python setup.py install

# At a Window OS
cd "matlabroot\extern\engines\python"
python setup.py install

For details please refer to this official tutorial.

Install onnxTester from source

git clone https://github.com/Vincenwwx/onnxTester.git
cd onnxTester

Usage

To run the software following command can be used:

# Add execution perssion to main.py
chmod +x main.py
# Run the software
./main.py (-p MODEL_PATH | -n {inceptionv3, resnet50, vgg16}) {origin_framework} {test_type}

Examples

Execute model conversion test

Suppose we want to auto generate a inception-V3 model in TensorFlow and benchmark with that:

./main.py -n inceptionv3 tensorflow conversion

Execute model inference test

Suppose we want to auto generate a ResNet-50 model in MATLAB and benchmark with that:

./main.py -n resnet50 matlab inference

Test results

Model conversion test

Top-5 Accuracy

Top-5 Accuracy of VGG-16 model conversion test

VGG-16	PyTorch	TensorFlow	MATLAB
PyTorch	-	100%	61.52%
TensorFlow	-	-	87.38%
MATLAB	-	99.98%	-

Top-5 Accuracy of Inception-V3 model conversion test

VGG-16	PyTorch	TensorFlow	MATLAB
PyTorch	-	100%	92.72%
TensorFlow	-	-	97.20%
MATLAB	-	100%	-

Top-5 Accuracy of VGG-16 model conversion test

VGG-16	PyTorch	TensorFlow	MATLAB
PyTorch	-	100%	74.10%
TensorFlow	-	-	94.50%
MATLAB	-	99.52%	-

We can see that when converting a CNN model whatever built by PyTorch or TensorFlow, there is an obvious prediction unconsistency.

Average prediction time

Figure 1. Average prediction time for models originate from PyTorch

Use PyTorch models in TensorFlow can in general achieve some performance improvement.

Figure 2. Average prediction time for models originate from TensorFlow

Figure 3. Average prediction time for models originate from MATLAB

MATLAB takes much longer time to inference model originating from TensorFlow.

Model inference test

Top-5 Accuracy

Top-5 Accuracy of VGG-16 model inference test

VGG-16	onnx-tf	TensorFlow Serving	onnxruntime
PyTorch	99.98%	99.98%	100%
TensorFlow	100%	100%	100%
MATLAB	99.98%	99.98%	99.98%

Top-5 Accuracy of Inception-V3 model inference test

Inception-V3	onnx-tf	TensorFlow Serving	onnxruntime
PyTorch	100%	100%	99.98%
TensorFlow	99.98%	99.98%	100%
MATLAB	100%	100%	100%

Top-5 Accuracy of ResNet-50 model inference test

ResNet-50	onnx-tf	TensorFlow Serving	onnxruntime
PyTorch	99.50%	99.50%	99.50%
TensorFlow	100%	100%	100%
MATLAB	99.50%	99.50%	99.50%

90-Percentile Latency

Figure 4. Latency of VGG-16 model inference test

Figure 5. Latency of ResNet-50 model inference test

Figure 6. Latency of Inception-V3 model inference test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

onnxTester

Introduction

ONNX and ONNXruntime

Scenario

Evaluation

Target platforms

Tasks

Installation

Pre-requisite

Software

Python and dependencies

Install onnxTester from source

Usage

Recommended

Examples

Execute model conversion test

Execute model inference test

Test results

Model conversion test

Top-5 Accuracy

Average prediction time

Model inference test

Top-5 Accuracy

90-Percentile Latency

Files

README.md

Latest commit

History

README.md

File metadata and controls

onnxTester

Introduction

ONNX and ONNXruntime

Scenario

Evaluation

Target platforms

Tasks

Installation

Pre-requisite

Software

Python and dependencies

Install onnxTester from source

Usage

Recommended

Examples

Execute model conversion test

Execute model inference test

Test results

Model conversion test

Top-5 Accuracy

Average prediction time

Model inference test

Top-5 Accuracy

90-Percentile Latency