These notebooks show tips on how to optimize inference in OpenVINO™.
There are different performance targets for various use case scenarios. For example, video conferencing usually requires the latency to be as short as possible. While for high-resolution video/image analysis, high throughput is typically the performance target. As a result, different optimization tricks should be applied to achieve other performance targets. In this notebook, we’ll show a set of performance tricks for optimizing inferencing latency.
This notebook demonstrates how to optimize the inference latency in OpenVINO™. A set of optimization tricks, including model conversion with different data precision, “AUTO” device with latency mode, shared memory, inference with a further configuration, inference on GPU, etc., are introduced.
This notebook demonstrates how to optimize the inference throughput in OpenVINO™. A set of optimization tricks, including bigger batch size, “AUTO” device with throughput and cumulative throughput mode, asynchronous inference mode, etc., are introduced.
If you have not installed all required dependencies, follow the Installation Guide.