Skip to content

Latest commit

 

History

History
executable file
·
489 lines (419 loc) · 23.6 KB

PreRuntime.md

File metadata and controls

executable file
·
489 lines (419 loc) · 23.6 KB

DRP-AI Pre-processing Runtime

This page explains about the DRP-AI Pre-processing Runtime, which includes its supported operations and APIs.

Index

1. Overview

1.1 Function

DRP-AI Pre-processing Runtime enables high performance AI pre-processing using the hardware accelerator, DRP-AI.
It is provided in DRP-AI TVM[^1] as its one of features.
Users can use the DRP-AI Pre-processing Runtime by compiling the pre-processing with "DRP-AI Pre-processing Compile module" (python APIs) and run it on the board by calling "DRP-AI Pre-processing Run Module" (C++ APIs).

1.2 Terminology

Terms Explanation
DRP-AI Name of the Renesas architecture that accelerates the inference of neural networks.
AIMAC Components of the DRP-AI. Mainly used for the operation of the convolutional layer.
DRP Components of the DRP-AI. Mainly used for the operation of layers other than the above and pre and postprocessing.
DRP-AI Pre-processing Runtime Object files Files required to run DRP-AI generated by DRP-AI Pre-processing Runtime Compile module.
- AIMAC descriptor (aimac_desc.bin)
- DRP descriptor (drp_desc.bin)
- DRP parameter (drp_param.bin)
- DRP parameter Information (drp_param_info.txt)
- Weight data (pp_weight.dat)
- DRP configuration data (pp_drpcfg.mem)
DRP-AI memory area Dedicated area on Memory Map of RZ/V series (RZ/V2L, RZ/V2M, RZ/V2MA) Linux Package used by DRP-AI.

1.3 Requirement

To use DRP-AI Pre-processing Runtime, please prepare the environment explained in Installation.

1.4 About Memory Management

DRP-AI computes the processing using data mapped in physically continuous memory area.
In RZ/V Linux Package, "DRP-AI memory area" is reserved for DRP-AI.
DRP-AI Pre-processing Runtime Object files must be deployed to the DRP-AI memory area before running the DRP-AI.

Since DRP-AI TVM[^1] also uses DRP-AI memory area, DRP-AI TVM[^1] and DRP-AI Pre-processing Runtime must use mutually exclusive area.
Overlapping the memory area used by DRP-AI TVM[^1] and DRP-AI Pre-processing Runtime will cause the DRP-AI error.

In DRP-AI TVM[^1], the memory address must be defined when compiling the model.
In Compile Tutorial, addr_map_start is the memory address that DRP-AI TVM[^1] Model Object is deployed.
For example, following shows that DRP-AI TVM Model Object will be deployed to memory area after the address of 0x438E0000.

drp_config_runtime = {
    "interpreter": False,
    "addr_map_start": 0x438E0000,
    "toolchain_dir": <TRANSRATOR PATH>,
    "sdk_root": <SDK PATH>
}

In DRP-AI Pre-processing Runtime, the start address of memory area that Object files area deployed must be defined when running the pre-processing on the target board.
To see how to define the start address, please refer to Run module API Load().
Memory area that DRP-AI uses consists of 9 spaces list in the following table.
Their address details and the size required are included in the address map txt file (pp_addrmap_intm.txt) in DRP-AI Pre-processing Runtime Object files.

Name Details
data_in Memory area to place the input data to DRP-AI.
data Work area for AIMAC
data_out Memory area to place the final inference results.
DRP-AI Pre-processing Runtime reads data from this area to obtain the DRP-AI output.
work Work area for DRP
weight Memory area to place the weight data of the neural network.
DRP-AI Pre-processing Runtime writes the weight data to this area.
drp_config Memory area to place DRP configuration data.
DRP-AI Pre-processing Runtime writes the DRP configuration data to this area.
drp_param Memory area to place DRP parameter.
DRP-AI Pre-processing Runtime writes the DRP parameter to this area.
desc_aimac Memory area to place AIMAC descriptor.
DRP-AI Pre-processing Runtime writes the AIMAC descriptor to this area.
desc_drp Memory area to place DRP descriptor.
DRP-AI Pre-processing Runtime writes the DRP descriptor to this area.

2. Compile Module

2.1 Overview

DRP-AI Pre-processing Runtime Compile module is a function of DRP-AI TVM[^1].
It allows users to compile the pre-processing into executable format (DRP-AI Pre-processing Runtime Object files) on DRP-AI.

2.2 File Configuration

Following files are required to run DRP-AI Pre-processing Runtime Compile module.

File
drpai_preprocess/__init__.py
drpai_preprocess/drpai_param.py
drpai_preprocess/op.py
drpai_preprocess/preruntime.py

2.3 Supported Operations

2.3.1 Input/Output

Parameters Input Output
Format YUV422
YUV420
RGB
BGR
Grayscale
RGB
BGR
Grayscale
Order HWC HWC
CHW
Type uint8
fp16
fp32
uint8
fp16
fp32

Notes

  1. See Format for YUV details.
  2. Following cases are not supported.
    • Input Format = YUV420 & Output Format = Grayscale
    • Input Format = Grayscale & Output Format = Other than grayscale
    • Input Type = Other than uint8 & Output Type = uint8

2.3.2 Pre-processing Operators

Supported pre-processing operators are as follows.

Operators Explanation
Crop Crop the image data into specified size.
Resize Resize the image data into specified size.
Normalize Normalize pixel values with specified coefficients.

Note:

  1. Color conversion, transpose and cast are automatically done according to the format/order/type that user specified.
  2. Pre-processing will be run in the order shown in the table above.
  3. ON/OFF of each process can be specified.

2.4 API

API list is as follows.

Module Class Category Description
preruntime PreRuntime() Runtime Class Class to run DRP-AI Pre-processing Runtime.
preruntime Config() Runtime Class Class to define the pre-processing details.
op Crop() Operator Class Class for Crop operation.
op Resize() Operator Class Class for Resize operation.
op Normalize() Operator Class Class for Normalize operation.
drp_param Format Constant Variable Constant variables for in/out format.
drp_param Order Constant Variable Constant variables for in/out order.
drp_param Type Constant Variable Constant variables for in/out type.

2.4.1 Runtime Class

Runtime class is used to control the DRP-AI Pre-processing Runtime.

2.4.1.1 PreRuntime()
preruntime.PreRuntime(config, out_dir, product)
  • Explanation
    • Class to run DRP-AI Pre-processing Runtime.
    • By defining this class, the DRP-AI Pre-processing Runtime Compile module will be run.
  • Arguments
    • config (preruntime.Config) : Pre-processing configuration. See Config for more details.
    • out_dir (string) : Name of the output directory to be generated.
    • product (string) : RZ/V Product name. Default is "V2L".
2.4.1.2 Config()
preruntime.Config()
  • Explanation

    • Class to define the pre-processing details, such as input size, which operator to be used, etc.
    • Define the pre-processing details after defined this class as shown in Class variables below.
  • Arguments

    • None
  • Class variables
    Following variables must be defined before running the DRP-AI Pre-processing Runtime Compile module.

Variables Type Details
shape_in [int, int, int, int] Input image data shape.
List must have size of 4 as NHWC.
format_in drpai_param.FORMAT Input format.
See Format for drpai_param.FORMAT.
order_in drpai_param.ORDER Input order.
See Order for drpai_param.ORDER.
type_in drpai_param.TYPE Input Type.
See Type for drpai_param.TYPE.
shape_out [int, int, int, int] Output data shape.
List must have size of 4 as NHWC/NCHW.
format_out drpai_param.FORMAT Output format.
See Format for drpai_param.FORMAT.
order_out drpai_param.ORDER Output order.
See Order for drpai_param.ORDER.
type_out drpai_param.TYPE Output Type.
See Type for drpai_param.TYPE.
ops list of Operator Class Operators to be run as pre-processing.
If no operator specified, only format/order/type conversion will be done as pre-processing.

2.4.2 Operator Class

Followings are operator class for operator explained in Pre-processing Operators.

2.4.2.1 Crop()
op.Crop(crop_tl_x, crop_tl_y, width, height)
  • Arguments

    • crop_tl_x (int) : X coordinates of top left corner of the crop box.
    • crop_tl_y (int) : Y coordinates of top left corner of the crop box.
    • width (int) : Width of the crop box.
    • height (int) : Height of the crop box.
  • Restrictions

    • crop_tl_x <= (width of input shape - 1)
    • crop_tl_y <= (height of input shape - 1)
  • Supported In/Out parameters

Features Input Output
Format drpai_param.FORMAT.RGB
drpai_param.FORMAT.BGR
drpai_param.FORMAT.GRAY
Same as Input.
Order drpai_param.ORDER.HWC Same as Input.
Type drpai_param.TYPE.UINT8
drpai_param.TYPE.FP16
Same as Input.
2.4.2.2 Resize()
op.Resize(width, height, algorithm)
  • Arguments

    • width (int) : Output Width of the crop box.
    • height (int) : Output Height of the crop box.
    • algorithm (int) : Resize algorithm to be used.
      • op.Resize.NEAREST (Default) : Nearest neighbor algorithm.
      • op.Resize.BILINEAR : Bilinear interpolation.
  • Restrictions

    • 2 < (width of input shape) <= 4096
    • 2 < (height of input shape) <= 4096
    • (channel of input shape) <= 512
    • 2 < width <= 4096
    • 2 < height <= 4096
  • Supported In/Out parameters

Features Input Output
Format drpai_param.FORMAT.RGB
drpai_param.FORMAT.BGR
drpai_param.FORMAT.GRAY
Same as Input.
Order drpai_param.ORDER.HWC Same as Input.
Type drpai_param.TYPE.UINT8
drpai_param.TYPE.FP16
Same as Input.
2.4.2.3 Normalize()
Normalize(cof_add, cof_mul)
  • Arguments

    • cof_add (list of float) : Addition coefficient in normalization.
    • cof_mul (list of float) : Multiplication coefficient in normalization.
    • Note: Above coefficients will be used in the following equation.
      Dout = (Din + cof_add) * cof_mul  
      
    • Example
      Following normalization may be used for the neural network input data.
      range = 255
      img = img / range
      img = (img - mean) / stdev
      
      In the above case, find cof_add and cof_mul from mean and stdev using following conversion formulae.
      cof_add = - (mean * range)
      cof_mul = 1/(stdev * range)
      
  • Restrictions

    • width of input shape * channel of input shape < 0xFFFFFFFF
  • Supported In/Out parameters

Features Input Output
Format drpai_param.FORMAT.RGB
drpai_param.FORMAT.BGR
drpai_param.FORMAT.GRAY
Same as Input.
Order drpai_param.ORDER.HWC Same as Input.
Type drpai_param.TYPE.UINT8
drpai_param.TYPE.FP16
drpai_param.TYPE.FP32
drpai_param.TYPE.FP16

2.4.3 Constant Variables

Constant variables are defined as drpai_param module.

2.4.3.1 Format
Class Variable Value Details
drpai_param.FORMAT YUYV_422 0x0000 Classified as YUV422.
drpai_param.FORMAT YVYU_422 0x0001 Classified as YUV422.
drpai_param.FORMAT UYUV_422 0x0002 Classified as YUV422.
drpai_param.FORMAT VUYY_422 0x0003 Classified as YUV422.
drpai_param.FORMAT YUYV_420 0x1000 Classified as YUV420.
drpai_param.FORMAT UYVY_420 0x1001 Classified as YUV420.
drpai_param.FORMAT YV12_420 0x1002 Classified as YUV420.
drpai_param.FORMAT IYUV_420 0x1003 Classified as YUV420.
drpai_param.FORMAT NV12_420 0x1004 Classified as YUV420.
drpai_param.FORMAT NV21_420 0x1005 Classified as YUV420.
drpai_param.FORMAT IMC1_420 0x1006 Classified as YUV420.
drpai_param.FORMAT IMC2_420 0x1007 Classified as YUV420.
drpai_param.FORMAT IMC3_420 0x1008 Classified as YUV420.
drpai_param.FORMAT IMC4_420 0x1009 Classified as YUV420.
drpai_param.FORMAT RGB 0x2000 -
drpai_param.FORMAT BGR 0x2001 -
drpai_param.FORMAT GRAY 0x2010 -
drpai_param.FORMAT UNKNOWN 0xFFFF -
2.4.3.2 Order
Class Variable Value
drpai_param.ORDER HWC 0x0000
drpai_param.ORDER CHW 0x0001
drpai_param.ORDER UNKNOWN 0xFFFF
2.4.3.3 Type
Class Variable Value
drpai_param.TYPE UINT8 0x0000
drpai_param.TYPE FP16 0x0001
drpai_param.TYPE FP32 0x0002
drpai_param.TYPE UNKNOWN 0xFFFF

2.5 Integration

To use DRP-AI Pre-processing Runtime Compile module, place the python script in the same directory with drpai_preprocess directory and import the module at the top of the script as shown below.

from drpai_preprocess import * 
# Import other modules after the DRP-AI Preprocessing Runtime Compile Module
# e.g.,
import os

Note: importing the DRP-AI Pre-processing Runtime Compile module after the other imports may cause the Segmentation Fault error.

2.6 Sample Code

Following is a sample code, which generate DRP-AI Pre-processing Runtime Object for V2MA into preprocess directory.

from drpai_preprocess import * 

config = preruntime.Config()
config.shape_in     = [1, 480, 640, 2] #HWC
config.format_in    = drpai_param.FORMAT.YUYV_422
config.order_in     = drpai_param.ORDER.HWC
config.type_in      = drpai_param.TYPE.UINT8
config.shape_out    = [1, 3, 224, 224] #CHW
config.format_out   = drpai_param.FORMAT.RGB
config.order_out    = drpai_param.ORDER.CHW
config.type_out     = drpai_param.TYPE.FP32
config.ops = [
    op.Crop(0, 0, 480, 480), # crop top left 480x480 area.
    op.Resize(224, 224, op.Resize.BILINEAR),
    op.Normalize([-123.675, -116.28, -103.53], [0.01712475, 0.017507, 0.01742919])
    # Normalize with mean = [0.485, 0.456, 0.406 ] and stdev = [ 0.229, 0.224, 0.225 ]
]
preruntime.PreRuntime(config, "preprocess", "V2MA")

3. Run module

3.1 Overview

DRP-AI Pre-processing Runtime Object files generated by Compile module can be run on RZ/V Evaluation Board with the Run module.
It allocates the Object files on DDR memory and assign it to DRP-AI.
Users also can change some of the parameters specified when compiled the pre-processing.
Run module can be used by calling C++ API explained in this section.

3.2 File configuration

3.2.1 Source Code

File Details
PreRuntime.cpp Pre-processing Runtime source code
PreRuntime.h Pre-processing Runtime header file.

3.3 API

Function Description
Load() Loads Pre-processing Runtime Object files.
Pre() Runs pre-processing on DRP-AI.

3.3.1 Load

Items Details
Overview Loads Pre-processing Runtime Object files.
Definition uint8_t PreRuntime::Load(const std::string pre_dir, uint32_t start_addr)
Description Initialize Pre-processing Runtime.
Load Pre-processing Runtime Object files stored in pre_dir and deploy them to DRP-AI memory area.
Arguments 1. const std::string pre_dir = Directory name that contains Pre-processing Runtime Object files.
2. uint32_t start_addr = Start address where Object files are dynamically allocated.
If not specified, start address of DRP-AI memory area will be used instead.
Return 0 = if function succeeded
>0 = otherwise

3.3.2 Pre

Items Details
Overview Run pre-processing on DRP-AI with specified parameters.
Definition uint8_t PreRuntime::Pre(s_preproc_param_t* param, void** out_ptr, uint32_t* out_size)
Description Run pre-processing on DRP-AI. Users can specify the pre-processing parameters in param. The DRP-AI output will be stored in out_ptr and its size will be stored in out_size.
Arguments 1. s_preproc_param_t* param = Pre-processing parameters to be changed.
Refer to s_preproc_param_t for more details.
2. void** out_ptr = Array pointer for DRP-AI output data. Buffer will last until next call of this function.
3. uint32_t* out_size = DRP-AI output buffer size.
Return 0 = if function succeeded
>0 = otherwise

3.4 Structure

3.4.1 s_preproc_param_t

DRP-AI pre-processing parameter container.
When user compiled the pre-processing using Compile module, these parameters are already defined.
Members are shown below.

Type Member variable Details Range Invalid value
See Notes.1
uint16_t pre_in_shape_w Input image width of pre-processing. 0~ 0xFFFF
uint16_t pre_in_shape_h Input image height of pre-processing. 0~ 0xFFFF
uint32_t pre_in_addr Start address of continuous memory area which stores the input image data.
See Notes.2
0~0xFFFFFFFE 0xFFFFFFFF
uint16_t pre_in_format Input image format. See Format and Format parameter change restriction 0xFFFF
uint16_t pre_out_format Output image format. FORMAT_GRAY, FORMAT_RGB, FORMAT_BGR
See Format parameter change restriction.
0xFFFF
uint8_t resize_alg Resize algorithm. See Resize Algorithm 0xFF
uint16_t resize_w Output image width of resize operator. 3~4096
See Notes.3
0xFFFF
uint16_t resize_h Output image height of resize operator. 3~4096
See Notes.3
0xFFFF
float[3] cof_add Addition coefficients for normalize operator. [-(FLT_MAX-1)
~FLT_MAX]
[-FLT_MAX]
float[3] cof_mul Multiplication coefficients for normalize operator. -(FLT_MAX-1)
~FLT_MAX
[-FLT_MAX]
uint16_t crop_tl_x X coordinates of top left corner of the crop box. 0~ 0xFFFF
uint16_t crop_tl_y Y coordinates of top left corner of the crop box. 0~ 0xFFFF
uint16_t crop_w Width of the crop box. 0~
See Notes.3
0xFFFF
uint16_t crop_h Height of the crop box. 0~
See Notes.3
0xFFFF
Notes:
  1. If users specified the "Invalid value", the parameter uses the current value.
  2. pre_in_addr must be specified when s_preproc_param_t is defined. For other members, define the value if changes are necessary.
  3. For resize_w, resize_h, crop_w, crop_h, the new value must be smaller than or equal to the value specified when compiling the pre-processing.

3.5 Macro

Major Macro definition are as follows.

3.5.1 Format

Macros to define the format.

Macro Value
FORMAT_YUYV_422 0x0000
FORMAT_YVYU_422 0x0001
FORMAT_UYUV_422 0x0002
FORMAT_VUYY_422 0x0003
FORMAT_YUYV_420 0x1000
FORMAT_UYVY_420 0x1001
FORMAT_YV12_420 0x1002
FORMAT_IYUV_420 0x1003
FORMAT_NV12_420 0x1004
FORMAT_NV21_420 0x1005
FORMAT_IMC1_420 0x1006
FORMAT_IMC2_420 0x1007
FORMAT_IMC3_420 0x1008
FORMAT_IMC4_420 0x1009
FORMAT_GRAY 0xFFFC
FORMAT_BGR 0xFFFD
FORMAT_RGB 0xFFFE
FORMAT_UNKNOWN 0xFFFF

3.5.2 Resize Algorithm

Macros to define the algorithm for resize operation.

Macro Value
ALG_NEAREST 0
ALG_BILINEAR 1

3.6 Format Parameter Change Restriction

When users change the input/output format of pre-processing using Run module, only following cases are allowed.

Parameter Original Value New Value Conditions
pre_in_format FORMAT_*_420 FORMAT_*_420, FORMAT_*_422 -
pre_in_format FORMAT_*_422 FORMAT_*_420, FORMAT_*_422 If output format specified in Compile module is RGB or BGR.
pre_in_format FORMAT_*_422 FORMAT_*_422 If output format specified in Compile module is GRAY.
pre_in_format FORMAT_RGB
FORMAT_BGR
FORMAT_RGB
FORMAT_BGR
-
pre_out_format FORMAT_RGB
FORMAT_BGR
FORMAT_RGB
FORMAT_BGR
-

3.7 Integration

  1. Place the files listed in File Configuration in the project directory.
  2. Include the PreRuntime.h file in the application source code.
#include "PreRuntime.h"
  1. Include the Pre-processing source code in the build command.
    For example, in cmake for Application Example, modify the CMakeLists.txt as follows.
  • Before
set(SRC tutorial_app.cpp MeraDrpRuntimeWrapper.cpp)
set(EXE_NAME tutorial_app)

add_executable(${EXE_NAME} ${SRC})
target_link_libraries(${EXE_NAME} ${TVM_RUNTIME_LIB})
  • After
set(SRC tutorial_app.cpp MeraDrpRuntimeWrapper.cpp PreRuntime.cpp)
set(EXE_NAME tutorial_app)

add_executable(${EXE_NAME} ${SRC})
target_link_libraries(${EXE_NAME} ${TVM_RUNTIME_LIB})

3.8 Sample Code

Please refer to Application Example.

[*1]: DRP-AI TVM is powered by EdgeCortix MERA™ Compiler Framework.