Image Calibration Data Preparation Issues and Processing

1 Problem introduction

When using PTQ model conversion scheme, I do not know where to start in the process of preparing calibration data, and directly give a jpg picture. How to solve the problem that cannot reshape array of size xxx into shape (x, xxx, xxx)?

2 Solution

Taking ResNet18’s pre-processing on imagenet data set as an example, the basic calibration data preparation process is introduced.

2.1 pytorch data preprocessing pseudocode

import torchvision.transforms as transforms

def Data():
    image.convert(RGB)
    data_transform = transforms.Compose(
        image,
        [transforms.Resize(224),
         transforms.ToTensor(),     
         transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

2.2 Prepare Files

[root@7660ab0db525 resnet18]# tree -L 1
.
├── 03_build.sh              
├── origin_image             
├── preprocess.py                  
├── resnet18_baseline_config.yaml  
└── resnet18_baseline.onnx

2.3 Image Calibration Data

preprocess.py

import cv2
import os
import numpy as np

src_dir = './origin_image'
dst_dir = './image_converted_rgb_f32'   # yaml文件中cal_data_dir参数配置成这个路径即可
pic_ext = '.rgb'

if not os.path.exists(dst_dir):
    os.mkdir(dst_dir)
for src_name in sorted(os.listdir(src_dir)):
    src_file = os.path.join(src_dir, src_name)  
    img = cv2.imread(src_file)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  
    img = cv2.resize(np.array(img), (224, 224), interpolation=cv2.INTER_CUBIC)
    img = img.transpose(2, 0, 1)    
    img = img.astype(np.float32)
    filename = os.path.basename(src_file)
    short_name, ext = os.path.splitext(filename)
    pic_name = os.path.join(dst_dir, short_name + pic_ext)
    img.tofile(pic_name)
    print("write:%s" % pic_name)

** Note ** : In the process of calibration data processing, except for the three operations of normalization, mean reduction and standard deviation, the rest of the parts should be consistent with the data preprocessing during training.

Normalization, mean reduction, and division of standard deviation can be accelerated by inserting preprocessing nodes into the model. How to configure such functions is described in detail in Section 2.4 Model transformation.

Run the preprocess.py script:

[root@7660ab0db525 resnet18]# python3 preprocess.py 
write:./image_converted_rgb_f32/ILSVRC2012_val_00000001.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000002.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000003.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000004.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000005.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000006.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000007.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000008.rgb
write:./image_converted_rgb_f32/ILSVRC2012_val_00000009.rgb

2.4 Model transformation

Normalization, mean-reduction, and division of standard deviation are integrated into yaml parameter configuration, which is achieved by norm_type, mean_value, and scale_value. The usage of these three parameters in yaml is described in detail in the following examples.

The conversion logic of mean and scale is as follows

data=(input/255−mean)×(1/std)=(input−255mean)×(1/255std)data=(input/255-mean)×(1/std)=(input-255mean)×(1/255std)data=(input/255−mean)×(1/std)=(input−255mean)×(1/255std)

data=(input−meanvalue)×scalevaluedata=(input-meanvalue)×scalevaluedata=(input−meanvalue)×scalevalue

norm_type: 'data_mean_and_scale'
  mean_value: 123.68 116.28 103.53
  scale_value: 0.017 0.018 0.017

In combination with the calibration data path generated in Section 2.3, configure the yaml file parameters: cal_data_dir: ‘./image_converted_rgb_f32’ and execute the 03_build.sh script to complete the model conversion.

[root@7660ab0db525 resnet18]# sh 03_build.sh 

cd $(dirname $0) || exit

config_file="./resnet18_baseline_config.yaml"
model_type="onnx"
# build model
hb_mapper makertbin --config ${config_file}  \
                    --model-type  ${model_type}
2022-11-16 10:30:05,057 INFO Start hb_mapper....
...
2022-11-16 10:30:38,935 INFO Convert to runtime bin file sucessfully!
2022-11-16 10:30:38,935 INFO End Model Convert