【Sunrise X3】Multi-task Learning with YOLOP

1. Preface

Multi-task network can complete multiple tasks through a network, YOLOP can simultaneously carry out target detection, driveable area segmentation, lane detection three tasks in the same network. This article will provide a brief explanation of the network structure and deploy YOLOP on the X3 pie via Horizon’s AI toolchain.

YOLOP project: https://github.com/hustvl/YOLOP

This article tests the code :https://github.com/Rex-LK/ai_arm_learning

2. Network structure

The network structure of YOLOP is very clear, which is basically consistent with the network of YOLO series. Mainly referring to the model structure of YOLOV4, YOLOP’s network is mainly divided into the following parts:

  • backbone
  • neck
  • Detect head
  • Drivable areas segment head
  • Lane line segment head

yolop.png

3. Model quantification

The opset_version of the official onnx model is 12. Therefore, you need to re-export onnx and change the opset_version value in export_onnx.py to 11. By default, the 640*640 model is used. After the export is completed, the quantization can be performed. The following is the YOLOP quantization configuration file.

model_parameters:
  onnx_model: 'yolop-640-640.onnx'
  output_model_file_prefix: 'yolop-640-640'
  march: 'bernoulli2'
input_parameters:
  input_type_train: 'rgb'
  input_layout_train: 'NCHW'
  input_type_rt: 'nv12'
  norm_type: 'data_mean_and_scale'
  mean_value: '123.675 116.28 103.53'

  scale_value: '0.0171 0.0175 0.0174'

  input_layout_rt: 'NCHW'
calibration_parameters:
  cal_data_dir: './calibration_data_rgb_f32'
  calibration_type: 'max'
  max_percentile: 0.9999
compiler_parameters:
  compile_mode: 'latency'  
  optimize_level: 'O3'
  debug: False
  core_num: 2

4. Board test

The following is part of the test code, (temporarily there are bugs, can not be uploaded, can be viewed in github)

model_path = weight
    model = pyeasy_dnn.load(model_path)
    print(f"Load {model_path} done!")

    save_det_path = "./pictures/detect.jpg"
    save_da_path = "./pictures/da.jpg"
    save_ll_path = "./pictures/ll.jpg"
    save_merge_path = "./pictures/output.jpg"

    img_bgr = cv2.imread(img_path)
    height, width, _ = img_bgr.shape

    img0 = img_bgr.copy().astype(np.uint8)
    img_rgb = img_bgr[:, :, ::-1].copy()
    
    h, w = get_hw(model[0].inputs[0].properties)

    canvas, r, dw, dh, new_unpad_w, new_unpad_h = resize_unscale(img0, (h, w))
    img_input = bgr2nv12_opencv(canvas)

    preds = model[0].forward(img_input)

    det_out = preds[0].buffer[...,0]
    da_seg_out = preds[1].buffer
    ll_seg_out = preds[2].buffer

    det_out = torch.from_numpy(det_out).float()
    boxes = non_max_suppression(det_out)[0]  # [n,6] [x1,y1,x2,y2,conf,cls]
    boxes = boxes.cpu().numpy().astype(np.float32)

    if boxes.shape[0] == 0:
        print("no bounding boxes detected.")
        return

    # scale coords to original size.
    boxes[:, 0] -= dw
    boxes[:, 1] -= dh
    boxes[:, 2] -= dw
    boxes[:, 3] -= dh
    boxes[:, :4] /= r

    print(f"detect {boxes.shape[0]} bounding boxes.")

    img_det = img_rgb[:, :, ::-1].copy()
    for i in range(boxes.shape[0]):
        x1, y1, x2, y2, conf, label = boxes[i]
        x1, y1, x2, y2, label = int(x1), int(y1), int(x2), int(y2), int(label)
        img_det = cv2.rectangle(img_det, (x1, y1), (x2, y2), (0, 255, 0), 2, 2)

    cv2.imwrite(save_det_path, img_det)

    # select da & ll segment area.
    da_seg_out = da_seg_out[:, :, dh:dh + new_unpad_h, dw:dw + new_unpad_w]
    ll_seg_out = ll_seg_out[:, :, dh:dh + new_unpad_h, dw:dw + new_unpad_w]

    da_seg_mask = np.argmax(da_seg_out, axis=1)[0]  # (?,?) (0|1)
    ll_seg_mask = np.argmax(ll_seg_out, axis=1)[0]  # (?,?) (0|1)
    print(da_seg_mask.shape)
    print(ll_seg_mask.shape)

    color_area = np.zeros((new_unpad_h, new_unpad_w, 3), dtype=np.uint8)
    color_area[da_seg_mask == 1] = [0, 255, 0]
    color_area[ll_seg_mask == 1] = [255, 0, 0]
    color_seg = color_area

    # convert to BGR
    color_seg = color_seg[..., ::-1]
    color_mask = np.mean(color_seg, 2)
    img_merge = canvas[dh:dh + new_unpad_h, dw:dw + new_unpad_w, :]
    img_merge = img_merge[:, :, ::-1]

    # merge: resize to original size
    img_merge[color_mask != 0] = \
        img_merge[color_mask != 0] * 0.5 + color_seg[color_mask != 0] * 0.5
    img_merge = img_merge.astype(np.uint8)
    img_merge = cv2.resize(img_merge, (width, height),
                           interpolation=cv2.INTER_LINEAR)
    for i in range(boxes.shape[0]):
        x1, y1, x2, y2, conf, label = boxes[i]
        x1, y1, x2, y2, label = int(x1), int(y1), int(x2), int(y2), int(label)
        img_merge = cv2.rectangle(img_merge, (x1, y1), (x2, y2), (0, 255, 0), 2, 2)

    # da: resize to original size
    da_seg_mask = da_seg_mask * 255
    da_seg_mask = da_seg_mask.astype(np.uint8)
    da_seg_mask = cv2.resize(da_seg_mask, (width, height),
                             interpolation=cv2.INTER_LINEAR)

    # ll: resize to original size
    ll_seg_mask = ll_seg_mask * 255
    ll_seg_mask = ll_seg_mask.astype(np.uint8)
    ll_seg_mask = cv2.resize(ll_seg_mask, (width, height),
                             interpolation=cv2.INTER_LINEAR)

    cv2.imwrite(save_merge_path, img_merge)
    cv2.imwrite(save_da_path, da_seg_mask)
    cv2.imwrite(save_ll_path, ll_seg_mask)

    print("detect done.")

output.jpg As you can see from the above figure, all three visual tasks have good results, especially at night, YOLOP is a very powerful model.

5, summary

This paper reviewed the classic multi-task learning network YOLOP, understood how a model can output multiple results at the same time, and completed the deployment on X3. Students who need to try it will try other multi-task networks in the future.