1. Preface
Multi-task network can complete multiple tasks through a network, YOLOP can simultaneously carry out target detection, driveable area segmentation, lane detection three tasks in the same network. This article will provide a brief explanation of the network structure and deploy YOLOP on the X3 pie via Horizon’s AI toolchain.
YOLOP project: https://github.com/hustvl/YOLOP
This article tests the code :https://github.com/Rex-LK/ai_arm_learning
2. Network structure
The network structure of YOLOP is very clear, which is basically consistent with the network of YOLO series. Mainly referring to the model structure of YOLOV4, YOLOP’s network is mainly divided into the following parts:
- backbone
- neck
- Detect head
- Drivable areas segment head
- Lane line segment head
yolop.png
3. Model quantification
The opset_version of the official onnx model is 12. Therefore, you need to re-export onnx and change the opset_version value in export_onnx.py to 11. By default, the 640*640 model is used. After the export is completed, the quantization can be performed. The following is the YOLOP quantization configuration file.
model_parameters:
onnx_model: 'yolop-640-640.onnx'
output_model_file_prefix: 'yolop-640-640'
march: 'bernoulli2'
input_parameters:
input_type_train: 'rgb'
input_layout_train: 'NCHW'
input_type_rt: 'nv12'
norm_type: 'data_mean_and_scale'
mean_value: '123.675 116.28 103.53'
scale_value: '0.0171 0.0175 0.0174'
input_layout_rt: 'NCHW'
calibration_parameters:
cal_data_dir: './calibration_data_rgb_f32'
calibration_type: 'max'
max_percentile: 0.9999
compiler_parameters:
compile_mode: 'latency'
optimize_level: 'O3'
debug: False
core_num: 2
4. Board test
The following is part of the test code, (temporarily there are bugs, can not be uploaded, can be viewed in github)
model_path = weight
model = pyeasy_dnn.load(model_path)
print(f"Load {model_path} done!")
save_det_path = "./pictures/detect.jpg"
save_da_path = "./pictures/da.jpg"
save_ll_path = "./pictures/ll.jpg"
save_merge_path = "./pictures/output.jpg"
img_bgr = cv2.imread(img_path)
height, width, _ = img_bgr.shape
img0 = img_bgr.copy().astype(np.uint8)
img_rgb = img_bgr[:, :, ::-1].copy()
h, w = get_hw(model[0].inputs[0].properties)
canvas, r, dw, dh, new_unpad_w, new_unpad_h = resize_unscale(img0, (h, w))
img_input = bgr2nv12_opencv(canvas)
preds = model[0].forward(img_input)
det_out = preds[0].buffer[...,0]
da_seg_out = preds[1].buffer
ll_seg_out = preds[2].buffer
det_out = torch.from_numpy(det_out).float()
boxes = non_max_suppression(det_out)[0] # [n,6] [x1,y1,x2,y2,conf,cls]
boxes = boxes.cpu().numpy().astype(np.float32)
if boxes.shape[0] == 0:
print("no bounding boxes detected.")
return
# scale coords to original size.
boxes[:, 0] -= dw
boxes[:, 1] -= dh
boxes[:, 2] -= dw
boxes[:, 3] -= dh
boxes[:, :4] /= r
print(f"detect {boxes.shape[0]} bounding boxes.")
img_det = img_rgb[:, :, ::-1].copy()
for i in range(boxes.shape[0]):
x1, y1, x2, y2, conf, label = boxes[i]
x1, y1, x2, y2, label = int(x1), int(y1), int(x2), int(y2), int(label)
img_det = cv2.rectangle(img_det, (x1, y1), (x2, y2), (0, 255, 0), 2, 2)
cv2.imwrite(save_det_path, img_det)
# select da & ll segment area.
da_seg_out = da_seg_out[:, :, dh:dh + new_unpad_h, dw:dw + new_unpad_w]
ll_seg_out = ll_seg_out[:, :, dh:dh + new_unpad_h, dw:dw + new_unpad_w]
da_seg_mask = np.argmax(da_seg_out, axis=1)[0] # (?,?) (0|1)
ll_seg_mask = np.argmax(ll_seg_out, axis=1)[0] # (?,?) (0|1)
print(da_seg_mask.shape)
print(ll_seg_mask.shape)
color_area = np.zeros((new_unpad_h, new_unpad_w, 3), dtype=np.uint8)
color_area[da_seg_mask == 1] = [0, 255, 0]
color_area[ll_seg_mask == 1] = [255, 0, 0]
color_seg = color_area
# convert to BGR
color_seg = color_seg[..., ::-1]
color_mask = np.mean(color_seg, 2)
img_merge = canvas[dh:dh + new_unpad_h, dw:dw + new_unpad_w, :]
img_merge = img_merge[:, :, ::-1]
# merge: resize to original size
img_merge[color_mask != 0] = \
img_merge[color_mask != 0] * 0.5 + color_seg[color_mask != 0] * 0.5
img_merge = img_merge.astype(np.uint8)
img_merge = cv2.resize(img_merge, (width, height),
interpolation=cv2.INTER_LINEAR)
for i in range(boxes.shape[0]):
x1, y1, x2, y2, conf, label = boxes[i]
x1, y1, x2, y2, label = int(x1), int(y1), int(x2), int(y2), int(label)
img_merge = cv2.rectangle(img_merge, (x1, y1), (x2, y2), (0, 255, 0), 2, 2)
# da: resize to original size
da_seg_mask = da_seg_mask * 255
da_seg_mask = da_seg_mask.astype(np.uint8)
da_seg_mask = cv2.resize(da_seg_mask, (width, height),
interpolation=cv2.INTER_LINEAR)
# ll: resize to original size
ll_seg_mask = ll_seg_mask * 255
ll_seg_mask = ll_seg_mask.astype(np.uint8)
ll_seg_mask = cv2.resize(ll_seg_mask, (width, height),
interpolation=cv2.INTER_LINEAR)
cv2.imwrite(save_merge_path, img_merge)
cv2.imwrite(save_da_path, da_seg_mask)
cv2.imwrite(save_ll_path, ll_seg_mask)
print("detect done.")
output.jpg As you can see from the above figure, all three visual tasks have good results, especially at night, YOLOP is a very powerful model.
5, summary
This paper reviewed the classic multi-task learning network YOLOP, understood how a model can output multiple results at the same time, and completed the deployment on X3. Students who need to try it will try other multi-task networks in the future.