Analysis of the Differences Between YUV_I420 and NV12

D-Robotics · September 22, 2023, 6:42am

Recently in the deployment of the board, so carefully read the sample code of the OE package horizon_runtime_sample, because the input data of these sample codes are local pictures, is bgr/rgb type, and the actual use of the camera to collect nv12 data type is different, so there will be color space conversion operation, Convert bgr to nv12, and then copy this input data into the input_tensor.

Here I use 00_quick_start as an example, and in the color space conversion step of the read_image_2_tensor_as_nv12 function, I use a line of code: cv::cvtColor(mat, yuv_mat, cv::COLOR_BGR2YUV_I420); I always thought that YUV_I420 here was NV12, and then found a little strange, so I worked hard to study it carefully.

horizon’s technical article Common Image Formats (horizon.cc) has helped me a lot. You can read this article first to get a basic understanding of data types.

First of all, we need to make sure that both nv12 and nv12_separate, before input to the model inference, should be in the form of yyyyyyyyuvuv, which is the YUV420SP mentioned in the Horizon technical article, the difference is that nv12 uses a block of memory, Nv12_uses two separate memory blocks.

Second, the sample code uses the cv::cvtColor interface to convert BGR into YUV_I420, which is not nv12/YUV420SP, but YUV420P, that is, the arrangement of yyyyyyyyuuvv, u and v are not stored alternately, but first u and then v.

So, when did you turn YUV420P into YUV420SP? The answer is in the second half of the read_image_2_tensor_as_nv12 function:

// copy y data
 auto data = input->sysMem[0].virAddr;
 int32_t y_size = input_h * input_w;
 memcpy(reinterpret_cast<uint8_t *>(data), nv12_data, y_size);

// copy uv data
 int32_t uv_height = input_h / 2;
 int32_t uv_width = input_w / 2;
 uint8_t *nv12 = reinterpret_cast<uint8_t *>(data) + y_size;
 uint8_t *u_data = nv12_data + y_size;
 uint8_t *v_data = u_data + uv_height * uv_width;

 for (int32_t i = 0; i < uv_width * uv_height; i++) {
 if (u_data && v_data) {
      *nv12++ = *u_data++;
      *nv12++ = *v_data++;
    }
  }

In this code, the variables nv12 are the address of input_tensor, and the variables u_data and v_data are the u and v component addresses of the input data after it is converted to YUV_I420. By copying u and v alternately to nv12/input_tensor, Implementation converts YUV_I420 to nv12.

Finally, the u and v of YUV_I420 are not alternating, and after converting to the alternating form, it is the nv12 in the usual sense!