Vector based implementation of "rs2_deproject_pixel_to_point" function
I have a code to get the realworld coordinates of a pixel which loops through each pixel and then calculates the realworld coordinate. I want a vector based implementation of the same function so that is becomes GPU optimised. Here is what I am doing currently:
def filter_data(depth_image,color_image):
#This Function is used to filter out the pixels which are in the vertical range of 1m of the camera
h,w = np.shape(depth_image)
depth_scale = depth_sensor.get_depth_scale()
#Getting the intrinsic parameters of the camera for the calculation of the xyz values
depth_intr = depth_frame_1.profile.as_video_stream_profile().get_intrinsics()
for y in range(h):
for x in range(w):
# The Function rs2_deproject_pixel_to_point() returns the realtime xyz values corresponding to the pixel
coordinate = rs.rs2_deproject_pixel_to_point(depth_intr,[y,x],depth_image[y,x]*depth_scale)
if abs(coordinate[0]) > 0.5 :
depth_image[y,x] = 65000.0
color_image[y,x] = 0
There is a lag of almost 2 seconds when I run this on my PC. I want to run this code on I.MX 8 mini processor on my drone. I am sure that in that processor, I will see latency of even more seconds so I want to use GPU for this function:
This is what I tried:
@jit(target_backend='cuda')
def filter_data(depth_image):
#This Function is used to filter out the pixels which are in the vertical range of 1m of the camera
h, w = depth_image.shape
# Create arrays of pixel coordinates
y_coords, x_coords = np.meshgrid(np.arange(h), np.arange(w), indexing='ij')
depth_scale = depth_sensor.get_depth_scale()
#Getting the intrinsic parameters of the camera for the calculation of the xyz values
depth_intr = depth_frame_1.profile.as_video_stream_profile().get_intrinsics()
# Calculate realworld coordinates for all pixels in one go
pixel_coords = np.stack((y_coords, x_coords), axis=1) # we are getting coordinates in (y, x, z) because we have given the input image_pixel_coordinates in the from [y, x]
depth_values = depth_image * depth_scale
real_coords = rs.rs2_deproject_pixel_to_point(depth_intr, pixel_coords, depth_values)
condition = np.any(np.abs(real_coords) > 0.5, axis=1)
depth_image[condition] = 65000.0
But I got an error:
TypeError: rs2_deproject_pixel_to_point(): incompatible function arguments. The following argument types are supported: 1. (intrin: pyrealsense2.pyrealsense2.intrinsics, pixel: List[float[2]], depth: float) > List[float[3]]
which means that function expects the pixel
argument to be a list of floats, but I to use NumPy arrays.
How to optimise code for GPU or make the code work faster on CPU

Hi Dhruvdarda2001 Instead of looping through each pixel to get an XYZ coordinate, it will use less processing power if you target one specific pixel with the rs2_project_color_pixel_to_depth_pixel instruction which converts a 2D color pixel to a 3D depth pixel. Using this method, it is not necessary to process the entire image. The link below demonstrates use of this instruction in Python.
https://github.com/IntelRealSense/librealsense/issues/5603#issuecomment574019008
Official documentation about this instruction can be found here:

Thanks MartyG, that is insightful!
So, is there a function that can do exactly opposite of what "rs2_project_color_pixel_to_depth_pixel" is doing, as in, is there any prebuilt function that can give me the (pixel_x, pixel_y) from an approximate (real_world_x, real_world_y) that I can get from other sources?

Thanks MartyG! This is helpful.
