View my account

finding 3d coordinates of a given point which is specified in the 2d pixel coordinates

Comments

14 comments

  • MartyG

    Hi Zahid Iqbal  When aligning depth to color with the align_to instruction, the center-line of the color sensor becomes the 0, 0, 0 origin of depth.  When depth and color are not aligned, the 0,0,0 origin of depth is the center-line of the left infrared sensor.

     

    The -0.000172043 value for the Z-depth coordinate of the script does not sound correct.  I would recommend printing the value of the dist variable (which should be the 3D real-world distance from the camera of that coordinate in meters) to see what it says.  I would expect it to be more like the 2.383 value that you got from the Viewer's pointcloud.

     

    I would recommend not comparing the outputs of the Method 1 script and the Viewer directly like-for-like though as they use different alignment methods to combine depth and color.  The Method 1 script uses the align_to instruction, whilst the Viewer's pointcloud maps depth and color together with the map_to instruction (demonstrated in the Method 3 script).

     

    I would suggest trying the Method 3 script to see how its coordinate output compares to the Viewer.

    0
    Comment actions Permalink
  • Zahid Iqbal

    Following up the previous post, I have tried to use the second approach

    Method 2 – do not align, use the original depth and color images, rs2_project_color_pixel_to_depth_pixel, then rs2_deproject_pixel_to_point.

    Here is the code that I have used,

    void realSenseWrapper :: getRealSense3DPoint(float result3DPoint[3], int c, int r)
    {


    //const rs2::depth_frame& depth, rs2_intrinsics* depth_intrin, rs2_intrinsics* color_intrin, rs2_extrinsics* color_extrin_to_depth, rs2_extrinsics* depth_extrin_to_color)


    rs2::config cfg;
    cfg.enable_stream(RS2_STREAM_DEPTH, 1280, 720, RS2_FORMAT_Z16, 15);
    cfg.enable_stream(RS2_STREAM_COLOR, 1280, 720, RS2_FORMAT_RGBA8, 15);


    selection = pipe.start(cfg);
    const rs2::stream_profile color_profile = selection.get_stream(RS2_STREAM_COLOR);
    const rs2::stream_profile depth_profile = selection.get_stream(RS2_STREAM_DEPTH);

     


    auto depth_intrin = (depth_profile.as<rs2::video_stream_profile>().get_intrinsics());
    auto color_intrin = (color_profile.as<rs2::video_stream_profile>().get_intrinsics());
    auto depth_extrin_to_color = (depth_profile.as<rs2::video_stream_profile>().get_extrinsics_to(color_profile));
    auto color_extrin_to_depth = (color_profile.as<rs2::video_stream_profile>().get_extrinsics_to(depth_profile));


    rs2::frameset frames = pipe.wait_for_frames();
    rs2::depth_frame depth = frames.get_depth_frame();

     


    float rgb_pixel[2], depth_pixel[2];
    int int_depth_pixel[2];
    rgb_pixel[0] = float(c); // col in RGB frame
    rgb_pixel[1] = float(r); // row in RGB frame


    float depth_scale = depth.get_units();


    rs2_project_color_pixel_to_depth_pixel(depth_pixel,
    reinterpret_cast<const uint16_t*>(depth.get_data()),
    depth_scale,
    0.20, 2.0, // From 0 to 2 meters Min to Max depth
    &depth_intrin,
    &color_intrin,
    &color_extrin_to_depth,
    &depth_extrin_to_color, rgb_pixel);


    int_depth_pixel[0] = (int)round(depth_pixel[0]);
    int_depth_pixel[1] = (int)round(depth_pixel[1]);
    depth_pixel[0] = int_depth_pixel[0];
    depth_pixel[1] = int_depth_pixel[1];


    auto dist = depth.get_distance(static_cast<int>(depth_pixel[0]), static_cast<int>(depth_pixel[1]));


    rs2_deproject_pixel_to_point(result3DPoint, &depth_intrin, depth_pixel, dist);


    pipe.stop();
    }

    However, my output is the following for the 

    getRealSense3DPoint(result3DPoint, 640, 360);

     3D coordinates: (0.0987119,-0.0107843,2.464)

    Which does not seem to agree with the reading from the realsense viewer.

    Please, any comments for this.

     

    thanks,

     

    0
    Comment actions Permalink
  • Zahid Iqbal

    Hi MartyG,

    about your comment, I have the following outputs

    pixel[0]: 640, pixel[1]: 360
    dist: 0
     center of the image in color coordinates: (-0,-0,0)
     center of the image in depth coordinates: (0.0590469,-0.000145956,-0.000172043)

    The image is (1280 by 720), so that is the center pixel of the image.

    Here is the code:

    void realSenseWrapper :: important_untested_function(float point_in_color_coordinates[3], float point_in_depth_coordinates[3])
    {
    rs2::align align_to(RS2_STREAM_COLOR);
    rs2::config cfg;
    cfg.enable_stream(RS2_STREAM_DEPTH, 1280, 720, RS2_FORMAT_Z16, 15);
    cfg.enable_stream(RS2_STREAM_COLOR, 1280, 720, RS2_FORMAT_RGBA8, 15);

    selection = pipe.start(cfg);

    while (true)
    {
    // Wait for the next set of frames from the camera
    auto frames = pipe.wait_for_frames();

    auto depth = frames.get_depth_frame();
    auto color = frames.get_color_frame();

    rs2_extrinsics color_extrin_to_depth = color.get_profile().as<rs2::video_stream_profile>().get_extrinsics_to(depth.get_profile().as<rs2::video_stream_profile>());

    frames = frames.apply_filter(align_to);
    auto color_aligned = frames.get_color_frame();
    auto depth_aligned = frames.get_depth_frame();

    rs2_intrinsics intr_depth = depth_aligned.get_profile().as<rs2::video_stream_profile>().get_intrinsics();

    // pixel in color coordinate system
    float u = 0.5;
    float v = 0.5;

    int w = color_aligned.get_width();
    int h = color_aligned.get_height();

    int c = w * u;
    int r = h * v;

    // assume depth and color same size after alignment, find depth pixel in color coordinate system
    float pixel[2];
    pixel[0] = c;
    pixel[1] = r;

    cout << "pixel[0]: " << c <<", pixel[1]: " << r << endl;

    //float point_in_color_coordinates[3]; // 3D point in color sensor coordinates
    //float point_in_depth_coordinates[3]; // 3D point in depth sensor coordinates

    auto dist = depth.get_distance(static_cast<int>(pixel[0]), static_cast<int>(pixel[1]));

    cout << "dist: " << dist << endl;

    rs2_deproject_pixel_to_point(point_in_color_coordinates, &intr_depth, pixel, dist);
    rs2_transform_point_to_point(point_in_depth_coordinates, &color_extrin_to_depth, point_in_color_coordinates);

    break;
    }
    }

     

     

    0
    Comment actions Permalink
  • MartyG

    If you received a z-depth 0f 2.464 from Method 2 (the same as the Viewer) and Method 1 is reporting 0 for Z-depth then it seems that it would be best to drop Method 1 and use Method 2 instead.

     

    Using the align_to instruction (as Method 1 does) introduces the possibility of inaccuracy caused by the alignment process.  So Method 2, which does not use alignment and instead makes use of intrinsics, may provide more reliable results.

     

    Another factor to bear in mind is that higher resolutions provide greater accuracy, whilst lower resolutions such as 640x480 will result in lower accuracy.

    0
    Comment actions Permalink
  • Zahid Iqbal

    MartyG, for the third approach with point cloud,

    float u = 0.609375; // 780
    float v = 0.488889; // 352

    shouldn't these be

    float u = 0.5;
    float v = 0.5;

    instead? Because, we need the center pixel of the image. I received the exactly same results as with the first approach.

     3D coordinates: (0.0590469,-0.000145956,-0.000172043)
     center of the image in color coordinates: (-0,-0,0)

     
    0
    Comment actions Permalink
  • MartyG

    The coordinates of the center pixel of an image at a particular resolution will be exactly half of that resolution.  So if 1280x720 resolution was being used then the center XY pixel coordinate would be 640, 360, whilst the center of 640x480 would be 320, 240.

     

    In regard to the values of floats u and v, I do not have information about why those particular values were chosen by the Intel RealSense team member who provided the script.  However, a RealSense user at the link below who also read #11031 does use 0.5 for u and v in their script.

    https://github.com/IntelRealSense/librealsense/issues/12067#issuecomment-1676895454

    0
    Comment actions Permalink
  • Zahid Iqbal

    MartyG

    Following some threads, I have the following piece of code, which seems to give me better results, but still with some inaccuracy in one of the dimensions. First I will write the code, and then I will report the results that I received.

    rs2::align align_to(RS2_STREAM_COLOR);
    rs2::config cfg;

    cfg.enable_stream(RS2_STREAM_DEPTH, 1280, 720, RS2_FORMAT_Z16, 30);    
    cfg.enable_stream(RS2_STREAM_COLOR, 1280, 720, RS2_FORMAT_BGRA8, 30);

    selection = pipe.start(cfg);

    while(true)
    {
     auto frames = pipe.wait_for_frames();        
    auto aligned_frameset = align_to.process(frames);

     rs2::video_frame color_stream = aligned_frameset.get_color_frame();        
    rs2::depth_frame aligned_depth_stream = aligned_frameset.get_depth_frame();

     rs2::video_stream_profile depth_stream_profile = aligned_depth_stream.get_profile().as<rs2::video_stream_profile>();        
    const auto depth_intrinsics = depth_stream_profile.get_intrinsics();

     const int w = color_stream.as<rs2::video_frame>().get_width();          
    const int h = color_stream.as<rs2::video_frame>().get_height();

     rs2_extrinsics color_extrin_to_depth = color_stream.get_profile().as<rs2::video_stream_profile>().get_extrinsics_to(aligned_depth_stream.get_profile().as<rs2::video_stream_profile>());
     
    // pixel in color coordinate system        
    float u = 0.5;        
    float v = 0.5;

     // assume depth and color same size after alignment, find depth pixel in color coordinate system
     float pixel[2];        
    pixel[0] = c;        
    pixel[1] = r;

     float dist = aligned_depth_stream.get_distance(static_cast<int>(pixel[0]), static_cast<int>(pixel[1]));
     rs2_deproject_pixel_to_point(point_in_color_coordinates, &depth_intrinsics, pixel, dist);        
    rs2_transform_point_to_point(point_in_depth_coordinates, &color_extrin_to_depth, point_in_color_coordinates);


    break;
    }



    This is the result that I get:

    dist: 2.494
     center of the image in color coordinates: (-0.0488563,-0.00127967,2.494)
     center of the image in depth coordinates: (-0.0488563,-0.00127967,2.494)

    And, from the realsense viewer, I get the following: (0.003, -0.004, 2.490)

    error in the x-coord: 45.8563 mm,

    error in the y-coord: 2.72033 mm,

    error in the z-coord: 4 mm

    The error in the x direction is still quite big about 4.5 cm. Please, can you comment on that.

    Thanks,

     

     

     

     

     

     

    0
    Comment actions Permalink
  • Zahid Iqbal

    MartyG,

    Are there any further settings that could help improve the accuracy. Also, how to run/activate the emitter. I read at another place, this helps the depth measurements, and probably will lead to better estimates. 

    0
    Comment actions Permalink
  • MartyG

    If you are using C++ then you can enable the emitter with the instruction RS2_OPTION_EMITTER_ENABLED.  A C++ script for doing so can be found at the link below under the librealsense2 heading.

    https://dev.intelrealsense.com/docs/api-how-to#controlling-the-laser

     

    Setting a Visual Preset camera configuration called Medium Density provides a good balance between accuracy and a high amount of detail on the depth image.  The link below has C++ code for setting a preset.

    https://github.com/IntelRealSense/librealsense/issues/9071#issuecomment-878552273

     

    In the code, change RS2_RS400_VISUAL_PRESET_HIGH_ACCURACY to RS2_RS400_VISUAL_PRESET_MEDIUM_DENSITY  to set the Medium Density preset.  If High Accuracy is used then it tends to greatly reduce the amount of detail on the depth image.

    0
    Comment actions Permalink
  • Zahid Iqbal

    MartyG thanks,

    I will check these details. Something very strange just happened. I do not seem to change anything, but the same code is returning (0,0,0). I wonder, did I reset something.

     

     

    0
    Comment actions Permalink
  • Wmaass88

    Marty,

    Regarding the visual preset, in c++ if you do not set/specify this is there a default preset that is used?

    0
    Comment actions Permalink
  • MartyG

    If a preset is not specified then the default values for camera settings should be automatically applied.

    0
    Comment actions Permalink
  • Zahid Iqbal

    MartyG

    I have run my program for a few times. The error I receive between what my function returns and from the image center in realsense-viewer is usually high along the x-coordinate, ranging anywhere between 30~60 millimeters. On the y-coordinate is minimal, but on the z-coordinate is anywhere between 4  ~ 29 mm.

     

     

     

    0
    Comment actions Permalink
  • MartyG

    In programs that use the instruction get_distance() like yours does, the X coordinate tends to be accurate at the center of the image but become increasingly inaccurate as you move away from the image center towards its outer edges. This can be caused if the incorrect intrinsics are used. 

     

    For example, if aligning depth to color then the color intrinsics should be used instead of the depth intrinsics.  This is because when depth is aligned to color, the origin of depth becomes the center-line of the RGB sensor.

    0
    Comment actions Permalink

Please sign in to leave a comment.