I copied the matlab code from: http://www.mathworks.fr/fr/help/vision/ug/stereo-image-rectification.html
I can compute the 3D coordinates but I am not sure if it is the correct one.
Starting from the disparity map and calculating the 3D coordinates, how do we take into account of the warping tform1 and tform2?
The problem here is that you are using uncalibrated cameras. In this case you can get up-to-scale reconstruction, but if you want the 3D points in world units, you would need to know actual distances to some points in the world.
I think you would be better off calibrating your stereo system. Please see this example.
Related
first I will try to explane what I have to do and then I will ask my question to the problem.
My task is to detect small balls (2mm) in gelatine using two webcams.
The steps for detection are these:
Image taking using two webcams (position: 90 degree to each other)
Stereo calibration of each pair of images
Masking of the areas in the images which are not necessary to analyse
Rectification of each pair of images
Circle detection resulting in structure with the positions (x, y) of the center of each circle (in reality of each ball)
Association of the resulted position to get something like a 3D coordinate to know the position of the balls (this is my problem)
Now the problem (step 6.):
What possibilities are given to compute the 3D-coordinates of each center of the balls using the 2D-coordinates of the two images.
I'm searching here
http://de.mathworks.com/help/vision/stereo-vision.html
for ideas, but I hope you know some easy way and have some ideas.
I can not upload any images (because I'm new at stackoverflow)
Take a look at this example: Depth Estimation from Stereo Video. The example takes a pair images with a calibrated stereo camera, rectifies the images, detects a person, and gets the 3D coordinates of the centroid of the person. You can do the same thing to find the balls.
Calibrate the cameras using the Stereo Camera Calibrator app.
Take two images
Rectify the images using the rectifyStereoImages function
Compute stereo disparity using the disparity funciton
Get the 3D coordinates for every pixel using the reconstructScene funciton
Detect the balls in image 1
Look up the 3D coordinates of their centroids
Once you have the disparity and the dense reconstruction from reconstructScene there is no need to find correspondences between the images. disparity already did that for you.
I have 512x512x313 volume of dicom images and i have a point represented in world coordinates say (57.7475 63.4184 83.1515), how could I get the corresponding pixel coordinate of that world coordinate in Matlab??
I hate to burst your bubble, but what you are asking for is impossible. The only way that I can think of where you are able to get a correspondence between real-world co-ordinates and pixel co-ordinates is if you calibrate the camera that was used to capture the images. Once you know the intrinsic and extrinsic parameters, you then have a transformation matrix that can map real-world co-ordinates to pixel co-ordinates.
I'm assuming you don't have calibration information for your camera, and so an alternative approach would be knowing which pixels in your image map to which real-world co-ordinates. You would need to know point correspondences between those points that map between the real-world and to your image. Once you know this, you would then compute the camera transformation matrix through least-squares and then you would use this matrix to determine which points map from the real-world to your image.
Unless you have pixel correspondences to each of your real-world co-ordinates, it is impossible to do what you're asking.
FWIW, if you want to see the procedure on how to obtain the transformation matrix, check out these notes: http://www.peterhillman.org.uk/downloads/whitepapers/calibration.pdf. This was a great starting point for me when I started learning about camera calibration. Take a look at Section #5 (Page #8), as this is what I believe you are looking for.... but you will need to have correspondences between the real-world co-ordinates and your image.
Good luck!
I have problems understanding the depth (Z) value in 3D point cloud resulted from 3d sparse reconstruction like this example in MATLAB: http://www.mathworks.com/help/vision/ug/sparse-3-d-reconstruction-from-multiple-views.html
I have attached a picture showing the reconstructed 3D point cloud in the above example. I have put some datatips on the figure so we know the (x,y,z) coordinates of the points. here are my questions:
1- what does the Z value in point cloud represent? is it the distance in millimeters from the camera? if that's the case then it does not make sense based on the picture I attached since I am sure the distance of the sphere and checkerboard from the camera must be greater than 200 mm.
Or maybe it is from some reference point in space? then what is this reference point? and how can I make a 3D point cloud that the Z values indicate the distance from the camera?
2- why is there negative values for Z? what does that mean in terms of distance to the camera?
I appreciate if someone can explain.
In this example the world coordinates are defined by the checkerboard. The checkerboard defines the X-Y plane, and the Z-axis points into the checkerboard, as explained in the documentation:
Since your 3D points are above the checkerboard, they have negative Z-coordinates.
Your (x,y,z) coordinates are in world units, which are completely disconnected from metric values (unless you build a scale between world and metric, there are various methods to do it). So the z value tells you about the depth of each point in world coordinates.
If you have the pose of each camera, and you multiply each point by the camera projection matrix, you will get the (x',y',z') points in camera coordinates. At that point, if z' is negative, it means it's behind the camera.
Using Stereo vision and based on Multiple View Geometry book (http://www.robots.ox.ac.uk/~vgg/hzbook/), I have created a 3D point cloud in MATLAB. To do that, I first calibrated the cameras and rectified the stereo images. Then feature extraction and matching. Then eliminated the noisy matched based on camera locations. Finally created the 3D point cloud using triangulation.
Now my question is how to convert this 3D point cloud from pixel domain to actual millimeter/centimeter domain knowing my focal length and camera calibration matrices?
the goal is to find DEPTH IN MILLIMETERS.
I know how to do it in disparity/depth map case using formula: Z=(t*f)/d.
But here in the sparse case, can I do something like this? http://matlab.wikia.com/wiki/FAQ#How_do_I_measure_a_distance_or_area_in_real_world_units_instead_of_in_pixels.3F
or there is a more sophisticated method with more in depth explanation?
Thanks.
The formula you wrote is valid only in the special case when the image planes of the two cameras are on the same geometrical plane, and the motion from one to the other is a translation parallel to one of the image axes.
In the general case you'll need to triangulate actual rays in 3D space, using one of the techniques described in that book (it has a whole chapter on reconstruction). The reconstruction will be metrical if your calibration is. In particular, if the coordinate transform between the cameras has a translation vector whose units are meters (or millimeters, or inches, ...).
I am using Camera Calibration Toolbox for Matlab. After calibration I have intrinsic and extrinsic parameters of stereo camera system. Next, I would like to determine the distance between the camera system and the object. To get this information, I used the function stereo_triangulation which is included in the Toolbox. Input are two matrixes including pixel coordinates of correspondences in the left and right image.
I tried to get coordinates of correspondences with using of Basic Block Matching method which is described in Matlab's help for Stereo Vision.
Resolution of my pictures is 1280x960 pixels. I know that the biggest disparity is around 520 pixels. I set the maximum of disparity range to 520. But then determine the coordinates takes ages. It is not possible use in practice. Calculating of disparity map is much faster with using of Matlab's function disparity(). But I want the step before - coordinates of correspondences.
Please can you suggest how can I effectively get the coordinates with Matlab?
Disparity and 3D are related by simple formulas (see below) so the time for calculating 3D data and disparity map should be the same. The notation is
f - focal length in pixels,
B - separation between cameras,
u, v - row and column in the system centered on the middle of the image,
d-disparity,
x, y, z - 3D coordinates.
z=f*B/d;
x=z*u/f;
y=z*v/f;
1280x960 is too large resolution for any correlation stereo to work in real time. Think about it: you have to loop over a 2d image, over 2d correlation window and over the range of disparities. This means 5 embedded loops! I don't work with Matlab anymore but I know that it is quite slow.