Measuring objects in a photo taken by calibrated cameras, knowing the size of a reference object in the photo - matlab

I am writing a program that captures real time images from a scene by two calibrated cameras (so the internal parameters of the cameras are known to us). Using two view geometry, I can find the essential matrix and use OpenCV or MATLAB to find the relative position and orientation of one camera with respect to another. Having the essential matrix, it is shown in Hartley and Zisserman's Multiple View Geometry that one can reconstruct the scene using triangulation up to scale. Now I want to use a reference length to determine the scale of reconstruction and resolve ambiguity.
I know the height of the front wall and I want to use it for determining the scale of reconstruction to measure other objects and their dimensions or their distance from the center of my first camera. How can it be done in practice?
Thanks in advance.
Edit: To add more information, I have already done linear trianglation (minimizing the algebraic error) but I am not sure if it is any useful because there is still a scale ambiguity that I don't know how to get rid of it. My ultimate goal is to recognize an object (like a Pepsi can) and separate it in a rectangular area (which is going to be written as a separate module by someone else) and then find the distance of each pixel in this rectangular area, i.e. the region of interest, to the camera. Then the distance from the camera to the object will be the minimum of the distances from the camera to the 3D coordinates of the pixels in the region of interest.

Might be a bit late, but at least for someone struggling with the same staff.
As far as I remember it is actually linear problem. You got essential matrix, which gives you rotation matrix and normalized translation vector specifying relative position of cameras. If you followed Hartley and Zissermanm you probably chose one of the cameras as origin of world coordinate system. Meaning all your triangulated points are in normalized distance from this origin. What is important is, that the direction of every triangulated point is correct.
If you have some reference in the scene (lets say height of the wall), then you just have to find this reference (2 points are enough - so opposite ends of the wall) and calculate "normalization coefficient" (sorry for terminology) as
coeff = realWorldDistanceOf2Points / distanceOfTriangulatedPoints
Once you have this coeff, just mulptiply all your triangulated points with it and you got real world points.
Example:
you know that opposite corners of the wall are 5m from each other. you find these corners in both images, triangulate them (lets call triangulated points c1 and c2), calculate their distance in the "normalized" world as ||c1 - c2|| and get the
coeff = 5 / ||c1 - c2||
and you get real 3d world points as triangulatedPoint*coeff.
Maybe easier option is to have both cameras in fixed relative position and calibrate them together by stereoCalibrate openCV/Matlab function (there is actually pretty nice GUI in Matlab for that) - it returns not just intrinsic params, but also extrinsic. But I don't know if this is your case.

Related

Is it possible to find the depth of an internal point of an object using stereo images (or any other method)?

I have image of robot with yellow markers as shown
The yellow points shown are the markers. There are two cameras used to view placed at an offset of 90 degrees. The robot bends in between the cameras. The crude schematic of the setup can be referred.
https://i.stack.imgur.com/aVyDq.png
Using the two cameras I am able to get its 3d co-ordinates of the yellow markers. But, I need to find the 3d-co-oridnates of the central point of the robot as shown.
I need to find the 3d position of the red marker points which is inside the cylindrical robot. Firstly, is it even feasible? If yes, what is the method I can use to achieve this?
As a bonus, is there any literature where they find the 3d location of such internal points which I can refer to (I searched, but could not find anything similar to my ask).
I am welcome to a theoretical solution as well(as long as it assures to find the central point within a reasonable error), which I can later translate to code.
If you know the actual dimensions, or at least, shape (e.g. perfect circle) of the white bands, then yes, it is feasible and possible.
You need to do the following steps, which are quite non trivial to do, and I won't do them here:
Optional but extremely suggested: calibrate your camera, and
undistort it.
find the equation of the projection of a 3D circle into a 2D camera, for any given rotation. You can simplify this by assuming the white line will be completely horizontal. You want some function that takes the parameters that make a circle and a rotation.
Find all white bands in the image, segment them, and make them horizontal (rotate them)
Fit points in the corrected white circle to the equation in (1). That should give you the parameters of the circle in 3d (radious, angle), if you wrote the equation right.
Now that you have an analytic equation of the actual circle (equation from 1 with parameters from 3), you can map any point from this circle (e.g. its center) to the image location. Remember to uncorrect for the rotations in step 2.
This requires understanding of curve fitting, some geometric analytical maths, and decent code skills. Not trivial, but this will provide a solution that is highly accurate.
For an inaccurate solution:
Find end points of white circles
Make line connecting endpoints
Chose center as mid point of this line.
This will be inaccurate because: choosing end points will have more error than fitting an equation with all points, ignores cone shape of view of the camera, ignores geometry.
But it may be good enough for what you want.
I have been able to extract the midpoint by fitting an ellipse to the arc visible to the camera. The centroid of the ellipse is the required midpoint.
There will be wrong ellipses as well, which can be ignored. The steps to extract the ellipse were:
Extract the markers
Binarise and skeletonise
Fit ellipse to the arc (found a matlab function for this)
Get the centroid of the ellipse
hsv_img=rgb2hsv(im);
bin=new_hsv_img(:,:,3)>marker_th; %was chosen 0.35
%skeletonise
skel=bwskel(bin);
%use regionprops to get the pixelID list
stats=regionprops(skel,'all');
for i=1:numel(stats)
el = fit_ellipse(stats(i).PixelList(:,1),stats(i).PixelList(:,2));
ellipse_draw(el.a, el.b, -el.phi, el.X0_in, el.Y0_in, 'g');
The link for fit_ellipse function
Link for ellipse_draw function

Verify that camera calibration is still valid

How do you determine that the intrinsic and extrinsic parameters you have calculated for a camera at time X are still valid at time Y?
My idea would be
to use a known calibration object (a chessboard) and place it in the camera's field of view at time Y.
Calculate the chessboard corner points in the camera's image (at time Y).
Define one of the chessboard corner points as world origin and calculate the world coordinates of all remaining chessboard corners based on that origin.
Relate the coordinates of 3. with the camera coordinate system.
Use the parameters calculated at time X to calculate the image points of the points from 4.
Calculate distances between points from 2. with points from 5.
Is that a clever way to go about it? I'd eventually like to implement it in MATLAB and later possibly openCV. I think I'd know how to do steps 1)-2) and step 6). Maybe someone can give a rough implementation for steps 2)-5). Especially I'd be unsure how to relate the "chessboard-world-coordinate-system" with the "camera-world-coordinate-system", which I believe I would have to do.
Thanks!
If you have a single camera you can easily follow the steps from this article:
Evaluating the Accuracy of Single Camera Calibration
For achieving step 2, you can easily use detectCheckerboardPoints function from MATLAB.
[imagePoints, boardSize, imagesUsed] = detectCheckerboardPoints(imageFileNames);
Assuming that you are talking about stereo-cameras, for stereo pairs, imagePoints(:,:,:,1) are the points from the first set of images, and imagePoints(:,:,:,2) are the points from the second set of images. The output contains M number of [x y] coordinates. Each coordinate represents a point where square corners are detected on the checkerboard. The number of points the function returns depends on the value of boardSize, which indicates the number of squares detected. The function detects the points with sub-pixel accuracy.
As you can see in the following image the points are estimated relative to the first point that covers your third step.
[The image is from this page at MATHWORKS.]
You can consider point 1 as the origin of your coordinate system (0,0). The directions of the axes are shown on the image and you know the distance between each point (in the world coordinate), so it is just the matter of depth estimation.
To find a transformation matrix between the points in the world CS and the points in the camera CS, you should collect a set of points and perform an SVD to estimate the transformation matrix.
But,
I would estimate the parameters of the camera and compare them with the initial parameters at time X. This is easier, if you have saved the images that were used when calibrating the camera at time X. By repeating the calibrating process using those images you should get very similar results, if the camera calibration is still valid.
Edit: Why you need the set of images used in the calibration process at time X?
You have a set of images to do the calibrations for the first time, right? To recalibrate the camera you need to use a new set of images. But for checking the previous calibration, you can use the previous images. If the parameters of the camera are changes, there would be an error between the re-estimation and the first estimation. This can be used for evaluating the validity of the calibration not for recalibrating the camera.

Azimuth and Elevation from one moving object to another, relative to first object's orientation

I have two moving objects with position and orientation data (Euler Angles, Quaternions) relative to ECI coordinate frame. I would like to calculate AZ/EL from what I'm guessing is the "body frame" of the first object. I have attempted to convert both objects into the body frame through rotation matrices (X-Y-Z and Z-Y-X rotation sequence) and calculate a target vector AZ/EL this way but have not had success. I've also tried to get body frame positions and calculate the body axis/angles and convert back to Eulers (relative to body frame). I'm never sure how the coordinate system axes I'm supposedly creating are aligned along my object.
I have found a couple other questions like this answered with Quaternion solutions so that may be the best path to take, however my Quaternion background is weak at best and I fail to see how the computations given result in a target vector.
Any advice would be much appreciated and I'm happy to share my progress/experiences going forward.
get the current NEH transform matrix for the moving object
you must know position and at least two directions from North,East,Height(Up or Altitude) of the moving object otherwise is your problem unsolvable no matter what. This matrix/frame is called NEH (X=North,Y=East,Z=Height) or sometimes also ENU (X=East,Y=North,Z=Up). Look here transform matrix anatomy and here Earth's NEH construction and change the position and radius to match your moving object.
convert point P0 from GCS (global coordinate system) to NEH
simply: P1=Inverse(NEH)*P0 where P1 is now in NEH LCS (Local coordinate system). Both P0,P1 are in homogenous coordinates { x,y,z,w=1 } to allow multiplications with 4x4 matrix so you can compute azimut and elevation directly from it:
Azimut=atanxy(P1.x,P1.y);
Elevation=atan(P1.z/sqrt((P1.x*P1.x)+(P1.y*P1.y)));
where atanxy is mine atan2 (4 quadrant atan) first is dx then dy. I think atan2 in matlab has it in reverse.
[Notes]
Always visually check all frames (especially NEH). Just draw the 3 axises as lines of some length to validate if the result is correct. It should look like on image, just different color for each axis. You can move to next point only if NEH is OK !!!
Check atan2/atanxy operands order and also check goniometric functions units (rad,deg) to avoid confusions.

Use calibrated camera get matched points for 3D reconstruction

I am trying to compute the 3D coordinates from several pair of two view points.
First, I used the matlab function estimateFundamentalMatrix() to get the F of the matched points (Number > 8) which is:
F1 =[-0.000000221102386 0.000000127212463 -0.003908602702784
-0.000000703461004 -0.000000008125894 -0.010618266198273
0.003811584026121 0.012887141181108 0.999845683961494]
And my camera - taken these two pictures - was pre-calibrated with the intrinsic matrix:
K = [12636.6659110566, 0, 2541.60550098958
0, 12643.3249022486, 1952.06628069233
0, 0, 1]
From this information I then computed the essential matrix using:
E = K'*F*K
With the method of SVD, I finally got the projective transformation matrices:
P1 = K*[ I | 0 ]
and
P2 = K*[ R | t ]
Where R and t are:
R = [ 0.657061402787646 -0.419110137500056 -0.626591577992727
-0.352566614260743 -0.905543541110692 0.235982367268031
-0.666308558758964 0.0658603659069099 -0.742761951588233]
t = [-0.940150699101422
0.320030970080146
0.117033504470591]
I know there should be 4 possible solutions, however, my computed 3D coordinates seemed to be not correct.
I used the camera to take pictures of a FLAT object with marked points. I matched the points by hand (which means there should not be obvious mistake exists about the raw material). But the result turned out to be a surface with a little bit banding.
I guess this might be due to the reason pictures did not processed with distortions (but actually I remember I did).
I just want to know whether this method to solve the 3D reconstruction issue right? Especially when we already know the camera intrinsic matrix.
Edit by JCraft at Aug.4: I have redone the process and got some pictures showing the problem, I will write another question with detail then post the link.
Edit by JCraft at Aug.4: I have posted a new question: Calibrated camera get matched points for 3D reconstruction, ideal test failed. And #Schorsch really appreciate your help formatting my question. I will try to learn how to do inputs in SO and also try to improve my gramma. Thanks!
If you only have the fundamental matrix and the intrinsics, you can only get a reconstruction up to scale. That is your translation vector t is in some unknown units. You can get the 3D points in real units in several ways:
You need to have some reference points in the world with known distances between them. This way you can compute their coordinates in your unknown units and calculate the scale factor to convert your unknown units into real units.
You need to know the extrinsics of each camera relative to a common coordinate system. For example, you can have a checkerboard calibration pattern somewhere in your scene that you can detect and compute extrinsics from. See this example. By the way, if you know the extrinsics, you can compute the Fundamental matrix and the camera projection matrices directly, without having to match points.
You can do stereo calibration to estimate the R and the t between the cameras, which would also give you the Fundamental and the Essential matrices. See this example.
Flat objects are critical surfaces, not possible to achive your goal from them. try adding two (or more) points off the plane (see Hartley and Zisserman or other text on the matter if still interested)

MATLAB - What are the units of Matlab Camera Calibration Toolbox

When showing the extrinsic parameters of calibration (the 3D model including the camera position and the position of the calibration checkerboards), the toolbox does not include units for the axes. It seemed logical to assume that they are in mm, but the z values displayed can not possibly be correct if they are indeed in mm. I'm assuming that there is some transformation going on, perhaps having to do with optical coordinates and units, but I can't figure it out from the documentation. Has anyone solved this problem?
If you marked the side length of your squares in mm, then the z-distance shown would be in mm.
I know next to nothing about matlabs (not entirely true but i avoid matlab wherever I can, and that would be almost always possible) tracking utilities but here's some general info.
Pixel dimension on the sensor has nothing to do with the size of the pixel on screen, or in model space. For all purposes a camera produces a picture that has no meaningful units. A tracking process is unaware of the scale of the scene. (the perspective projection takes care of that). You can re insert a scale by taking 2 tracked points and measuring the distance between those points. This is the solver spaces distance is pretty much arbitrary. Now if you know the real distance between these points you can get a conversion factor. By doing:
real distance / solver space distance.
There's really now way to knowing this distance form the cameras settings as the camera is unable to differentiate between different scales of scenes. So a perfect 1:100 replica is no different for the solver than the real deal. So you must allays relate to something you can measure separately for each measuring session. The camera always produces something that's relative in nature.