Creating stereoParameters class in Matlab: what coordinate system should be used for relative camera rotation parameter? - matlab

stereoParameters takes two extrinsic parameters: RotationOfCamera2 and TranslationOfCamera2.
The problem is that the documentation is a not very detailed about what RotationOfCamera2 really means, it only says: Rotation of camera 2 relative to camera 1, specified as a 3-by-3 matrix.
What is the coordinate system in this case ?
A rotation matrix can be specified in any coordinate system.
What does it exactly mean "the coordinate system of Camera 1" ? What are its x,y,z axes ?
In other words, if I calculate the Essential Matrix, how can I get the corresponding RotationOfCamera2 and TranslationOfCamera2 from the Essential Matrix ?

RotationOfCamera2 and TranslationOfCamera2 describe the transformation from camera1's coordinates into camera2's coordinates. A camera's coordinate system has its origin at the camera's optical center. Its X and Y-axes are in the image plane, and its Z-axis points out along the optical axis.
Equivalently, the extrinsics of camera 1 are identity rotation and zero translation, while the extrinsics of camera 2 are RotationOfCamera2 and TranslationOfCamera2.
If you have the Essential matrix, you can decompose it into the rotation and a translation. Two things to keep in mind. First, the translation is up to scale, so t will be a unit vector. Second, the rotation matrix will be a transpose of what you get from estimateCameraParameters, because of the difference in the vector-matrix multiplication conventions.
Out of curiosity, what is it that you are trying to accomplish? Are you working with a single moving camera? Otherwise, why not use the Stereo Camera Calibrator app to calibrate your cameras, and get rotation and translation for free?

Suppose for left camera's 1st checkerboard (or to any world reference) rotation is R1 and translation is T1, right camera's 1st checkerboard rotation is R2 and translation is T2, then you can calculate them as follows;
RotationOfCamera2 = R2*R1';
TranslationOfCamera2= T2-RotationOfCamera2*T1
But please note that this calculations are just for one identical checkerboard reference. Inside matlab these two parameters are calculated by all given pair of checkerboard images and calculate median values as initial guess. Later these parameters will be refine by nonlinear optimization. So after median calculations they might be sigtly differ. But if you have just one reference point tranfomation for both two camera, you should use above formula. Note Dima told, matlab's rotation matrix is transpose of normal usage. So I wrote it as how the literature tells not matlab's style.

Related

Find 3D coordinate with respect to the camera using 2D image coordinates

I need to calculate the X,Y coordinates in the world with respect to the camera using u,v coordinates in the 2D image. I am using an S7 edge camera to send a 720x480 video feed to MATLAB.
What I know: Z i.e the depth of the object from the camera, size of the camera pixels (1.4um), focal length (4.2mm)
Let's say the image point is at (u,v) = (400,400).
My approach is as follows:
Subtract the pixel value of center point (240,360) from the u,v pixel coordinates of the point in the image. This should give us the pixel coordinates with respect to the camera's optical axis (z axis). The origin is now at the center of the image. So new coordinates are: (160, -40)
Multiply the new u,v pixel values with pixel size to obtain the distance of the point from the origin in physical units. Let's call it (x,y). We get (x,y) = (0.224,-0.056) in mm units.
Use the formula X = xZ/f & Y = yZ/f to calculate X,Y coordinates in the real world with respect to the camera's optical axis.
Is my approach correct?
Your approach is going in the right way, but it would be easier if you use a more standardize approach. What we usually do is use Pinhole Camera Model to give you a transformation between the world coordinates [X, Y, Z] to the pixel [x, y]. Take a look in this guide which describes step-by-step the process of building your transformation.
Basically you have to define you Internal Camera Matrix to do the transformation:
fx and fy are your focal length scaled to use as pixel distance. You can calculate this with your FOV and the total pixel in each direction. Take a look here and here for more info.
u0 and v0 are the piercing point. Since our pixels are not centered in the [0, 0] these parameters represents a translation to the center of the image. (intersection of the optical axis with the image plane provided in pixel coordinates).
If you need, you can also add a the skew factor a, which you can use to correct shear effects of your camera. Then, the Internal Camera Matrix will be:
Since your depth is fixed, just fix your Z and continue the transformation without a problem.
Remember: If you want the inverse transformation (camera to world) just invert you Camera Matrix and be happy!
Matlab has also a very good guide for this transformation. Take a look.

How to convert camera pose (Translation matrix) obtained from the essential matrix to world coordinate system

I have extracted Rotation and Translation matrices from the essential matrix. The translation vector has a scale ambiguity. Therefore, I couldn't define its "true" values.
My steps were as follow:
F=estimateF(matches1,matches2,'RANSAC')
E=K2'*F*K1
[U S V]=svd(E)
s=(S(1,1)+S(2,2))/2
S=diag([s s 0])
E_new=U*S*V'
[U S V]=svd(E_new);
R1=U*W*V'
R2=U*W'*V';
t1=U(:,3);
t2=-t1
My problem is how to define the translation of the second camera from the first one in mm.
Unless you know some more information that ties your points to the real world, it's not possible to recover the absolute scale.
For example, if the matches where corners of squares of a calibration chessboard of which you know the size in mm, then you would be able to know how far cameras are from each other in mm.

Verify that camera calibration is still valid

How do you determine that the intrinsic and extrinsic parameters you have calculated for a camera at time X are still valid at time Y?
My idea would be
to use a known calibration object (a chessboard) and place it in the camera's field of view at time Y.
Calculate the chessboard corner points in the camera's image (at time Y).
Define one of the chessboard corner points as world origin and calculate the world coordinates of all remaining chessboard corners based on that origin.
Relate the coordinates of 3. with the camera coordinate system.
Use the parameters calculated at time X to calculate the image points of the points from 4.
Calculate distances between points from 2. with points from 5.
Is that a clever way to go about it? I'd eventually like to implement it in MATLAB and later possibly openCV. I think I'd know how to do steps 1)-2) and step 6). Maybe someone can give a rough implementation for steps 2)-5). Especially I'd be unsure how to relate the "chessboard-world-coordinate-system" with the "camera-world-coordinate-system", which I believe I would have to do.
Thanks!
If you have a single camera you can easily follow the steps from this article:
Evaluating the Accuracy of Single Camera Calibration
For achieving step 2, you can easily use detectCheckerboardPoints function from MATLAB.
[imagePoints, boardSize, imagesUsed] = detectCheckerboardPoints(imageFileNames);
Assuming that you are talking about stereo-cameras, for stereo pairs, imagePoints(:,:,:,1) are the points from the first set of images, and imagePoints(:,:,:,2) are the points from the second set of images. The output contains M number of [x y] coordinates. Each coordinate represents a point where square corners are detected on the checkerboard. The number of points the function returns depends on the value of boardSize, which indicates the number of squares detected. The function detects the points with sub-pixel accuracy.
As you can see in the following image the points are estimated relative to the first point that covers your third step.
[The image is from this page at MATHWORKS.]
You can consider point 1 as the origin of your coordinate system (0,0). The directions of the axes are shown on the image and you know the distance between each point (in the world coordinate), so it is just the matter of depth estimation.
To find a transformation matrix between the points in the world CS and the points in the camera CS, you should collect a set of points and perform an SVD to estimate the transformation matrix.
But,
I would estimate the parameters of the camera and compare them with the initial parameters at time X. This is easier, if you have saved the images that were used when calibrating the camera at time X. By repeating the calibrating process using those images you should get very similar results, if the camera calibration is still valid.
Edit: Why you need the set of images used in the calibration process at time X?
You have a set of images to do the calibrations for the first time, right? To recalibrate the camera you need to use a new set of images. But for checking the previous calibration, you can use the previous images. If the parameters of the camera are changes, there would be an error between the re-estimation and the first estimation. This can be used for evaluating the validity of the calibration not for recalibrating the camera.

Understanding depth values in 3D point cloud

I have problems understanding the depth (Z) value in 3D point cloud resulted from 3d sparse reconstruction like this example in MATLAB: http://www.mathworks.com/help/vision/ug/sparse-3-d-reconstruction-from-multiple-views.html
I have attached a picture showing the reconstructed 3D point cloud in the above example. I have put some datatips on the figure so we know the (x,y,z) coordinates of the points. here are my questions:
1- what does the Z value in point cloud represent? is it the distance in millimeters from the camera? if that's the case then it does not make sense based on the picture I attached since I am sure the distance of the sphere and checkerboard from the camera must be greater than 200 mm.
Or maybe it is from some reference point in space? then what is this reference point? and how can I make a 3D point cloud that the Z values indicate the distance from the camera?
2- why is there negative values for Z? what does that mean in terms of distance to the camera?
I appreciate if someone can explain.
In this example the world coordinates are defined by the checkerboard. The checkerboard defines the X-Y plane, and the Z-axis points into the checkerboard, as explained in the documentation:
Since your 3D points are above the checkerboard, they have negative Z-coordinates.
Your (x,y,z) coordinates are in world units, which are completely disconnected from metric values (unless you build a scale between world and metric, there are various methods to do it). So the z value tells you about the depth of each point in world coordinates.
If you have the pose of each camera, and you multiply each point by the camera projection matrix, you will get the (x',y',z') points in camera coordinates. At that point, if z' is negative, it means it's behind the camera.

pca in matlab - 2D curve stretching

I have N 3D observations taken from an optical motion capture system in XYZ form.
The motion that was captured was just a simple circle arc, derived from a rigid body with fixed axis of rotation.
I used the princomp function in matlab to get all marker points on the same plane i.e. the plane on which the motion has been done.
(See a pic representing 3D data on the plane that was found, below)
What i want to do after the previous step is to look the fitted data on the plane that was found and get the curve of the captured motion in 2D.
In the princomp how to, it is said that
The first two coordinates of the principal component scores give the
projection of each point onto the plane, in the coordinate system of
the plane.
(from "Fitting an Orthogonal Regression Using Principal Components Analysis" article on mathworks help site)
So i thought that if i just plot those pc scores -plot(score(:,1),score(:,2))- i'll get the motion curve. Instead what i got is this.
(See a pic representing curve data in 2D derived from pc scores, below)
The 2d curve seems stretched and nonlinear (different y values for same x values) when it shouldn't be. The curve that i am looking for, should be interpolated by just using simple polynomial (polyfit) or circle fit in matlab.
Is this happening because the plane that was found looks like rhombus relative to the original coordinate system and the pc axes are rotated with respect to the basis of plane in such way that produce this stretch?
Then i thought that, this is happening because of the different coordinate systems of optical system and Matlab. Optical system's (ie cameras) co.sys. is XZY oriented and Matlab's default (i think) co.sys is XYZ oriented. I transformed my data to correspond to Matlab's co.sys through a rotation matrix, run again princomp but i got the same stretch in the 2D curve (the new curve just had different orientation now).
Somewhere else i read that
Principal Components Analysis chooses the first PCA axis as that line
that goes through the centroid, but also minimizes the square of the
distance of each point to that line. Thus, in some sense, the line is
as close to all of the data as possible. Equivalently, the line goes
through the maximum variation in the data. The second PCA axis also
must go through the centroid, and also goes through the maximum
variation in the data, but with a certain constraint: It must be
completely uncorrelated (i.e. at right angles, or "orthogonal") to PCA
axis 1.
I know that i am missing something but i have a problem understanding why i get a stretched curve. What i have to do so i can get the curve right?
Thanks in advance.
EDIT: Here is a sample data file (3 columns XYZ coords for 2 markers)
w w w.sendspace.com/file/2hiezc