MATLAB - What are the units of Matlab Camera Calibration Toolbox - matlab

When showing the extrinsic parameters of calibration (the 3D model including the camera position and the position of the calibration checkerboards), the toolbox does not include units for the axes. It seemed logical to assume that they are in mm, but the z values displayed can not possibly be correct if they are indeed in mm. I'm assuming that there is some transformation going on, perhaps having to do with optical coordinates and units, but I can't figure it out from the documentation. Has anyone solved this problem?

If you marked the side length of your squares in mm, then the z-distance shown would be in mm.

I know next to nothing about matlabs (not entirely true but i avoid matlab wherever I can, and that would be almost always possible) tracking utilities but here's some general info.
Pixel dimension on the sensor has nothing to do with the size of the pixel on screen, or in model space. For all purposes a camera produces a picture that has no meaningful units. A tracking process is unaware of the scale of the scene. (the perspective projection takes care of that). You can re insert a scale by taking 2 tracked points and measuring the distance between those points. This is the solver spaces distance is pretty much arbitrary. Now if you know the real distance between these points you can get a conversion factor. By doing:
real distance / solver space distance.
There's really now way to knowing this distance form the cameras settings as the camera is unable to differentiate between different scales of scenes. So a perfect 1:100 replica is no different for the solver than the real deal. So you must allays relate to something you can measure separately for each measuring session. The camera always produces something that's relative in nature.

Related

Measuring objects in a photo taken by calibrated cameras, knowing the size of a reference object in the photo

I am writing a program that captures real time images from a scene by two calibrated cameras (so the internal parameters of the cameras are known to us). Using two view geometry, I can find the essential matrix and use OpenCV or MATLAB to find the relative position and orientation of one camera with respect to another. Having the essential matrix, it is shown in Hartley and Zisserman's Multiple View Geometry that one can reconstruct the scene using triangulation up to scale. Now I want to use a reference length to determine the scale of reconstruction and resolve ambiguity.
I know the height of the front wall and I want to use it for determining the scale of reconstruction to measure other objects and their dimensions or their distance from the center of my first camera. How can it be done in practice?
Thanks in advance.
Edit: To add more information, I have already done linear trianglation (minimizing the algebraic error) but I am not sure if it is any useful because there is still a scale ambiguity that I don't know how to get rid of it. My ultimate goal is to recognize an object (like a Pepsi can) and separate it in a rectangular area (which is going to be written as a separate module by someone else) and then find the distance of each pixel in this rectangular area, i.e. the region of interest, to the camera. Then the distance from the camera to the object will be the minimum of the distances from the camera to the 3D coordinates of the pixels in the region of interest.
Might be a bit late, but at least for someone struggling with the same staff.
As far as I remember it is actually linear problem. You got essential matrix, which gives you rotation matrix and normalized translation vector specifying relative position of cameras. If you followed Hartley and Zissermanm you probably chose one of the cameras as origin of world coordinate system. Meaning all your triangulated points are in normalized distance from this origin. What is important is, that the direction of every triangulated point is correct.
If you have some reference in the scene (lets say height of the wall), then you just have to find this reference (2 points are enough - so opposite ends of the wall) and calculate "normalization coefficient" (sorry for terminology) as
coeff = realWorldDistanceOf2Points / distanceOfTriangulatedPoints
Once you have this coeff, just mulptiply all your triangulated points with it and you got real world points.
Example:
you know that opposite corners of the wall are 5m from each other. you find these corners in both images, triangulate them (lets call triangulated points c1 and c2), calculate their distance in the "normalized" world as ||c1 - c2|| and get the
coeff = 5 / ||c1 - c2||
and you get real 3d world points as triangulatedPoint*coeff.
Maybe easier option is to have both cameras in fixed relative position and calibrate them together by stereoCalibrate openCV/Matlab function (there is actually pretty nice GUI in Matlab for that) - it returns not just intrinsic params, but also extrinsic. But I don't know if this is your case.

Verify that camera calibration is still valid

How do you determine that the intrinsic and extrinsic parameters you have calculated for a camera at time X are still valid at time Y?
My idea would be
to use a known calibration object (a chessboard) and place it in the camera's field of view at time Y.
Calculate the chessboard corner points in the camera's image (at time Y).
Define one of the chessboard corner points as world origin and calculate the world coordinates of all remaining chessboard corners based on that origin.
Relate the coordinates of 3. with the camera coordinate system.
Use the parameters calculated at time X to calculate the image points of the points from 4.
Calculate distances between points from 2. with points from 5.
Is that a clever way to go about it? I'd eventually like to implement it in MATLAB and later possibly openCV. I think I'd know how to do steps 1)-2) and step 6). Maybe someone can give a rough implementation for steps 2)-5). Especially I'd be unsure how to relate the "chessboard-world-coordinate-system" with the "camera-world-coordinate-system", which I believe I would have to do.
Thanks!
If you have a single camera you can easily follow the steps from this article:
Evaluating the Accuracy of Single Camera Calibration
For achieving step 2, you can easily use detectCheckerboardPoints function from MATLAB.
[imagePoints, boardSize, imagesUsed] = detectCheckerboardPoints(imageFileNames);
Assuming that you are talking about stereo-cameras, for stereo pairs, imagePoints(:,:,:,1) are the points from the first set of images, and imagePoints(:,:,:,2) are the points from the second set of images. The output contains M number of [x y] coordinates. Each coordinate represents a point where square corners are detected on the checkerboard. The number of points the function returns depends on the value of boardSize, which indicates the number of squares detected. The function detects the points with sub-pixel accuracy.
As you can see in the following image the points are estimated relative to the first point that covers your third step.
[The image is from this page at MATHWORKS.]
You can consider point 1 as the origin of your coordinate system (0,0). The directions of the axes are shown on the image and you know the distance between each point (in the world coordinate), so it is just the matter of depth estimation.
To find a transformation matrix between the points in the world CS and the points in the camera CS, you should collect a set of points and perform an SVD to estimate the transformation matrix.
But,
I would estimate the parameters of the camera and compare them with the initial parameters at time X. This is easier, if you have saved the images that were used when calibrating the camera at time X. By repeating the calibrating process using those images you should get very similar results, if the camera calibration is still valid.
Edit: Why you need the set of images used in the calibration process at time X?
You have a set of images to do the calibrations for the first time, right? To recalibrate the camera you need to use a new set of images. But for checking the previous calibration, you can use the previous images. If the parameters of the camera are changes, there would be an error between the re-estimation and the first estimation. This can be used for evaluating the validity of the calibration not for recalibrating the camera.

Specifications of Checkerboard (Calibration) for obtaining maximum accuracy in stereo reconstruction

I have to reconstruct an object which will be placed around 1 meter to 1.5 meters away from the baseline of my stereo setup. The image captured by both cameras have high resolution (10 MP)
The accuracy with which I have to detect it's position is +/- 0.5mm, in all the three co-ordinate axes. (If you require more details, please let me know)
For these, what should the optimal specifications of my checkerboard (for calibration) be?
I only know that it should be an asymmetric board. It should be placed in the same distance range as the range where object is expected to be placed. Also, it should be oriented in all possible angles (making sure all corners are seen by both cameras)
What about:
Number of squares horizontally and vertically? (also, on which side should the squares be more / even?)
Dimension of each square on checkerboard?
What effect does the baseline distance have on this?
Do these parameters of the checkerboard affect my accuracy in anyway? Are there any other parameters I need to consider for calibration?
I am using the MATLAB Stereo Calibrator App.
I will try to answer as good as I can:
Numbers of squares. Well, as you can guess, the more squares (actually corners between squares are used!) the better the result will be, as you have a more overdetermined system of equations to solve. Additionally, it doesnt matter the size of the chequerboard, only the odd/even pattern matters.
Dimensions of squares. the size does not matter very much in "mathematical" reresentation, but it matters practically. If your squares are very small, probably your printer wont draw a that good corner of the square and that will make your data "noisier". In the past, for really small calibration system I needed to go to an specialised printing shop so they could print it with the maximum quality possible. Of course if you make them very big you wont be able to fit lost of them in the iage which is not good.
The baseline distance has effect only in how properly can you see the corners between squares. The more accurate (in mm!, real distance!) you are detecting this corners the better. Obviously if you make small squares and put them very far, well, you wont see very much. This fits with the 1,2 question. Additionally, another problem you may have is focal length. In a application I worked on, some really small and close things wanted to be imaged. That was a problem while calibrating, as the amount if z distance I could see without blur was around 2mm. This really crippled my ability to calibrate properly because I could big angles in Z direction without getting blurred corners.
TL;DR: You want to have lots of corners between squares of the chequerboard but you want to see them as precisely as possible.

Stereo Camera Geometry

This paper describes nicely the geometry of a stereo image system. I am trying to figure out, if the cameras tilted towards each other with a certain angle, how the calculation would change? I looked around but couldn't find any reference to tilted camera systems.
Unfortunately, the calculation changes significantly. The rectified case (where both cameras are well-aligned to each other) has the advantage that you can calculate the disparity and the depth is proportional to the disparity. This is not the case in the general case.
When you introduce tilts, you end up with something called epipolar geometry. Here is a paper about this I just googled. In order to calculate the depth from a pixel-pair you need the fundamental matrix or the essential matrix. Both are not easy to obtain from the image pair. If, however, you have the geometric relation of both cameras (translation and rotation), calculating these matrices is a lot easier.
There are several ways to calculate the depth of a pixel-pair. One way is to use the fundamental matrix to rectify both images (although rectifying is not easy either, or even unique) and run a simple disparity check.

Calculating magnetic heading using raw accelerometer and magnetometer data

I have an accelerometer and magnetometer each producing raw X, Y and Z readouts. From this I need to determine the magnetic heading of an object.
I'm not that great at trig, but I've put together a formula that does respond pretty well to the rotation of the device, but also responds to movement that one would not think is relevant, such as angling the device in such a way that has no impact on the direction it is pointed. Such as laying it flat and "rolling" the device.
I think the formula I have for calculating the magnetic heading is fine, but I think my pitch and roll radians for input are wrong.
So I guess the core of my question (unless someone actually has a formula around that does this), is how do you calculate angles, in radians, using an accelerometer for pitch and roll.
Then secondly, any info on the heading formula itself would be great.
Depending on the accuracy your application requires, you may need to solve several problems:
Are the accelerometer axes calibrated? I've seen MEMs accelerometers that had axes that were not mutually perpendicular, and had significantly different response characteristics for each axis (typically X and Y would match, and Z would differ). You will need to synthesize ideal XYZ axes from whatever physical reading your device provides. (Google 'accelerometer calibration'.)
Are the magnetometer axes calibrated? Similar problem as above, except much harder to check: It is very difficult to generate uniform calibrated magnetic fields. If you use the ambient geomagnetic field, you will need to carefully control the local magnetism of your work environment and your tools. (Google 'magnetometer calibration'.)
After the accelerometer and magnetometer have been individually calibrated, their axes need to be calibrated relative to each other. Since both of these devices are typically soldered to a PCB, there is almost guaranteed to be significant misalignment. In many cases, the board layout and device parameters may not even permit the XYZ axes to correspond with each other! This may be the hardest part to do from a lab perspective, so I'd recommend you do a direct comparison using other hardware that has both kinds of sensors and is already calibrated (such as an iPhone or Android phone - but verify the device before use). Normally, this is accomplished by adjusting the prior two calibration matrices until they generate vectors that are correctly aligned relative to each other.
Only after you are generating mutually calibrated magnetic and accelerometer vectors can you apply the solutions suggested by the other respondents.
I've only described the static solution, where both the magnetometer and accelerometer are motionless relative to the local gravitational and magnetic fields. If you need to generate responses in real-time while the system is rapidly moving, you will need to account for the time behavior of each sensor. There are two basic ways to do this: 1) Apply time-domain filters to each sensor so that their outputs share a common time domain (generally adding some delay). 2) Use predictive modeling to modify the sensor outputs in real-time (less delay, but more noise).
I've seen Kalman filters used for such applications, but applying them in a vector domain may require using quaternions instead of Euler matrices. Quaternions are easier to use computationally (fewer operations needed compared to matrices), but I found them to be much more difficult to comprehend and get right.
Or, you may choose a completely different path, and use statistics and data fitting to do all the above work in one giant step. Consider the problem as follows: Given 6 input values (XYZ each from uncalibrated magnetometer and accelerometer) and a reference to the device (assuming it is hand-held, and there is an arrow painted on the case), output a single angle representing the magnetic bearing toward which the arrow on the case is pointing, and the elevation of the arrow relative to the gravity vector (tilt of the case).
Using a calibrated reference device (as mentioned above), pair it with the device to be calibrated, and take a several hundred data points, with the device at different orientations. Then use a powerful math package such a Matlab, MathCAD, R or SciPy to setup and solve the nonlinear equations to create the transformation matrices.
I would point to Euler Angles and Roll Pich Yaw.
You're not thinking in enough dimensions. This would be the answer in only 2 dimensions, and it works great if you can find a way to ensure "Z" always aligns with gravity.
int heading=180-atan2(mag_datX, mag_datY)/0.0174532925; // 0/359=N, 90=E, 180=S, 270=W
(if you're reading directly form the device - beware that it probably returns X, Z, Y - not X, Y, Z !)
However - this is not a 2D compass problem - imagine you take the needle out of the compass, balance it so that gravity plays no part in keeping it "level", and you'll find that "north" will point a bit up or down - depending where on earth you are (or, if at the poles, directly up or down!).
So you need to try and compute the THREE DIMENSIONAL vector from all 3 values - which is a matrix operation.