I have to perform re-projection of my 3D points (I already have data from Bundler).
I am using Camera Calibration toolbox in MATLAB to get the intrinsic camera parameters. I got output like this from 27 images (chess board; images are taken from different angles).
Calibration results after optimization (with uncertainties):
Focal Length: fc = [ 2104.11696 2101.75357 ] ± [ 23.13283 22.92478 ]
Principal point: cc = [ 969.15779 771.30555 ] ± [ 21.98972 15.25166 ]
Skew: alpha_c = [ 0.00000 ] ± [ 0.00000 ]
Distortion: kc = [ 0.11555 -0.55754 -0.00100 -0.00275 0.00000 ] ±
[ >0.05036 0.59076 0.00307 0.00440 0.00000 ]
Pixel error: err = [ 0.71656 0.63306 ]
Note: The numerical errors are approximately three times the standard deviations (for reference).
I am wondering about the numerical errors i.e. Focal length error +- [23.13283 22.92478] , principal point error etc. What these error numbers actually represent and what are their impact??
The pixel error is really less.
So far I use the following matrix from above data for my re-projection:
K=[ 2104.11696 0 969.15779; 0 2101.75357 771.30555;0 0 1]
The above matrix "K" seems right to me. Correct me if I am doing something wrong...
Will be waiting for your replies.
There are two kinds of errors here.
One is the reprojection errors. Once you calibrate a camera, you use the resulting camera parameters to project the checkerboard points in world coordinates into the image. Then the reprojection erros are the distances between those projected points and the detect checkerboard points. The acceptable value for the reprojection errors depends on your application, but a good rule of thumb is that the mean reprojection error should be less than 0.5 of a pixel.
The other kind of errors are those +/- intervals you get for each estimate parameter. Those are based on the standard errors resulting from the optimization algorithm. The values that the Bouguet's
Camera Calibration Toolbox gives you are actually 3 times the standard error, which corresponds to 99.73% confidence interval. In other words, if the Camera Calibration toolbox reports the focal length error as +- [23.13283 22.92478], then the actual focal length is within that interval of your estimate with the probability of 99.73%.
The reprojection errors give you a quick measure of the accuracy of your calibration. The standard errors - let's call them estimation errors - are useful for a more careful analysis of your results. For example, you should try excluding calibration images that have high mean reprojection error. On the other hand, if your estimation errors are high, you can try adding more calibration images.
By the way, the Computer Vision System Toolbox now includes a GUI Camera Calibrator app that makes camera calibration much easier. There is also a good explanation of the reprojection errors in the documentation.
The camera calibration toolbox extract grid points from the checker board images and uses it for finding calibration parameters.
The pixel errors are mean re-projection error for extracted grid points, i.e. the actual pixel location and the one by using calculated K matrix. So these numbers are mostly within 1 (1 pixel error) although your numbers are quite. The error in focal length is variance of calculated focal length.
You need only 3 or 4 images to find calibration of a camera (I forget the actual number). If you provide multiple images, it will compute K for all the combination of 3-4 images and compute a K. The errors are the variance of all these computed K.
Your numbers are quite high (it should be within 3-4 pixels compared to your 22-23 pixels). The reason is bad images for calibration and wrong initial estimate of grid points (this you do manually by selecting 4 corners in image). Also usually f_x and f_y are same in modern cameras and you should take mean of both (f_x + f_y)/2.
Regarding your principle point, it seems your camera resolution in 1920 x 1600 and you should use [980 800] instead of the one given by toolbox. Usually the ccd is placed carefully now-days and you have your principle point exactly at the center.
Related
I am using chessboard to estimate translation vector between it and the camera. Firstly, the intrinsic camera parameters are calculated, then translation vector are estimated using n points detected from the chessboard.
I found a very strange phenomenon: the translation vector is accurate and stable when using more points in the chessboard, and such phenomenon is more obvious when the distance is farer. For instance, the square in the chessboard is 1cm*1cm, when the distance is 3m, translation vector is accurately estimated when using 25 points, while it is inaccurate and unstable using the minimal 4 points. However, when the distance is 0.6m, estimation results of translation vector using 4 points and 25 points are similar, which are all accurate.
How to explain this phenomenon (in theory)? what's the relationship between stable estimation result and distance, and number of points?
Thanks.
When you are using a smaller number of points, the calculation of the translation vector is more sensitive to the noise in coordinates of those points. Point coordinates are noisy due to a finite resolution of the camera (among other things). A that noise only increases with distance. So using a larger number of points should provide for a better estimation.
I want to evaluate the grid quality where all coordinates differ in the real case.
Signal is of a ECG signal where average life-time is 75 years.
My task is to evaluate its age at the moment of measurement, which is an inverse problem.
I think 2D approximation of the 3D case is hard (done here by Abo-Zahhad) with with 3-leads (2 on chest and one at left leg - MIT-BIT arrhythmia database):
where f is a piecewise continuous function in R^2, \epsilon is the error matrix and A is a 2D matrix.
Now, I evaluate the average grid distance in x-axis (time) and average grid distance in y-axis (energy).
I think this can be done by Matlab's Image Analysis toolbox.
However, I am not sure how complete the toolbox's approaches are.
I think a transform approach must be used in the setting of uneven and noncontinuous grids. One approach is exact linear time euclidean distance transforms of grid line sampled shapes by Joakim Lindblad et all.
The method presents a distance transform (DT) which assigns to each image point its smallest distance to a selected subset of image points.
This kind of approach is often a basis of algorithms for many methods in image analysis.
I tested unsuccessfully the case with bwdist (Distance transform of binary image) with chessboard (returns empty square matrix), cityblock, euclidean and quasi-euclidean where the last three options return full matrix.
Another pseudocode
% https://stackoverflow.com/a/29956008/54964
%// retrieve picture
imgRGB = imread('dummy.png');
%// detect lines
imgHSV = rgb2hsv(imgRGB);
BW = (imgHSV(:,:,3) < 1);
BW = imclose(imclose(BW, strel('line',40,0)), strel('line',10,90));
%// clear those masked pixels by setting them to background white color
imgRGB2 = imgRGB;
imgRGB2(repmat(BW,[1 1 3])) = 255;
%// show extracted signal
imshow(imgRGB2)
where I think the approach will not work here because the grids are not necessarily continuous and not necessary ideal.
pdist based on the Lumbreras' answer
In the real examples, all coordinates differ such that pdist hamming and jaccard are always 1 with real data.
The options euclidean, cytoblock, minkowski, chebychev, mahalanobis, cosine, correlation, and spearman offer some descriptions of the data.
However, these options make me now little sense in such full matrices.
I want to estimate how long the signal can live.
Sources
J. Müller, and S. Siltanen. Linear and nonlinear inverse problems with practical applications.
EIT with the D-bar method: discontinuous heart-and-lungs phantom. http://wiki.helsinki.fi/display/mathstatHenkilokunta/EIT+with+the+D-bar+method%3A+discontinuous+heart-and-lungs+phantom Visited 29-Feb 2016.
There is a function in Matlab defined as pdist which computes the pairwisedistance between all row elements in a matrix and enables you to choose the type of distance you want to use (Euclidean, cityblock, correlation). Are you after something like this? Not sure I understood your question!
cheers!
Simply, do not do it in the post-processing. Those artifacts of the body can be about about raster images, about the viewer and/or ... Do quality assurance in the signal generation/processing step.
It is much easier to evaluate the original signal than its views.
I'm trying to port some Matlab code to C++.
I've come across this line:
edges = edge(gray,'canny',0.1);
The output for the sample image is a completely black image. I want to reproduce the same behaviour using cv::Canny. What values should I use for low threshold and high threshold?
Sample:
Output:
In the line above you have not defined a threshold, probably it takes zero then, thus delivering a black picture. Also, you use a sigma of 0.1 which means virtually no Gauss Blur in the first Canny step. Within Matlab you can get an optimized threhold by:
[~, th] = edge(gray,'canny');
and then apply the optimized threshold th multiplied by some factor f (from my experience f should be between 1-3), you have to try out:
edges=edge(gray,'canny',f*th,'both', sigma);
sigma is sqrt(2) by default (you used 0.1 above). Following remarks:
Matlab calculated the optimized threshold as a percentile of the distribution of intensity gradients (you can see the construction of edge() if you enter "edit edge", if I remember correctly)
the above parameter th is a vector consisting of the low and high threshold. Matlab always uses low_threshold = 0.4* high_threshold
How is the reprojection error calculated in Matlab's triangulate function?
Sadly, the documentation gives no mathematical formula.
It only says: The vector contains the average reprojection error for each M world point.
What is the procedure/Matlab uses when calculating this error?
I searched SOF but found nothing on this IMHO important question.
UPDATE:
How can they use this error to filter out bad matches here : http://se.mathworks.com/help/vision/examples/sparse-3-d-reconstruction-from-two-views.html
AFAIK The reprojection error is calculated always in the same way (in the field of computer vision in general).
The reprojection is (as the name says) the error between the reprojected point in the camera and the original point.
So from 2 (or more) points in the camera you triangulate and get 3D points in the world system. Due to errors in the calibration of the cameras, the point will not be 100% accurate. What you do is take the result 3D point (P) and with the camera calibration parameters project it in the cameras again, obtaining new points (\hat{p}) near the original ones (p).
Then you calculate the euclidean distance between the original point and the "reprojected" one.
In case you want to know a bit more about the method used by Matlab, I'll enhance the bibliography they use giving you also the page number:
Multiple View Geometry in Computer Vision by Richard Hartley and
Andrew Zisserman (p312). Cambridge University Press, 2003.
But basically it is a least squares minimization, that has no geometrical interpretation.
You can find an explanation of reprojection errors in the context of camera calibration in the Camera Calibrator tutorial:
The reprojection errors returned by the triangulate function are essentially the same concept.
The way to use the reprojection errors to discard bad matches is shown in this example:
[points3D, reprojErrors] = triangulate(matchedPoints1, matchedPoints2, ...
cameraMatrix1, cameraMatrix2);
% Eliminate noisy points
validIdx = reprojErrors < 1;
points3D = points3D(validIdx, :);
This code excludes all 3D points for which the reprojection error was more than a pixel. You can also use validIdx to eliminate the corresponding 2D matches.
Above mentioned answers interpret re-projection error in a simplistic way as an actual reprojection in the camera. In more general sense, this error reflects the distance between a noisy image point and the point estimated from the model. One can imagine a tangental plane to some surface (model) in n-dimensional space where the noisy point is projected (hence it lands on the plane, not on the model!). n is not obligatory = 2 since a notion of a "point" can be generalized to, for example, concatenation of coordinates of two corresponding points for Homography.
It is important to understand that reprojection error is not a final answer. Overall_error^2 = reprojection_error^2 + estimation_error^2. The latter is the distance between estimation reprojected and true point on the model. More on this can be found in chapter 5 of Hatrtley andd Zisserman book Multiple View Geometry. They prove that reprojeciton error has a theoretical limit 0.6*sigma (for Homography estimation), where sigma is noise standard deviation.
They filter out bad matches by removing relevant indexes which have large re-projection errors.
This means that points that have large re-projection errors are the outliers.
I have a two vectors of spatial data (each about 2000 elements in length). One is a convolved version of the other. I am trying to determine the kernel that would produce such a convolution. I know that I can do this by finding the inverse Fourier transform of the ratio of the Fourier transforms of the output and input vectors. Indeed, when I do this I get more or less the shape I was expecting. However, my kernel vector has the same dimensionality as the two input vectors when in reality the convolution was only using about one fifth (~300-400) of the points. The fact that I am getting the right shape but the wrong number of points makes me think that I am not using the ifft and fft functions quite correctly. It seems like if I were really doing the right thing this should happen naturally. At the moment I am simply doing;
FTInput = fft(in);
FtOutput = fft(out);
kernel = ifft(FtOutput./FTInput).
Is this correct and it's up to me to interpret the output vector correctly or have I oversimplified the task? I'm sure it's the latter, I'm just not sure where.
Thanks
You are doing things correctly, this is not a bug.
The problem of estimating a convolution filter given clean and convolved data is VERY HARD. Given "nice" data, you may get the right shape but retrieving the true support of the convolution filter (i.e. getting zeroes where they should be) is NOT going to happen naturally.
I think your ''problem'' comes from the inherent padding necessary for discrete convolution that your are neglecting
By dividing in fourier, you assume your convolution was made with a cyclic padding in the spatial (or that your convolution was made by multiplication in fourier, both are equivalent) but if your convolution was computed in the spatial domain, a zero padding was most likely used.
s=[1 2 3 4 5] //signal
f=[0 1 2 1 0] //filter
s0=s *conv0* f=[4 8 12 16 14] //convolution with zero padding in spatial domain, truncated to signal length
sc=s *convc* f=[9 8 12 16 15] //convolution with cyclic padding in spatial domain, truncated to signal length
S,S0,Sc, the ffts of s,s0,sc
approx0=ifft(S0./S)=[-0.08 1.12 2.72 -0.08 -0.08]
approxc=ifft(Sc./S)=[0 1 2 1 0]