Recently I am trying to use the Matlab program provided by Andrea Fusiello1, Emanuele Trucco2, Alessandro Verri3 in the A compact algorithm for rectification of stereo pairs to rectify the pictures got from the two cameras in my research project about stereo calibration.
Though the Matlab code is not complex, how to get the projection matrixs of the two cameras still confused me.
I used the following Matlab code to get the Internal matrix and R and T of each camera. And I think I can get the projection matrix by using the formula: P = A1*[R|T]. However, as you can see in the picture, the consequence is strange.
So I think there is something wrong with the projection matrixs I got. Could anyone told me how to get the projection matrixs correctly?
matlab code:
numImages = 9;
files = cell(1, numImages);
for i = 1:numImages
files{i} = fullfile(matlabroot, 'toolbox', 'vision', 'visiondata', ...
'calibration', 'left', sprintf('left%d.bmp', i));
[imagePoints, boardSize] = detectCheckerboardPoints(files);
squareSize = 120;
worldPoints = generateCheckerboardPoints(boardSize, squareSize);
cameraParams = estimateCameraParameters(imagePoints, worldPoints);
imOrig = imread(fullfile(matlabroot, 'toolbox', 'vision', 'visiondata', ...
'calibration', 'left', 'left9.bmp'));
[imagePoints, boardSize] = detectCheckerboardPoints(imOrig);
[R, t] = extrinsics(imagePoints, worldPoints, cameraParams);
The consequence:
There is a built in function cameraMatrix in the Computer Vision System Toolbox to compute the camera projection matrix.
However, if you are trying to do stereo rectification, you should calibrate a stereo pair of cameras using Stereo Camera Calibrator app, and then use rectifyStereoImage function. See this example.
The thing to keep in mind is that the functions in the Computer Vision System Toolbox use the post-multiply convention, i.e. row vector times the matrix. Because of this, the rotation matrices and the camera projection matrix are transposes of their conterparts in Trucco and Veri, and the other textbooks. So the formula used by cameraMatrix is
P = [R;t] * K
So P ends up being 4-by-3, and not 3-by-4. This may explain why you are getting weird results.
I have got the following in Matlab (solution as in the example in
h = ezplot3('cos(t)', 'sin(t)', 'sin(5*t)', [-pi pi]);
data = get(h,{'XData','YData','Zdata'});
data = [cat(1,data{:})', ones(numel(data{1}),1)];
% Projection matrix on screen
[az,el] = view(); A = viewmtx(az,el);
data_transformed = A*data';
plot(data_transformed(1,:), data_transformed(2,:))
That transformation does not work with:
h = ezplot3('t', 'sin(t)', '20*cos(t)', [0 10*pi]);
How to get the screen projection of the 3rd plot?
Also, any links to the math behind the projection, with examples would be nice too :)
The projection depends on view. If you try with various view values, the project in 2D will produce different results.
For example, [az,el]=view(60,30); and you will have this projection.
and [az,el]=view(30,15); you will have this image
It turns out you need to normalize by the DataAspectRatio, so the viewTransform matrix becomes:
[az, el] = view(gca);
A = viewmtx(az,el) * makehgtform('scale',1./get(gca,'DataAspectRatio'));
The full answer can be seen on
I'm doing a piece of Matlab code for reconstruction of radial 3D NMR data. However, since the radon functions built into Matlab are only 2D, I will have to apply them twice. After the first iradon transform in my code I get out projections of the imaged object as they are supposed to look (from different angles). But after the second iradon transform I do not see a correct reconstruction of the object (just a lot of noise and some blurry stuff where the object should be).
My attempt at a solution is shown below.
The input data is a free induction decay or fid: fid(NP,nv,nv2) where NP is the number of projections, nv is the number of theta angle increments and nv2 is the number of phi angle increments.
Doing the ifft on the FID will give a sinogram of dimensions proj(NP,phi) for each theta angle.
Doing the first iradon gives filtered and unfiltered backprojections with dimensions I(r,x) for each theta angle. (so that I3 and I4 have dimensions (r,z,theta) )
Doing the last iradon transform should then render the reconstructed 3D image with dimensions I(x,y,z)
for k=1:1:nv2
FID = squeeze(fid(:,:,k));
I1 = imrotate(iradon(proj,theta,'v5cubic','none',1,2*NP),-90);
I2 = imrotate(iradon(proj,theta,'v5cubic','Ram-Lak',1,2*NP),-90);
I3(:,:,k) = I1;
I4(:,:,k) = I2;
for k=1:size(I3,2)
I5(:,:,k) = iradon(squeeze(I3(:,k,:)),phi,'v5cubic','none',1,2*NP);
I6(:,:,k) = iradon(squeeze(I4(:,k,:)),phi,'v5cubic','Ram-Lak',1,2*NP);
This is the first time I do the image processing. So I have a lot of questions:
I have two pictures which are taken from different position, one from the left and the other one from the right like the picture below.[![enter image description here][1]][1]
Step 1: Read images by using imread function
I1 = imread('DSC01063.jpg');
I2 = imread('DSC01064.jpg');
Step 2: Using camera calibrator app in matlab to get the cameraParameters
load cameraParams.mat
Step 3: Remove Lens Distortion by using undistortImage function
[I1, newOrigin1] = undistortImage(I1, cameraParams, 'OutputView', 'same');
[I2, newOrigin2] = undistortImage(I2, cameraParams, 'OutputView', 'same');
Step 4: Detect feature points by using detectSURFFeatures function
imagePoints1 = detectSURFFeatures(rgb2gray(I1), 'MetricThreshold', 600);
imagePoints2 = detectSURFFeatures(rgb2gray(I2), 'MetricThreshold', 600);
Step 5: Extract feature descriptors by using extractFeatures function
features1 = extractFeatures(rgb2gray(I1), imagePoints1);
features2 = extractFeatures(rgb2gray(I2), imagePoints2);
Step 6: Match Features by using matchFeatures function
indexPairs = matchFeatures(features1, features2, 'MaxRatio', 1);
matchedPoints1 = imagePoints1(indexPairs(:, 1));
matchedPoints2 = imagePoints2(indexPairs(:, 2));
From there, how can I construct the 3D point cloud ??? In step 2, I used the checkerboard as in the picture attach to calibrate the camera[![enter image description here][2]][2]
The square size is 23 mm and from the cameraParams.mat I know the intrinsic matrix (or camera calibration matrix K) which has the form K=[alphax 0 x0; 0 alphay y0; 0 0 1].
I need to compute the Fundamental matrix F, Essential matrix E in order to calculate the camera matrices P1 and P2, right ???
After that when I have the camera matrices P1 and P2, I use the linear triangulation methods to estimate 3D point cloud. Is it the correct way??
I appreciate if you have any suggestion for me?
To triangulate the points you need the so called "camera matrices" and the points in 2D in each of the images (that you already have).
In Matlab you have the function triangulate, that does the job for you.
If you have calibrated the cameras, you shoudl have this information already. Anyways, you have here an example of how to create the "stereoParams" object needed for the triangulation.
Yes, that is the correct way. Now that you have matched points, you can use estimateFundamentalMatrix to compute the fundamental matrix F. Then you get the essential matrix E by multiplying F by extrinsics. Be careful about the order of multiplication, because the intrinsic matrix in cameraParameters is transposed relative to what you see in most textbooks.
Now, you have to decompose E into a rotation and a translation, from which you can construct the camera matrix for the second camera using cameraMatrix. You also need the camera matrix for the first camera, for which the rotation would be a 3x3 identity matrix, and translation will be a 3-element 0 vector.
Edit: there is now a cameraPose function in MATLAB, which computes an up-to-scale relative pose ('R' and 't') given the Fundamental matrix and the camera parameters.
I have a stereo camera system and I am trying this MATLAB's Computer Vision toolbox example ( with my own images and camera calibration files. I used Caltech's camera calibration toolbox (
First I tried each camera separately based on first example and found the intrinsic camera calibration matrices for each camera and saved them. I also undistorted the left and right images using Caltech toolbox. Therefore I commented out the code for that from MATLAB example.
Here are the instrinsic camera matrices:
K1=[1050 0 630;0 1048 460;0 0 1];
K2=[1048 0 662;0 1047 468;0 0 1];
BTW, these are the right and center lenses from bumblebee XB3 cameras.
Question: aren't they supposed to be the same?
Then I did stereo calibration based on fifth example. I saved the rotation matrix (R) and translation matrix (T) from that. Therefore I commented out the code for that from MATLAB example.
Here are the rotation and translation matrices:
R=[0.9999 -0.0080 -0.0086;0.0080 1 0.0048;0.0086 -0.0049 1];
T=[120.14 0.55 1.04];
Then I fed all these images and calibration files and camera matrices to the MATLAB example and tried to find the 3-D point cloud but the results are not promising. I am attaching the code here. I think here are two problems:
1- My epipolar constraint value is too large!(to the power of 16)
2- I am not sure about the camera matrices and how I calculated them from R, and T from Caltech toolbox!
P.S. as far as feature extraction goes that is fine.
would be great if someone can help.
close all
files = {'Left1.tif';'Right1.tif'};
for i = 1:numel(files)
files{i}=fullfile('...\sparse_matlab', files{i});
images(i).image = imread(files{i});
montage(files); title('Pair of Original Images')
% Intrinsic camera parameters
K1 = KK;
K2 = KK;
%Extrinsics using stereo calibration
images(1).CameraMatrix=[Rotation; Translation] * K1;
images(2).CameraMatrix=[Rotation; Translation] * K2;
% Detect feature points and extract SURF descriptors in images
for i = 1:numel(images)
%detect SURF feature points
images(i).points = detectSURFFeatures(rgb2gray(images(i).image),...
%extract SURF descriptors
[images(i).featureVectors,images(i).points] = ...
% Visualize several extracted SURF features from the Left image
figure; imshow(images(1).image);
title('1500 Strongest Feature Points from Globe01');
hold on;
indexPairs = ...
matchFeatures(images(1).featureVectors, images(2).featureVectors,...
'Prenormalized', true,'MaxRatio',0.4) ;
matchedPoints1 = images(1).points(indexPairs(:, 1));
matchedPoints2 = images(2).points(indexPairs(:, 2));
% Visualize correspondences
showMatchedFeatures(images(1).image,images(2).image,matchedPoints1,matchedPoints2,'montage' );
title('Original Matched Features from Globe01 and Globe02');
% Set a value near zero, It will be used to eliminate matches that
% correspond to points that do not lie on an epipolar line.
epipolarThreshold = .05;
for k = 1:length(matchedPoints1)
% Compute the fundamental matrix using the example helper function
% Evaluate the epipolar constraint
epipolarConstraint =[matchedPoints1.Location(k,:),1]...
%%%% here my epipolarConstraint results are bad %%%%%%%%%%%%%
% Only consider feature matches where the absolute value of the
% constraint expression is less than the threshold.
valid(k) = abs(epipolarConstraint) < epipolarThreshold;
validpts1 = images(1).points(indexPairs(valid, 1));
validpts2 = images(2).points(indexPairs(valid, 2));
title('Matched Features After Applying Epipolar Constraint');
% convert image to double format for plotting
doubleimage = im2double(images(1).image);
points3D = ones(length(validpts1),4); % store homogeneous world coordinates
color = ones(length(validpts1),3); % store color information
% For all point correspondences
for i = 1:length(validpts1)
% For all image locations from a list of correspondences build an A
pointInImage1 = validpts1(i).Location;
pointInImage2 = validpts2(i).Location;
P1 = images(1).CameraMatrix'; % Transpose to match the convention in
P2 = images(2).CameraMatrix'; % in [1]
A = [
pointInImage1(1)*P1(3,:) - P1(1,:);...
pointInImage1(2)*P1(3,:) - P1(2,:);...
pointInImage2(1)*P2(3,:) - P2(1,:);...
pointInImage2(2)*P2(3,:) - P2(2,:)];
% Compute the 3-D location using the smallest singular value from the
% singular value decomposition of the matrix A
X = V(:,end);
X = X/X(end);
% Store location
points3D(i,:) = X';
% Store pixel color for visualization
y = round(pointInImage1(1));
x = round(pointInImage1(2));
color(i,:) = squeeze(doubleimage(x,y,:))';
% add green point representing the origin
points3D(end+1,:) = [0,0,0,1];
color(end+1,:) = [0,1,0];
% show images
figure('units','normalized','outerposition',[0 0 .5 .5])
subplot(1,2,1); montage(files,'Size',[1,2]); title('Original Images')
% plot point-cloud
hAxes = subplot(1,2,2); hold on; grid on;
xlabel('x-axis (mm)');ylabel('y-axis (mm)');zlabel('z-axis (mm)')
view(20,24);axis equal;axis vis3d
grid on
title('Reconstructed Point Cloud');
First of all, the Computer Vision System Toolbox now includes a Camera Calibrator App for calibrating a single camera, and also support for programmatic stereo camera calibration. It would be easier for you to use those tools, because the example you are using and the Caltech Calibration Toolbox use somewhat different conventions.
The example uses the pre-multiply convention, i.e. row vector * matrix, while the Caltech toolbox uses the post-multiply convention (matrix * column vector). That means that if you do use the camera parameters from Caltech, you would have to transpose the intrinsic matrix and the rotation matrices. That could be the main cause of your problems.
As far as the intrinsics being different between your two cameras, that is perfectly normal. All cameras are slightly different.
It would also help to see the matched features that you've used for triangulation. Given that you are reconstructing an elongated object, it doesn't seem too surprising to see the reconstructed points form a line in 3D...
You could also try rectifying the images and doing a dense reconstruction, as in the example I've linked to above.
I am doing a project to detect people in crowd using HOG-LBP. I want to make it for real-time application. I've read in some references, integral image/histogram can increase the speed of the performance from sliding window detection. I want to ask, how to implement integral image on my sliding window detection:
here is the code for integral image from matlab:
A = (cumsum(cumsum(double(img)),2));
and here my sliding window detection code:
im = strcat ('C:\Documents\Crowd_PETS09\S1\L1\Time_13-57\View_001\frame_0150.jpg');
im = imread (im);
figure (1), imshow(im);
win_size= [32, 32];
[lastRightCol lastRightRow d] = size(im);
counter = 1;
%% Scan the window by using sliding window object detection
% this for loop scan the entire image and extract features for each sliding window
% Loop on scales (based on size of the window)
for s=1
disp(strcat('s is',num2str(s)));
for y = 1:X/4:lastRightCol-Y
for x = 1:Y/4:lastRightRow-X
%get four points for boxes
p1 = [x,y];
p2 = [x+(X-1), y+(Y-1)];
po = [p1; p2] ;
% cropped image based on the four points
crop_px = [po(1,1) po(2,1)];
crop_py = [po(1,2) po(2,2)];
topLeftRow = ceil(min(crop_px));
topLeftCol = ceil(min(crop_py));
bottomRightRow = ceil(max(crop_px));
bottomRightCol = ceil(max(crop_py));
cropedImage = im(topLeftCol:bottomRightCol,topLeftRow:bottomRightRow,:);
%Get the feature vector from croped image
HOGfeatureVector{counter}= getHOG(double(cropedImage));
LBPfeatureVector{counter}= getLBP(cropedImage);
LBPfeatureVector{counter}= LBPfeatureVector{counter}';
boxPoint{counter} = [x,y,X,Y];
counter = counter+1;
x = x+2;
where should i put the integral image code?
i am really appreciate, if someone can help me to figure it out.
The integral image is most suited for the Haar-like features. Using it for HOG or LBP would be tricky. I would suggest to first get your algorithm working, and then think about optimizing it.
By the way, the Computer Vision System Toolbox includes the extractHOGFeatures function, which would be helpful. Here's an example of training a HOG-SVM classifier to recognize hand-written digits. Also there is a vision.PeopleDetector object, which uses a HOG-SVM classifier to detect people. You could either use it directly for your project, or use it to evaluate performance of your own algorithm.