I have obtained fundamental matrix between two cameras. I also, have their internal parameters in a 3 X 3 matrix which I had obtained earlier through chess board. Using the fundamental matrix, I have obtained P1 and P2 by
P1 = [I | 0] and P2 = [ [e']x * F | e']
These projection matrices are not really useful in getting the exact 3D location.
Since, I have the internal parameters K1 and K2, I changed P1 and P2 as
P1 = K1 * [I | 0] and P2 = K2 * [ [e']x * F | e']
Is this the right way to get the real projection matrices which gives the actual relation between the 3D world and the image?
If not, please help me understand the right way and where I have gone wrong.
If this is the right approach, how do I verify these matrices?
A good reference book is "Multiple View Geometry in Computer Vision" from Hartley and Zisserman.
First, your formula for P is wrong. If you want the formula with K inside, it is rather
P = K * [R | t]
or
P = [ [e']x * F | e']
but not a mix of both.
If you computed F from the 8 points algorithm, then you can recover only projective geometry up to a 3D homography (i.e. a 4x4 transformation).
To upgrade to euclidian space, there are 2 possibilities, both starting by computing the essential matrix.
First possibility is to compute the essential matrix from F: E = transpose(K2)*F*K1.
Second possibility, is to estimate directly the essential matrix for these 2 views:
Normalize your 2D points by pre multiplying with inverse of K for each image ("normalized image coordinates")
Apply the (same than for F) 8 points algorithm on these normalized points
Enforce the fact that the essential matrix has its 2 singular values equal to 1 and last is 0, by SVD decomposition and forcing the diagonal values.
Once you have the essential matrix, we can compute the projection matrix in the form
P = K * [R | t]
R and t can be found thanks to the elements of the SVD of E (cf the previously mentioned book).
However, you will have 4 possibilities. Only one of them projects points in front of both cameras, so you shall test a point (if you are sure of it) to remove the ambiguity among the 4.
And in this case you will be able to place the camera and its orientation (with R and t of the projection) in your 3D scene.
Not so obvious, indeed...
Just came across this question and want to give a more direct answer to the question.
When P1 = [I, 0] is your first projection matrix but it should be P1 = K1 * [I, 0], then your "world" is distorted by the 4x4 matrix M = [K1, 0; 0, 1]. Any point X in the world projects to x1 = P1 * X = (P1 * M) * (M^-1 * X) = P1' * X' where X' is now the point in the "undistorted world" (note that X = M * X' is again the point in the "distorted world") and P1' = P1 * M = [I, 0] * [K1, 0; 0, 1] = K1 * [I, 0] is the projection matrix in the undistorted world.
Analoguously, P2' = P1 * M is the projection matrix in the undistorted world and has the form P2' = [ [e']x * F | e'] * [K1, 0; 0, 1] = [ [e']x * F * K1 | e'].
Note that P2 = [ [e']x * F | e'] is just one possible projection matrix but in general has the form P2 = [ [e']x * F + e' * v^T | s * e'] for some real s and a 3-vector v. Note further, that if you want to find a projection matrix of the form P2' ~ K2 * [R, t] for some rotation matrix R, you might be better off using the algorithm based on the essential matrix outlined by Damien and described in Hartley&Zisserman(2.Ed) Sec. 9.6.2.
Related
Both opencv and matlab have the function decomposeHomographyMatrix. This requires the homography matrix, H, and the camera intrinsic, K. I dont understand why it needs K?
Each of these function implementations reference "Malis" (https://hal.archives-ouvertes.fr/inria-00174036/document). On page 7 it talks about the intrinsic matrix K but then never uses it. It doesn't seem like it is necessary?
Here is how I understand the document you linked ("Deeper understanding of the homography decomposition for vision-based control"). If I am wrong, please correct me.
DecomposeHomographyMat needs K to compute euclidean homography H from projective homography G.
H = K_inv * G * K (Formula2)
R, t and n will be computed from euclidean homography matrix (Formula 3).
Opencv function decomposeHomographyMat assumes to get a projective homography matrix as an input (not euclidean)!
Notice that in the document you linked, the projective homography matrix is called G (not H!), while an euclidean homography matrix is called H.
Projective homography matrix is computed based on the vectors p - image coordinates of the points. (pixels)
alfa_p * p_c = G * p_star
Euclidean homography matrix is computed based on the vectors m "normalized projective coordinates of the points viewed from the camera pose" (e.g. meters).
alfa_m * m_c = H * m_star
Where: -c - current frame; _d - desired frame; alfa - just scale factors
The relationship between image and projective coordinates is:
m = K_inv * p
K - matrix of the camera intrinsic parameters; K_inv - inverse of the matrix K
To be even more clear:
p_c = G * p_star
-> K * K_inv * p_c = G * K * K_inv * p_star
-> K * m_c = G * K * m_star
-> m_c = K_inv * G * K * m_star
-> m_c = H * m_star
Remember that:
K_inv * K = I - identity matrix
Well, you must know K in order to find R,T. Formula (1) in your reference shows this clearly (the homography G is given in your case and you are looking for the camera pose R,T). There is no way to solve for R,T without knowing K.
Whether K is used later depends on what you are going to do with the pose. If you are going to project 3D landmarks to images, you must know K for that too.
Not sure if I am telling you anything new...
I have a Matlab code from my class in which the professor does the step of assigning each data point to the nearest cluster using this code where c is the centroids matrix and x is the data matrix.
% norm squared of the centroids;
c2 = sum(c.^2, 1);
% For each data point x, computer min_j -2 * x' * c_j + c_j^2;
% Note that here is implemented as max, so the difference is negated.
tmpdiff = bsxfun(#minus, 2*x'*c, c2);
[val, labels] = max(tmpdiff, [], 2);
I am not sure how this is equivalent to the algorithm definition of this step in which the cluster assignment is done through
% For every centroid j and for every data point x_i
labels(i) = `argmin||x_i - c_j||^2`
Can anyone please explain to me how this works, essentially how computing
min_j -2 * x' * c_j + c_j^2
is equivalent to
argmin||x_i - c_j||^2
If we have a triangle such that the length of its sides is a, b, c, then
we know that (from the law of cosines)
a^2=c^2+b^2-2bc*cos(alpha)
where alpha is the angle between the side with size b and the size with size c.
Now, consider the triangle made of the three vertices x, c_j and O (the origin of R^n). Writing theta the angle between x and c, we have
argmin_j||x-c_j||^2
=argmin_j (||x||^2+||c_j||^2 - 2*||x||* ||c_j|| * cos(theta) )
which is equal to
argmin_j(||x||^2 + ||c||^2 - 2x^t c_j)
Now, remember that x is constant in this minimization, so the last equation is just equal to
argmin_j(||c_j||^2 - 2 x^t c_j)
which is the equation you minimize in your code.
I would like to compute the intersection point in R^3 of a vector given by
p + alpha * n where x is a spatial vector, n is another vector and alpha is a scalar to be determined.
the surface is given in analytical form by the formulation
f(x,y) = [x, y, z(x,y)] where z(x,y) can be an arbitrary nonlinear surface description
I set up a linearization:
[n1 n2 n3 ] (d_alpha)= [p1 + alpha*n1 - x]
[-1 0 -dz(x,y)/dx] (d_x) = [p2 + alpha*n2 - y]
[ 0 -1 -dz(x,y)/dx] (d_y) = [p3 + alpha*n3 - z(x,y)]
and search to iterate with starting values for alpha, x and y
However, I cant seem to converge here. Any idea where my mistake is?
Thanks in advance
You can write your equations as
x_line(a) = p1 + a * n1
y_line(a) = p2 + a * n2
z_line(a) = p3 + a * n3
z_plane(x, y) = fun(x, y)
Assuming that your problem has a unique solution, the height along the z-direction dz of the line above the plane, as a function of a is then
dz(a) = z_line(a) - fun(x_line(a), y_line(a))
= p3 + a * n3 - fun(p1 + a * n1, p2 + a * n2)
To find the intersection of the line with the plane, you simply have to find the value of a for which dz is zero. This can be done in Matlab using an anonymous function and fzero like so:
dz = #(a) = p3 + a * n3 - fun(p1 + a * n1, p2 + a * n2);
a_intersect = fzero(dz, a0);
where a0 is some (arbitrary) starting guess for a.
You might want read a bit about optical ray-tracing, I guess you might find some introductory university notes online. This is a pretty standard problem for finding e.g. the intersection of an optical ray and a curved lens or a parabolic mirror.
The following diagram is a schematic of a lake, and the equation illustrates how to calculate the effective heat flux of a lake.
where S is a vector of surface fluxes, q is short wave radiation, h is depth of the mixed layer, and z is the depth of the lake. For example:
q0 = 400+(1-400).*rand(100,1); % This is the short wave radiation
kd = 0.8; % extinction coefficient
h = 10; % depth of the surface mixed layer
for i = 1:length(q0); % loop for calculating short wave radiation at depth h
qh(i) = q0(i).*exp(-kd*h); % here, qh is calculated according to the Lambert Beer law
end
given
dz = 0.5
and z varies from 0 (surface) to depth h in increments of dz i.e.
z = 0:dz:h
how would I calculate the last portion of this equation in matlab i.e. how to calculate q at depth z between the surface and h? which is expressed here as an integral?
Apologies if this should be on another one of the stack overflow forums but it seems more related to programming than pure physics or maths questions.
To integrate this correctly, you will need compute all values of q(z) in the range [0, h]. If q0 and qh are N-by-1 column vectors, this means that q should be an N-by-M matrix, where M is the number of sample points in the range [0, h].
First, lets define z properly:
z = linspace(0, h, 200); %// M=200, but it's an arbitrary number to your choosing
The computation of q can be reduced to:
q = q0 * exp(-kd * z);
and qh actually equals to the final column of q, i.e q(:, end).
The integral itself can be approximated to a sum and computed using sum:
dz = z(2) - z(1);
I = sum(q, 2) * dz;
P.S.
Since q(z) = e(-kd ·z), it's simple enough for you to compute the integral analytically:
I = q0 * (1 - exp(-kd * h)) / kd;
I have an equation of the type c = Ax + By where c, x and y are vectors of dimensions say 50,000 X 1, and A and B are matrices with dimensions 50,000 X 50,000.
Is there any way in Matlab to find matrices A and B when c, x and y are known?
I have about 100,000 samples of c, x, and y. A and B remain the same for all.
Let X be the collection of all 100,000 xs you got (such that the i-th column of X equals the x_i-th vector).
In the same manner we can define Y and C as 2D collections of ys and cs respectively.
What you wish to solve is for A and B such that
C = AX + BY
You have 2 * 50,000^2 unknowns (all entries of A and B) and numel(C) equations.
So, if the number of data vectors you have is 100,000 you have a single solution (up to linearly dependent samples). If you have more than 100,000 samples you may seek for a least-squares solution.
Re-writing:
C = [A B] * [X ; Y] ==> [X' Y'] * [A';B'] = C'
So, I suppose
[A' ; B'] = pinv( [X' Y'] ) * C'
In matlab:
ABt = pinv( [X' Y'] ) * C';
A = ABt(1:50000,:)';
B = ABt(50001:end,:)';
Correct me if I'm wrong...
EDIT:
It seems like there is quite a fuss around dimensionality here. So, I'll try and make it as clear as possible.
Model: There are two (unknown) matrices A and B, each of size 50,000x50,000 (total 5e9 unknowns).
An observation is a triplet of vectors: (x,y,c) each such vector has 50,000 elements (total of 150,000 observed points at each sample). The underlying model assumption is that an observation is generated by c = Ax + By in this model.
The task: given n observations (that is n triplets of vectors { (x_i, y_i, c_i) }_i=1..n) the task is to uncover A and B.
Now, each sample (x_i,y_i,c_i) induces 50,000 equations of the form c_i = Ax_i + By_i in the unknown A and B. If the number of samples n is greater than 100,000, then there are more than 50,000 * 100,000 ( > 5e9 ) equations and the system is over constraint.
To write the system in a matrix form I proposed to stack all observations into matrices:
A matrix X of size 50,000 x n with its i-th column equals to observed x_i
A matrix Y of size 50,000 x n with its i-th column equals to observed y_i
A matrix C of size 50,000 x n with its i-th column equals to observed c_i
With these matrices we can write the model as:
C = A*X + B*Y
I hope this clears things up a bit.
Thank you #Dan and #woodchips for your interest and enlightening comments.
EDIT (2):
Submitting the following code to octave. In this example instead of 50,000 dimension I work with only 2, instead of n=100,000 observations I settled for n=100:
n = 100;
A = rand(2,2);
B = rand(2,2);
X = rand(2,n);
Y = rand(2,n);
C = A*X + B*Y + .001*randn(size(X)); % adding noise to observations
ABt = pinv( [ X' Y'] ) * C';
Checking the difference between ground truth model (A and B) and recovered ABt:
ABt - [A' ; B']
Yields
ans =
5.8457e-05 3.0483e-04
1.1023e-04 6.1842e-05
-1.2277e-04 -3.2866e-04
-3.1930e-05 -5.2149e-05
Which is close enough to zero. (remember, the observations were noisy and solution is a least-square one).