I am trying to understand the normalization and "unnormalization" steps in the direct least squares ellipse fitting algorithm developed by Fitzgibbon, Pilu and Fisher (improved by Halir and Flusser).
EDITED: More details about theory added. Is eigenvalue problem where confusion stems from?
Short theory:
The ellipse is represented by an implicit second order polynomial (general conic equation):
where:
To constrain this general conic to an ellipse, the coefficients must satisfy the quadratic constraint:
which is equivalent to:
where C is a matrix of zeros except:
The design matrix D is composed of all data points x sub i.
The minimization of the distance between a conic and the data points can be expressed by a generalized eigenvalue problem (some theory has been omitted):
Denoting:
We now have the system:
If we solve this system, the eigenvector corresponding to the single positive eigenvalue is the correct answer.
The code:
The code snippets here are directly from the MATLAB code provided by the authors:
http://research.microsoft.com/en-us/um/people/awf/ellipse/fitellipse.html
The data input is a series of (x,y) points. The points are normalized by subtracting the mean and dividing by the standard deviation (in this case, computed as half the range). I'm assuming this normalization allows for a better fit of the data.
% normalize data
% X and Y are the vectors of data points, not normalized
mx = mean(X);
my = mean(Y);
sx = (max(X)-min(X))/2;
sy = (max(Y)-min(Y))/2;
x = (X-mx)/sx;
y = (Y-my)/sy;
% Build design matrix
D = [ x.*x x.*y y.*y x y ones(size(x)) ];
% Build scatter matrix
S = D'*D; %'
% Build 6x6 constraint matrix
C(6,6) = 0; C(1,3) = -2; C(2,2) = 1; C(3,1) = -2;
[gevec, geval] = eig(S,C);
% Find the negative eigenvalue
I = find(real(diag(geval)) < 1e-8 & ~isinf(diag(geval)));
% Extract eigenvector corresponding to negative eigenvalue
A = real(gevec(:,I));
After this, the normalization is reversed on the coefficients:
par = [
A(1)*sy*sy, ...
A(2)*sx*sy, ...
A(3)*sx*sx, ...
-2*A(1)*sy*sy*mx - A(2)*sx*sy*my + A(4)*sx*sy*sy, ...
-A(2)*sx*sy*mx - 2*A(3)*sx*sx*my + A(5)*sx*sx*sy, ...
A(1)*sy*sy*mx*mx + A(2)*sx*sy*mx*my + A(3)*sx*sx*my*my ...
- A(4)*sx*sy*sy*mx - A(5)*sx*sx*sy*my ...
+ A(6)*sx*sx*sy*sy ...
]';
At this point, I'm not sure what happened. Why is the unnormalization of the last three coefficients of A (d, e, f) dependent on the first three coefficients? How do you mathematically show where these unnormalization equations come from?
The 2 and 1 coefficients in the unnormalization lead me to believe the constraint matrix must be involved somehow.
Please let me know if more detail is needed on the method...it seems I'm missing how the normalization has propagated through the matrices and eigenvalue problem.
Any help is appreciated. Thanks!
At first, let me formalize the problem in a homogeneous space (as used in Richard Hartley and Andrew Zisserman's book Multiple View Geometry):
Assume that,
P=[X,Y,1]'
is our point in the unnormalized space, and
p=lambda*[x,y,1]'
is our point in the normalized space, where lambda is an unimportant free scale (in homogeneous space [x,y,1]=[10*x,10*y,10] and so on).
Now it is clear that we can write
x = (X-mx)/sx;
y = (Y-my)/sy;
as a simple matrix equation like:
p=H*P; %(equation (1))
where
H=[1/sx, 0, -mx/sx;
0, 1/sy, -my/sy;
0, 0, 1];
Also we know that an ellipse with the equation
A(1)*x^2 + A(2)*xy + A(3)*y^2 + A(4)*x + A(5)*y + A(6) = 0 %(first representation)
can be written in matrix form as:
p'*C*p=0 %you can easily verify this by matrix multiplication
where
C=[A(1), A(2)/2, A(4)/2;
A(2)/2, A(3), A(5)/2;
A(4)/2, A(5)/2, A(6)]; %second representation
and
p=[x,y,1]
and it is clear that these two representations of an ellipse are exactly the same and equivalent.
Also we know that the vector A=[A(1),A(2),A(3),A(4),A(5),A(6)] is a type-1 representation of the ellipse in the normalized space.
So we can write:
p'*C*p=0
where p is the normalized point and C is as defined previously.
Now we can use the "equation (1): p=HP" to derive some good result:
(H*P)'*C*(H*P)=0
=====>
P'*H'*C*H*P=0
=====>
P'*(H'*C*H)*P=0
=====>
P'*(C1)*P=0 %(equation (2))
We see that the equation (2) is an equation of an ellipse in the unnormalized space where C1 is the type-2 representation of ellipse and we know that:
C1=H'*C*H
Ans also, because the equation (2) is a zero equation we can multiply it by any non-zero number. So we multiply it by sx^2*sy^2 and we can write:
C1=sx^2*sy^2*H'*C*H
And finally we get the result
C1=[ A(1)*sy^2, (A(2)*sx*sy)/2, (A(4)*sx*sy^2)/2 - A(1)*mx*sy^2 - (A(2)*my*sx*sy)/2;
(A(2)*sx*sy)/2, A(3)*sx^2, (A(5)*sx^2*sy)/2 - A(3)*my*sx^2 - (A(2)*mx*sx*sy)/2;
-(- (A(4)*sx^2*sy^2)/2 + (A(2)*my*sx^2*sy)/2 + A(1)*mx*sx*sy^2)/sx, -(- (A(5)*sx^2*sy^2)/2 + A(3)*my*sx^2*sy + (A(2)*mx*sx*sy^2)/2)/sy, (mx*(- (A(4)*sx^2*sy^2)/2 + (A(2)*my*sx^2*sy)/2 + A(1)*mx*sx*sy^2))/sx + (my*(- (A(5)*sx^2*sy^2)/2 + A(3)*my*sx^2*sy + (A(2)*mx*sx*sy^2)/2))/sy + A(6)*sx^2*sy^2 - (A(4)*mx*sx*sy^2)/2 - (A(5)*my*sx^2*sy)/2]
which can be transformed into the type-2 ellipse and get the exact result we were looking for:
[ A(1)*sy^2, A(2)*sx*sy, A(3)*sx^2, A(4)*sx*sy^2 - 2*A(1)*mx*sy^2 - A(2)*my*sx*sy, A(5)*sx^2*sy - 2*A(3)*my*sx^2 - A(2)*mx*sx*sy, A(2)*mx*my*sx*sy + A(1)*mx*my*sy^2 + A(3)*my^2*sx^2 + A(6)*sx^2*sy^2 - A(4)*mx*sx*sy^2 - A(5)*my*sx^2*sy]
If you are curious how I managed to caculate these time-consuming equations I can give you the matlab code to do it for you as follows:
syms sx sy mx my
syms a b c d e f
C=[a, b/2, d/2;
b/2, c, e/2;
d/2, e/2, f];
H=[1/sx, 0, -mx/sx;
0, 1/sy, -my/sy;
0, 0, 1];
C1=sx^2*sy^2*H.'*C*H
a=[Cp(1,1), 2*Cp(1,2), Cp(2,2), 2*Cp(1,3), 2*Cp(2,3), Cp(3,3)]
Related
I am pretty new to MATLAB computation and would appreciate any help on this. Firstly, I am trying to integrate a cosine function using the McLaurin expansion [cos(z) = 1 − (z^2)/2 + (z^4)/4 + ....] and lets say plotted over one cycle from 0 to 2π. This first integral (as seen in Figure 1) would serve as a "reference" or "contour" for what I want to do next.
Figure 1 - "The first integral representation"
Now, my next problem comes from writing in MATLAB cos(z) in
terms of an integral in the complex plane.
Figure 2 - "showing cos(z) as an integral in the complex plane"
Where I could choose an equi-sampled set of points 'sn' along the contour, from 'sL' to 'sU'
and separated by '∆s'. This is Euler’s method.
I am trying to write a code to numerically approximate the integral
along a suitable contour and plot the approximation against the
exact value of cos(z). I would like to do this twice - once for
z ∈ [0, 6π] and once for complex valued z in the range
z ∈ [0 + i, 6π + ]. Then to plot both the real and imaginary part of the
computed cos(z)
I am aware of the steps I am looking to implement which I'll bullet-point here.
Choose γ, SL, SU , N.
Step through z from z lower to z upper (use a different
number of steps (other than N) for this).
For each value of z compute cos z by stepping along the contour in
N discrete steps from SL to SU .
For each value of sn along the contour compute the integrand e^(sn-(z^2/4sn))/sqrt(sn) and add it to the rolling sum [I have attached figure 3 showing an image formula of the integrand if its not clear!] Figure 3 - "The exponential integrand I am looking to compute"
Now I will show what I have attempted in MATLAB!
% "Contour Integral Representation of Cosine"
% making 'z' a possible number such as pi
N = 10000000; % example number - meaning sample of steps
z_lower = 0;
z_upper = 6*pi;
%==========================%
z1 = linspace(z_lower,z_upper,N);
y = 1;
Sl = y - 10*1i;
sum = 0.0;
%==========================%
for z = linspace(z_lower,z_upper,N)
for Sn = linspace(Sl,Su,N)
sum = sum + ((exp(Sn) - (z.^2/4*Sn))/sqrt(Sn))*ds;
end
end
sum = sum*(sqrt(pi)/2*pi*1i);
plot(Sn,sum)
Edit1: Hiya, this figure will show what I am expecting - the numerical method to be not exactly the same as the "symbolic" integration lets say. In figure 4, the black cosine wave is the same as in figure 1 and the blue is the numerical integration method.Figure 4 - "End Result of what I expect to plot
I have two images which one of them is the Original image and the second one is Transformed image.
I have to find out how many degrees Transformed image was rotated using 3x3 transformation matrix. Plus, I need to find how far translated from origin.
Both images are grayscaled and held in matrix variables. Their sizes are same [350 500].
I have found a few lecture notes like this.
Lecture notes say that I should use the following matrix formula for rotation:
For translation matrix the formula is given:
Everything is good. But there are two problems:
I could not imagine how to implement the formulas using MATLAB.
The formulas are shaped to find x',y' values but I already have got x,x',y,y' values. I need to find rotation angle (theta) and tx and ty.
I want to know the equivailence of x, x', y, y' in the the matrix.
I have got the following code:
rotationMatrix = [ cos(theta) sin(theta) 0 ; ...
-sin(theta) cos(theta) 0 ; ...
0 0 1];
translationMatrix = [ 1 0 tx; ...
0 1 ty; ...
0 0 1];
But as you can see, tx, ty, theta variables are not defined before used. How can I calculate theta, tx and ty?
PS: It is forbidden to use Image Processing Toolbox functions.
This is essentially a homography recovery problem. What you are doing is given co-ordinates in one image and the corresponding co-ordinates in the other image, you are trying to recover the combined translation and rotation matrix that was used to warp the points from the one image to the other.
You can essentially combine the rotation and translation into a single matrix by multiplying the two matrices together. Multiplying is simply compositing the two operations together. You would this get:
H = [cos(theta) -sin(theta) tx]
[sin(theta) cos(theta) ty]
[ 0 0 1]
The idea behind this is to find the parameters by minimizing the error through least squares between each pair of points.
Basically, what you want to find is the following relationship:
xi_after = H*xi_before
H is the combined rotation and translation matrix required to map the co-ordinates from the one image to the other. H is also a 3 x 3 matrix, and knowing that the lower right entry (row 3, column 3) is 1, it makes things easier. Also, assuming that your points are in the augmented co-ordinate system, we essentially want to find this relationship for each pair of co-ordinates from the first image (x_i, y_i) to the other (x_i', y_i'):
[p_i*x_i'] [h11 h12 h13] [x_i]
[p_i*y_i'] = [h21 h22 h23] * [y_i]
[ p_i ] [h31 h32 1 ] [ 1 ]
The scale of p_i is to account for homography scaling and vanishing points. Let's perform a matrix-vector multiplication of this equation. We can ignore the 3rd element as it isn't useful to us (for now):
p_i*x_i' = h11*x_i + h12*y_i + h13
p_i*y_i' = h21*x_i + h22*y_i + h23
Now let's take a look at the 3rd element. We know that p_i = h31*x_i + h32*y_i + 1. As such, substituting p_i into each of the equations, and rearranging to solve for x_i' and y_i', we thus get:
x_i' = h11*x_i + h12*y_i + h13 - h31*x_i*x_i' - h32*y_i*x_i'
y_i' = h21*x_i + h22*y_i + h23 - h31*x_i*y_i' - h32*y_i*y_i'
What you have here now are two equations for each unique pair of points. What we can do now is build an over-determined system of equations. Take each pair and build two equations out of them. You will then put it into matrix form, i.e.:
Ah = b
A would be a matrix of coefficients that were built from each set of equations using the co-ordinates from the first image, b would be each pair of points for the second image and h would be the parameters you are solving for. Ultimately, you are finally solving this linear system of equations reformulated in matrix form:
You would solve for the vector h which can be performed through least squares. In MATLAB, you can do this via:
h = A \ b;
A sidenote for you: If the movement between images is truly just a rotation and translation, then h31 and h32 will both be zero after we solve for the parameters. However, I always like to be thorough and so I will solve for h31 and h32 anyway.
NB: This method will only work if you have at least 4 unique pairs of points. Because there are 8 parameters to solve for, and there are 2 equations per point, A must have at least a rank of 8 in order for the system to be consistent (if you want to throw in some linear algebra terminology in the loop). You will not be able to solve this problem if you have less than 4 points.
If you want some MATLAB code, let's assume that your points are stored in sourcePoints and targetPoints. sourcePoints are from the first image and targetPoints are for the second image. Obviously, there should be the same number of points between both images. It is assumed that both sourcePoints and targetPoints are stored as M x 2 matrices. The first columns contain your x co-ordinates while the second columns contain your y co-ordinates.
numPoints = size(sourcePoints, 1);
%// Cast data to double to be sure
sourcePoints = double(sourcePoints);
targetPoints = double(targetPoints);
%//Extract relevant data
xSource = sourcePoints(:,1);
ySource = sourcePoints(:,2);
xTarget = targetPoints(:,1);
yTarget = targetPoints(:,2);
%//Create helper vectors
vec0 = zeros(numPoints, 1);
vec1 = ones(numPoints, 1);
xSourcexTarget = -xSource.*xTarget;
ySourcexTarget = -ySource.*xTarget;
xSourceyTarget = -xSource.*yTarget;
ySourceyTarget = -ySource.*yTarget;
%//Build matrix
A = [xSource ySource vec1 vec0 vec0 vec0 xSourcexTarget ySourcexTarget; ...
vec0 vec0 vec0 xSource ySource vec1 xSourceyTarget ySourceyTarget];
%//Build RHS vector
b = [xTarget; yTarget];
%//Solve homography by least squares
h = A \ b;
%// Reshape to a 3 x 3 matrix (optional)
%// Must transpose as reshape is performed
%// in column major format
h(9) = 1; %// Add in that h33 is 1 before we reshape
hmatrix = reshape(h, 3, 3)';
Once you are finished, you have a combined rotation and translation matrix. If you want the x and y translations, simply pick off column 3, rows 1 and 2 in hmatrix. However, we can also work with the vector of h itself, and so h13 would be element 3, and h23 would be element number 6. If you want the angle of rotation, simply take the appropriate inverse trigonometric function to rows 1, 2 and columns 1, 2. For the h vector, this would be elements 1, 2, 4 and 5. There will be a bit of inconsistency depending on which elements you choose as this was solved by least squares. One way to get a good overall angle would perhaps be to find the angles of all 4 elements then do some sort of average. Either way, this is a good starting point.
References
I learned about homography a while ago through Leow Wee Kheng's Computer Vision course. What I have told you is based on his slides: http://www.comp.nus.edu.sg/~cs4243/lecture/camera.pdf. Take a look at slides 30-32 if you want to know where I pulled this material from. However, the MATLAB code I wrote myself :)
My approach
fun = #(y) (1/sqrt(pi))*exp(-(y-1).^2).*log(1 + exp(-4*y))
integral(fun,-Inf,Inf)
This gives NaN.
So I tried plotting it.
y= -10:0.1:10;
plot(y,exp(-(y-1).^2).*log(1 + exp(-4*y)))
Then understood that domain (siginificant part) is from -4 to +4.
So changed the limits to
integral(fun,-10,10)
However I do not want to always plot the graph and then know its limits. So is there any way to know the integral directly from -Inf to Inf.
Discussion
If your integrals are always of the form
I would use a high-order Gauss–Hermite quadrature rule.
It's similar to the Gauss-Legendre-Kronrod rule that forms the basis for quadgk but is specifically tailored for integrals over the real line with a standard Gaussian multiplier.
Rewriting your equation with the substitution x = y-1, we get
.
The integral can then be computed using the Gauss-Hermite rule of arbitrary order (within reason):
>> order = 10;
>> [nodes,weights] = GaussHermiteRule(order);
>> f = #(x) log(1 + exp(-4*(x+1)))/sqrt(pi);
>> sum(f(nodes).*weights)
ans =
0.1933
I'd note that the function below builds a full order x order matrix to compute nodes, so it shouldn't be made too large.
There is a way to avoid this by explicitly computing the weights, but I decided to be lazy.
Besides, event at order 100, the Gaussian multiplier is about 2E-98, so the integrand's contribution is extremely minimal.
And while this isn't inherently adaptive, a high-order rule should be sufficient in most cases ... I hope.
Code
function [nodes,weights] = GaussHermiteRule(n)
% ------------------------------------------------------------------------------
% Find the nodes and weights for a Gauss-Hermite Quadrature integration.
%
if (n < 1)
error('There is no Gauss-Hermite rule of order 0.');
elseif (n < 0) || (abs(n - round(n)) > eps())
error('Given order ''n'' must be a strictly positive integer.');
else
n = round(n);
end
% Get the nodes and weights from the Golub-Welsch function
n = (0:n)' ;
b = n*0 ;
a = b + 0.5 ;
c = n ;
[nodes,weights] = GolubWelsch(a,b,c,sqrt(pi));
end
function [xk,wk] = GolubWelsch(ak,bk,ck,mu0)
%GolubWelsch
% Calculate the approximate* nodes and weights (normalized to 1) of an orthogonal
% polynomial family defined by a three-term reccurence relation of the form
% x pk(x) = ak pkp1(x) + bk pk(x) + ck pkm1(x)
%
% The weight scale factor mu0 is the integral of the weight function over the
% orthogonal domain.
%
% Calculate the terms for the orthonormal version of the polynomials
alpha = sqrt(ak(1:end-1) .* ck(2:end));
% Build the symmetric tridiagonal matrix
T = full(spdiags([[alpha;0],bk,[0;alpha]],[-1,0,+1],length(alpha),length(alpha)));
% Calculate the eigenvectors and values of the matrix
[V,xk] = eig(T,'vector');
% Calculate the weights from the eigenvectors - technically, Golub-Welsch requires
% a normalization, but since MATLAB returns unit eigenvectors, it is omitted.
wk = mu0*(V(1,:).^2)';
end
I've had success with transforming such infinite-bounded integrals using a numerical variable transformation, as explained in Numerical Recipes 3e, section 4.5.3. Basically, you substitute in y=c*tan(t)+b and then numerically integrate over t in (-pi/2,pi/2), which sweeps y from -infinity to infinity. You can tune the values of c and b to optimize the process. This approach largely dodges the question of trying to determine cutoffs in the domain, but for this to work reliably using quadrature you have to know that the integrand does not have features far from y=b.
A quick and dirty solution would be to look for a position, where your function is sufficiently small enough and then taking it as limits. This assumes that for x>0 the function fun decreases montonically and fun(x) is roughly the same size as fun(-x) for all x.
%// A small number
epsilon = eps;
%// Stepsize for searching bound
stepTest = 1;
%// Starting position for searching bound
position = 0;
%// Not yet small enough
smallEnough = false;
%// Search bound
while ~smallEnough
smallEnough = (fun(position) < eps);
position = position + stepTest;
end
%// Calculate integral
integral(fun, -position, position)
If your were happy with plotting the function, deciding by eye where you can cut, then this code will suffice, I guess.
I am using svmlib to classify linearly two dimensional non-separable data. I am able to train the svm and obtain w and b using svmlib. Using this information I can plot the decision boundary, along with the support vectors, but I am not sure about how to plot the margins, using the information that svmlib gives me.
Below is my code:
model = svmtrain(Y,X, '-s 0 -t 0 -c 100');
w = model.SVs' * model.sv_coef;
b = -model.rho;
if (model.Label(1) == -1)
w = -w; b = -b;
end
y_hat = sign(w'*X' + b);
sv = full(model.SVs);
% plot support vectors
plot(sv(:,1),sv(:,2),'ko', 'MarkerSize', 10);
% plot decision boundary
plot_x = linspace(min(X(:,1)), max(X(:,1)), 30);
plot_y = (-1/w(2))*(w(1)*plot_x + b);
plot(plot_x, plot_y, 'k-', 'LineWidth', 1)
It depends on what you mean by "the margins". It also depends on what SVM version you are talking about (separable on non-separable), but since you mentioned libsvm I'll assume you mean the more general, non-separable version.
The term "margin" can refer to the Euclidean distance from the separating hyperplane to the hyperplane defined by wx+b=1 (or wx+b=-1). This distance is given by 1/norm(w).
"Margin" can also refer to the margin of a specific sample x, which is the Euclidean distance of x from the separating hyperplane. It is given by
(wx+b)/norm(w)
note that this is a signed distance, that is it is negative/positive, depending on which side of the hyperplane the point x resides. You can draw it as a line from the point, perpendicular to the hyperplane.
Another interesting value is the slack variable xi, which is the "algebraic" distance (not Euclidean) of a support vector from the "hard" margin defined by wx+b=+1 (or -1). It is positive only for support vectors, and if a point is not a support vector, its xi equals 0. More compactly:
xi = max(0, 1 - y*(w'*x+b))
where y is the label.
I try to write an algorithm which determine $\mu$, $\sigma$,$\pi$ for each class from a mixture multivariate normal distribution.
I finish with the algorithm partially, it works when I set the random guess values($\mu$, $\sigma$,$\pi$) near from the real value. But when I set the values far from the real one, the algorithm does not converge. The sigma goes to 0 $(2.30760684053766e-24 2.30760684053766e-24)$.
I think the problem is my covarience calculation, I am not sure that this is the right way. I found this on wikipedia.
I would be grateful if you could check my algorithm. Especially the covariance part.
Have a nice day,
Thanks,
2 mixture gauss
size x = [400, 2] (400 point 2 dimension gauss)
mu = 2 , 2 (1 row = first gauss mu, 2 row = second gauss mu)
for i = 1 : k
gaussEvaluation(i,:) = pInit(i) * mvnpdf(x,muInit(i,:), sigmaInit(i, :) * eye(d));
gaussEvaluationSum = sum(gaussEvaluation(i, :));
%mu calculation
for j = 1 : d
mu(i, j) = sum(gaussEvaluation(i, :) * x(:, j)) / gaussEvaluationSum;
end
%sigma calculation methode 1
%for j = 1 : n
% v = (x(j, :) - muNew(i, :));
% sigmaNew(i) = sigmaNew(i) + gaussEvaluation(i,j) * (v * v');
%end
%sigmaNew(i) = sigmaNew(i) / gaussEvaluationSum;
%sigma calculation methode 2
sub = bsxfun(#minus, x, mu(i,:));
sigma(i,:) = sum(gaussEvaluation(i,:) * (sub .* sub)) / gaussEvaluationSum;
%p calculation
p(i) = gaussEvaluationSum / n;
Two points: you can observe this even when you implement gaussian mixture EM correctly, but in your case, the code does seem to be incorrect.
First, this is just a problem that you have to deal with when fitting mixtures of gaussians. Sometimes one component of the mixture can collapse on to a single point, resulting in the mean of the component becoming that point and the variance becoming 0; this is known as a 'singularity'. Hence, the likelihood also goes to infinity.
Check out slide 42 of this deck: http://www.cs.ubbcluj.ro/~csatol/gep_tan/Bishop-CUED-2006.pdf
The likelihood function that you are evaluating is not log-concave, so the EM algorithm will not converge to the same parameters with different initial values. The link I gave above also gives some solutions to avoid this over-fitting problem, such as putting a prior or regularization term on your parameters. You can also consider running multiple times with different starting parameters and discarding any results with variance 0 components as having over-fitted, or just reduce the number of components you are using.
In your case, your equation is right; the covariance update calculation on Wikipedia is the same as the one on slide 45 of the above link. However, if you are in a 2d space, for each component the mean should be a length 2 vector and the covariance should be a 2x2 matrix. Hence your code (for two components) is wrong because you have a 2x2 matrix to store the means and a 2x2 matrix to store the covariances; it should be a 2x2x2 matrix.