How to implement a soft-margin SVM model using Matlab's quadprog? - matlab

Suppose we are given a training dataset {yᵢ, xᵢ}, for i = 1, ..., n, where yᵢ can either be -1 or 1 and xᵢ can be e.g. a 2D or 3D point.
In general, when the input points are linearly separable, the SVM model can be defined as follows
min 1/2*||w||²
w,b
subject to the constraints (for i = 1, ..., n)
yᵢ*(w*xᵢ - b) >= 1
This is often called the hard-margin SVM model, which is thus a constrained minimization problem, where the unknowns are w and b. We can also omit 1/2 in the function to be minimized, given it's just a constant.
Now, the documentation about Matlab's quadprog states
x = quadprog(H, f, A, b) minimizes 1/2*x'*H*x + f'*x subject to the restrictions A*x ≤ b. A is a matrix of doubles, and b is a vector of doubles.
We can implement the hard-margin SVM model using quadprog function, to get the weight vector w, as follows
H becomes an identity matrix.
f' becomes a zeros matrix.
A is the left-hand side of the constraints
b is equal to -1 because the original constraint had >= 1, it becomes <= -1 when we multiply with -1 on both sides.
Now, I am trying to implement a soft-margin SVM model. The minimization equation here is
min (1/2)*||w||² + C*(∑ ζᵢ)
w,b
subject to the constraints (for i = 1, ..., n)
yᵢ*(w*xᵢ - b) >= 1 - ζᵢ
such that ζᵢ >= 0, where ∑ is the summation symbol, ζᵢ = max(0, 1 - yᵢ*(w*xᵢ - b)) and C is a hyper-parameter.
How can this optimization problem be solved using the Matlab's quadprog function? It's not clear to me how the equation should be mapped to the parameters of the quadprog function.
The "primal" form of the soft-margin SVM model (i.e. the definition above) can be converted to a "dual" form. I did that, and I am able to get the Lagrange variable values (in the dual form). However, I would like to know if I can use quadprog to solve directly the primal form without needing to convert it to the dual form.

I don't see how it can be a problem. Let z be our vector of (2n + 1) variables:
z = (w, eps, b)
Then, H becomes diagonal matrix with first n values on the diagonal equal to 1 and the last n + 1 set to zero:
H = diag([ones(1, n), zeros(1, n + 1)])
Vector f can be expressed as:
f = [zeros(1, n), C * ones(1, n), 0]'
First set of constrains becomes:
Aineq = [A1, eye(n), zeros(n, 1)]
bineq = ones(n, 1)
where A1 is a the same matrix as in primal form.
Second set of constraints becomes lower bounds:
lb = (inf(n, 1), zeros(n, 1), inf(n, 1))
Then you can call MATLAB:
z = quadprog(H, f, Aineq, bineq, [], [], lb);
P.S. I can be mistaken in some small details, but the general idea is right.

I wanted to clarify #vharavy answer because you could get lost while trying to deduce what 'n' means in his code. Here is my version according to his answer and SVM wikipedia article. I assume we have a file named "test.dat" which holds coordinates of test points and their class membership in the last column.
Example content of "test.dat" with 3D points:
-3,-3,-2,-1
-1,3,2,1
5,4,1,1
1,1,1,1
-2,5,4,1
6,0,1,1
-5,-5,-3,-1
0,-6,1,-1
-7,-2,-2,-1
Here is the code:
data = readtable("test.dat");
tableSize = size(data);
numOfPoints = tableSize(1);
dimension = tableSize(2) - 1;
PointsCoords = data(:, 1:dimension);
PointsSide = data.(dimension+1);
C = 0.5; %can be changed
n = dimension;
m = numOfPoints; %can be also interpretet as number of constraints
%z = [w, eps, b]; number of variables in 'z' is equal to n + m + 1
H = diag([ones(1, n), zeros(1, m + 1)]);
f = [zeros(1, n), C * ones(1, m), 0];
Aineq = [-diag(PointsSide)*table2array(PointsCoords), -eye(m), PointsSide];
bineq = -ones(m, 1);
lb = [-inf(1, n), zeros(1, m), -inf];
z = quadprog(H, f, Aineq, bineq, [], [], lb);

If let z = (w; w0; eps)T be a the long vector with n+1+m elements.(m the number of points)
Then,
H= diag([ones(1,n),zeros(1,m+1)]).
f = [zeros(1; n + 1); ones(1;m)]
The inequality constraints can be specified as :
A = -diag(y)[X; ones(m; 1); zeroes(m;m)] -[zeros(m,n+1),eye(m)],
where X is the n x m input matrix in the primal form.Out of the 2 parts for A, the first part is for w0 and the second part is for eps.
b = ones(m,1)
The equality constraints :
Aeq = zeros(1,n+1 +m)
beq = 0
Bounds:
lb = [-inf*ones(n+1,1); zeros(m,1)]
ub = [inf*ones(n+1+m,1)]
Now, z=quadprog(H,f,A,b,Aeq,beq,lb,ub)

Complete code. The idea is the same as above.
n = size(X,1);
m = size(X,2);
H = diag([ones(1, m), zeros(1, n + 1)]);
f = [zeros(1,m+1) c*ones(1,n)]';
p = diag(Y) * X;
A = -[p Y eye(n)];
B = -ones(n,1);
lb = [-inf * ones(m+1,1) ;zeros(n,1)];
z = quadprog(H,f,A,B,[],[],lb);
w = z(1:m,:);
b = z(m+1:m+1,:);
eps = z(m+2:m+n+1,:);

Related

adaptive elliptical structuring element in MATLAB

I'm trying to create an adaptive elliptical structuring element for an image to dilate or erode it. I write this code but unfortunately all of the structuring elements are ones(2*M+1).
I = input('Enter the input image: ');
M = input('Enter the maximum allowed semi-major axes length: ');
% determining ellipse parameteres from eigen value decomposition of LST
row = size(I,1);
col = size(I,2);
SE = cell(row,col);
padI = padarray(I,[M M],'replicate','both');
padrow = size(padI,1);
padcol = size(padI,2);
for m = M+1:padrow-M
for n = M+1:padcol-M
a = (l2(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
b = (l1(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
if e1(m-M,n-M,1)==0
phi = pi/2;
else
phi = atan(e1(m-M,n-M,2)/e1(m-M,n-M,1));
end
% defining structuring element for each pixel of image
x0 = m;
y0 = n;
se = zeros(2*M+1);
row_se = 0;
for i = x0-M:x0+M
row_se = row_se+1;
col_se = 0;
for j = y0-M:y0+M
col_se = col_se+1;
x = j-y0;
y = x0-i;
if ((x*cos(phi)+y*sin(phi))^2)/a^2+((x*sin(phi)-y*cos(phi))^2)/b^2 <= 1
se(row_se,col_se) = 1;
end
end
end
SE{m-M,n-M} = se;
end
end
a, b and phi are semi-major and semi-minor axes length and phi is angle between a and x axis.
I used 2 MATLAB functions to compute the Local Structure Tensor of the image, and then its eigenvalues and eigenvectors for each pixel. These are the matrices l1, l2, e1 and e2.
This is the bit of your code I didn't understand:
a = (l2(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
b = (l1(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
I simplified the expression for b to (just removing the indexing):
b = (l1+eps/l1+l2+2*eps)*M;
For l1 and l2 in the normal range we get:
b =(approx)= (l1+0/l1+l2+2*0)*M = (l1+l2)*M;
Thus, b can easily be larger than M, which I don't think is your intention. The eps in this case also doesn't protect against division by zero, which is typically the purpose of adding eps: if l1 is zero, eps/l1 is Inf.
Looking at this expression, it seems to me that you intended this instead:
b = (l1+eps)/(l1+l2+2*eps)*M;
Here, you're adding eps to each of the eigenvalues, making them guaranteed non-zero (the structure tensor is symmetric, positive semi-definite). Then you're dividing l1 by the sum of eigenvalues, and multiplying by M, which leads to a value between 0 and M for each of the axes.
So, this seems to be a case of misplaced parenthesis.
Just for the record, this is what you need in your code:
a = (l2(m-M,n-M)+eps ) / ( l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
b = (l1(m-M,n-M)+eps ) / ( l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
^ ^
added parentheses
Note that you can simplify your code by defining, outside of the loops:
[se_x,se_y] = meshgrid(-M:M,-M:M);
The inner two loops, over i and j, to construct se can then be written simply as:
se = ((se_x.*cos(phi)+se_y.*sin(phi)).^2)./a.^2 + ...
((se_x.*sin(phi)-se_y.*cos(phi)).^2)./b.^2 <= 1;
(Note the .* and .^ operators, these do element-wise multiplication and power.)
A further slight improvement comes from realizing that phi is first computed from e1(m,n,1) and e1(m,n,2), and then used in calls to cos and sin. If we assume that the eigenvector is properly normalized, then
cos(phi) == e1(m,n,1)
sin(phi) == e1(m,n,2)
But you can always make sure they are normalized:
cos_phi = e1(m-M,n-M,1);
sin_phi = e1(m-M,n-M,2);
len = hypot(cos_phi,sin_phi);
cos_phi = cos_phi / len;
sin_phi = sin_phi / len;
se = ((se_x.*cos_phi+se_y.*sin_phi).^2)./a.^2 + ...
((se_x.*sin_phi-se_y.*cos_phi).^2)./b.^2 <= 1;
Considering trigonometric operations are fairly expensive, this should speed up your code a bit.

Matlab equivalent code of the following equation

I have an equation for which I am trying to write a code in Matlab, but I am not sure if my code is right. The equation is as follows:
Where I think the iteration is over the superscript i.e. k, k+1 etc and the dimensions are marked by subscripts m, n, n'. The notations are not well defined in the literature so I think this is how it should be.
My code segment for this equation is as follows:
c_n = [1,2,3,4]'; % c^(0)_n (nx1) vector
K = 50;
d = [0.5,0.9]';
for k = 1:1:K
c_n = c_n.*((sum(A_mn'*d/(sum(A_mn*c_n,2)),2))./sum(A_mn',2)) ;
end
Is this code correct for the above equation?. The summations in the equation are confusing me.
If A is a matrix with m rows and n columns, then the sum is just the sum of the nth row in AT. This is the same as the corresponding sum of the nth column in A: . The matrix multiplication it represents works out nicer with the transpose because matrix multiplications are just sums of weighted rows.
Similarly, is the mth row of A, weighted element-wise by by c.
We can assume that c and d are column vectors of size n and m, respectively. (d' will be represented by just plain d in the code). In that case, most of the operations can be reduced to matrix operations:
is just the matrix product A * c, which yields a column vector of size m.
then becomes the element-wise ratio d ./ (A * c), also of size m.
The ratio is used to scale the elements of the sum of AT in the numerator, making it the matrix product A.' * (d ./ (A * c)) of size n.
Each element of that is scaled by , which can be represented by either A.' * ones(m, 1) or sum(A, 1).', so the final matrix product is just c .* (A.' * (d ./ (A * c)) ./ sum(A, 1).').
You can pre-calculate sum(A, 1).', call it e to get the following:
c = [1; 2; 3; 4];
d = [0.5; 0.9];
A = ... some 2x4 matrix;
e = sum(A, 1).';
k = 50;
for i = 1 : k
c = c .* (A.' * (d ./ (A * c)) ./ e);
end
If you want to retain the intermediate values of c for each k, you can allocate a matrix of size n, k + 1 and fill that in with each column representing a new iteration of c:
c = zeros(4, 51);
c(:, 1) = [1; 2; 3; 4];
for i = 1 : k
c(:, k + 1) = c(:, k) .* (A.' * (d ./ (A * c(:, k))) ./ e);
end

MATLAB - vectorize iteration over two matrices used in function

I have two matrices X and Y, both of order mxn. I want to create a new matrix Z of order mx1 such that each i th entry in this new matrix is computed by applying a function to ith and ith row of X and Y respectively. In my case m = 100000 and n = 2. I tried using a loop but it takes forever.
for i = 1:m
Z = function(X(1,:),Y(1,:), constant_parameters)
end
Is there an efficient way to vectorize it?
EDIT 1
This is the function
function [peso] = fxPesoTexturaCN(a,b, img, r, L)
ac = num2cell(a);
bc = num2cell(b);
imgint1 = img(sub2ind(size(img),ac{:}));
imgint2 = img(sub2ind(size(img),bc{:}));
peso = (sum((a - b) .^ 2) + (r/L) * (imgint2 - imgint1)) / (2*r^2);
Where img, r, L are constats. a is X(1,:) and b is Y(1,:)
And the call of this function is
peso = bsxfun(#(a,b) fxPesoTexturaCN(a,b,img,r,L), a, b);

Matlab calculate the product of an expression

I'm basicaly trying to find the product of an expression that goes like this:
(x-(N-1)/2).....(x+(N-1)/2) for even value of N
x is a value that I will set at the beginning that changes too but that is a different problem...
let's say for the sake of argument that for now x is a constant (ex x=1)
example for N=6
(x-5/2)(x-3/2)(x-1/2)(x+1/2)(x+3/2)*(x+5/2)
the idea was to create a row vector every element of which is each individual term (P(1)=x-5/2) (P(2)=x-3/2)...etc and then calculate its product
N=6;
x=1;
P=ones(1,N);
for k=(-N-1)/2:(N-1)/2
for n=1:N
P(n)=(x-k);
end
end
y=prod(P);
instead this creates a vector that takes only the first value of the epxression and then
repeats the same value at each cell.
there is obviously a fundamental problem with my loop but I just can't see it.
So if anyone can help with that OR suggest a better way to calculate the product I would be grateful.
Use vectorized commands
Why use a loop when you can use vectorized commands like prod?
y = prod(2 * x + [-N + 1 : 2 : N - 1]) / 2;
For convenience, you may want to define an anonymous function for it:
f = #(N,x) reshape(prod(bsxfun(#plus, 2 * x(:), -N + 1 : 2 : N - 1) / 2, 2), size(x));
Note that the function is compatible with a (row or column) vector input x.
Tests in MATLAB's Command Window
>> f(6, [2,2]')
ans =
-14.7656
4.9219
-3.5156
4.9219
-14.7656
>> f(6, [2,2])
ans =
-14.7656 4.9219 -3.5156 4.9219 -14.7656
Benchmark
Here is a comparison of rayreng's approach versus mine. The former emerges as the clear winner... :'( ...at least as N increases.
Varying N, fixed x
Fixed N (= 10), vector x of varying length
Fixed N (= 100), vector x of varying length
Benchmark code
function benchmark
% varying N, fixed x
clear all
n = logspace(2,4,20)';
x = rand(1000,1);
tr = zeros(size(n));
tj = tr;
for k = 1 : numel(n)
% rayreng's approach (poly/polyval)
fr = #() rayreng(n(k), x);
tr(k) = timeit(fr);
% Jubobs's approach (prod/reshape/bsxfun)
fj = #() jubobs(n(k), x);
tj(k) = timeit(fj);
end
figure
hold on
plot(n, tr, 'bo')
plot(n, tj, 'ro')
hold off
xlabel('N')
ylabel('time (s)')
legend('rayreng', 'jubobs')
end
function y = jubobs(N,x)
y = reshape(prod(bsxfun(#plus,...
2 * x(:),...
-N + 1 : 2 : N - 1) / 2,...
2),...
size(x));
end
function y = rayreng(N, x)
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
y = polyval(p, x);
end
function benchmark2
% fixed N, varying x
clear all
n = 100;
nx = round(logspace(2,4,20));
tr = zeros(size(n));
tj = tr;
for k = 1 : numel(nx)
disp(k)
x = rand(nx(k), 1);
% rayreng's approach (poly/polyval)
fr = #() rayreng(n, x);
tr(k) = timeit(fr);
% Jubobs's approach (prod/reshape/bsxfun)
fj = #() jubobs(n, x);
tj(k) = timeit(fj);
end
figure
hold on
plot(nx, tr, 'bo')
plot(nx, tj, 'ro')
hold off
xlabel('number of elements in vector x')
ylabel('time (s)')
legend('rayreng', 'jubobs')
title(['n = ' num2str(n)])
end
function y = jubobs(N,x)
y = reshape(prod(bsxfun(#plus,...
2 * x(:),...
-N + 1 : 2 : N - 1) / 2,...
2),...
size(x));
end
function y = rayreng(N, x)
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
y = polyval(p, x);
end
An alternative
Alternatively, because the terms in your product form an arithmetic progression (each term is greater than the previous one by 1/2), you can use the formula for the product of an arithmetic progression.
I agree with #Jubobs in that you should avoid using for loops for this kind of computation. There are cases where for loops perform fast, but for something as simple as this, avoid using loops if possible.
An alternative approach to what Jubobs has suggested is that you can consider that polynomial equation to be in factored form where each factor denotes a root located at that particular location. You can use poly to convert these factors into a polynomial equation, then use polyval to evaluate the expression at the point you want. First, generate your roots by linspace where the points vary from -(N-1)/2 to (N-1)/2 and there are N of them, then plug this into poly. Finally, for any values of x, put this into polyval with the output of poly. The advantage of this approach is that you can evaluate multiple points of x in a single sweep.
Going with what you have, you would simply do this:
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
out = polyval(p, x);
With your example, supposing that N = 6, this would be the output of the first line:
p =
1.0000 0 -8.7500 0 16.1875 0 -3.5156
As such, this is saying that when we expand out (x-5/2)(x-3/2)(x-1/2)(x+1/2)(x+3/2)(x+5/2), we get:
x^6 - 8.75x^4 + 16.1875x^2 - 3.5156
If we take a look at the roots of this equation, this is what we get:
r = roots(p)
r =
-2.5000
2.5000
-1.5000
1.5000
-0.5000
0.5000
As you can see, each term corresponds to one factor in your polynomial equation, so we do have the right mindset here. Now, all you have to do is use p with your values of x into polyval to obtain your results. For example, if I wanted to evaluate that polynomial from -2 <= x <= 2 where x is an integer, this is the result I get:
polyval(p, -2:2)
ans =
-14.7656 4.9219 -3.5156 4.9219 -14.7656
Therefore, when x = -2, the result is -14.7656 and so on.
Though I would recommend the solution by #Jubobs, it is also good to check what the issue is with your loop.
The first indication that something is wrong, is that you have a nested loop over 2 variables, and only index with one of them to store the result. Probably you just need a single loop.
Here is a loop that you may be interested in that should do roughly what you need:
N=6;
x=1;
k=(-N-1)/2:(N-1)/2
P = ones(size(k));
for n=1:numel(k)
P(n)=(x-k(n));
end
y=prod(P);
I tried to keep the code close to the original, so hopefully it is easy to understand.

Lagrange interpolation method

I use convolution and for loops (too much for loops) for calculating the interpolation using
Lagrange's method , here's the main code :
function[p] = lagrange_interpolation(X,Y)
L = zeros(n);
p = zeros(1,n);
% computing L matrice, so that each row i holds the polynom L_i
% Now we compute li(x) for i=0....n ,and we build the polynomial
for k=1:n
multiplier = 1;
outputConv = ones(1,1);
for index = 1:n
if(index ~= k && X(index) ~= X(k))
outputConv = conv(outputConv,[1,-X(index)]);
multiplier = multiplier * ((X(k) - X(index))^-1);
end
end
polynimialSize = length(outputConv);
for index = 1:polynimialSize
L(k,n - index + 1) = outputConv(polynimialSize - index + 1);
end
L(k,:) = multiplier .* L(k,:);
end
% continues
end
Those are too much for loops for computing the l_i(x) (this is done before the last calculation of P_n(x) = Sigma of y_i * l_i(x)) .
Any suggestions into making it more matlab formal ?
Thanks
Yeah, several suggestions (implemented in version 1 below): if loop can be combined with for above it (just make index skip k via something like jr(jr~=j) below); polynomialSize is always equal length(outputConv) which is always equal n (because you have n datapoints, (n-1)th polynomial with n coefficients), so the last for loop and next line can be also replaced with simple L(k,:) = multiplier * outputConv;
So I replicated the example on http://en.wikipedia.org/wiki/Lagrange_polynomial (and adopted their j-m notation, but for me j goes 1:n and m is 1:n and m~=j), hence my initialization looks like
clear; clc;
X=[-9 -4 -1 7]; %example taken from http://en.wikipedia.org/wiki/Lagrange_polynomial
Y=[ 5 2 -2 9];
n=length(X); %Lagrange basis polinomials are (n-1)th order, have n coefficients
lj = zeros(1,n); %storage for numerator of Lagrange basis polyns - each w/ n coeff
Lj = zeros(n); %matrix of Lagrange basis polyns coeffs (lj(x))
L = zeros(1,n); %the Lagrange polynomial coefficients (L(x))
then v 1.0 looks like
jr=1:n; %j-range: 1<=j<=n
for j=jr %my j is your k
multiplier = 1;
outputConv = 1; %numerator of lj(x)
mr=jr(jr~=j); %m-range: 1<=m<=n, m~=j
for m = mr %my m is your index
outputConv = conv(outputConv,[1 -X(m)]);
multiplier = multiplier * ((X(j) - X(m))^-1);
end
Lj(j,:) = multiplier * outputConv; %jth Lagrange basis polinomial lj(x)
end
L = Y*Lj; %coefficients of Lagrange polinomial L(x)
which can be further simplified if you realize that numerator of l_j(x) is just a polynomial with specific roots - for that there is a nice command in matlab - poly. Similarly the denominator is just that polyn evaluated at X(j) - for that there is polyval. Hence, v 1.9:
jr=1:n; %j-range: 1<=j<=n
for j=jr
mr=jr(jr~=j); %m-range: 1<=m<=n, m~=j
lj=poly(X(mr)); %numerator of lj(x)
mult=1/polyval(lj,X(j)); %denominator of lj(x)
Lj(j,:) = mult * lj; %jth Lagrange basis polinomial lj(x)
end
L = Y*Lj; %coefficients of Lagrange polinomial L(x)
Why version 1.9 and not 2.0? well, there is probably a way to get rid of this last for loop, and write it all in 1 line, but I can't think of it right now - it's a todo for v 2.0 :)
And, for dessert, if you want to get the same picture as wikipedia:
figure(1);clf
x=-10:.1:10;
hold on
plot(x,polyval(Y(1)*Lj(1,:),x),'r','linewidth',2)
plot(x,polyval(Y(2)*Lj(2,:),x),'b','linewidth',2)
plot(x,polyval(Y(3)*Lj(3,:),x),'g','linewidth',2)
plot(x,polyval(Y(4)*Lj(4,:),x),'y','linewidth',2)
plot(x,polyval(L,x),'k','linewidth',2)
plot(X,Y,'ro','linewidth',2,'markersize',10)
hold off
xlim([-10 10])
ylim([-10 10])
set(gca,'XTick',-10:10)
set(gca,'YTick',-10:10)
grid on
produces
enjoy and feel free to reuse/improve
Try:
X=0:1/20:1; Y=cos(X) and create L and apply polyval(L,1).
polyval(L,1)=0.917483227909543
cos(1)=0.540302305868140
Why there is huge difference?