MATLAB fitcsvm svm boundary equation does not separate data - matlab

I am trying to classify binary data using fitcsvm, but when I plot the boundry equation, it does not sit close to the data.
Here is the code that I used to generate the model
Theme
%creating inputs for the model
xTable = [responseData_Intensity.Intensity responseData_Intensity.ActiveForce_kg_];
y = responseData_Intensity.FeltSVM;
%-------------------------------------------------------SVM MODEL
SVMModel = fitcsvm(xTable,y);
%------------------------------------------PLOTTING THE MODEL WITH DATA
figureSVM = figure;
hold on
figTitle = strcat(participantList(participantNumber),'-',parameter,'-Maximal Margin Line');
title(figTitle);
in = responseData_Intensity.Intensity; fr = responseData_Intensity.ActiveForce_Kg_;
gscatter(in,fr,responseData_Intensity.FeltSVM,'rb');
syms x
eqn = slope*x+yIntercept == 0;
xIntercept = double(solve(eqn)); % X values where y=0
xlabel('Inensity Tested');
ylabel('Force (kg)');
plot(in(SVMModel.IsSupportVector), fr(SVMModel.IsSupportVector), 'ko', 'MarkerSize',10);
plot(in, -SVMModel.Beta(1)/SVMModel.Beta(2)*in - (SVMModel.Bias)/SVMModel.Beta(2))
legend('Not Felt','Felt','Support Vector','Classifier');
These are the values for
xTable and y
xTable =
0.5000 0.5500
0.4000 0.6167
0.3000 0.4000
0.2000 0.3500
0.1000 0.6833
0.2000 0.6333
0.1000 0.4833
0 0.6500
0.5000 0.6167
0.4000 0.5333
0.3000 0.7333
0.2000 0.7000
0.1000 0.7000
0.2000 0.6833
0.1000 0.7833
0.1000 0.6500
0.2000 0.6333
0.1000 0.8167
0 1.1333
0 0.8500
y =
1
1
1
1
-1
1
-1
-1
1
1
1
1
-1
1
1
-1
1
-1
1
1
and the resulting plot
which seems off because it is so far removed from the data and the support vectors. The zoomed in data is here:
From all the other example I've seen the line should divide the data in between the two identifiers? I may be getting some things mixed up, so any help would be very much appreciated!

Figured out the answer, I needed to normalize the data, once that is done the boundary equation separatest the data nicely!

Related

Solve System of Linear Equations in MatLab with Matrix of Arbitrary Size for Finite Difference Calculation

I am trying to write a script in MatLab R2016a that can solve a system of linear equations that can have different sizes depending on the values of p and Q.
I have the following equations that I am trying to solve, where h=[-p:1:p]*dx. Obviously, there is some index m where h=0, but that shouldn't be a problem.
I'm trying to write a function where I can input p and Q and build the matrix and then just solve it to get the coefficients. Is there a way to build a matrix using the variables p, Q, and h instead of using different integer values for each individual case?
I would use bsxfun(in recent matlab versions this function may be implented to the interpreter, I don't know for sure):
p = 4;
Q = 8;
dx = 1;
h = -p:p*dx
Qvector = [Q,1:Q-1]'
Matrix = bsxfun(#(Qvector, h)h.^(Qvector)./factorial(Qvector), Qvector, h)
Output:
h =
-4 -3 -2 -1 0 1 2 3 4
Qvector =
8
1
2
3
4
5
6
7
Matrix =
1.6254 0.1627 0.0063 0.0000 0 0.0000 0.0063 0.1627 1.6254
-4.0000 -3.0000 -2.0000 -1.0000 0 1.0000 2.0000 3.0000 4.0000
8.0000 4.5000 2.0000 0.5000 0 0.5000 2.0000 4.5000 8.0000
-10.6667 -4.5000 -1.3333 -0.1667 0 0.1667 1.3333 4.5000 10.6667
10.6667 3.3750 0.6667 0.0417 0 0.0417 0.6667 3.3750 10.6667
-8.5333 -2.0250 -0.2667 -0.0083 0 0.0083 0.2667 2.0250 8.5333
5.6889 1.0125 0.0889 0.0014 0 0.0014 0.0889 1.0125 5.6889
-3.2508 -0.4339 -0.0254 -0.0002 0 0.0002 0.0254 0.4339 3.2508

matlab: apply an operand on an array by a condition

I have an array like this:
>> a = [2,34,5,6,7,0,1,10]
now I want to reverse each element of this array.
By using 1 ./ a the result is:
ans =
0.5000 0.0294 0.2000 0.1667 0.1429 Inf 1.0000 0.1000
The Inf is not good for me, the answer should be
ans =
0.5000 0.0294 0.2000 0.1667 0.1429 0 1.0000 0.1000
I want to apply this on elements that are not zero!
How can I do that?
You could also reset the Inf value to zero afterwards:
>> b=1./a
b =
0.5000 0.0294 0.2000 0.1667 0.1429 Inf 1.0000 0.1000
>> b(isinf(b)) = 0
b =
0.5000 0.0294 0.2000 0.1667 0.1429 0 1.0000 0.1000
You can do it conditionally:
nz = a ~= 0; %// select using logical indexing
a(nz) = 1./a(nz);
A slightly more general approach than m.s.'s is to check for finite elements in the output using isfinite:
b = 1./a;
b( ~isfinite(b) ) = 0;
isfinite covers both inf values as well as NaN values, so if the element-wise function you are applying might generate both types of non-numeric values, isfinite handles them simultaneously for you.

Summing cumulative area under curves of overapping triangles

I have two matrices for several triangles:
x =
2.0000 5.0000 10.0000
8.0000 10.0000 12.0000
12.0000 24.0000 26.0000
22.0000 25.0000 28.0000
23.0000 26.0000 25.0000
23.5000 27.0000 27.5000
20.0000 23.0000 27.0000
21.0000 24.0000 27.0000
24.0000 25.0000 27.0000
24.0000 26.0000 27.0000
24.0000 28.0000 29.0000
19.0000 22.0000 25.0000
18.0000 21.0000 23.0000
y =
0 1.0000 0
0 0.8000 0
0 0.6000 0
0 0.8000 0
0 0.8000 0
0 0.8000 0
0 1.0000 0
0 1.0000 0
0 1.0000 0
0 1.0000 0
0 1.0000 0
0 1.0000 0
0 1.0000 0
one line is one triangle. Columns are x and y positions of each point of the triangles.
So, I plot all these triangles and I need to sum the cumulative area under the curve of the triangles.
I try to use the area function, but I couldn't find how to sum their areas.
EDIT: I need to plot the sum of the areas on a red line in the same graphics. So I don't want a number like 20 cm²... I would like something like that:
I suggest that you interpolate to create all your individual triangles and then add the results. First you will need to augment your x and y matrices with the beginning (the origin) and end points like so:
m = 30; %// This is your max point, maybe set it using max(x(:))?
X = [zeros(size(x,1),1), x, ones(size(x,1),1)*m];
Y = [zeros(size(y,1),1), y, zeros(size(y,1),1)];
then perform all the interpolations (I'll sum as I go):
xi = 0:0.1:m;
A = zeros(1,size(xi,2)); %// initialization
for row = 1:size(x,1)
A = A + interp1(X(row,:), Y(row,:), xi);
end
and finally plot:
plot(x,y,'k')
hold on
plot(xi,A,'r','linewidth',2)
using your example data this gives:

MatLab: Create 3D Histogram from sampled data

I have sampled data in the interval [0,1] in an Array transitions=zeros(101,101) which I want to plot as a 3D-histogram. transitions is filled with data similar to the example data provided at the end of this thread.
The first columns refers to the first observed variable X, the second column to the second variable Y and the third column is the normalized frequency. I.e. for the first row: the observed normalized frequency of the variable pair (0,0) is 0.9459. The sum of the normalized frequencies for (0,Y)thus is 1.
I tried to make (sort of) a 3D histogram with the following code:
x_c = (transitions(:,1) * 100)+1;
y = (transitions(:,2) * 100)+1;
z = transitions(:,4);
%A = zeros(10,10);
A = zeros(max(x_c),max(y));
for i = 1:length(x_c)
try
if(z(i)>0)
A(int32(x_c(i)), int32(y(i))) = abs(log(z(i)));
else
% deal with exceptions regarding log(0)
A(int32(x_c(i)), int32(y(i))) = 0;
end
catch
disp('');
end
end
bar3(A);
However, since it is sampled data in a discrete space A the output looks like the plot below. This is somehow misleading as there are 'gaps' in the plot (z-value = 0 for coordinates where I have no sampled data). I rather would like to have the sampled data being assigned to their corresponding plots, thus resulting in a 'real' 3d histogram.
By the way, as a result of my 'hack' of creating A also the x-,y- and z-scale is not correct. The 3D histogram's axes (all three) should be in the interval of [0,1].
ans =
0 0 0.9459
0 0.0500 0.0256
0 0.1000 0.0098
0 0.1100 0.0004
0 0.1500 0.0055
0 0.1600 0.0002
0 0.2000 0.0034
0 0.2100 0.0001
0 0.2500 0.0024
0 0.2600 0.0001
0 0.3000 0.0018
0 0.3200 0.0000
0 0.3700 0.0000
0 0.4000 0.0010
0 0.4200 0.0000
0 0.4500 0.0007
0 0.5000 0.0007
0 0.5300 0.0000
0 0.5500 0.0005
0 0.6000 0.0005
0 0.6300 0.0000
0 0.7000 0.0002
0 0.7400 0
0 0.7500 0.0003
0 0.7900 0.0000
0 0.8000 0.0002
0 0.8400 0.0000
0 0.8500 0.0002
0 0.8900 0.0000
0 0.9000 0.0002
0 0.9500 0.0001
0 1.0000 0.0001
0.0500 0 0.0235
0.0500 0.0500 0.0086
0.0500 0.1000 0.0045
. . .
. . .
. . .
. . .
. . .
0.9500 0.9000 0.0035
0.9500 0.9500 0.0066
0.9500 1.0000 0.0180
1.0000 0 0.0001
1.0000 0.0500 0.0001
1.0000 0.1000 0.0001
1.0000 0.1100 0.0000
1.0000 0.1500 0.0001
1.0000 0.1600 0.0000
1.0000 0.2000 0.0001
1.0000 0.2100 0.0000
1.0000 0.2500 0.0001
1.0000 0.2600 0.0000
1.0000 0.3000 0.0001
1.0000 0.3200 0.0000
1.0000 0.3700 0.0000
1.0000 0.4000 0.0002
1.0000 0.4200 0
1.0000 0.4500 0.0002
1.0000 0.5000 0.0003
1.0000 0.5300 0.0000
1.0000 0.5500 0.0004
1.0000 0.6000 0.0004
1.0000 0.6300 0.0000
1.0000 0.7000 0.0007
1.0000 0.7400 0.0000
1.0000 0.7500 0.0010
1.0000 0.7900 0.0000
1.0000 0.8000 0.0015
1.0000 0.8400 0.0001
1.0000 0.8500 0.0024
1.0000 0.8900 0.0002
1.0000 0.9000 0.0042
1.0000 0.9500 0.0111
1.0000 1.0000 0.3998
I found a solution by working on the non-aggregated data. In particular each row of the data set transitions contains one observation of Xand Y. I used the code below to produce a normalized 3D histogram (and a 2D map) as folllows:
function createHistogram(transitions)
uniqueValues = unique(transitions(:,1));
biases = cell(numel(uniqueValues),1);
for i = 1:numel(uniqueValues)
start = min(find(transitions(:,1) == uniqueValues(i)));
stop = max(find(transitions(:,1) == uniqueValues(i)));
biases(i) = mat2cell(transitions(start:stop,2));
end
combinedBiases = padcat(biases{1},biases{2},biases{3},biases{4},...
biases{5},biases{6},biases{7},biases{8},biases{9},biases{10},...
biases{11},biases{12},biases{13},biases{14},biases{15},biases{16},...
biases{17},biases{18},biases{19});
bins = 0:0.1:1;
[f, x] = hist(combinedBiases, bins);
%
% normalize
%
for i = 1:numel(f(1,:))
for j = 1:numel(f(:,i))
f(j,i) = f(j,i)/numel(biases{i});
end
end
bHandle = bar3(x, f);
ylim([-0.04,1.04])
for k = 1:length(bHandle)
zdata = get(bHandle(k),'ZData');
set(bHandle(k),'CData',zdata, 'FaceColor','interp');
end
colormap('autumn');
hcol = colorbar();
axis('square');
cpos=get(hcol,'Position');
cpos(4)=cpos(4)/3; % Halve the thickness
cpos(2)=0.4; % Move it down outside the plot#
cpos(1)=0.82;
set(hcol, 'Position',cpos);
xlabel('Enrollment biases');
ylabel('Aging biases');
zlabel('Bias transition probability');
title(strcat('Probability mass function of bias transitions (', device,')'));
set(gca,'XTick',0:2:20);
set(gca,'XTickLabel',0:0.1:1);
print('-dpng','-r600',strcat('tau_PMF3D_enrollment-ageing-', device));
view(2);
cpos(1)=0.84;
set(hcol, 'Position',cpos);
print('-dpng','-r600',strcat('tau_PMF2D_enrollment-ageing-', device));
end
From the comment on the question it appears you have the values you want to represent each bin count. If so an alternative solution is to plot using hist3 with "junk" data using correct x and y scales and then update the zdata of the surface object created with your bin data (modified to be in the correct format).
This modification to the bin data is fairly simple and consists of reshaping into a matrix then replicating and padding all the elements, the method is included in the code below.
Based on the ans variable at the end of the question, assuming
ans(:,1) gives x values
ans(:,2) gives y values
ans(:,3) gives the normalised bin counts
code
%// Inputs
zdata=ans(:,3); %// zdata=rand(21*21,1); % for testing
xvalues = 0:0.05:1;
yvalues = 0:0.05:1;
%// plot with junk data, [0,0] in this case
nx = numel(xvalues); ny = numel(yvalues);
bincenters = { xvalues , yvalues };
hist3([0,0],bincenters);
Hsurface = get(gca,'children');
%// apply bin count format
pad = [0 0 0 0 0;0 1 1 0 0;0 1 1 0 0;0 0 0 0 0;0 0 0 0 0]; %// padding for each point
ztrans=kron(reshape(zdata,[nx,ny]),pad); %// apply padding to each point
%// update plot
set(Hsurface,'ZData',ztrans)
%// to set colour based on bar height
colormap('autumn');
set(Hsurface,'CData',ztrans,'FaceColor','interp')
output

Blockdiagonal variation grid

I have the feeling I am missing something intuitive in my solution for generating a partially varied block-diagonal grid. In any case, I would like to get rid of the loop in my function (for the sake of challenge...)
Given tuples of parameters, number of intervals and percentage variation:
params = [100 0.5 1
24 1 0.9];
nint = 1;
perc = 0.1;
The desired output should be:
pspacegrid(params,perc,nint)
ans =
90.0000 0.5000 1.0000
100.0000 0.5000 1.0000
110.0000 0.5000 1.0000
100.0000 0.4500 1.0000
100.0000 0.5000 1.0000
100.0000 0.5500 1.0000
100.0000 0.5000 0.9000
100.0000 0.5000 1.0000
100.0000 0.5000 1.1000
21.6000 1.0000 0.9000
24.0000 1.0000 0.9000
26.4000 1.0000 0.9000
24.0000 0.9000 0.9000
24.0000 1.0000 0.9000
24.0000 1.1000 0.9000
24.0000 1.0000 0.8100
24.0000 1.0000 0.9000
24.0000 1.0000 0.9900
where you can see that the variation occurs at the values expressed by this mask:
mask =
1 0 0
1 0 0
1 0 0
0 1 0
0 1 0
0 1 0
0 0 1
0 0 1
0 0 1
1 0 0
1 0 0
1 0 0
0 1 0
0 1 0
0 1 0
0 0 1
0 0 1
0 0 1
The function pspacegrid() is:
function out = pspacegrid(params, perc, nint)
% PSPACEGRID Generates a parameter space grid for sensitivity analysis
% Size and number of variation steps
sz = size(params);
nsteps = nint*2+1;
% Preallocate output
out = reshape(permute(repmat(params,[1,1,nsteps*sz(2)]),[3,1,2]),[],sz(2));
% Mask to index positions where to place interpolated
[tmp{1:sz(2)}] = deal(true(nsteps,1));
mask = repmat(logical(blkdiag(tmp{:})),sz(1),1);
zi = cell(sz(1),1);
% LOOP per each parameter tuple
for r = 1:sz(1)
% Columns, rows, rows to interpolate and lower/upper parameter values
x = 1:sz(2);
y = [1; nint*2+1];
yi = (1:nint*2+1)';
z = [params(r,:)*(1-perc); params(r,:)*(1+perc)];
% Interpolated parameters
zi{r} = interp2(x,y,z, x, yi);
end
out(mask) = cat(1,zi{:});
I think I got it, building off your pre-loop code:
params = [100 0.5 1
24 1 0.9];
nint = 1;
perc = 0.1;
sz = size(params);
nsteps = nint*2+1;
% Preallocate output
out = reshape(permute(repmat(params,[1,1,nsteps*sz(2)]),[3,1,2]),[],sz(2));
%Map of the percentage moves
[tmp{1:sz(2)}] = deal(linspace(-perc,perc,nint*2+1)');
mask = repmat(blkdiag(tmp{:}),sz(1),1) + 1; %Add one so we can just multiply at the end
mask.*out
So instead of making your mask replicate the ones I made it replicate the percentage moves each element makes which is a repeating pattern, the basic element is made like this:
linspace(-perc,perc,nint*2+1)'
Then it's as simple as adding 1 to the whole thing and multiplying by your out matrix
I tested it as follows:
me = mask.*out;
you = pspacegrid(params, perc, nint);
check = me - you < 0.0001;
mean(check(:))
Seemed to work when I fiddled with the inputs. However I did get an error with your function, I had to change true(...) to ones(...). This might be because I'm running it online which probably uses Octave rather than Matlab.