I have the following code implemented in Matlab. I want to train the perceptron using a batch algorithm to separate this liniar separable points. So, in order to do that I use adapt() function but it doesn't seem to work. What I mean by that is that my perceptron is not able to classify the points as they should be. It has some weights which are not useful in any way. On the other hand, when I use train() function everything goes according to plan.The perceptron is able to classify the points with accuracy. Can anyone explain to me what is wrong with my code? Thanks in advance!
function problema2_1()
p = -1 + ( 1 + 1) .* rand(3,5);
for i = 1 : length(p)
if 2 * p(1,i) - p(2,i) + p(3,i) < 0
t(i) = -1;
else
t(i) = 1;
end
end
net = newp([-1 1; -1 1; -1 1],1,'hardlims');
net.adaptParam.passes = 1000000;
net = adapt(net,p,t);
plotpv(p,hardlim(t));
hold on
plotpc(net.IW{1,1,1},net.b{1});
t - sim(net,p)
end
adapt only runs passes through your training data once and thus makes very small updates to the network weights. Meanwhile train iterates on the training data several times until a stopping condition is met.
The examples in the Matlab documentation for adapt should provide some clarification. I suspect your line net.adaptParam.passes = 1000000 isn't doing what you think it's doing.
As an immediate fix, just try looping over your net = adapt(net,p,t) several times to verify that the resultant network seems to be converging to the one obtained when using train().
Related
I have a system of 5 ODEs with nonlinear terms involved. I am trying to vary 3 parameters over some ranges to see what parameters would produce the necessary behaviour that I am looking for.
The issue is I have written the code with 3 for loops and it takes a very long time to get the output.
I am also storing the parameter values within the loops when it meets a parameter set that satisfies an ODE event.
This is how I have implemented it in matlab.
function [m,cVal,x,y]=parameters()
b=5000;
q=0;
r=10^4;
s=0;
n=10^-8;
time=3000;
m=[];
cVal=[];
x=[];
y=[];
val1=0.1:0.01:5;
val2=0.1:0.2:8;
val3=10^-13:10^-14:10^-11;
for i=1:length(val1)
for j=1:length(val2)
for k=1:length(val3)
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
[t,y,te,ye]=ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if length(te)==1
m=[m;val1(i)];
cVal=[cVal;val2(j)];
x=[x;val3(k)];
y=[y;ye(1)];
end
end
end
end
Is there any other way that I can use to speed up this process?
Profile viewer results
I have written the system of ODEs simply with the a format like
function s=systemFunc(t,y,p)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p(1)*y(2)*y(1)/(p(2)*y(2)+y(1));
s(2)=p(3)*y(1)-d*y(2);
end
f,d,k are constant parameters.
The equations are more complicated than what's here as its a system of 5 ODEs with lots of non linear terms interacting with each other.
Tommaso is right. Preallocating will save some time.
But I would guess that there is fundamentally not a lot you can do since you are running ode45 in a loop. ode45 itself may be the bottleneck.
I would suggest you profile your code to see where the bottleneck is:
profile on
parameters(... )
profile viewer
I would guess that ode45 is the problem. Probably you will find that you should actually focus your time on optimizing the systemFunc code for performance. But you won't know that until you run the profiler.
EDIT
Based on the profiler output and additional code, I see some things that will help
It seems like the vectorization of your values is hurting you. Instead of
#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)])
try
#(t,y)systemFunc(t,y,val1(i),val2(j),val3(k))
where your system function is defined as
function s=systemFunc(t,y,p1,p2,p3)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1));
s(2)=p3*y(1)-d*y(2);
end
Next, note that you don't have to preallocate space in the systemFunc, just combine them in the output:
function s=systemFunc(t,y,p1,p2,p3)
s = [ f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1)),
p3*y(1)-d*y(2) ];
end
Finally, note that ode45 is internally taking about 1/3 of your runtime. There is not much you will be able to do about that. If you can live with it, I would suggest increasing your 'AbsTol' and 'RelTol' to more reasonable numbers. Those values are really small, and are making ode45 run for a really long time. If you can live with it, try increasing them to something like 1e-6 or 1e-8 and see how much the performance increases. Alternatively, depending on how smooth your function is, you might be able to do better with a different integrator (like ode23). But your mileage will vary based on how smooth your problem is.
I have two suggestions for you.
Preallocate the vectors in which you store your results and use an
increasing index to populate them into each iteration.
Since the options you use are always the same, instantiate then
outside the loop only once.
Final code:
function [m,cVal,x,y] = parameters()
b = 5000;
q = 0;
r = 10^4;
s = 0;
n = 10^-8;
time = 3000;
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
val1 = 0.1:0.01:5;
val1_len = numel(val1);
val2 = 0.1:0.2:8;
val2_len = numel(val2);
val3 = 10^-13:10^-14:10^-11;
val3_len = numel(val3);
total_len = val1_len * val2_len * val3_len;
m = NaN(total_len,1);
cVal = NaN(total_len,1);
x = NaN(total_len,1);
y = NaN(total_len,1);
res_offset = 1;
for i = 1:val1_len
for j = 1:val2_len
for k = 1:val3_len
[t,y,te,ye] = ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if (length(te) == 1)
m(res_offset) = val1(i);
cVal(res_offset) = val2(j);
x(res_offset) = val3(k);
y(res_offset) = ye(1);
end
res_offset = res_offset + 1;
end
end
end
end
If you only want to preserve result values that have been correctly computed, you can remove the rows containing NaNs at the bottom of your function. Indexing on one of the vectors will be enough to clear everything:
rows_ok = ~isnan(y);
m = m(rows_ok);
cVal = cVal(rows_ok);
x = x(rows_ok);
y = y(rows_ok);
In continuation of the other suggestions, I have 2 more suggestions for you:
You might want to try with a different solver, ODE45 is for non-stiff problems, but from the looks of it, it might seem like your problem could be stiff (parameters have a different order of magnitude). Try for instance with the ode23s method.
Secondly, without knowing which event you are looking for, maybe it is possible for you to use a logarithmic search rather than a linear one. e.g. the Bisection method. This will severely cut down on the number of times you have to solve the equation.
This is my first go with ML (and Matlab) and I'm following "Learning From Data" by Yaser S. Abu-Mostafa.
I'm trying to implement the Perceptron algorithm, after trying to go through the pseudocode, using other people's solutions I can't seem to fix my problem (I went through other threads too).
The algorithm separates the data fine, it works. However, I want to plot a single line, but it seems as it separates them in a way so the '-1' cluster is divided to a second cluster or more.
This is the code:
iterations = 100;
dim = 3;
X1=[rand(1,dim);rand(1,dim);ones(1,dim)]; % class '+1'
X2=[rand(1,dim);1+rand(1,dim);ones(1,dim)]; % class '-1'
X=[X1,X2];
Y=[-ones(1,dim),ones(1,dim)];
w=[0,0,0]';
% call perceptron
wtag=weight(X,Y,w,iterations);
% predict
ytag=wtag'*X;
% plot prediction over origianl data
figure;hold on
plot(X1(1,:),X1(2,:),'b.')
plot(X2(1,:),X2(2,:),'r.')
plot(X(1,ytag<0),X(2,ytag<0),'bo')
plot(X(1,ytag>0),X(2,ytag>0),'ro')
legend('class -1','class +1','pred -1','pred +1')
%Why don't I get just one line?
plot(X,Y);
The weight function (Perceptron):
function [w] = weight(X,Y,w_init,iterations)
%WEIGHT Summary of this function goes here
% Detailed explanation goes here
w = w_init;
for iteration = 1 : iterations %<- was 100!
for ii = 1 : size(X,2) %cycle through training set
if sign(w'*X(:,ii)) ~= Y(ii) %wrong decision?
w = w + X(:,ii) * Y(ii); %then add (or subtract) this point to w
end
end
sum(sign(w'*X)~=Y)/size(X,2); %show misclassification rate
end
I don't think the problem is in the second function but I added it regardless
I'm pretty sure the algorithm separates it to more than one cluster but I can't tell why most of the learning I've done so far was math and theory and not actual coding so I'm probably missing something obvious..
I am trying to figure out how to right a math based app with Matlab, although I cannot seem to figure out how to get the Monte Carlo method of integration to work. I feel that I do not have algorithm thought out correctly either. As of now, I have something like:
// For the function {integral of cos(x^3)*exp(x^(1/2))+x dx
// from x = 0 to x = 10
ans = 0;
for i = 1:100000000
x = 10*rand;
ans = ans + cos(x^3)*exp(x^(1/2))+x
end
I feel that this is completely wrong because my outputs are hardly even close to what is expected. How should I correctly write this? Or, how should the algorithm for setting this up look?
Two issues:
1) If you look at what you're calculating, "ans" is going to grow as i increases. By putting a huge number of samples, you're just increasing your output value. How could you normalize this value so that it stays relatively the same, regardless of number of samples?
2) Think about what you're trying to calculate here. Your current "ans" is giving you the sum of 100000000 independent random measurements of the output to your function. What does this number represent if you divide by the number of samples you've taken? How could you combine that knowledge with the range of integration in order to get the expected area under the curve?
I managed to solve this with the formula I found here. I ended up using:
ans = 0;
n = 0;
for i:1:100000000
x = 10*rand;
n = n + cos(x^3)*exp(x^(1/2))+x;
end
ans = ((10-0)/100000000)*n
I am trying to use libsvm (with Matlab interface) to run some multi-label classification problem. Here is some toy problem using IRIS data:
load fisheriris;
featuresTraining = [meas(1:30,:); meas(51:80,:); meas(101:130,:)];
featureSelectedTraining = featuresTraining(:,1:3);
groundTruthGroupTraining = [species(1:30,:); species(51:80,:); species(101:130,:)];
[~, ~, groundTruthGroupNumTraining] = unique(groundTruthGroupTraining);
featuresTesting = [meas(31:50,:); meas(81:100,:); meas(131:150,:)];
featureSelectedTesting = featuresTesting(:,1:3);
groundTruthGroupTesting = [species(31:50,:); species(81:100,:); species(131:150,:)];
[~, ~, groundTruthGroupNumTesting] = unique(groundTruthGroupTesting);
% Train the classifier
optsStruct = ['-c ', num2str(2), ' -g ', num2str(4), '-b ', 1];
SVMClassifierObject = svmtrain(groundTruthGroupNumTraining, featureSelectedTraining, optsStruct);
optsStruct = ['-b ', 1];
[predLabelTesting, predictAccuracyTesting, ...
predictScoresTesting] = svmpredict(groundTruthGroupNumTesting, featureSelectedTesting, SVMClassifierObject, optsStruct);
However, for the predict probabilities I have got (the first 12 rows of results showed here)
1.08812899093155 1.09025554950852 -0.0140009056912001
0.948911671379753 0.947899227815959 -0.0140009056926024
0.521486301840914 0.509673405799383 -0.0140009056926027
0.914684487894784 0.912534150299246 -0.0140009056926027
1.17426551505833 1.17855350325579 -0.0140009056925103
0.567801459258613 0.557077025701113 -0.0140009056926027
0.506405203427106 0.494342606399178 -0.0140009056926027
0.930191457490471 0.928343421250020 -0.0140009056926027
1.16990617214906 1.17412523596840 -0.0140009056926026
1.16558843984163 1.16986137054312 -0.0140009056926015
0.879648874624610 0.876614924593740 -0.0140009056926027
-0.151223818963057 -0.179682730685229 -0.0140009056925999
I am confused that how some of the probabilities are larger than 1 and some of them are negative?
However, the predicted label seems quite accurate:
1
1
1
1
1
1
1
1
1
1
1
3
with final output of
Accuracy = 93.3333% (56/60) (classification)
Then how to interpret the results of the predicted probabilities? Thanks a lot. A.
The output of an svm are not probabilities!
The score's sign indicates whether it belongs to class A or class B. And if the score is 1 or -1 it is on the margin, although that is not particularly useful to know.
If you really need probabilities, you can convert them using Platt scaling. You basically apply a sigmoid function to them.
I understand that this answer is probably too late, but it may benefit people encountering the same problem.
libsvm can in fact produce probabilities, for which the option '-b' is used.
I think the mistake you made is in the way you defined the optsStruct variable. It should be defined like this: ['-b ' num2str(1)] OR ['-b 1'].
The same applies to the options sent to the svmtrain.
The code in question is here:
function k = whileloop(odefun,args)
...
while (sign(costheta) == originalsign)
y=y(:) + odefun(0,y(:),vars,param)*(dt); % Line 4
costheta = dot(y-normpt,normvec);
k = k + 1;
end
...
end
and to clarify, odefun is F1.m, an m-file of mine. I pass it into the function that contains this while-loop. It's something like whileloop(#F1,args). Line 4 in the code-block above is the Euler method.
The reason I'm using a while-loop is because I want to trigger upon the vector "y" crossing a plane defined by a point, "normpt", and the vector normal to the plane, "normvec".
Is there an easy change to this code that will speed it up dramatically? Should I attempt learning how to make mex files instead (for a speed increase)?
Edit:
Here is a rushed attempt at an example of what one could try to test with. I have not debugged this. It is to give you an idea:
%Save the following 3 lines in an m-file named "F1.m"
function ydot = F1(placeholder1,y,placeholder2,placeholder3)
ydot = y/10;
end
%Run the following:
dt = 1.5e-12 %I do not know about this. You will have to experiment.
y0 = [.1,.1,.1];
normpt = [3,3,3];
normvec = [1,1,1];
originalsign = sign(dot(y0-normpt,normvec));
costheta = originalsign;
y = y0;
k = 0;
while (sign(costheta) == originalsign)
y=y(:) + F1(0,y(:),0,0)*(dt); % Line 4
costheta = dot(y-normpt,normvec);
k = k + 1;
end
disp(k);
dt should be sufficiently small that it takes hundreds of thousands of iterations to trigger.
Assume I must use the Euler method. I have a stochastic differential equation with state-dependent noise if you are curious as to why I tell you to take such an assumption.
I would focus on your actual ODE integration. The fewer steps you have to take, the faster the loop will run. I would only worry about the speed of the sign check after you've optimized the actual integration method.
It looks like you're using the first-order explicit Euler method. Have you tried a higher-order integrator or an implicit method? Often you can increase the time step significantly.