Neuralnetwork activation function - matlab

This is beginner level question.
I have several training inputs in binary and for the neural network I am using a sigmoid thresholding function SigmoidFn(Input1*Weights) where
SigmoidFn(x) = 1./(1+exp(-1.*x));
The use of the above function will give continuous real numbers. But, I want the output to be in binary since the network is a Hopfield neural net (single layer 5 input nodes and 5 output nodes). The problem which I am facing is I am unable to correctly understand the usage and implementation of the various thresholding fucntions. The weights given below are the true weights and provided in the paper. So, I am using the weights to generate several training examples, several output samples by keeping the weight fixed, that is just run the neural network several times.
Weights = [0.0 0.5 0.0 0.2 0.0
0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0
0.0 1.0 0.0 0.0 0.0
0.0 0.0 0.0 -0.6 0.0];
Input1 = [0,1,0,0,0]
x = Input1*Weights; % x = 0 0 1 0 0
As can be seen the result of the multiplication is the second row of the Weights. Is this a mere coincidence?
Next,
SigmoidFn = 1./(1+exp(-1.*x))
SigmoidFn =
0.5000 0.5000 0.7311 0.5000 0.5000
round(SigmoidFn)
ans =
1 1 1 1 1
Input2 = [1,0,0,0,0]
x = Input2*Weights
x = 0 0.5000 0 0.2000 0
SigmoidFn = 1./(1+exp(-1.*x))
SigmoidFn = 0.5000 0.6225 0.5000 0.5498 0.5000
>> round(SigmoidFn)
ans =
1 1 1 1 1
Is it a good practice to use the round function round(SigmoidFn(x)) . ? The result obtained is not correct.
or how should I obtain binary result when I use any threshold function:
(a) HArd Limit
(b) Logistic sigmoid
(c) Tanh
Can somebody please show the proper code for thresholding and a brief explanation of when to use which activation function?I mean there should be certain logic otherwise why are there different kinds of functions?
EDIT : Implementation of Hopfield to recall the input pattern by successive iterations by keeping the weight fixed.
Training1 = [1,0,0,0,0];
offset = 0;
t = 1;
X(t,:) = Training1;
err = 1;
while(err~=0)
Out = X(t,:)*Weights > offset;
err = ((Out - temp)*(Out - temp).')/numel(temp);
t = t+1
X(t,:) = temp;
end

Hopfield networks do not use a sigmoid nonlinearity; the state of a node is simply updated to whether its weighted input is greater than or equal to its offset.
You want something like
output2 = Weights * Input1' >= offsets;
where offsets is the same size as Input1. I used Weights * Input1' instead of Input1 * Weights because most examples I have seen use left-multiplication for updating (that is, the rows of the weight matrix label the input nodes and the columns label the output nodes), but you will have to look at wherever you got your weight matrix to be sure.
You should be aware that you will have to perform this update operation many times before you converge to a fixed point which represents a stored pattern.
In response to your further questions, the weight matrix you have chosen does not store any memories that can be recalled with a Hopfield network. It contains a cycle 2 -> 3 -> 4 -> 2 ... that will not allow the network to converge.
In general you would recover a memory in a way similar to what you wrote in your edit:
X = [1,0,0,0,0];
offset = 0;
t = 1;
err = 1;
nIter = 100;
while err ~= 0 && t <= nIter
prev = X;
X = X * Weights >= offset;
err = ~isequal(X, prev);
t = t + 1;
end
if ~err
disp(X);
end
If you refer to the wikipedia page, this is what's referred to as the synchronous update method.

Related

Why does the rowsize of A matter in fmincon

I have a Matlab code, which use fmincon with some constraints. So that I am able to modify the code I have thought about whether the line position within the condition matrix A makes a difference
I set up a test file so I can change some variables. It turns out that the position of the condition is irrelevant for the result, but the number of rows in A and b plays a role. I´m suprised by that because I would expect that a row with only zeros in A and b just cancel out.
fun = #(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2;
options1 = optimoptions('fmincon','Display','off');
A=zeros(2,2); %setup A
A(2,2)=1; %x2<0
b=[0 0]'; %setup b
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change condition position inside A
A=zeros(2,2);
A(1,2)=1; %x2<0
b=[0 0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
% no change; the position doesn´t influence fmincon
%change row size of A
A=zeros(1,2);
A(1,2)=1; %x2<0
b=[0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change in x2
%increase size of A
A=zeros(10,2);
A(1,2)=1; %x2<0
b=[0 0 0 0 0 0 0 0 0 0]';
x = fmincon(fun,[-1,2],A,b,[],[],[],[],[],options1);x
%change in x2
Can someone explain to me why fmincon is influenced by the row number? What is the "right" rownumber in A and b? The number of variables or the number of conditions?
EDIT
For reasons of completeness:
I agree that different values are possible because of the iteration process. Nevertheless I can find situations where the difference is bigger than the tolerance:
Added +log(x(2) to the function:
fun = #(x)100*(x(2)-x(1)^2)^2 + (1-x(1))^2+log(x(3));
options1 = optimoptions('fmincon','Display','off');
options = optimoptions('fmincon')
A=zeros(2,3); %setup A
A(2,3)=1; %x2<0
b=[0 0]'; %setup b
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change row size of A
A=zeros(1,3);
A(1,3)=1; %x2<0
b=[0]';
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change in x2
%increase size of A
A=zeros(10,3);
A(1,3)=1; %x2<0
b=[0 0 0 0 0 0 0 0 0 0]';
x = fmincon(fun,[-1,2,1],A,b,[],[],[],[],[],options1);x
%change in x2
x =
-0.79876 **0.49156** 2.3103e-11
x =
-0.79921 0.49143 1.1341e-11
x =
-0.80253 **0.50099** 5.8733e-12
Matlab support told me that the A matrix should not have more rows than conditions. Each condition makes it more difficult for the algorithm.
Note that fmincom doesn't necessarily give the exact solution but a good approximation of the solution according to a certain criteria.
The difference in results are plausible since fminconis an iterative algorithm and these matrix multiplications (even if there are mainly zeros) will eventually end with different results. Matlab will actually do these matrix multiplications until he finds the best result. So these results are all correct in the sense they are all close to the solution.
x =
0.161261791015350 -0.000000117317860
x =
0.161261791015350 -0.000000117317860
x =
0.161261838607809 -0.000000077614999
x =
0.161261877075196 -0.000000096088746
The difference in your results is around 1.0e-07 which is decent result considering you don't specify stopping criteria. You can see what you have by default with the command
options = optimoptions('fmincon')
My result is
Default properties:
Algorithm: 'interior-point'
CheckGradients: 0
ConstraintTolerance: 1.0000e-06
Display: 'final'
FiniteDifferenceStepSize: 'sqrt(eps)'
FiniteDifferenceType: 'forward'
HessianApproximation: 'bfgs'
HessianFcn: []
HessianMultiplyFcn: []
HonorBounds: 1
MaxFunctionEvaluations: 3000
MaxIterations: 1000
ObjectiveLimit: -1.0000e+20
OptimalityTolerance: 1.0000e-06
OutputFcn: []
PlotFcn: []
ScaleProblem: 0
SpecifyConstraintGradient: 0
SpecifyObjectiveGradient: 0
StepTolerance: 1.0000e-10
SubproblemAlgorithm: 'factorization'
TypicalX: 'ones(numberOfVariables,1)'
UseParallel: 0
For example, I can reach closer results with the option:
options1 = optimoptions('fmincon','Display','off', 'OptimalityTolerance', 1.0e-09);
Result is
x =
0.161262015455003 -0.000000000243997
x =
0.161262015455003 -0.000000000243997
x =
0.161262015706777 -0.000000000007691
x =
0.161262015313928 -0.000000000234186
You can also try and play with other criteria MaxFunctionEvaluations, MaxFunctionEvaluations etc to see if you can have even closer results...

FMINCON to schedule appliance usage to minimize total cost

I would like to write a code to find the minimum cost of running a dishwasher. This is dependent on the power required, hourly tariff rate, and time used. I am using fmincon for this however the code provided below shows the following error message:
User supplied objective function must return a scalar value
My objective function is to minimize (Total Cost * Time) s.t total cost is equal to the summation of (hourly power)*(hourly cost) from hour 1 to 24 is equal to 0.8 kwh, also, the total cost must be greater than Ca and the total run time for the day is one hour.
% Array showing the hourly electricity rates (cents per kwh)
R=zeros(24,1);
R(1:7,1)=6;
R(20:24,1)=6;
R(8:11,1)=9;
R(18:19,1)=9;
R(12:17,1)=13;
p_7 = transpose([0.8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]); %This is the power pattern of appliance (operates at 0.8 kWh for 1 hour daily)
for k=1:23
P7(:, k+1) = circshift(p_7,k); % This shows all the possible hours of operation
end
Total = P7*R; % This is the total cost per hour at different hourly tariffs
fun = #(x)Total.*(x);
x0 = [1];
A = Total;
%Ca = 0.5;
Ca = ones(1,24);
b = Ca;
Aeq = Total;
Daily_tot_7 = 2*ones(1,24);
beq = Daily_tot_7;
ub = 24;
lb = 1;
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub)
I believe that my understanding on converting constraints to fmincon is not correct and that I may be missing vital constraints for this issue.
Your output is currently a vector of outputs. You stated that your cost function is the summation of hourly elements. Therefore, your function definition should be
fun = #(x)sum(Total.*(x));
However, if I'm reading into this right, you wish to solve for each hour individually. In that case, you need to set your x0 variable to be defined as a 24x1 input
x0 = ones(24,1);
If that is the case you need to adjust your A,b,Aeq, and beq variables accordingly. However, do you actually need these, you can just not use them by replacing them with []
Finally, your p7 variable is likely better redefined as
p7 = R*.8;
My apologies if I misunderstood what you are trying to accomplish here.

MATLAB: efficient generation of a large integer matrix of multi-indices

Let d and p be two integers. I need to generate a large matrix A of integers, having d columns and N=nchoosek(d+p,p) rows. Note that nchoosek(d+p,p) increases quickly with d and p, so it's very important that I can generate A quickly. The rows of A are all the multi-indices with components from 0 to p, such that the sum of the components is less than or equal to p. This means that, if d=3 and p=3, then A is an [N=nchoosek(3+3,3)=20x3] matrix with the following structure:
A=[0 0 0;
1 0 0;
0 1 0;
0 0 1;
2 0 0;
1 1 0;
1 0 1;
0 2 0;
0 1 1;
0 0 2;
3 0 0;
2 1 0;
2 0 1;
1 2 0;
1 1 1;
1 0 2;
0 3 0;
0 2 1;
0 1 2;
0 0 3]
It is not indispensable to follow exactly the row ordering I used, although it would make my life easier (for those interested, it's called graded lexicographical ordering and it's described here:
http://en.wikipedia.org/wiki/Monomial_order).
In case you are curious about the origin of this weird matrix, let me know!
Solution using nchoosek and diff
The following solution is based on this clever answer by Mark Dickinson.
function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
degrees = (0:maxDegree).';
return;
end
degrees = cell(maxDegree+1,1);
k = numVars;
for n = 0:maxDegree
dividers = flipud(nchoosek(1:(n+k-1), k-1));
degrees{n+1} = [dividers(:,1), diff(dividers,1,2), (n+k)-dividers(:,end)]-1;
end
degrees = cell2mat(degrees);
You can get your matrix by calling monomialDegrees(d,p).
Solution using nchoosek and accumarray/histc
This approach is based on the following idea: There is a bijection between all k-multicombinations and the matrix we are looking for. The multicombinations give the positions, where the entries should be added. For example the multicombination [1,1,1,1,3] will be mapped to [4,0,1], as there are four 1s, and one 3. This can be either converted using accumarray or histc. Here is the accumarray-approach:
function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
degrees = (0:maxDegree).';
return;
end
degrees = cell(maxDegree+1,1);
degrees{1} = zeros(1,numVars);
for n = 1:maxDegree
pos = nmultichoosek(1:numVars, n);
degrees{n+1} = accumarray([reshape((1:size(pos,1)).'*ones(1,n),[],1),pos(:)],1);
end
degrees = cell2mat(degrees);
And here the alternative using histc:
function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
degrees = (0:maxDegree).';
return;
end
degrees = cell(maxDegree+1,1);
degrees(1:2) = {zeros(1,numVars); eye(numVars);};
for n = 2:maxDegree
pos = nmultichoosek(1:numVars, n);
degrees{n+1} = histc(pos.',1:numVars).';
end
degrees = cell2mat(degrees(1:maxDegree+1));
Both use the following function to generate multicombinations:
function combs = nmultichoosek(values, k)
if numel(values)==1
n = values;
combs = nchoosek(n+k-1,k);
else
n = numel(values);
combs = bsxfun(#minus, nchoosek(1:n+k-1,k), 0:k-1);
combs = reshape(values(combs),[],k);
end
Benchmarking:
Benchmarking the above codes yields that the diff-solution is faster if your numVars is low and maxDegree high. If numVars is higher than maxDegree, then the histc solution will be faster.
Old approach:
This is an alternative to Dennis' approach of dec2base, which has a limit on the maximum base. It is still a lot slower than the above solutions.
function degrees = monomialDegrees(numVars, maxDegree)
Cs = cell(1,numVars);
[Cs{:}] = ndgrid(0:maxDegree);
degrees = reshape(cat(maxDegree+1, Cs{:}),(maxDegree+1)^numVars,[]);
degrees = degrees(sum(degrees,2)<=maxDegree,:);
I would solve it this way:
ncols=d;
colsum=p;
base=(0:colsum)';
v=#(dm)permute(base,[dm:-1:1]);
M=bsxfun(#plus,base,v(2));
for idx=3:ncols
M=bsxfun(#plus,M,v(idx));
end
L=M<=colsum;
A=cell(1,ncols);
[A{:}]=ind2sub(size(L),find(L));
a=cell2mat(A);
%subtract 1 because 1 based indexing but base starts at 0
a=a-1+min(base);
It builds up a p-dimensional matrix which contains the sum. The efficiency of this code depends on sum(L(:))/numel(L), this quotient tells you how much of the created matrix is actually used for solutions. If this gets low for your intput, there probably exits a better solution.
Here is a very easy way to do it:
L = dec2base(0:4^3-1,4);
idx=sum(num2str(L)-'0',2)<=3;
L(idx,:)
I think the first line can be very time efficient for creating a list of candidates, but unfortunately I don't know how to reduce the list in an efficient way after that.
So the second line works, but could use improvement performance wise.

bernuli,geometric simulation on matlab

I am trying to simulate a simple bernuli simulation and also a simple geometric simulation on matlab and since I am new to matlab it seems a bit difficult.
I have been using this to understand it better http://www.academia.edu/1722549/Useful_distributions_using_MATLAB
but I Havent been able to make a good simulation so far.Can some help me or show me a good tutorial. thank you.
NEW EDIT:
answer from here:
this is my own asnwer that I try to com up with is it correct:
If we want to simulate Bernoulli distribution in Matlab, we can simply use random number generator rand to simulate a Bernoulli experiment. In this case we try to simulate tossing a coin 4 times with p = 0.5:
>> p = 0.5;
>> rand(1,4) < p
ans =
1 1 1 0
Using function rand, it returns values distributed between 0 and 1. By using “ < “, every value that is less than 0.5 is a success and therefore it prints 1 for that value; and for values equal or greater than 0.5 is a failure and therefore it prints 0 for that value.
Our ans is: 1 1 1 0. Which means that 3 times we have value less than 0.5 and 1 times we had values greater or equal to 0.5.
rand(1,n) < p will give count of tails in n Bernoulli trails assuming 1 is head. Alternatively, you can use binornd(n,p) function in MATLAB to simulate Bernoulli trial for n=1. One small caveat is that using rand(1,n) < p is quite faster as compared to binornd(n,p).
From the Wikipedia and your link, you can reply the question on your own:
The Binomial distribution is the discrete probability distribution of the number of successes (n) in a sequence of n independent yes/no experiments. The Bernoulli distribution is a special case of the Binomial distribution where n=1.
function pdf = binopdf_(k,n,p)
m = 10000;
idx = 0;
for ii=1:m
idx = idx + double(nnz(rand(n,1) < p)==k);
end
pdf = idx/m;
end
For example, if I toss a fair coin (p=0.5) 20 times, how many tails will I get?
k = 0:20;
y_pdf = binopdf_(k,20,0.5);
y_cdf = cumsum(y_pdf);
figure;
subplot(1,2,1);
stem(k,y_pdf);
title('PDF');
subplot(1,2,2);
stairs(k,y_cdf);
axis([0 20 0 1]);
title('CDF');
If you see the PDF, the mean value of tails we will see is 10.
The geometric distribution probability distribution of the number X of Bernoulli trials needed to get one success.
function pdf = geopdf_(k,p)
m = 10000;
pdf = zeros(numel(k));
for jj=1:numel(k)
idx = 0;
for ii=1:m
idx = idx + double(nnz(rand(jj,1) < p) < 1);
end
pdf(jj) = idx/m;
end
end
For example, how many times we have to toss a fair coin (p=0.5) to get one tail?
k = 0:20;
y_pdf = geopdf_(k,0.5);
y_cdf = cumsum(y_pdf);
figure;
subplot(1,2,1);
stem(k,y_pdf)
title('PDF');
subplot(1,2,2);
stairs(k,y_cdf);
axis([0 20 0 1]);
title('CDF');
If you see the PDF, we have 0.5 possibilities of getting a tail in the first trial, 0.75 possibilities of getting a tail in the first two trials, etc.

Find index of min element in matlab

Here I have two matrix, one indicating cost and the other determines when to take into comparison.
cost = [0.2 0.0 0.3; 0.4 0 0; 0.5 0 0];
available = [1 1 0 ; 1 0 0; 0 0 0];
available = logical(available);
I want to get the index of the min available element in the cost matrix, which in this case would compare 0.2, 0.0 and 0.4 and return the index of 0.0, which is (1, 2) or 4 in the cost matrix.
I tried
mul = cost .* available; % Zero if not available, but I can't know if it is zero because cost is zero
mul(~mul) = nan; % Set zero to be NaN
[minVal, minId] = min(mul)
This will help to get the min non-zero cost but if there exists zero elements which are available, it would be wrong.
So is there a better way to do so?
Here are two possible solutions. Both essentially involve converting all non-available costs to Inf.
%#Set up an example
Cost = [0.2 0 0.3; 0.4 0 0; 0.5 0 0];
Available = [1 1 0; 1 0 0; 0 0 0];
%#Transform non-available costs to Inf
Cost(Available == 0) = Inf;
%#Obtain indices using find
[r, c] = find(Cost == min(min(Cost)))
%#Obtain linear indices and convert using ind2sub
[~, I1] = min(Cost(:));
[r2, c2] = ind2sub(size(Cost), I1);
Both solutions will only return the first minimum value in the instance that there is not a unique minimum. Also, the method will fail in the perverse case that all the available costs are Inf (but I guess you've got bigger problems if all your costs are infinite...).
I've done a few speed tests, and the second method is definitely faster, no matter what the dimensions of Cost, so should be strictly preferred. Also, if you only want linear indices and not subscript indices then you can of course drop the call to ind2sub. However, this doesn't give you huge savings in efficiency, so if there is a preference for subscript indices then you should use them.