how restrict values of chosen weights in matlab neural network? - matlab

everyone!
I know about regularization, but I wanna restrict only choosen weights. For example, I have code
H = rand(10, 100);
H = rand(1, 100);
net = newff(H, F, [10, 5], { 'tansig' 'tansig'}, 'traingdx', 'learngdm', 'mse');
net = train(net, H, F);
and i wanna have during train
net.IW{1}(i, i) = 0
or even
a <= net.IW{1}(i - 1 : i + 1, i) <= b
how can i achieve this result?

Related

Triangular factorization with pivoting Matlab

I'm new to Octave and Matlab and I have a problem. I need to make a program, which solves a Triangular system of linear equations and making a triangular factorization with pivoting.
For example I need to do next exercise
That is my lufact function
function X = lufact (A, B)
[N, N] = size(A);
X = zeros(N, 1);
Y = zeros(N, 1);
C = zeros(1, N);
R = 1:N;
for p=1:N-1
[max1,j]=max(abs(A(p:N,p)));
C = A(p, :);
A(p, :) = A(j + p - 1, :);
A(j + p - 1, :) = C;
d = R(p);
R(p) = R(j + p - 1);
R(j + p - 1) = d;
if A(p,p) == 0
'A is singular. No unique solution'
break
endif
endfor
for k = p + 1: N
mult = A(k,p)/A(p,p);
A(k,p) = mult;
A(k, p + 1: N) = A(k, p + 1: N) - mult*A(p, p+1: N);
endfor
endfunction
Y(1) = B(R(1));
for k = 2: N
Y(k) = B(R(k)) - A(k, 1:k - 1)*Y(1:k - 1);
endfor
X(N) = Y(N)/A(N,N);
for k = N-1: -1: 1
X(k) = (Y(k) - A(k, k + 1: N)*X(k + 1: N))/A(k, k);
endfor
And that is my main function
A = [2 4 -6; 1 5 3; 1 3 2];
BA = [-4; 10; 5]
BB = [20; 49; 32]
XA = lufact(A, BA);
XB = lufact(A, BB);
disp(XA);
disp(XB);
The output of my program
What am I doing wrong and what should I do to fix that?
You don't seem to be attempting to answer the question you're asking, so your question doesn't make much sense...
But, in any case. The exercise asks you to show that given the following matrices / vectors:
A = [ 2, 4, -6; 1, 5, 3; 1, 3, 2 ];
L = [ 1, 0, 0; 1/2, 1, 0; 1/2, 1/3, 1 ];
U = [ 2, 4, -6; 0, 3, 6; 0, 0, 3 ];
show that LY = B, UX = Y, AX = B for a given B.
Ok. In octave, the best way to solve a matrix equation of the form aX = b w.r.t X for a given a and b, is to use the "matrix left-division" operator, \, i.e. x = a\b. Type help mldivide in your octave console for details.
So,
Y = L\B;
X = U\Y;
A*X % the result should be the same as B. QED.
Then there is the separate question of how can I perform LU decomposition in octave? Well, octave provides a function for this: lu.
[L, U] = lu(A)
% L = 1.0000 0 0
% 0.5000 1.0000 0
% 0.5000 0.3333 1.0000
%
% U = 2 4 -6
% 0 3 6
% 0 0 3
Then there's the implied further question of "I would like to perform LU decomposition by hand."
Great. Good luck. Your code here is a bit messy, no comments, no self-explanatory variable names ... I won't attempt to debug it here in detail. One thing to note though is, LU decomposition only takes a matrix as an input. I'm not sure why you are trying to pass the 'B' matrix to it as an input. Plus it doesn't seem like you're using it anywhere in the function. Or creating L or U matrices for that matter. Also, if the whole top part is inside an lufact.m file, then you should know that your function terminates well before the for loops; these get ignored completely. What were you trying to do exactly?

polyfit and polyval funtions [duplicate]

This question already has an answer here:
MATLAB curve fitting - least squares method - wrong "fit" using high degrees
(1 answer)
Closed 3 years ago.
Hello my question is about curve tiffing of P3(x) I write my code with using polyfit and polyval functions but I need to make it without using these functions.
this is my code that I wrote
n=input('Enter a value: ');
if(n<3)
fprintf('Enter a different value larger than 3')
end
if(n>=3)
x = 1:n;
y = -0.3*x + 2*randn(1,n);
[p,S] = polyfit(x,y,3);
[y_fit,delta] = polyval(p,x,S);
plot(x,y,'bo')
hold on
plot(x,y_fit,'r-')
title('Linear-Fit Output')
legend('Data','Linear Fit')
end
this is code that I write it is working but I am suppose to write it without using polfite and polyval functions
Without Using syms
y = a0*x^0 + a1*x^1 + a2*x^2 + a3*x^3
For n data point --> y = X*a
where
X = [x1^0 , a1*x1^1, a2*x1^2 , a3*x1^3; x2^0 , a1*x2^1, a2*x2^2 , a3*x2^3;...;xn^0 , a1*xn^1 , a2*xn^2 , a3*xn^3 ]
and a = [a0, a1, a2, a3]; y = [y1, y2, ..., yn]
a is computed as follow
y = X*a ---> a = X\y
The code is as follows
n is given
x = 1:n;
y = -0.3*x + 2*randn(1,n);
x0 = ones(n, 1);
x1 = x';
x2 = (x.^2)';
x3 = (x.^3)';
X = [x0 x1 x2 x3];
a = X\(y');
f =#(t)a(1) + a(2).*t + a(3).*(t.^2)+ a(4).*(t.^3);
Use least square method to find best fit cubic polynomial
n=input('Enter a value: ');
if(n<3)
fprintf('Enter a different value larger than 3')
else
x = 1:n;
y = -0.3*x + 2*randn(1,n);
% Cubic regression
syms a0 a1 a2 a3
yq = a0 + a1.*x + a2.*(x.^2) + a3.*(x.^3) ;
rq = yq - y;
f = sum(rq.^2);
fa0 = diff(f,a0);
fa1 = diff(f,a1);
fa2 = diff(f,a2);
fa3 = diff(f,a3);
sol = solve(fa0 == 0, fa1 == 0, fa2 == 0, a0, a1, a2, a3);
a0 = sol.a0;
a1 = sol.a1;
a2 = sol.a2;
a3 = sol.a3;
% Cubic Regression Curve Function
f =#(t)a0 + a1.*t + a2.*(t.^2)+ a3.*(t.^3);
% Plot Data and Cubic Regression Curve
h = figure(1);
% Data
plot3 = scatter(x, y, 100, '+', 'MarkerEdgeColor', 'red', 'linewidth', 5);
hold on
% cubic Regression Curve
xx = linspace(0,n,100);
plot4 = plot(xx, f(xx), 'linewidth', 5);
[~,b] = legend([plot3 plot4],{'Real Data','Cubic Regression'}, 'FontSize',30);
set(findobj(b,'-property','MarkerSize'),'MarkerSize',30);
xlabel('x-axis','color', 'k', 'fontSize', 25)
ylabel('y-axis', 'color','k', 'fontSize', 25)
hYLabel = get(gca,'YLabel');
set(hYLabel,'rotation',0,'VerticalAlignment','middle', 'HorizontalAlignment','right')
grid on
grid minor
set(gca,'FontSize',20)
set(get(h,'CurrentAxes'),'GridAlpha',0.8,'MinorGridAlpha',0.5);
xticks(x);
title('Cubic Regression', 'color', 'r');
whitebg('w');
end
n = 5
n = 20

Why at Logistic Regression(multi-class) accuracy is so small?

I try to solve a problem with 3 features and 6 classes(label). The training dataset is 700 rows * 3 columns. The features values are continious from 0-100. I use one-Vs-all method, but I do not know why the prediction accuracy is so small, just 24%. Could anyone tell me, please? Thank you!
This is how I do the prediction:
function p = predictOneVsAll(all_theta, X)
m = size(X, 1);
num_labels = size(all_theta, 1);
% You need to return the following variables correctly
p = zeros(size(X, 1), 1);
% Add ones to the X data matrix
X = [ones(m, 1) X];
[m, p] = max(sigmoid(X * all_theta'), [], 2);
end
And the One-Vs-all
% You need to return the following variables correctly
all_theta = zeros(num_labels, n + 1);
% Add ones to the X data matrix
X = [ones(m, 1) X];
initial_theta = zeros(n+1, 1);
options = optimset('GradObj', 'on', 'MaxIter', 20);
for c = 1:num_labels,
[theta] = ...
fmincg (#(t)(lrCostFunction(t, X, (y == c), lambda)), ...
initial_theta, options);
all_theta(c,:) = theta';
end
In predictOneVsAll , you don't need to use sigmoid function. You need it only when the calculating cost. So correct code is,
[m, p] = max((X * all_theta'), [], 2);
In OneVsAll , loop should look like this
for c = 1:num_labels
all_theta(c,:) = fmincg (#(t)(lrCostFunction(t, X, (y == c), lambda)), initial_theta, options);
endfor
It is better if you ask these questions in andrew's ML course discussion. They would be more familiar with the code and the problem.

Finding an equation of linear classifier for two separable sets of points using perceptron learning

I would like to write a matlab function to find an equation of a linear classifier for 2 separable sets of points using one single-layer perceptron. I have got 2 files:
script file - run.m:
x_1 = [3, 3, 2, 4, 5];
y_1 = [3, 4, 5, 2, 2];
x_2 = [6, 7, 5, 9, 8];
y_2 = [3, 3, 4, 2, 5];
target_array = [0 0 0 0 0 1 1 1 1 1];
[ func ] = classify_perceptron([x_1 x_2; y_1 y_2], target_array);
x = -2:10;
y = arrayfun(func, x);
plot(x_1, y_1, 'o', x_2, y_2, 'X', x, y);
axis([-2, 10, -2, 10]);
classify_perceptron.m
function [ func ] = classify_perceptron( points, target )
% points - matrix of x,y coordinates
% target - array of expected results
% func - function handler which appropriately classifies a point
% given by x, y arguments supplied to this function
target_arr = target;
weights = rand(1, 2);
translation = rand();
for i=1:size(points, 2)
flag = true;
while flag
result = weights * points(:, i) + translation;
y = result > 0;
e = target_arr(1, i) - y;
if e ~= 0
weights = weights + (e * points(:, i))';
translation = translation + e;
else
flag = false;
end
end
end
func = #(x)(-(translation + (weights(1, 1) * x)) / weights(1, 2));
return
end
The problem is that I don't know where I am making the mistake that leads to incorrect result. It looks like the slope of the line is right, however translation should be a bit bigger. I would be really thankful for pointing me in the right direction. The result I get is presented in the picture below:
Ok, so I have made a significant progress. In case someone runs into the same problem I present to you the solution. The problem has been solved by adding a variable learning_rate = 0.1 and packing the loop iterating over points into another loop iterating as many times as specified in the variable epochs (e.g. 300) .

heterogeneous class recognition with ANN / MLP

I have put together a classifying 3 layer artificial neural network that appears to work on other datasets. Playing around some artificial datasets that I made, I was unable to correctly predict between two classes when one class was positive in one feature or another feature.
Clearly class1 is can be identified by asking if either feature 1 or feature 2 is equal to 1 but I can't get the algorithm to predict the dataset correctly (there are 20 examples following this pattern in the dataset).
Can ANN/MLPs recognize this type of pattern? If so, what am I missing? If not, are there other methods that can predict this type of pattern (maybe SVM)?
I used Octave as that was what was used in the online course offered from coursera. I have listed most of the code here although it is structured slightly differently when I run it. As you can see I do use bias units on the first and second layers and I have also varied the number of hidden units in the second layer from 1-5 with no improvement over random guessing.
% Load dataset
y = [1; 1; 2; 2]
X = [1, 0; 0, 1; 0, 0; 0, 0]
m = size(X, 1);
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), num_labels, (hidden_layer_size + 1));
% Randomly initialize weight parameters
initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);
initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];
% Add bias units to layers and feedforward
Xbias = [ones(m,1), X];
L2bias = [ones(m,1), sigmoid(Xbias*Theta1')];
L3 = sigmoid(L2bias * Theta2');
% Create class matrix Y
Y = zeros(m, num_labels);
for r = 1:m;
Y(r, y(r)) = 1;
end
% Set cost function
J = (sum(sum(Y.*log(L3) + (1-Y).*log(1-L3))))/-m + lambda*(sum(sum((Theta1(:,2:columns(Theta1))).^2)) + sum(sum((Theta2(:,2:columns(Theta2))).^2)))/2/m;
% Initialize weight gradient matrices
D2 = zeros(rows(Theta2),columns(Theta2));
D1 = zeros(rows(Theta1),columns(Theta1));
% Calculate gradient with backpropagation
for t = 1:m;
a1 = [1 X(t,:)]';
z2 = Theta1*a1;
a2 = [1; sigmoid(z2)];
z3 = Theta2*a2;
a3 = sigmoid(z3);
d3 = a3 - Y(t,:)';
d2 = (Theta2'*d3)(2:end).*sigmoidGradient(z2);
D2 = D2 + d3*a2';
D1 = D1 + d2*a1';
end
Theta2_grad = D2/m;
Theta1_grad = D1/m;
Theta2_grad(:,2:end) = Theta2_grad(:,2:end) + lambda*Theta2(:,2:end)/m;
Theta1_grad(:,2:end) = Theta1_grad(:,2:end) + lambda*Theta1(:,2:end)/m;
% Unroll gradients
grad = [Theta1_grad(:) ; Theta2_grad(:)];
% Compute cost (Feed forward)
[J,grad] = nnCostFunction(initial_nn_params, input_layer_size, hidden_layer_size, num_labels, X, y, lambda);
% Create "short hand" for the cost function to be minimized using fmincg
costFunction = #(p) nnCostFunction(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda);
% Train the neural network using fmincg
options = optimset('MaxIter', 1000);
[nn_params, cost] = fmincg(costFunction, initial_nn_params, options);
% Obtain Theta1 and Theta2 back from nn_params
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), num_labels, (hidden_layer_size + 1));
NN can recognize any pattern. Universal Approximation Theorem proves that (as well as many others).
The most obvious reason I can think of is lack of bias neuron. Althouh for more valuable answers you have to include your code.