My code
b(abs(b(1:3:length(b))) > 0.75) = 0.75
What it's supposed to do:
b1 = b(1:3:end);
b1(abs(b1)>0.75) = 0.75;
b(1:3:end) = b1;
How are these two not the same?
The indexing part b(1:3:end) returns a short vector of zeros and ones, so will only change the i-th entry of b (for i in the first third-ish of b) to 0.75 if the absolute value of the 3*i + 1-th entry is greater than 0.75.
For example:
b = [-0.684; 0.941; 0.914; -0.029; 0.6; -0.716; -0.156; 0.831; 0.584; 0.919];
b_index = abs(b(1:3:length(b)))>0.75
would return
b_index =
0
0
0
1
and b(b_index) = 0.75 would change the 4th entry of b to 0.75.
One way of doing this as a one-liner is
b(1:3:end) = b(1:3:end).*(abs(b(1:3:end))<=0.75)) + 0.75*((b(1:3:end)>0.75));
but I think the three-liner is a bit clearer.
Related
I've been trying to finish Andrew Ng's Machine Learning course, I am at the part about logistic regression now. I am trying to discover the parameters and also calculate the cost without using the MATLAB function fminunc. However, I am not converging to the correct results as posted by other students who have finished the assignment using fminunc. Specifically, my problems are:
the parameters theta are incorrect
my cost seems to be blowing up
I get many NaNs in my cost vector (I just create a vector of the costs to keep track)
I attempted to discover the parameters via Gradient Descent as how I understood the content. However, my implementation still seems to be giving me incorrect results.
dataset = load('dataStuds.txt');
x = dataset(:,1:end-1);
y = dataset(:,end);
m = length(x);
% Padding the the 1's (intercept term, the call it?)
x = [ones(length(x),1), x];
thetas = zeros(size(x,2),1);
% Setting the learning rate to 0.1
alpha = 0.1;
for i = 1:100000
% theta transpose x (tho why in MATLAB it needs to be done the other way
% round? :)
ttrx = x * thetas;
% the hypothesis function h_x = g(z) = sigmoid(-z)
h_x = 1 ./ (1 + exp(-ttrx));
error = h_x - y;
% the gradient (aka the derivative of J(\theta) aka the derivative
% term)
for j = 1:length(thetas)
gradient = 1/m * (h_x - y)' * x(:,j);
% Updating the parameters theta
thetas(j) = thetas(j) - alpha * gradient;
end
% Calculating the cost, just to keep track...
cost(i) = 1/m * ( -y' * log(h_x) - (1-y)' * log(1-h_x) );
end
% Displaying the final theta's that I obtained
thetas
The parameters theta that I get are:
thetas =
-482.8509
3.7457
2.6976
The results below is from one example that I downloaded, but the author used fminunc for this one.
Cost at theta found by fminunc: 0.203506
theta:
-24.932760
0.204406
0.199616
The data:
34.6236596245170 78.0246928153624 0
30.2867107682261 43.8949975240010 0
35.8474087699387 72.9021980270836 0
60.1825993862098 86.3085520954683 1
79.0327360507101 75.3443764369103 1
45.0832774766834 56.3163717815305 0
61.1066645368477 96.5114258848962 1
75.0247455673889 46.5540135411654 1
76.0987867022626 87.4205697192680 1
84.4328199612004 43.5333933107211 1
95.8615550709357 38.2252780579509 0
75.0136583895825 30.6032632342801 0
82.3070533739948 76.4819633023560 1
69.3645887597094 97.7186919618861 1
39.5383391436722 76.0368108511588 0
53.9710521485623 89.2073501375021 1
69.0701440628303 52.7404697301677 1
67.9468554771162 46.6785741067313 0
70.6615095549944 92.9271378936483 1
76.9787837274750 47.5759636497553 1
67.3720275457088 42.8384383202918 0
89.6767757507208 65.7993659274524 1
50.5347882898830 48.8558115276421 0
34.2120609778679 44.2095285986629 0
77.9240914545704 68.9723599933059 1
62.2710136700463 69.9544579544759 1
80.1901807509566 44.8216289321835 1
93.1143887974420 38.8006703371321 0
61.8302060231260 50.2561078924462 0
38.7858037967942 64.9956809553958 0
61.3792894474250 72.8078873131710 1
85.4045193941165 57.0519839762712 1
52.1079797319398 63.1276237688172 0
52.0454047683183 69.4328601204522 1
40.2368937354511 71.1677480218488 0
54.6351055542482 52.2138858806112 0
33.9155001090689 98.8694357422061 0
64.1769888749449 80.9080605867082 1
74.7892529594154 41.5734152282443 0
34.1836400264419 75.2377203360134 0
83.9023936624916 56.3080462160533 1
51.5477202690618 46.8562902634998 0
94.4433677691785 65.5689216055905 1
82.3687537571392 40.6182551597062 0
51.0477517712887 45.8227014577600 0
62.2226757612019 52.0609919483668 0
77.1930349260136 70.4582000018096 1
97.7715992800023 86.7278223300282 1
62.0730637966765 96.7688241241398 1
91.5649744980744 88.6962925454660 1
79.9448179406693 74.1631193504376 1
99.2725269292572 60.9990309984499 1
90.5467141139985 43.3906018065003 1
34.5245138532001 60.3963424583717 0
50.2864961189907 49.8045388132306 0
49.5866772163203 59.8089509945327 0
97.6456339600777 68.8615727242060 1
32.5772001680931 95.5985476138788 0
74.2486913672160 69.8245712265719 1
71.7964620586338 78.4535622451505 1
75.3956114656803 85.7599366733162 1
35.2861128152619 47.0205139472342 0
56.2538174971162 39.2614725105802 0
30.0588224466980 49.5929738672369 0
44.6682617248089 66.4500861455891 0
66.5608944724295 41.0920980793697 0
40.4575509837516 97.5351854890994 1
49.0725632190884 51.8832118207397 0
80.2795740146700 92.1160608134408 1
66.7467185694404 60.9913940274099 1
32.7228330406032 43.3071730643006 0
64.0393204150601 78.0316880201823 1
72.3464942257992 96.2275929676140 1
60.4578857391896 73.0949980975804 1
58.8409562172680 75.8584483127904 1
99.8278577969213 72.3692519338389 1
47.2642691084817 88.4758649955978 1
50.4581598028599 75.8098595298246 1
60.4555562927153 42.5084094357222 0
82.2266615778557 42.7198785371646 0
88.9138964166533 69.8037888983547 1
94.8345067243020 45.6943068025075 1
67.3192574691753 66.5893531774792 1
57.2387063156986 59.5142819801296 1
80.3667560017127 90.9601478974695 1
68.4685217859111 85.5943071045201 1
42.0754545384731 78.8447860014804 0
75.4777020053391 90.4245389975396 1
78.6354243489802 96.6474271688564 1
52.3480039879411 60.7695052560259 0
94.0943311251679 77.1591050907389 1
90.4485509709636 87.5087917648470 1
55.4821611406959 35.5707034722887 0
74.4926924184304 84.8451368493014 1
89.8458067072098 45.3582836109166 1
83.4891627449824 48.3802857972818 1
42.2617008099817 87.1038509402546 1
99.3150088051039 68.7754094720662 1
55.3400175600370 64.9319380069486 1
74.7758930009277 89.5298128951328 1
I ran your code and it does work fine. However, the tricky thing about gradient descent is ensuring that your costs don't diverge to infinity. If you look at your costs array, you will see that the costs definitely diverge and this is why you are not getting the correct results.
The best way to eliminate this in your case is to reduce the learning rate. Through experimentation, I have found that a learning rate of alpha = 0.003 is the best for your problem. I've also increased the number of iterations to 200000. Changing these two things gives me the following parameters and associated cost:
>> format long g;
>> thetas
thetas =
-17.6287417780435
0.146062780453677
0.140513170941357
>> cost(end)
ans =
0.214821863463963
This is more or less in line with the magnitudes of the parameters you see when you are using fminunc. However, they get slightly different parameters as well as different costs because of the actual minimization method itself. fminunc uses a variant of L-BFGS which finds the solution in a much faster way.
What is most important is the actual accuracy itself. Remember that to classify whether an example belongs to label 0 or 1, you take the weighted sum of the parameters and examples, run it through the sigmoid function and threshold at 0.5. We find what the average amount of times each expected label and predicted label match.
Using the parameters we found with gradient descent gives us the following accuracy:
>> ttrx = x * thetas;
>> h_x = 1 ./ (1 + exp(-ttrx)) >= 0.5;
>> mean(h_x == y)
ans =
0.89
This means that we've achieved an 89% classification accuracy. Using the labels provided by fminunc also gives:
>> thetas2 = [-24.932760; 0.204406; 0.199616];
>> ttrx = x * thetas2;
>> h_x = 1 ./ (1 + exp(-ttrx)) >= 0.5;
>> mean(h_x == y)
ans =
0.89
So we can see that the accuracy is the same so I wouldn't worry too much about the magnitude of the parameters but it's more in line with what we see when we compare the costs between the two implementations.
As a final note to you, I would suggest looking at this post of mine for some tips on how to make logistic regression work over long-term. I would definitely recommend normalizing your features prior to finding the parameters to make the algorithm run faster. It also addresses why you were finding the wrong parameters (namely the cost blowing up): Cost function in logistic regression gives NaN as a result.
normalizing the data using mean and standard deviation as follows enables you to use large learning rate and get a similar answer
clear; clc
data = load('ex2data1.txt');
m = length(data);
alpha = 0.1;
theta = [0; 0; 0];
y = data(:,3);
% Normalizing the data
xm1 = mean(data(:,1)); xm2 = mean(data(:,2));
xs1 = std(data(:,1)); xs2 = std(data(:,2));
x1 = (data(:,1)-xm1)./xs1; x2 = (data(:,2)-xm2)./xs2;
X = [ones(m, 1) x1 x2];
for i=1:10000
h = 1./(1+exp(-(X*theta)));
theta = theta - (alpha/m)* (X'*(h-y));
J(i) = (1/m)*(-y'*log(h)-(1-y)'*log(1-h));
end
theta
J(end)
figure
plot(J)
output is a 3d matrix with size(output) == [height width N] and input is a 2d matrix with size(input) == [height width] . I need to implement the following code in one line.
for k = 1:size(output,3)
f = output(:,:,k);
i_zero = (f==0);
f(is_zero) = input(is_zero);
output(:,:,k) = f;
end
bsxfun approach -
output = bsxfun(#times,output==0,input) + output
Alternative approach -
output = (output==0).*input(:,:,ones(1,N))+ output
I hope the "I need to implement" is not a homework.
Here goes a solution that should solve your problem although not in one line.
new_input=repmat(input,1,1,size(output,3));
output(output==0)=new_input(output==0);
All the answers solve the problem when there is a exact comparison to 0 (as OP required) but for the sake of generalization if you intend to change for another comparison be aware that not all methods work in the same way.
Example below:
CODE:
%Simulation
output=rand(10,10,3);
input=rand(10,10);
% output=randi(9,10,10,3);
% input=randi(9,10,10);
%OP code
output2=[]
for k = 1:size(output,3)
f = output(:,:,k);
i_zero = (f<0.5);
f(i_zero) = input(i_zero);
output2(:,:,k) = f;
end
%repmat code
output3=output;
new_input=repmat(input,1,1,size(output,3));
output3(output<0.5)=new_input(output<0.5);
any(output2(:)-output3(:))
%bsxfun code
output4 = bsxfun(#times,output<0.5,input) + output;
any(output2(:)-output4(:))
%other variation code
output5 = (output<0.5).*input(:,:,ones(1,size(output,3)))+ output;
any(output2(:)-output5(:))
% bultin code
output6=output;
output6(output<0.5)=builtin('_paren',repmat(input,[1,1,size(output,3)]),output<0.5);
any(output2(:)-output6(:))
'-----'
any(abs(output2(:)-output3(:))>eps)
any(abs(output2(:)-output4(:))>eps)
any(abs(output2(:)-output5(:))>eps)
any(abs(output2(:)-output6(:))>eps)
'-----'
sum(abs(output2(:)-output3(:)))
sum(abs(output2(:)-output4(:)))
sum(abs(output2(:)-output5(:)))
sum(abs(output2(:)-output6(:)))
OUTPUT:
ans =
0
ans =
1
ans =
1
ans =
0
-----
ans =
0
ans =
1
ans =
1
ans =
0
-----
ans =
0
ans =
150.5088
ans =
150.5088
ans =
0
If you insist on a single line solution, you can use the (:) operator along with the mod command:
output(output(:)==0) = input(mod(find(output(:)==0)-1,height*width)+1)
where the -1 and +1 are in order to avoid index 0
Here is in a one liner... but uses the undocumented builtin('_paren',... to subscript reference to the output of a function
output(output==0)=builtin('_paren',repmat(input,[1,1,N]),output==0)
without the the undocumented builtin this method gets messy if you want it in one line...
output=subsasgn(output,struct('type','()','subs',{{output==0}}),...
subsref(repmat(input,[1,1,N]),struct('type','()','subs',{{output==0}})))
...sadly I forgot using masks and adding two matrices together was an option...
You should have the 'cat' function:
http://www.mathworks.nl/help/matlab/ref/cat.html
cat(3,matrix1,matrix2,...) to concatenate along the 3rd dimension.
in this question i am addressing to numerical computation problems in matlab and want to get experience how to avoid this problems/errors in future
for example let consider following simple codes
t = 0.4 + 0.1 - 0.5
t =
0
it works fine,but
u = 0.4 - 0.5 + 0.1
u =
2.7756e-17
of course in mind it is also 0,but why not in first calculation got the same result?or what is difference?also please look
v = (sin(2*pi) = = sin(4*pi))
v = (sin(2*pi)==sin(4*pi))
v =
0
it shows that sine function is not periodic,so what is general advice in this case?introduce some epsilon?like
V=((sin(2*pi)-sin(4*pi))<eps)
V =
0
or
EPS=0.000000000000001
EPS =
1.0000e-15
>> V=((sin(2*pi)-sin(4*pi))<EPS)
V =
1
please help me
It's normal you get these results, because floating-point relative accuracy in Matlab is
eps('double')
ans =
2.2204e-16
For V=((sin(2*pi)-sin(4*pi))<eps), because
sin(2*pi)-sin(4*pi)
ans =
2.4493e-16
which is larger than eps('double'), so its result will be V=0.
And for V=((sin(2*pi)-sin(4*pi))<EPS), because EPS>2.4493e-16, so its result will be V=1.
I am trying to rewrite part of a vector given that:
t = -10:.1:10
x = exp((-3.*t);
The length of x will be 201, and I want to rewrite the first 100 values.
The only way I've gotten to work is by doing this:
EDIT Fixed typo.
t = 0:.1:10;
x = exp((-3.*t); % EDIT: THERE WAS A TYPO HERE
z = zeros(1,100);
for k = 1 : 100
x(k) = z(k);
end
There are three questions. First: What is a faster and more efficient way of doing this? Second: What do I do if I don't want to overwrite the first part of the code but rather the middle or the second part? Third: Is there a way of utilizing the full range of t where t = -10:.1:10 and just ignoring the first half instead of writing a whole new variable for it?
First: Nothing else I've tried has been successful.
Second: The only way I can think to do that is to append the two vectors together, but then it doesn't overwrite the data, so that is a no go.
Third:I have tried an if statement and that didn't work at all.
Your code appears to assign something to y, then changes the value of x. I assume that is a typo - and not the problem you actually want to fix.
In general, if you have
t = -10:0.1:10; % my preference: t = linspace(-10,10,201);
and
y = exp(-3 * t );
but you want to set the first 100 elements of y to zero, you can then do
y(1:100) = 0;
If you wanted never to compute y(1:100) in the first place you might do
y = zeros(size(t));
y(101:end) = exp(-3 * t(101:end));
There are many variations on this. I think the above code samples address all three of your questions.
change your
for k = 1 : 100
x(k) = z(k); % i think it should be y(k) though
end
to
x(1:100) = 0;
You could use logical indexing; that is, you can use a logical statement to select elements of a vector/matrix:
t = -10:0.1:10;
x = exp((-3.*t);
x(t < 0) = 0;
This works for the middle of a matrix too:
x(t > -5 & t < 5) = whatever;
I know there has got to be a cleaner more elegant way to do this. I have an array of number in the range [0,1] and want to check which ones are greater than a threshold. I remember there being some syntax to do this nicely. In python I would use something like a lambda function.
p = sigmoid(dot(theta,X));
for i =1:size(p)
if(p(i)>=0.5)
p(i)=1
else
p(i)=0
end
end
mtrw is on the right track, but it gets even shorter:
p = (p >= 0.5);
You can simply say p = (p>=0.5). Boolean operators work on arrays, and return logical arrays (which consist of boolean values).
You can operate on the whole array at once:
p(p >= 0.5) = 1;
p(p < 0.5) = 0;
For what it's worth, you can do the same thing in Numpy if p is a Numpy array:
>>> p[p >= 0.5] = 1
>>> p[p < 0.5] = 0
Just for variety. You can also do:
p = floor(p + 0.5);
which also generalises to other thresholds in the range [0,1].