fminsearch (matlab) taking too much time - matlab
function A=obj(b,Y, YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, year1, year2, year3, year4, year5, year6, year7, workregion1, workregion2, workregion3, workregion4, workregion5, workregion6, workregion7, workregion8, workregion9, workregion10, workregion11, workregion12, workregion13, workregion14, workregion15, workregion16, qw)
A=0;
for i=1:715
S=0;
for m=1:12
P=1;
for t=1:8
P=P*(normcdf((2*Y(i,t)-1)*(b(1)*YL(i,t)+b(2)*tempcyc(i,t)+b(3)*cyc(i,t)+b(4)*Yzero(i,t)+b(5)*age(i,t)+b(6)*agesq(i,t)+b(7)*educ(i,t)+b(8)*ageav(i,t)+b(9)*agesqav(i,t)+b(10)*educav(i,t)+b(11)*1+b(12)*sqrt(2)*qv(m,1)+b(13)*year1(i,t)+b(14)*year2(i,t)+b(15)*year3(i,t)+b(16)*year4(i,t)+b(17)*year5(i,t)+b(18)*year6(i,t)+b(19)*year7(i,t)+b(20)*workregion1(i,t)+b(21)*workregion2(i,t)+b(22)*workregion3(i,t)+b(23)*workregion4(i,t)+b(24)*workregion5(i,t)+b(25)*workregion6(i,t)+b(26)*workregion7(i,t)+b(27)*workregion8(i,t)+b(28)*workregion9(i,t)+b(29)*workregion10(i,t)+b(30)*workregion11(i,t)+b(31)*workregion12(i,t)+b(32)*workregion13(i,t)+b(33)*workregion14(i,t)+b(34)*workregion15(i,t)+b(35)*workregion16(i,t))));
end
S=S+qw(m,1)*P;
end
A=A+log(S/sqrt(pi));
A=-A;
end
This is my objective function I am minimizing (actually maximizing; I am minimizing -A) and the parameters I am estimating is b=[b(1)............b(35)]. Y, YL, tempcyc,.........., qw are matrices that I am imported as data. The objective function consists of a product(across t=1:8) nested within a sum (across m=1:12) that is in turn nested within a sum (across n=1:715). And below is my code using fminsearch for my unconstrained minimization.
% %% Minimization
start=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0];
iter=50000;
options=optimset('Display','iter','MaxIter',iter,'MaxFunEvals',100000);
[b, fval, exitflag, output]=fminsearch(#(b)obj(b,Y, YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, year1, year2, year3, year4, year5, year6, year7, workregion1, workregion2, workregion3, workregion4, workregion5, workregion6, workregion7, workregion8, workregion9, workregion10, workregion11, workregion12, workregion13, workregion14, workregion15, workregion16, qw), start, options);%
%% Results
fprintf('[b] : % 1.4e % 1.4e %1.4e % 1.4e % 1.4e % 1.4e \n',...b(1),b(2),b(3),b(4), b(5),b(6),b(7),b(8), b(9),b(10),b(11),b(12), b(13),b(14),b(15),b(16), b(17),b(18),b(19),b(20), b(21),b(22),b(23),b(24), b(25),b(26),b(27),b(28), b(29),b(30),b(31),b(32), b(33),b(34), b(35));
end
The problem is that the optmization results do not show up even after 12+ hours (It is still reflecting, expanding, contracting inside, etc.) Could someone give me an idea on how to make the process faster?
Thank you.
This is not a complete answer to your question; this is a suggestion on how to better write your code, which is probably half the answer.
I would write your calling script something like this:
%% Minimization
b0 = zeros(1,35);
iter = 5e4;
options=optimset(...
'Display' , 'iter',...
'MaxIter' , iter,...
'MaxFunEvals', 1e5);
data = {...
YL, tempcyc, cyc, Yzero, age, agesq, educ, ageav, agesqav, educav, qv, ...
year1, year2, year3, year4, year5, year6, year7, ...
workregion1, workregion2, workregion3, workregion4, workregion5, ...
workregion6, workregion7, workregion8, workregion9, workregion10, ...
workregion11, workregion12, workregion13, workregion14, workregion15, ...
workregion16
};
[b, fval, exitflag, output] = fminsearch(#(b)obj(b, Y,data,qw), b0, options);
%% Print results
fprintf('[b]:');
fprintf('%1.4e', b);
fprintf('\n');
(actually, I would also wrap the data in cells instead of having 16 workregions and 7 years etc. But you may not be the creator of the data, so this may not always be possible)
You could re-write your objective function something like this:
function A = obj(b, Y, others, qw)
A = 0;
sPi = -0.5*log(pi);
bB = num2cell(b);
for ii = 1:715
P = prod( normcdf((2*Y(ii,1:8)-1) .* cellfun(#(x,y) x(ii,1:8).*y, others, bB)) );
S = sum( P*qw(1:12,1) );
A = A + log(S) + sPi;
end
A = -A;
end
Probably also the outer loop can be simplified into vectorized form. But I cannot test the correctness of all this (or whether it runs at all), because I don't have access to your data. But I trust you get the idea -- you can now at least read your code, and probably spot the error yourself.
Related
Doing Andrew Ng's Logistic Regression execrise without fminunc
I've been trying to finish Andrew Ng's Machine Learning course, I am at the part about logistic regression now. I am trying to discover the parameters and also calculate the cost without using the MATLAB function fminunc. However, I am not converging to the correct results as posted by other students who have finished the assignment using fminunc. Specifically, my problems are: the parameters theta are incorrect my cost seems to be blowing up I get many NaNs in my cost vector (I just create a vector of the costs to keep track) I attempted to discover the parameters via Gradient Descent as how I understood the content. However, my implementation still seems to be giving me incorrect results. dataset = load('dataStuds.txt'); x = dataset(:,1:end-1); y = dataset(:,end); m = length(x); % Padding the the 1's (intercept term, the call it?) x = [ones(length(x),1), x]; thetas = zeros(size(x,2),1); % Setting the learning rate to 0.1 alpha = 0.1; for i = 1:100000 % theta transpose x (tho why in MATLAB it needs to be done the other way % round? :) ttrx = x * thetas; % the hypothesis function h_x = g(z) = sigmoid(-z) h_x = 1 ./ (1 + exp(-ttrx)); error = h_x - y; % the gradient (aka the derivative of J(\theta) aka the derivative % term) for j = 1:length(thetas) gradient = 1/m * (h_x - y)' * x(:,j); % Updating the parameters theta thetas(j) = thetas(j) - alpha * gradient; end % Calculating the cost, just to keep track... cost(i) = 1/m * ( -y' * log(h_x) - (1-y)' * log(1-h_x) ); end % Displaying the final theta's that I obtained thetas The parameters theta that I get are: thetas = -482.8509 3.7457 2.6976 The results below is from one example that I downloaded, but the author used fminunc for this one. Cost at theta found by fminunc: 0.203506 theta: -24.932760 0.204406 0.199616 The data: 34.6236596245170 78.0246928153624 0 30.2867107682261 43.8949975240010 0 35.8474087699387 72.9021980270836 0 60.1825993862098 86.3085520954683 1 79.0327360507101 75.3443764369103 1 45.0832774766834 56.3163717815305 0 61.1066645368477 96.5114258848962 1 75.0247455673889 46.5540135411654 1 76.0987867022626 87.4205697192680 1 84.4328199612004 43.5333933107211 1 95.8615550709357 38.2252780579509 0 75.0136583895825 30.6032632342801 0 82.3070533739948 76.4819633023560 1 69.3645887597094 97.7186919618861 1 39.5383391436722 76.0368108511588 0 53.9710521485623 89.2073501375021 1 69.0701440628303 52.7404697301677 1 67.9468554771162 46.6785741067313 0 70.6615095549944 92.9271378936483 1 76.9787837274750 47.5759636497553 1 67.3720275457088 42.8384383202918 0 89.6767757507208 65.7993659274524 1 50.5347882898830 48.8558115276421 0 34.2120609778679 44.2095285986629 0 77.9240914545704 68.9723599933059 1 62.2710136700463 69.9544579544759 1 80.1901807509566 44.8216289321835 1 93.1143887974420 38.8006703371321 0 61.8302060231260 50.2561078924462 0 38.7858037967942 64.9956809553958 0 61.3792894474250 72.8078873131710 1 85.4045193941165 57.0519839762712 1 52.1079797319398 63.1276237688172 0 52.0454047683183 69.4328601204522 1 40.2368937354511 71.1677480218488 0 54.6351055542482 52.2138858806112 0 33.9155001090689 98.8694357422061 0 64.1769888749449 80.9080605867082 1 74.7892529594154 41.5734152282443 0 34.1836400264419 75.2377203360134 0 83.9023936624916 56.3080462160533 1 51.5477202690618 46.8562902634998 0 94.4433677691785 65.5689216055905 1 82.3687537571392 40.6182551597062 0 51.0477517712887 45.8227014577600 0 62.2226757612019 52.0609919483668 0 77.1930349260136 70.4582000018096 1 97.7715992800023 86.7278223300282 1 62.0730637966765 96.7688241241398 1 91.5649744980744 88.6962925454660 1 79.9448179406693 74.1631193504376 1 99.2725269292572 60.9990309984499 1 90.5467141139985 43.3906018065003 1 34.5245138532001 60.3963424583717 0 50.2864961189907 49.8045388132306 0 49.5866772163203 59.8089509945327 0 97.6456339600777 68.8615727242060 1 32.5772001680931 95.5985476138788 0 74.2486913672160 69.8245712265719 1 71.7964620586338 78.4535622451505 1 75.3956114656803 85.7599366733162 1 35.2861128152619 47.0205139472342 0 56.2538174971162 39.2614725105802 0 30.0588224466980 49.5929738672369 0 44.6682617248089 66.4500861455891 0 66.5608944724295 41.0920980793697 0 40.4575509837516 97.5351854890994 1 49.0725632190884 51.8832118207397 0 80.2795740146700 92.1160608134408 1 66.7467185694404 60.9913940274099 1 32.7228330406032 43.3071730643006 0 64.0393204150601 78.0316880201823 1 72.3464942257992 96.2275929676140 1 60.4578857391896 73.0949980975804 1 58.8409562172680 75.8584483127904 1 99.8278577969213 72.3692519338389 1 47.2642691084817 88.4758649955978 1 50.4581598028599 75.8098595298246 1 60.4555562927153 42.5084094357222 0 82.2266615778557 42.7198785371646 0 88.9138964166533 69.8037888983547 1 94.8345067243020 45.6943068025075 1 67.3192574691753 66.5893531774792 1 57.2387063156986 59.5142819801296 1 80.3667560017127 90.9601478974695 1 68.4685217859111 85.5943071045201 1 42.0754545384731 78.8447860014804 0 75.4777020053391 90.4245389975396 1 78.6354243489802 96.6474271688564 1 52.3480039879411 60.7695052560259 0 94.0943311251679 77.1591050907389 1 90.4485509709636 87.5087917648470 1 55.4821611406959 35.5707034722887 0 74.4926924184304 84.8451368493014 1 89.8458067072098 45.3582836109166 1 83.4891627449824 48.3802857972818 1 42.2617008099817 87.1038509402546 1 99.3150088051039 68.7754094720662 1 55.3400175600370 64.9319380069486 1 74.7758930009277 89.5298128951328 1
I ran your code and it does work fine. However, the tricky thing about gradient descent is ensuring that your costs don't diverge to infinity. If you look at your costs array, you will see that the costs definitely diverge and this is why you are not getting the correct results. The best way to eliminate this in your case is to reduce the learning rate. Through experimentation, I have found that a learning rate of alpha = 0.003 is the best for your problem. I've also increased the number of iterations to 200000. Changing these two things gives me the following parameters and associated cost: >> format long g; >> thetas thetas = -17.6287417780435 0.146062780453677 0.140513170941357 >> cost(end) ans = 0.214821863463963 This is more or less in line with the magnitudes of the parameters you see when you are using fminunc. However, they get slightly different parameters as well as different costs because of the actual minimization method itself. fminunc uses a variant of L-BFGS which finds the solution in a much faster way. What is most important is the actual accuracy itself. Remember that to classify whether an example belongs to label 0 or 1, you take the weighted sum of the parameters and examples, run it through the sigmoid function and threshold at 0.5. We find what the average amount of times each expected label and predicted label match. Using the parameters we found with gradient descent gives us the following accuracy: >> ttrx = x * thetas; >> h_x = 1 ./ (1 + exp(-ttrx)) >= 0.5; >> mean(h_x == y) ans = 0.89 This means that we've achieved an 89% classification accuracy. Using the labels provided by fminunc also gives: >> thetas2 = [-24.932760; 0.204406; 0.199616]; >> ttrx = x * thetas2; >> h_x = 1 ./ (1 + exp(-ttrx)) >= 0.5; >> mean(h_x == y) ans = 0.89 So we can see that the accuracy is the same so I wouldn't worry too much about the magnitude of the parameters but it's more in line with what we see when we compare the costs between the two implementations. As a final note to you, I would suggest looking at this post of mine for some tips on how to make logistic regression work over long-term. I would definitely recommend normalizing your features prior to finding the parameters to make the algorithm run faster. It also addresses why you were finding the wrong parameters (namely the cost blowing up): Cost function in logistic regression gives NaN as a result.
normalizing the data using mean and standard deviation as follows enables you to use large learning rate and get a similar answer clear; clc data = load('ex2data1.txt'); m = length(data); alpha = 0.1; theta = [0; 0; 0]; y = data(:,3); % Normalizing the data xm1 = mean(data(:,1)); xm2 = mean(data(:,2)); xs1 = std(data(:,1)); xs2 = std(data(:,2)); x1 = (data(:,1)-xm1)./xs1; x2 = (data(:,2)-xm2)./xs2; X = [ones(m, 1) x1 x2]; for i=1:10000 h = 1./(1+exp(-(X*theta))); theta = theta - (alpha/m)* (X'*(h-y)); J(i) = (1/m)*(-y'*log(h)-(1-y)'*log(1-h)); end theta J(end) figure plot(J)
interp2 with for loop
%I_Phase [0 51.1111 102.2222 153.3333 204.4444 255.5556 306.6667 357.7778 408.8889 460] %Theta [-45 -30 -15 0 15 30 45 60 75 90] %Torque_Matrix [0 0 0 0 0 0 0 0 0 0 28.6989 35.9452 41.1581 43.8173 43.5092 40.0174 33.4011 24.0388 12.6221 0.0940 52.6956 67.7241 79.8022 87.4465 89.2066 84.0131 71.5044 52.2439 27.7582 0.3762 71.9900 95.3367 115.9323 130.8876 137.0923 131.9869 114.3100 84.6153 45.4083 0.8464 86.5822 118.7830 149.5483 174.1406 187.1661 183.9389 161.8178 121.1530 65.5724 1.5047 96.4722 138.0630 180.6504 217.2055 239.4282 239.8691 214.0278 161.8569 88.2505 2.3511 101.6600 153.1768 209.2385 260.0824 293.8785 299.7776 270.9400 206.7272 113.4425 3.3856 102.1456 164.1242 235.3126 302.7711 350.5170 363.6642 332.5544 255.7637 141.1486 4.6082 97.9289 170.9054 258.8727 345.2718 409.3438 431.5290 398.8711 308.9666 171.3687 6.0188 89.0100 173.5203 279.9188 387.5844 470.3588 503.3720 469.8900 366.3357 204.1028 7.6176] I have two vectors (I_Phase1, Theta1) and one matrix (Torque_Matrix), I need to do a interpolation with interp2, but my fuction and loop when running interp2, Theta_Int and I_Phase_Int values are only 90 and 460 respectively, so I only have a value for Torque_Value and I need values at -45: 90 and 0: 460, I need to create a new array with Torque_Values values. I need help, please. surf(Theta1,I_Phase1,Torque_Matrix); hold on Matrix_int = zeros(100); index_Theta = 0; index_I = 0; for Theta_Int = -45:90; for I_Phase_Int = 0:460; Torque_Value = interp2(Theta1,I_Phase1,Torque_Matrix,Theta_Int,I_Phase_Int); end end surf(Theta_Int,I_Phase_Int,Torque_Value,'.r');
You can use the interp2 function with vectors, but if it helps for a beginner as yourself to see what is going on in each step, for loops will still work. You just need to create vectors for Theta_Int and I_Phase_Int. Then when you call interp2, you index Theta_Int(j) and I_Phase_Int(k), as well as store the values in Torque_Value(k,j). I swapped the for loops around so when you do the surf it will plot along the same axis as above. Theta_Int = -45:90; I_Phase_Int = 0:460; for k = 1:length(I_Phase_Int) for j = 1:length(Theta_Int) Torque_Value(k, j) = interp2(Theta,I_Phase,Torque_Matrix,Theta_Int(j),I_Phase_Int(k)); end end figure(); surf(Theta_Int, I_Phase_Int, Torque_Value); Also created a new figure so both of your plots will be shown. So just replace your for loop down with this code and it should work.
CRC 16 in matlab
i actually want to generate crc in matlab for Modbus protocol and i have used following code in matlab. I have also given message array as message=uint16([hex2dec('01') hex2dec('02') hex2dec('00') hex2dec('C4') hex2dec('00') hex2dec('16')]); and done bitand with 0xffff at the end, but it is unable to give correct crc.. My code is as below and the expected crc is B839 as per the Modbus crc calculator but it is giving B8DD(47325 decimal). Please help me if there is anything to change in the code. Thank you. function crc_val = crc3 (~) crc = uint16(hex2dec('1D0F')); % Non-augmented initial value equivalent to augmented initial value 0xFFFF polynomial = hex2dec('1021'); % Polynomial message=uint16([hex2dec('01') hex2dec('02') hex2dec('00') hex2dec('C4') hex2dec('00') hex2dec('16') hex2dec('00') hex2dec('00')]); for i = 1:(length(message)-2) % Not taking the last 2 bytes because they are the CRC. crc = bitxor(crc, bitsll(message(i), 8)); for j = 1:8 if (bitand(crc, hex2dec('8000')) > 0); crc = bitxor(bitsll(crc, 1), polynomial); else crc = bitsll(crc, 1); end end end crc_val = bitand(crc, hex2dec('ffff')); end
Did you try this. It is available as BSD license. You would not face any possible licensing issues. The following explains how CRC actually works. The following also helps understand the concept. %usage: crc16(input vector). // function [resto] = crc16(h) % g(X) = X^16+X^15+X^2+1 gx = [1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1]; % P(X) is given input vector px = h; %Calculate P(x)x^r pxr=[px zeros(1,length(gx)-1)]; % deconvolve (r), entre pxr y gx [c r]=deconv(pxr,gx); r=mod(abs(r),2); % returncrc-16 resto=r(length(px)+1:end); end
Plotting time graph in MATLAB
I have a function of this form in MATLAB, C=S*e^(L*t)*inv(S)*C_0 where my S=[-2 -3;3 -2] L=[0.5 0; 0 1.5] C_0=[1; 1] I need to plot this function with respect to time. My output C is a 2-by-1 matrix. What I have done is computed e^L separately using b=expm(L) and then I inserted mpower(b,t) into the function. So my resulting function in the script looks like b=expm(L); C=S*mpower(b,t)*inv(S)*C_0; Now, how should I go about plotting this w.r.t time. I tried defining the time vector and then using it, but quite obviously I get the error message which says matrix dimensions do not agree. Can someone give me a suggestion?
You can probably do this in a vectorised manner but if you're not worried about speed or succinct code, why not just write a for loop? ts = 1 : 100; Cs = zeros(2, length(ts) ); S = [-2 -3;3 -2]; L = [0.5 0; 0 1.5]; C_0 = [1; 1]; for ii = 1 : length(ts) b = expm(L); Cs(:,ii) = S*mpower(b,ts(ii))*inv(S)*C_0; end ts contains the time values, Cs contains the values of C at each time.
Saving derivative values in ode45 in Matlab
I'm simulating equations of motion for a (somewhat odd) system with mass-springs and double pendulum, for which I have a mass matrix and function f(x), and call ode45 to solve M*x' = f(x,t); I have 5 state variables, q= [ QDot, phi, phiDot, r, rDot]'; (removed Q because nothing depends on it, QDot is current.) Now, to calculate some forces, I would like to also save the calculated values of rDotDot, which ode45 calculates for each integration step, however ode45 doesn't give this back. I've searched around a bit, but the only two solutions I've found are to a) turn this into a 3rd order problem and add phiDotDot and rDotDot to the state vector. I would like to avoid this as much as possible, as it's already non-linear and this really makes matters a lot worse and blows up computation time. b) augment the state to directly calculate the function, as described here. However, in the example he says to make add a line of zeros in the mass matrix. It makes sense, since otherwise it will integrate the derivative and not just evaluate it at the one point, but on the other hand it makes the mass matrix singular. Doesn't seem to work for me... This seems like such a basic thing (to want the derivative values of the state vector), is there something quite obvious that I'm not thinking of? (or something not so obvious would be ok too....) Oh, and global variables are not so great because ode45 calls the f() function several time while refining it's step, so the sizes of the global variable and the returned state vector q don't match at all. In case someone needs it, here's the code: %(Initialization of parameters are above this line) options = odeset('Mass',#massMatrix); [T,q] = ode45(#f, tspan,q0,options); function dqdt = f(t,q,p) % q = [qDot phi phiDot r rDot]'; dqdt = zeros(size(q)); dqdt(1) = -R/L*q(1) - kb/L*q(3) +vs/L; dqdt(2) = q(3); dqdt(3) = kt*q(1) + mp*sin(q(2))*lp*g; dqdt(4) = q(5); dqdt(5) = mp*lp*cos(q(2))*q(3)^2 - ks*q(4) - (mb+mp)*g; end function M = massMatrix(~,q) M = [ 1 0 0 0 0; 0 1 0 0 0; 0 0 mp*lp^2 0 -mp*lp*sin(q(2)); 0 0 0 1 0; 0 0 mp*lp*sin(q(2)) 0 (mb+mp) ]; end
The easiest solution is to just re-run your function on each of the values returned by ode45. The hard solution is to try to log your DotDots to some other place (a pre-allocated matrix or even an external file). The problem is that you might end up with unwanted data points if ode45 secretly does evaluations in weird places.
Since you are using nested functions, you can use their scoping rules to mimic the behavior of global variables. It's easiest to (ab)use an output function for this purpose: function solveODE % .... %(Initialization of parameters are above this line) % initialize "global" variable rDotDot = []; % Specify output function options = odeset(... 'Mass', #massMatrix,... 'OutputFcn', #outputFcn); % solve ODE [T,q] = ode45(#f, tspan,q0,options); % show the rDotDots rDotDot % derivative function dqdt = f(~,q) % q = [qDot phi phiDot r rDot]'; dqdt = [... -R/L*q(1) - kb/L*q(3) + vs/L q(3) kt*q(1) + mp*sin(q(2))*lp*g q(5) mp*lp*cos(q(2))*q(3)^2 - ks*q(4) - (mb+mp)*g ]; end % q-dot function % mass matrix function M = massMatrix(~,q) M = [ 1 0 0 0 0; 0 1 0 0 0; 0 0 mp*lp^2 0 -mp*lp*sin(q(2)); 0 0 0 1 0; 0 0 mp*lp*sin(q(2)) 0 (mb+mp) ]; end % mass matrix function % the output function collects values for rDotDot at the initial step % and each sucessful step function status = outputFcn(t,q,flag) status = 0; % at initialization, and after each succesful step if isempty(flag) || strcmp(flag, 'init') deriv = f(t,q); rDotDot(end+1) = deriv(end); end end % output function end The output function only computes the derivatives at the initial and all successful steps, so it's basically doing the same as what Adrian Ratnapala suggested; re-run the derivative at each of the outputs of ode45; I think that would even be more elegant (+1 for Adrian). The output function approach has the advantage that you can plot the rDotDot values while the integration is being run (don't forget a drawnow!), which can be very useful for long-running integrations.