Background: I am working on a problem similar to the nonlinear logistic regression described in the link [1] (my problem is more complicated, but link [1] is enough for the next sections of this post). Comparing my results with those obtained in parallel with a R package, I got similar results for the coefficients, but (very approximately) an opposite logLikelihood.
Hypothesis: The logLikelihood given by fitnlm in Matlab is in fact the negative LogLikelihood. (Note that this impairs consequently the BIC and AIC computation by Matlab)
Reasonning: in [1], the same problem is solved through two different approaches. ML-approach/ By defining the negative LogLikelihood and making an optimization with fminsearch. GLS-approach/ By using fitnlm.
The negative LogLikelihood after the ML-approach is:380
The negative LogLikelihood after the GLS-approach is:-406
I imagine the second one should be at least multiplied by (-1)?
Questions: Did I miss something? Is the (-1) coefficient enough, or would this simple correction not be enough?
Self-contained code:
%copy-pasting code from [1]
myf = #(beta,x) beta(1)*x./(beta(2) + x);
mymodelfun = #(beta,x) 1./(1 + exp(-myf(beta,x)));
rng(300,'twister');
x = linspace(-1,1,200)';
beta = [10;2];
beta0=[3;3];
mu = mymodelfun(beta,x);
n = 50;
z = binornd(n,mu);
y = z./n;
%ML Approach
mynegloglik = #(beta) -sum(log(binopdf(z,n,mymodelfun(beta,x))));
opts = optimset('fminsearch');
opts.MaxFunEvals = Inf;
opts.MaxIter = 10000;
betaHatML = fminsearch(mynegloglik,beta0,opts)
neglogLH_MLApproach = mynegloglik(betaHatML);
%GLS Approach
wfun = #(xx) n./(xx.*(1-xx));
nlm = fitnlm(x,y,mymodelfun,beta0,'Weights',wfun)
neglogLH_GLSApproach = - nlm.LogLikelihood;
Source:
[1] https://uk.mathworks.com/help/stats/examples/nonlinear-logistic-regression.html
This answer (now) only details which code is used. Please see Tom Lane's answer below for a substantive answer.
Basically, fitnlm.m is a call to NonLinearModel.fit.
When opening NonLinearModel.m, one gets in line 1209:
model.LogLikelihood = getlogLikelihood(model);
getlogLikelihood is itself described between lines 1234-1251.
For instance:
function L = getlogLikelihood(model)
(...)
L = -(model.DFE + model.NumObservations*log(2*pi) + (...) )/2;
(...)
Please also not that this notably impacts ModelCriterion.AIC and ModelCriterion.BIC, as they are computed using model.LogLikelihood ("thinking" it is the logLikelihood).
To get the corresponding formula for BIC/AIC/..., type:
edit classreg.regr.modelutils.modelcriterion
this is Tom from MathWorks. Take another look at the formula quoted:
L = -(model.DFE + model.NumObservations*log(2*pi) + (...) )/2;
Remember the normal distribution has a factor (1/sqrt(2*pi)), so taking logs of that gives us -log(2*pi)/2. So the minus sign comes from that and it is part of the log likelihood. The property value is not the negative log likelihood.
One reason for the difference in the two log likelihood values is that the "ML approach" value is computing something based on the discrete probabilities from the binomial distribution. Those are all between 0 and 1, and they add up to 1. The "GLS approach" is computing something based on the probability density of the continuous normal distribution. In this example, the standard deviation of the residuals is about 0.0462. That leads to density values that are much higher than 1 at the peak. So the two things are not really comparable. You would need to convert the normal values to probabilities on the same discrete intervals that correspond to individual outcomes from the binomial distribution.
Related
Summary: This question deals with the improvement of an algorithm for the computation of linear regression.
I have a 3D (dlMAT) array representing monochrome photographs of the same scene taken at different exposure times (the vector IT) . Mathematically, every vector along the 3rd dimension of dlMAT represents a separate linear regression problem that needs to be solved. The equation whose coefficients need to be estimated is of the form:
DL = R*IT^P, where DL and IT are obtained experimentally and R and P must be estimated.
The above equation can be transformed into a simple linear model after applying a logarithm:
log(DL) = log(R) + P*log(IT) => y = a + b*x
Presented below is the most "naive" way to solve this system of equations, which essentially involves iterating over all "3rd dimension vectors" and fitting a polynomial of order 1 to (IT,DL(ind1,ind2,:):
%// Define some nominal values:
R = 0.3;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(3)+P;
rMAT = 0.1*randn(3)+R;
%// Generate "fake" observation data:
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%// Regression:
sol = cell(size(rMAT)); %// preallocation
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedP = cellfun(#(x)x(1),sol); %// Estimate of pMAT
fittedR = cellfun(#(x)exp(x(2)),sol); %// Estimate of rMAT
The above approach seems like a good candidate for vectorization, since it does not utilize MATLAB's main strength that is MATrix operations. For this reason, it does not scale very well and takes much longer to execute than I think it should.
There exist alternative ways to perform this computation based on matrix division, as demonstrated here and here, which involve something like this:
sol = [ones(size(x)),log(x)]\log(y);
That is, appending a vector of 1s to the observations, followed by mldivide to solve the equation system.
The main challenge I'm facing is how to adapt my data to the algorithm (or vice versa).
Question #1: How can the matrix-division-based solution be extended to solve the problem presented above (and potentially replace the loops I am using)?
Question #2 (bonus): What is the principle behind this matrix-division-based solution?
The secret ingredient behind the solution that includes matrix division is the Vandermonde matrix. The question discusses a linear problem (linear regression), and those can always be formulated as a matrix problem, which \ (mldivide) can solve in a mean-square error sense‡. Such an algorithm, solving a similar problem, is demonstrated and explained in this answer.
Below is benchmarking code that compares the original solution with two alternatives suggested in chat1, 2 :
function regressionBenchmark(numEl)
clc
if nargin<1, numEl=10; end
%// Define some nominal values:
R = 5;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(numEl)+P;
rMAT = 0.1*randn(numEl)+R;
%// Generate "fake" measurement data using the relation "DL = R*IT.^P"
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%% // Method1: loops + polyval
disp('-------------------------------Method 1: loops + polyval')
tic; [fR,fP] = method1(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method2: loops + Vandermonde
disp('-------------------------------Method 2: loops + Vandermonde')
tic; [fR,fP] = method2(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method3: vectorized Vandermonde
disp('-------------------------------Method 3: vectorized Vandermonde')
tic; [fR,fP] = method3(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
function [fittedR,fittedP] = method1(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method2(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = flipud([ones(numel(IT),1) log(IT(:))]\log(squeeze(dlMAT(ind1,ind2,:)))).'; %'
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method3(IT,dlMAT)
N = 1; %// Degree of polynomial
VM = bsxfun(#power, log(IT(:)), 0:N); %// Vandermonde matrix
result = fliplr((VM\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
%// Compressed version:
%// result = fliplr(([ones(numel(IT),1) log(IT(:))]\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
fittedR = exp(real(reshape(result(:,2),size(dlMAT,1),size(dlMAT,2))));
fittedP = real(reshape(result(:,1),size(dlMAT,1),size(dlMAT,2)));
The reason why method 2 can be vectorized into method 3 is essentially that matrix multiplication can be separated by the columns of the second matrix. If A*B produces matrix X, then by definition A*B(:,n) gives X(:,n) for any n. Moving A to the right-hand side with mldivide, this means that the divisions A\X(:,n) can be done in one go for all n with A\X. The same holds for an overdetermined system (linear regression problem), in which there is no exact solution in general, and mldivide finds the matrix that minimizes the mean-square error. In this case too, the operations A\X(:,n) (method 2) can be done in one go for all n with A\X (method 3).
The implications of improving the algorithm when increasing the size of dlMAT can be seen below:
For the case of 500*500 (or 2.5E5) elements, the speedup from Method 1 to Method 3 is about x3500!
It is also interesting to observe the output of profile (here, for the case of 500*500):
Method 1
Method 2
Method 3
From the above it is seen that rearranging the elements via squeeze and flipud takes up about half (!) of the runtime of Method 2. It is also seen that some time is lost on the conversion of the solution from cells to matrices.
Since the 3rd solution avoids all of these pitfalls, as well as the loops altogether (which mostly means re-evaluation of the script on every iteration) - it unsurprisingly results in a considerable speedup.
Notes:
There was very little difference between the "compressed" and the "explicit" versions of Method 3 in favor of the "explicit" version. For this reason it was not included in the comparison.
A solution was attempted where the inputs to Method 3 were gpuArray-ed. This did not provide improved performance (and even somewhat degradaed them), possibly due to wrong implementation, or the overhead associated with copying matrices back and forth between RAM and VRAM.
I have a dataset comprising of 30 independent variables and I tried performing linear regression in MATLAB R2010b using the regress function.
I get a warning stating that my matrix X is rank deficient to within machine precision.
Now, the coefficients I get after executing this function don't match with the experimental one.
Can anyone please suggest me how to perform the regression analysis for this equation which is comprising of 30 variables?
Going with our discussion, the reason why you are getting that warning is because you have what is known as an underdetermined system. Basically, you have a set of constraints where you have more variables that you want to solve for than the data that is available. One example of an underdetermined system is something like:
x + y + z = 1
x + y + 2z = 3
There are an infinite number of combinations of (x,y,z) that can solve the above system. For example, (x, y, z) = (1, −2, 2), (2, −3, 2), and (3, −4, 2). What rank deficient means in your case is that there is more than one set of regression coefficients that would satisfy the governing equation that would describe the relationship between your input variables and output observations. This is probably why the output of regress isn't matching up with your ground truth regression coefficients. Though it isn't the same answer, do know that the output is one possible answer. By running through regress with your data, this is what I get if I define your observation matrix to be X and your output vector to be Y:
>> format long g;
>> B = regress(Y, X);
>> B
B =
0
0
28321.7264417536
0
35241.9719076362
899.386999172398
-95491.6154990829
-2879.96318251964
-31375.7038251919
5993.52959752106
0
18312.6649115112
0
0
8031.4391233753
27923.2569044728
7716.51932560781
-13621.1638587172
36721.8387047613
80622.0849069525
-114048.707780113
-70838.6034825939
-22843.7931997405
5345.06937207617
0
106542.307940305
-14178.0346010715
-20506.8096166108
-2498.51437396558
6783.3107243113
You can see that there are seven regression coefficients that are equal to 0, which corresponds to 30 - 23 = 7. We have 30 variables and 23 constraints to work with. Be advised that this is not the only possible solution. regress essentially computes the least squared error solution such that sum of residuals of Y - X*B has the least amount of error. This essentially simplifies to:
B = X^(*)*Y
X^(*) is what is known as the pseudo-inverse of the matrix. MATLAB has this available, and it is called pinv. Therefore, if we did:
B = pinv(X)*Y
We get:
B =
44741.6923363563
32972.479220139
-31055.2846404536
-22897.9685877566
28888.7558524005
1146.70695371731
-4002.86163441217
9161.6908044046
-22704.9986509788
5526.10730457192
9161.69080479427
2607.08283489226
2591.21062004404
-31631.9969765197
-5357.85253691504
6025.47661106009
5519.89341411127
-7356.00479046122
-15411.5144034056
49827.6984426955
-26352.0537850382
-11144.2988973666
-14835.9087945295
-121.889618144655
-32355.2405829636
53712.1245333841
-1941.40823106236
-10929.3953469692
-3817.40117809984
2732.64066796307
You see that there are no zero coefficients because pinv finds the solution using the L2-norm, which promotes the "spreading" out of the errors (for a lack of a better term). You can verify that these are correct regression coefficients by doing:
>> Y2 = X*B
Y2 =
16.1491563400241
16.1264219600856
16.525331600049
17.3170318001845
16.7481541301999
17.3266932502295
16.5465094100486
16.5184456100487
16.8428701100165
17.0749421099829
16.7393450000517
17.2993993099419
17.3925811702017
17.3347117202356
17.3362798302375
17.3184486799219
17.1169638102517
17.2813552099096
16.8792925100727
17.2557945601102
17.501873690151
17.6490477001922
17.7733493802508
Similarly, if we used the regression coefficients from regress, so B = regress(Y,X); then doing Y2 = X*B, we get:
Y2 =
16.1491563399927
16.1264219599996
16.5253315999987
17.3170317999969
16.7481541299967
17.3266932499992
16.5465094099978
16.5184456099983
16.8428701099975
17.0749421099985
16.7393449999981
17.2993993099983
17.3925811699993
17.3347117199991
17.3362798299967
17.3184486799987
17.1169638100025
17.281355209999
16.8792925099983
17.2557945599979
17.5018736899983
17.6490476999977
17.7733493799981
There are some slight computational differences, which is to be expected. Similarly, we can also find the answer by using mldivide:
B = X \ Y
B =
0
0
28321.726441712
0
35241.9719075889
899.386999170666
-95491.6154989513
-2879.96318251572
-31375.7038251485
5993.52959751295
0
18312.6649114859
0
0
8031.43912336425
27923.2569044349
7716.51932559712
-13621.1638586983
36721.8387047123
80622.0849068411
-114048.707779954
-70838.6034824987
-22843.7931997086
5345.06937206919
0
106542.307940158
-14178.0346010521
-20506.8096165825
-2498.51437396236
6783.31072430201
You can see that this curiously matches up with what regress gives you. That's because \ is a more smarter operator. Depending on how your matrix is structured, it finds the solution to the system by a different method. I'd like to defer you to the post by Amro that talks about what algorithms mldivide uses when examining the properties of the input matrix being operated on:
How to implement Matlab's mldivide (a.k.a. the backslash operator "\")
What you should take away from this answer is that you can certainly go ahead and use those regression coefficients and they will more or less give you the expected output for each value of Y with each set of inputs for X. However, be warned that those coefficients are not unique. This is apparent as you said that you have ground truth coefficients that don't match up with the output of regress. It isn't matching up because it generated another answer that satisfies the constraints you have provided.
There is more than one answer that can describe that relationship if you have an underdetermined system, as you have seen by my experiments shown above.
I am having a hard time understanding what should be a simple concept. I have constructed a vector in MATLAB that is real and symmetric. When I take the FFT in MATLAB, the result has a significant imaginary component, even though the symmetry rules of the Fourier transform say that the FT of a real symmetric function should also be real and symmetric. My example code:
N = 1 + 2^8;
k = linspace(-1,1,N);
V = exp(-abs(k));
Vf1 = fft(fftshift(V));
Vf2 = fft(ifftshift(V));
Vf3 = ifft(fftshift(V));
Vf4 = ifft(ifftshift(V));
Vf5 = fft(V);
Vf6 = ifft(V);
disp([isreal(Vf1) isreal(Vf2) isreal(Vf3) isreal(Vf4) isreal(Vf5) isreal(Vf6)])
Result:
0 0 0 0 0 0
No combinations of (i)fft or (i)fftshift result in a real symmetric vector. I've tried with both even and odd N (N = 2^8 vs. N = 1+2^8).
I did try looking at k+flip(k) and there are some residuals on the order of eps(1), but the residuals are also symmetric and the imaginary part of the FFT is not coming out as fuzz on the order of eps(1), but rather with magnitude comparable to the real part.
What blindingly obvious thing am I missing?
Blindingly obvious thing I was missing:
The FFT is not an integral over all space, so it assumes a periodic signal. Above, I am duplicating the last point in the period when I choose an even N, and so there is no way to shift it around to put the zero frequency at the beginning without fractional indexing, which does not exist.
A word about my choice of k. It is not arbitrary. The actual problem I am trying to solve is to generate a model FTIR interferogram which I will then FFT to get a spectrum. k is the distance that the interferometer travels which gets transformed to frequency in wavenumbers. In the real problem there will be various scaling factors so that the generating function V will yield physically meaningful numbers.
It's
Vf = fftshift(fft(ifftshift(V)));
That is, you need ifftshift in time-domain so that samples are interpreted as those of a symmetric function, and then fftshift in frequency-domain to again make symmetry apparent.
This only works for N odd. For N even, the concept of a symmetric function does not make sense: there is no way to shift the signal so that it is symmetric with respect to the origin (the origin would need to be "between two samples", which is impossible).
For your example V, the above code gives Vf real and symmetric. The following figure has been generated with semilogy(Vf), so that small as well as large values can be seen. (Of course, you could modify the horizontal axis so that the graph is centered at 0 frequency as it should; but anyway the graph is seen to be symmetric.)
#Yvon is absolutely right with his comment about symmetry. Your input signal looks symmetrical, but it isn't because symmetry is related to origin 0.
Using linspace in Matlab for constructing signals is mostly a bad choice.
Trying to repair the results with fftshift is a bad idea too.
Use instead:
k = 2*(0:N-1)/N - 1;
and you will get the result you expect.
However the imaginary part of the transformed values will not be perfectly zero.
There is some numerical noise.
>> max(abs(imag(Vf5)))
ans =
2.5535e-15
Answer to Yvon's question:
Why? >> N = 1+2^4 N = 17 >> x=linspace(-1,1,N) x = -1.0000 -0.8750 -0.7500 -0.6250 -0.5000 -0.3750 -0.2500 -0.1250 0 0.1250 0.2500 0.3750 0.5000 0.6250 0.7500 0.8750 1.0000 >> y=2*(0:N-1)/N-1 y = -1.0000 -0.8824 -0.7647 -0.6471 -0.5294 -0.4118 -0.2941 -0.1765 -0.0588 0.0588 0.1765 0.2941 0.4118 0.5294 0.6471 0.7647 0.8824 – Yvon 1
Your example is not a symmetric (even) function, but an antisymmetric (odd) function. However, this makes no difference.
For a antisymmetric function of length N the following statement is true:
f[i] == -f[-i] == -f[N-i]
The index i runs from 0 to N-1.
Let us see was happens with i=2. Remember, count starts with 0 and ends with 16.
x[2] = -0.75
-x[N-2] == -x[17-2] == -x[15] = (-1) 0.875 = -0.875
x[2] != -x[N-2]
y[2] = -0.7647
-y[N-2] == -y[15] = (-1) 0.7647
y[2] == y[N-2]
The problem is, that the origin of Matlab vectors start at 1.
Modulo (periodic) vectors start with origin 0.
This difference leads to many misunderstandings.
Another way of explanation why linspace(-1,+1,N) is not correct:
Imagine you have a vector which holds a single period of a periodic function,
for instance a Cosinus function. This single period is one of a infinite number of periods.
The first value of your Cosinus vector must not be same as the last value of your vector.
However,that is exactly what linspace(-1,+1,N) does.
Doing so, results in a sequence where the last value of period 1 is the same value as the first sample of the following period 2. That is not what you want.
To avoid this mistake use t = 2*(0:N-1)/N - 1. The distance t[i+1]-t[i] is 2/N and the last value has to be t[N-1] = 1 - 2/N and not 1.
Answer to Yvon's second comment
Whatever you put in an input vector of a DFT/FFT, by theory it is interpreted as a periodic function.
But that is not the point.
DFT performs an integration.
fft(m) = Sum_(k=0)^(N-1) (x(k) exp(-i 2 pi m k/N )
The first value x(k=0) describes the amplitude of the first integration interval of length 1/N. The second value x(k=1) describes the amplitude of the second integration interval of length 1/N. And so on.
The very last integration interval of the symmetric function ends with same value as the first sample. This means, the starting point of the last integration interval is k=N-1 = 1-1/N. Your input vector holds the starting points of the integration intervals.
Therefore, the last point of symmetry k=N is a point of the function, but it is not a starting point of an integration interval and so it is not a member of the input vector.
You have a problem when implementing the concept "symmetry". A purely real, even (or "symmetric") function has a Fourier transform function that is also real and even. "Even" is the symmetry with respect to the y-axis, or the t=0 line.
When implementing a signal in Matlab, however, you always start from t=0. That is, there is no way to "define" the signal starting from before the origin of time.
Searching the Internet for a while lead me to this -
Correct use of fftshift and ifftshift at input to fft and ifft.
As Luis has pointed out, you need to perform ifftshift before feeding the signal into fft. The reason has never been documented in Matlab, but only in that thread. For historical reasons, outputs AND inputs of fft and ifft are swapped. That is, instead of ordered from -N/2 to N/2-1 (the natural order), the signal in time or frequency domain is ordered from 0 to N/2-1 and then -N/2 to -1. That means, the correct way to code is fft( ifftshift(V) ), but most people ignore this at most times. Why it's got silently ignored rather than raising huge problems is that most concerns have been put on the amplitude of signal, not phase. Since circular shifting does not affect amplitude spectrum, this is not a problem (even for the Matlab guys who have written the documentations).
To check the amplitude equality -
Vf2 = fft(ifftshift(V));
Vf5 = fft(V);
Va2 = abs(fftshift(Vf2));
Va5 = abs(fftshift(Vf5));
>> min(abs(Va2-Va5)<1e-10)
ans =
1
To see how badly wrong in phase -
Vp2 = angle(fftshift(Vf2));
Vp5 = angle(fftshift(Vf5));
Anyway, as I wrote in the comment, after copy&pasting your code into a fresh and clean Matlab, it gives 0 1 0 1 0 0.
To your question about N=even and N=odd, my opinion is when N=even, the signal is not symmetric, since there are unequal number of points on either side of the time origin.
Just add the following line after "k = linspace(-1,1,N);"
k(end)=[];
it will remove the last element of the array. This is defined to be symmetric array.
also consider that isreal(complex(1,0)) is false!!!
The isreal function just checks for the memory storage format. so 1+0i is not real in the above example.
You have define your function in order to check for real numbers (like this)
myisreal=#(x) all((abs(imag(x))<1e-6*abs(real(x)))|(abs(x)<1e-8));
Finally your source code should become something like this:
N = 1 + 2^8;
k = linspace(-1,1,N);
k(end)=[];
V = exp(-abs(k));
Vf1 = fft(fftshift(V));
Vf2 = fft(ifftshift(V));
Vf3 = ifft(fftshift(V));
Vf4 = ifft(ifftshift(V));
Vf5 = fft(V);
Vf6 = ifft(V);
myisreal=#(x) all((abs(imag(x))<1e-6*abs(real(x)))|(abs(x)<1e-8));
disp([myisreal(Vf1) myisreal(Vf2) myisreal(Vf3) myisreal(Vf4) myisreal(Vf5) myisreal(Vf6)]);
I would like to calibrate a interest rate tree using the optimization tool in matlab. Need some guidance on doing it.
The interest rate tree looks like this:
How it works:
3.73% = 2.5%*exp(2*0.2)
96.40453 = (0.5*100 + 0.5*100)/(1+3.73%)
94.15801 = (0.5*96.40453+ 0.5*97.56098)/(1+2.50%)
The value of 2.5% is arbitrary and the upper node is obtained by multiplying with an exponential of 2*volatility(here it is 20%).
I need to optimize the problem by varying different values for the lower node.
How do I do this optimization in Matlab?
What I have tried so far?
InterestTree{1}(1,1) = 0.03;
InterestTree{3-1}(1,3-1)= 2.5/100;
InterestTree{3}(2,:) = 100;
InterestTree{3-1}(1,3-2)= (2.5*exp(2*0.2))/100;
InterestTree{3-1}(2,3-1)=(0.5*InterestTree{3}(2,3)+0.5*InterestTree{3}(2,3-1))/(1+InterestTree{3-1}(1,3-1));
j = 3-2;
InterestTree{3-1}(2,3-2)=(0.5*InterestTree{3}(2,j+1)+0.5*InterestTree{3}(2,j))/(1+InterestTree{3-1}(1,j));
InterestTree{3-2}(2,3-2)=(0.5*InterestTree{3-1}(2,j+1)+0.5*InterestTree{3-1}(2,j))/(1+InterestTree{3-2}(1,j));
But I am not sure how to go about the optimization. Any suggestions to improve the code, do tell me..Need some guidance on this..
Are you expecting the tree to increase in size? Or are you just optimizing over the value of the "2.5%" parameter?
If it's the latter, there are two ways. The first is to model the tree using a closed form expression by replacing 2.5% with x, which is possible with the tree. There are nonlinear optimization toolboxes available in Matlab (e.g. more here), but it's been too long since I've done this to give you a more detailed answer.
The seconds is the approach I would immediately do. I'm interpreting the example you gave, so the equations I'm using may be incorrect - however, the principle of using the for loop is the same.
vol = 0.2;
maxival = 100;
val1 = zeros(1,maxival); %Preallocate
finalval = zeros(1,maxival);
for ival=1:maxival
val1(ival) = i/1000; %Use any scaling you want. This will go from 0.1% to 10%
val2=val1(ival)*exp(2*vol);
x1 = (0.5*100+0.5*100)/(1+val2); %Based on the equation you gave
x2 = (0.5*100+0.5*100)/(1+val1(ival)); %I'm assuming this is how you calculate the bottom node
finalval(ival) = x1*0.5+x2*0.5/(1+...); %The example you gave isn't clear, so replace this with whatever it should be
end
[maxval, indmaxval] = max(finalval);
The maximum value is in maxval, and the interest that maximized this is in val1(indmaxval).
I have the following code which is used to deconvolve a signal. It works very well, within my error limit...as long as I divide my final result by a very large factor (11000).
width = 83.66;
x = linspace(-400,400,1000);
a2 = 1.205e+004 ;
al = 1.778e+005 ;
b1 = 94.88 ;
c1 = 224.3 ;
d = 4.077 ;
measured = al*exp(-((abs((x-b1)./c1).^d)))+a2;
rect = #(x) 0.5*(sign(x+0.5) - sign(x-0.5));
rt = rect(x/83.66);
signal = conv(rt,measured,'same');
check = (1/11000)*conv(signal,rt,'same');
Here is what I have. measured represents the signal I was given. Signal is what I am trying to find. And check is to verify that if I convolve my slit with the signal I found, I get the same result. If you use what I have exactly, you will see that the check and measured are off by that factor of 11000~ish that I threw up there.
Does anyone have any suggestions. My thoughts are that the slit height is not exactly 1 or that convolve will not actually effectively deconvolve, as I request it to. (The use of deconv only gives me 1 point, so I used convolve instead).
I think you misunderstand what conv (and probably also therefore deconv) is doing.
A discrete convolution is simply a sum. In fact, you can expand it as a sum, using a couple of explicit loops, sums of products of the measured and rt vectors.
Note that sum(rt) is not 1. Were rt scaled to sum to 1, then conv would preserve the scaling of your original vector. So, note how the scalings pass through here.
sum(rt)
ans =
104
sum(measured)
ans =
1.0231e+08
signal = conv(rt,measured);
sum(signal)
ans =
1.0640e+10
sum(signal)/sum(rt)
ans =
1.0231e+08
See that this next version does preserve the scaling of your vector:
signal = conv(rt/sum(rt),measured);
sum(signal)
ans =
1.0231e+08
Now, as it turns out, you are using the same option for conv. This introduces an edge effect, since it truncates some of the signal so it ends up losing just a bit.
signal = conv(rt/sum(rt),measured,'same');
sum(signal)
ans =
1.0187e+08
The idea is that conv will preserve the scaling of your signal as long as the kernel is scaled to sum to 1, AND there are no losses due to truncation of the edges. Of course convolution as an integral also has a similar property.
By the way, where did that quoted factor of roughly 11000 come from?
sum(rt)^2
ans =
10816
Might be coincidence. Or not. Think about it.