Matlab. Poisson fit. Factorial - matlab

I have a histogram that seems to fit a poisson distribution.
In order to fit it, I declare the function myself as follows
xdata; ydata; % Arrays in which I have stored the data.
%Ydata tell us how many times the xdata is repeated in the set.
fun= #(x,xdata) (exp(-x(1))*(x(1).^(xdata)) )/(factorial(xdata)) %Function I
% want to use in the fit. It is a poisson distribution.
x0=[1]; %Approximated value of the parameter lambda to help the fit
p=lsqcurvefit(fun,x0,xdata,ydata); % Fit in the least square sense
I find an error. It probably has to do with the "factorial". Any ideas?

Factorial outputs a vector from vector xdata. Why are you using .xdata in factorial?
For example:
data = [1 2 3];
factorial(data) is then [1! 2! 3!].
Try ./factorial(xdata) (I cannot recall if the dot is even necessary at this case.)

You need to use gamma(xdata+1) function instead of factorial(xdata) function. Gamma function is a generalized form of factorial function which can be used for real and complex numbers. Thus, your code would be:
fun = #(x,xdata) exp(-x(1))*x(1).^xdata./gamma(xdata+1);
x = lsqcurvefit(fun,1,xdata,ydata);
Alternatively, you can MATLAB fitdist function which is already optimized and you might get better results:
pd = fitdist(xdata,'Poisson','Frequency',ydata);
pd.lambda

Related

Minimizing Function with vector valued input in MATLAB

I want to minimize a function like below:
Here, n can be 5,10,50 etc. I want to use Matlab and want to use Gradient Descent and Quasi-Newton Method with BFGS update to solve this problem along with backtracking line search. I am a novice in Matlab. Can anyone help, please? I can find a solution for a similar problem in that link: https://www.mathworks.com/help/optim/ug/unconstrained-nonlinear-optimization-algorithms.html .
But, I really don't know how to create a vector-valued function in Matlab (in my case input x can be an n-dimensional vector).
You will have to make quite a leap to get where you want to be -- may I suggest to go through some basic tutorial first in order to digest basic MATLAB syntax and concepts? Another useful read is the very basic example to unconstrained optimization in the documentation. However, the answer to your question touches only basic syntax, so we can go through it quickly nevertheless.
The absolute minimum to invoke the unconstraint nonlinear optimization algorithms of the Optimization Toolbox is the formulation of an objective function. That function is supposed to return the function value f of your function at any given point x, and in your case it reads
function f = objfun(x)
f = sum(100 * (x(2:end) - x(1:end-1).^2).^2 + (1 - x(1:end-1)).^2);
end
Notice that
we select the indiviual components of the x vector by matrix indexing, and that
the .^ notation effects that the operand is to be squared elementwise.
For simplicity, save this function to a file objfun.m in your current working directory, so that you have it available from the command window.
Now all you have to do is to call the appropriate optimization algorithm, say, the quasi Newton method, from the command window:
n = 10; % Use n variables
options = optimoptions(#fminunc,'Algorithm','quasi-newton'); % Use QM method
x0 = rand(n,1); % Random starting guess
[x,fval,exitflag] = fminunc(#objfun, x0, options); % Solve!
fprintf('Final objval=%.2e, exitflag=%d\n', fval, exitflag);
On my machine I see that the algorithm converges:
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the optimality tolerance.
Final objval=5.57e-11, exitflag=1

apply function to each column of a matrix (Vectorizing)

What is fastest way of applying function on each column of a matrix without looping through it?
The function I am using is pwelch but the concept should be the same for any function.
Currently I am looping though my matrix as such.
X = ones(5);
for i = 1:5 % length of the number of columns
result = somefunction(X(:,i))
end
Is there a way to vectorize this code?
You say
the concept should be the same for any function
Actually that's not the case. Depending on the function, the code that calls it can be made vectorized or not. It depends on how the function is written internally. From outside the function there's nothing you can do to make it vectorized. Vectorization is done within the function, not from the outside.
If the function is vectorized, you simply call it with a matrix, and the function works on each column. For example, that's what sum does.
In the case of pwelch, you are lucky: according to the documentation (emphasis added),
Pxx = pwelch(X) returns the Power Spectral Density (PSD) estimate, Pxx, ...
When X is a matrix, the PSD is
computed independently for each column and stored in the corresponding
column of Pxx.
So pwelch is a vectorized function.

parameter optimization of black-box function in MATLAB

I need an elegant, simple system to find out what is the highest value returned from a deterministic function given one, or more, parameters.
I know that there is a nice implementation of genetic algorithms in MATLAB, but actually, in my case this is an overkill. I need something simpler.
Any idea?
You cannot find a maximum with Matlab directly, but you can minimize something. Multiplying your function by -1 transformes your "find the maximum"-problem into a "find the minimum"-problem, which can be found with fminsearch
f = #(x) 2*x - 3*x.^2; % a simple function to find the maximum from
minusf = #(x) -1*f(x); % minus f, find minimum from this function
x = linspace(-2,2,100);
plot(x, f(x));
xmax = fminsearch(minusf, -1);
hold on
plot(xmax,f(xmax),'ro') % plot the minimum of minusf (maximum of f)
The result looks like this:
A real simple idea is to use a grid search approach, maybe with mesh refinements. A better idea would be to use a more advanced derivative-free optimizer, such as the Nelder-Mead algorithm. This is available in fminsearch.
You could also try algorithms from the global optimization toolbox: for example patternsearch or the infamous simulannealbnd.

Matlab optimization: what types of objective functions are 'allowed' with fminsearch.m and Co.?

Examples for optimizations with functions like fmincon.m and fminsearchbnd.m usually minimize objective functions that are relatively simple. With simple I mean that the objective function only consists of some algebraic expression, e.g. the Rosenbrock formula.
In my problem, on the other hand, the objective function consists of several steps, including
computing an L2-norm misfit between an observed data point and a set of n training data points (n~5e4)
selecting those data points from the training data set that give the lowest misfit
then using the row indices of this selected subset to compute the final distance that I intend to minimize.
i.e. I perform operations that cannot be formulated as a single mathematical expression. Can I use such an objective function with tools like fminsearchbnd.m or fmincon.m at all? My results so far are not very promising...
There is an easy and obvious solution for that. You fminsearch() to find a minimum for some self-defined functions. In my example, it is fitting a polynomial, which of course is easy, but the trick is, that this could be anything. You can access the data if you make your objective function as a nested function, so they share the same variable scope.
You can start from the following code and fill in everything you want to do part by part and maybe ask followup questions, if any come up.
function main
verbose = 1; % some output
% optimize something, maybe a distorted polynomial
x = sort(rand(20,1));
p_original = [1.5, 3, 2, 1];
y = polyval(p_original,x) + 0.5*(rand(size(x))-0.5);
% optimize polynomial of order order. This is an example of how to pass
% a parameter to the fit function.
order = 3;
% obvious solution is this, but we want to do something else
p_polyfit = polyfit(x,y,order)
% we want to do it a bit more complex
pfit = optimize_something(x, y, order, verbose)
% what is happening?
figure
plot(x,polyval(p_original,x),'k-')
hold on
plot(x,y,'ko')
plot(x,polyval(p_polyfit,x),'rs-')
plot(x,fit_function(x,pfit),'gx-')
legend('original','noisy','polyfit','optimization')
end
function pfit = optimize_something(x,y, order, verbose)
% for polynomial of order order we need order+1 coefficients
p0 = ones(1,order+1); % initial guess: all coefficients are 1
if verbose
fprintf('optimize_something calling fminsearch(#objFun)\n');
end
% hand over only p0 to our objective function
pfit = fminsearch(#objFun, p0);
% ------------------------- NESTED objFUN --------------------------------%
function e = objFun(p)
% This function accepts only p as parameter and returns a value e, which
% will be minimized by some metric (maybe least squares).
% Since this function is nested, it can use also the predefined variables x, y (and also p0 and verbose).
% The magic is, we calculate a value yfitted out of x and p by a
% fit_function. This function can really be anything!
yfitted = fit_function(x, p);
e = sum((yfitted-y).^2);
% e = sum(abs(yfitted-y)); % another possibility
end
% ------------------------- NESTED objFUN --------------------------------%
if verbose
disp('pfit found')
end
end
function yfitted = fit_function(x, p)
% In our example we want to fit a polynomial, so we do so. We evaluate the
% polynomial p at x.
yfitted = polyval(p,x);
% But it could be anything, really.. each value in p could be something
% else, maybe the sum of an exponential function and a straight line
% yfitted = p(1)*exp(p(2)*x) + p(3)*x + p(4);
end
You can try to use CVX. It is an addon for Matlab that lets you describe your optimisation problem with normal Matlab code.
Alternatively, write down your objective function including any constraints. Your description is not clear to me, and it would help you too, if you would write this down in actual formulae.
I read your steps as this:
"Computing an L2-norm between an observed data point and a set of n training data points." It seems that there is a total of one (1) observed data points. Let's call the observed point x. Let's call the training data points y_i for i=1..n.
The L2-Norm is: |x-y_i|.
"Selecting those data points [multiple?] that give the lowest misfit". You haven't said how many data points you want, and how you'd combine multiple points to give a single L2-Norm. Let's assume you want exactly one such point (the closest to the observed data point x). Thus you get: argmin (over i) |x-y_i|. If you have multiple, you could greedily take the k closest points.
"Then using the row indices of this selected subset to compute the final distance that I intend to minimize." And what is the final distance that you intend to minimize?

Find approximation of sine using least squares

I am doing a project where i find an approximation of the Sine function, using the Least Squares method. Also i can use 12 values of my own choice.Since i couldn't figure out how to solve it i thought of using Taylor's series for Sine and then solving it as a polynomial of order 5. Here is my code :
%% Find the sine of the 12 known values
x=[0,pi/8,pi/4,7*pi/2,3*pi/4,pi,4*pi/11,3*pi/2,2*pi,5*pi/4,3*pi/8,12*pi/20];
y=zeros(12,1);
for i=1:12
y=sin(x);
end
n=12;
j=5;
%% Find the sums to populate the matrix A and matrix B
s1=sum(x);s2=sum(x.^2);
s3=sum(x.^3);s4=sum(x.^4);
s5=sum(x.^5);s6=sum(x.^6);
s7=sum(x.^7);s8=sum(x.^8);
s9=sum(x.^9);s10=sum(x.^10);
sy=sum(y);
sxy=sum(x.*y);
sxy2=sum( (x.^2).*y);
sxy3=sum( (x.^3).*y);
sxy4=sum( (x.^4).*y);
sxy5=sum( (x.^5).*y);
A=[n,s1,s2,s3,s4,s5;s1,s2,s3,s4,s5,s6;s2,s3,s4,s5,s6,s7;
s3,s4,s5,s6,s7,s8;s4,s5,s6,s7,s8,s9;s5,s6,s7,s8,s9,s10];
B=[sy;sxy;sxy2;sxy3;sxy4;sxy5];
Then at matlab i get this result
>> a=A^-1*B
a =
-0.0248
1.2203
-0.2351
-0.1408
0.0364
-0.0021
However when i try to replace the values of a in the taylor series and solve f.e t=pi/2 i get wrong results
>> t=pi/2;
fun=t-t^3*a(4)+a(6)*t^5
fun =
2.0967
I am doing something wrong when i replace the values of a matrix in the Taylor series or is my initial thought flawed ?
Note: i can't use any built-in function
If you need a least-squares approximation, simply decide on a fixed interval that you want to approximate on and generate some x abscissae on that interval (possibly equally spaced abscissae using linspace - or non-uniformly spaced as you have in your example). Then evaluate your sine function at each point such that you have
y = sin(x)
Then simply use the polyfit function (documented here) to obtain least squares parameters
b = polyfit(x,y,n)
where n is the degree of the polynomial you want to approximate. You can then use polyval (documented here) to obtain the values of your approximation at other values of x.
EDIT: As you can't use polyfit you can generate the Vandermonde matrix for the least-squares approximation directly (the below assumes x is a row vector).
A = ones(length(x),1);
x = x';
for i=1:n
A = [A x.^i];
end
then simply obtain the least squares parameters using
b = A\y;
You can clearly optimise the clumsy Vandermonde generation loop above I have just written to illustrate the concept. For better numerical stability you would also be better to use a nice orthogonal polynomial system like Chebyshev polynomials of the first kind. If you are not even allowed to use the matrix divide \ function then you will need to code up your own implementation of a QR factorisation and solve the system that way (or some other numerically stable method).