Find approximation of sine using least squares - matlab

I am doing a project where i find an approximation of the Sine function, using the Least Squares method. Also i can use 12 values of my own choice.Since i couldn't figure out how to solve it i thought of using Taylor's series for Sine and then solving it as a polynomial of order 5. Here is my code :
%% Find the sine of the 12 known values
x=[0,pi/8,pi/4,7*pi/2,3*pi/4,pi,4*pi/11,3*pi/2,2*pi,5*pi/4,3*pi/8,12*pi/20];
y=zeros(12,1);
for i=1:12
y=sin(x);
end
n=12;
j=5;
%% Find the sums to populate the matrix A and matrix B
s1=sum(x);s2=sum(x.^2);
s3=sum(x.^3);s4=sum(x.^4);
s5=sum(x.^5);s6=sum(x.^6);
s7=sum(x.^7);s8=sum(x.^8);
s9=sum(x.^9);s10=sum(x.^10);
sy=sum(y);
sxy=sum(x.*y);
sxy2=sum( (x.^2).*y);
sxy3=sum( (x.^3).*y);
sxy4=sum( (x.^4).*y);
sxy5=sum( (x.^5).*y);
A=[n,s1,s2,s3,s4,s5;s1,s2,s3,s4,s5,s6;s2,s3,s4,s5,s6,s7;
s3,s4,s5,s6,s7,s8;s4,s5,s6,s7,s8,s9;s5,s6,s7,s8,s9,s10];
B=[sy;sxy;sxy2;sxy3;sxy4;sxy5];
Then at matlab i get this result
>> a=A^-1*B
a =
-0.0248
1.2203
-0.2351
-0.1408
0.0364
-0.0021
However when i try to replace the values of a in the taylor series and solve f.e t=pi/2 i get wrong results
>> t=pi/2;
fun=t-t^3*a(4)+a(6)*t^5
fun =
2.0967
I am doing something wrong when i replace the values of a matrix in the Taylor series or is my initial thought flawed ?
Note: i can't use any built-in function

If you need a least-squares approximation, simply decide on a fixed interval that you want to approximate on and generate some x abscissae on that interval (possibly equally spaced abscissae using linspace - or non-uniformly spaced as you have in your example). Then evaluate your sine function at each point such that you have
y = sin(x)
Then simply use the polyfit function (documented here) to obtain least squares parameters
b = polyfit(x,y,n)
where n is the degree of the polynomial you want to approximate. You can then use polyval (documented here) to obtain the values of your approximation at other values of x.
EDIT: As you can't use polyfit you can generate the Vandermonde matrix for the least-squares approximation directly (the below assumes x is a row vector).
A = ones(length(x),1);
x = x';
for i=1:n
A = [A x.^i];
end
then simply obtain the least squares parameters using
b = A\y;
You can clearly optimise the clumsy Vandermonde generation loop above I have just written to illustrate the concept. For better numerical stability you would also be better to use a nice orthogonal polynomial system like Chebyshev polynomials of the first kind. If you are not even allowed to use the matrix divide \ function then you will need to code up your own implementation of a QR factorisation and solve the system that way (or some other numerically stable method).

Related

MATLAB: polyval function for N greater than 1

I am trying trying to graph the polynomial fit of a 2D dataset in Matlab.
This is what I tried:
rawTable = readtable('Test_data.xlsx','Sheet','Sheet1');
x = rawTable.A;
y = rawTable.B;
figure(1)
scatter(x,y)
c = polyfit(x,y,2);
y_fitted = polyval(c,x);
hold on
plot(x,y_fitted,'r','LineWidth',2)
rawTable.A and rawTable.A are randomly generated numbers. (i.e. the x dataset cannot be represented in the following form : x=0:0.1:100)
The result:
second-order polynomial
But the result I expect looks like this (generated in Excel):
enter image description here
How can I graph the second-order polynomial fit in MATLAB?
I sense some confusion regarding what the output of each of those Matlab function mean. So I'll clarify. And I think we need some details as well. So expect some verbosity. A quick answer, however, is available at the end.
c = polyfit(x,y,2) gives the coefficient vectors of the polynomial fit. You can get the fit information such as error estimate following the documentation.
Name this polynomial as P. P in Matlab is actually the function P=#(x)c(1)*x.^2+c(2)*x+c(3).
Suppose you have a single point X, then polyval(c,X) outputs the value of P(X). And if x is a vector, polyval(c,x) is a vector corresponding to [P(x(1)), P(x(2)),...].
Now that does not represent what the fit is. Just as a quick hack to see something visually, you can try plot(sort(x),polyval(c,sort(x)),'r','LineWidth',2), ie. you can first sort your data and try plotting on those x-values.
However, it is only a hack because a) your data set may be so irregularly spaced that the spline doesn't represent function or b) evaluating on the whole of your data set is unnecessary and inefficient.
The robust and 'standard' way to plot a 2D function of known analytical form in Matlab is as follows:
Define some evenly-spaced x-values over the interval you want to plot the function. For example, x=1:0.1:10. For example, x=linspace(0,1,100).
Evaluate the function on these x-values
Put the above two components into plot(). plot() can either plot the function as sampled points, or connect the points with automatic spline, which is the default.
(For step 1, quadrature is ambiguous but specific enough of a term to describe this process if you wish to communicate with a single word.)
So, instead of using the x in your original data set, you should do something like:
t=linspace(min(x),max(x),100);
plot(t,polyval(c,t),'r','LineWidth',2)

Computing the DFT of an arbitrary signal

As part of a course in signal processing at university, we have been asked to write an algorithm in Matlab to calculate the single sided spectrum of our signal using DFT, without using the fft() function built in to matlab. this isn't an assessed part of the course, I'm just interested in getting this "right" for myself. I am currently using the 2018b version of Matlab, should anyone find this useful.
I have built a signal of a 1 KHz and 2KHz sinusoid, phase shifted by 135 degrees (2*pi/3 rad).
then using the equations in 9.1 of Discrete time signal processing (Allan V. Oppenheim) and Euler's formula to simplify the exponent, I produce this code:
%%DFT(currently buggy)
n=0;m=0;
for m=1:DFT_N-1 %DFT_Fmin;DFT_Fmax; %scrolls through DFT m values (K in text.)
for n=1:DFT_N-1;%;(DFT_N-1);%<<redundant code? from Oppenheim eqn. 9.1 % eulers identity, K=m and n=n
X(m)=x(n)*(cos((2*pi*n*m)/DFT_N)-j*sin((2*pi*n*m)/DFT_N));
n=n+1;
end
%m=m+1; %redundant code?
end
This takes x as the input, in this case the signal mentioned earlier, as well as the resolution of the transform, as represented by the DFT_N, which has been initialized to 100. The output of this function, X, should be something in the frequency domain, but plotting X yields a circular plot slightly larger than the unit circle, and with a gap on the left hand edge.
I am struggling to see how I am supposed to convert this to the stem() plots as given by the in-built DFT algorithm.
Many thanks, J.
This is your bug:
replace X(m)=x(n)*(cos.. with X(m)=X(m)+x(n)*(cos..
For a given m, it does not integrate over the variable n, but overwrites X(m) only the last calculation for n = DFT_N-1.
Notice that integrating over n=1:DFT_N-1 omits one harmonic, i.e., the first one, exp(-j*2*pi). Replace
n=1:DFT_N-1 with n=1:DFT_N to include that. I would also replace m=1:DFT_N-1 with m=1:DFT_N for plotting concerns.
Also replace any 2*pi*n*m with 2*pi*(n-1)*(m-1) to get the phase right, since the first entry of X should correspond to zero-frequency, yielding sum_n x(n) * (cos(0) + j sin(0)) = sum_n x(n). If your signal x is real-valued then the zero-frequency component X(1) should be real-valued, angle(X(1))=0.
Last remark, don't forget to shift zero-frequency component to the center of the spectrum for better visibility, X = circshift(X,floor(size(X)/2));
If you are interested in the single-sided spectrum only, than you can just calculate X(m) for m=1:DFT_N/2 since X it is conjugate symmetric around m=DFT_N/2, i.e., X(DFT_N/2+m) = X(DFT_N/2-m)', due to exp(-j*(pi*n+2*pi/DFT_N*m)) = exp(-j*(pi*n-2*pi/DFT_N*m))'.
As a side note, for a given m this program calculates an inner product between the array x and another array of complex exponentials, i.e., exp(-j*2*pi/DFT_N*m*n), for n = 0,1,...,N-1. MATLAB syntax is very convenient for such calculations, and you can avoid this inner loop by the following command
exp(-j*2*pi/DFT_N*m*(0:DFT_N-1)) * x
where x is a column vector. Similarly, you can avoid the first loop too by expanding your complex exponential vector row-wise for every m, i.e., build the matrix exp(-j*2*pi/DFT_N*(0:DFT_N-1)'*(0:DFT_N-1)). Then your DFT is simply
X = exp(-j*2*pi/DFT_N*(0:DFT_N-1)'*(0:DFT_N-1)) * x
For single-sided spectrum, instead use
X = exp(-j*2*pi/DFT_N*(0:floor((DFT_N-1)/2))'*(0:DFT_N-1)) * x

How to solve equations with complex coefficients using ode45 in MATLAB?

I am trying to solve two equations with complex coefficients using ode45.
But iam getting an error message as "Inputs must be floats, namely single or
double."
X = sym(['[',sprintf('X(%d) ',1:2),']']);
Eqns=[-(X(1)*23788605396486326904946699391889*1i)/38685626227668133590597632 + (X(2)*23788605396486326904946699391889*1i)/38685626227668133590597632; (X(2)*23788605396486326904946699391889*1i)/38685626227668133590597632 + X(1)*(- 2500000 + (5223289665997855453060886952725538686654593059791*1i)/324518553658426726783156020576256)] ;
f=#(t,X)[Eqns];
[t,Xabc]=ode45(f,[0 300*10^-6],[0 1])
How can i fix this ? Can somebody can help me ?
Per the MathWorks Support Team, the "ODE solvers in MATLAB 5 (R12) and later releases properly handle complex valued systems." So the complex numbers are the not the issue.
The error "Inputs must be floats, namely single or double." stems from your definition of f using Symbolic Variables that are, unlike complex numbers, not floats. The easiest way to get around this is to not use the Symbolic Toolbox at all; just makes Eqns an anonymous function:
Eqns= #(t,X) [-(X(1)*23788605396486326904946699391889*1i)/38685626227668133590597632 + (X(2)*23788605396486326904946699391889*1i)/38685626227668133590597632; (X(2)*23788605396486326904946699391889*1i)/38685626227668133590597632 + X(1)*(- 2500000 + (5223289665997855453060886952725538686654593059791*1i)/324518553658426726783156020576256)] ;
[t,Xabc]=ode45(Eqns,[0 300*10^-6],[0 1]);
That being said, I'd like to point out that numerically time integrating this system over 300 microseconds (I assume without units given) will take a long time since your coefficient matrix has imaginary eigenvalues on the order of 10E+10. The extremely short wavelength of those oscillations will more than likely be resolved by Matlab's adaptive methods, and that will take a while to solve for a time span just a few orders greater than the wavelength.
I'd, therefore, suggest an analytical approach to this problem; unless it is a stepping stone another problem that is non-analytically solvable.
Systems of ordinary differential equations of the form
,
which is a linear, homogenous system with a constant coefficient matrix, has the general solution
,
where the m-subscripted exponential function is the matrix exponential.
Therefore, the analytical solution to the system can be calculated exactly assuming the matrix exponential can be calculated.
In Matlab, the matrix exponential is calculate via the expm function.
The following code computes the analytical solution and compares it to the numerical one for a short time span:
% Set-up
A = [-23788605396486326904946699391889i/38685626227668133590597632,23788605396486326904946699391889i/38685626227668133590597632;...
-2500000+5223289665997855453060886952725538686654593059791i/324518553658426726783156020576256,23788605396486326904946699391889i/38685626227668133590597632];
Eqns = #(t,X) A*X;
X0 = [0;1];
% Numerical
options = odeset('RelTol',1E-8,'AbsTol',1E-8);
[t,Xabc]=ode45(Eqns,[0 1E-9],X0,options);
% Analytical
Xana = cell2mat(arrayfun(#(tk) expm(A*tk)*X0,t,'UniformOutput',false)')';
k = 1;
% Plots
figure(1);
subplot(3,1,1)
plot(t,abs(Xana(:,k)),t,abs(Xabc(:,k)),'--');
title('Magnitude');
subplot(3,1,2)
plot(t,real(Xana(:,k)),t,real(Xabc(:,k)),'--');
title('Real');
ylabel('Values');
subplot(3,1,3)
plot(t,imag(Xana(:,k)),t,imag(Xabc(:,k)),'--');
title('Imaginary');
xlabel('Time');
The comparison plot is:
The output of ode45 matches the magnitude and real parts of the solution very well, but the imaginary portion is out-of-phase by exactly π.
However, since ode45's error estimator only looks at norms, the phase difference is not noticed which may lead to problems depending on the application.
It will be noted that while the matrix exponential solution is far more costly than ode45 for the same number of time vector elements, the analytical solution will produce the exact solution for any time vector of any density given to it. So for long time solutions, the matrix exponential can be viewed as an improvement in some sense.

Generate random samples from arbitrary discrete probability density function in Matlab

I've got an arbitrary probability density function discretized as a matrix in Matlab, that means that for every pair x,y the probability is stored in the matrix:
A(x,y) = probability
This is a 100x100 matrix, and I would like to be able to generate random samples of two dimensions (x,y) out of this matrix and also, if possible, to be able to calculate the mean and other moments of the PDF. I want to do this because after resampling, I want to fit the samples to an approximated Gaussian Mixture Model.
I've been looking everywhere but I haven't found anything as specific as this. I hope you may be able to help me.
Thank you.
If you really have a discrete probably density function defined by A (as opposed to a continuous probability density function that is merely described by A), you can "cheat" by turning your 2D problem into a 1D problem.
%define the possible values for the (x,y) pair
row_vals = [1:size(A,1)]'*ones(1,size(A,2)); %all x values
col_vals = ones(size(A,1),1)*[1:size(A,2)]; %all y values
%convert your 2D problem into a 1D problem
A = A(:);
row_vals = row_vals(:);
col_vals = col_vals(:);
%calculate your fake 1D CDF, assumes sum(A(:))==1
CDF = cumsum(A); %remember, first term out of of cumsum is not zero
%because of the operation we're doing below (interp1 followed by ceil)
%we need the CDF to start at zero
CDF = [0; CDF(:)];
%generate random values
N_vals = 1000; %give me 1000 values
rand_vals = rand(N_vals,1); %spans zero to one
%look into CDF to see which index the rand val corresponds to
out_val = interp1(CDF,[0:1/(length(CDF)-1):1],rand_vals); %spans zero to one
ind = ceil(out_val*length(A));
%using the inds, you can lookup each pair of values
xy_values = [row_vals(ind) col_vals(ind)];
I hope that this helps!
Chip
I don't believe matlab has built-in functionality for generating multivariate random variables with arbitrary distribution. As a matter of fact, the same is true for univariate random numbers. But while the latter can be easily generated based on the cumulative distribution function, the CDF does not exist for multivariate distributions, so generating such numbers is much more messy (the main problem is the fact that 2 or more variables have correlation). So this part of your question is far beyond the scope of this site.
Since half an answer is better than no answer, here's how you can compute the mean and higher moments numerically using matlab:
%generate some dummy input
xv=linspace(-50,50,101);
yv=linspace(-30,30,100);
[x y]=meshgrid(xv,yv);
%define a discretized two-hump Gaussian distribution
A=floor(15*exp(-((x-10).^2+y.^2)/100)+15*exp(-((x+25).^2+y.^2)/100));
A=A/sum(A(:)); %normalized to sum to 1
%plot it if you like
%figure;
%surf(x,y,A)
%actual half-answer starts here
%get normalized pdf
weight=trapz(xv,trapz(yv,A));
A=A/weight; %A normalized to 1 according to trapz^2
%mean
mean_x=trapz(xv,trapz(yv,A.*x));
mean_y=trapz(xv,trapz(yv,A.*y));
So, the point is that you can perform a double integral on a rectangular mesh using two consecutive calls to trapz. This allows you to compute the integral of any quantity that has the same shape as your mesh, but a drawback is that vector components have to be computed independently. If you only wish to compute things which can be parametrized with x and y (which are naturally the same size as you mesh), then you can get along without having to do any additional thinking.
You could also define a function for the integration:
function res=trapz2(xv,yv,A,arg)
if ~isscalar(arg) && any(size(arg)~=size(A))
error('Size of A and var must be the same!')
end
res=trapz(xv,trapz(yv,A.*arg));
end
This way you can compute stuff like
weight=trapz2(xv,yv,A,1);
mean_x=trapz2(xv,yv,A,x);
NOTE: the reason I used a 101x100 mesh in the example is that the double call to trapz should be performed in the proper order. If you interchange xv and yv in the calls, you get the wrong answer due to inconsistency with the definition of A, but this will not be evident if A is square. I suggest avoiding symmetric quantities during the development stage.

Exponential curve fit matlab

I have the following equation:
I want to do a exponential curve fitting using MATLAB for the above equation, where y = f(u,a). y is my output while (u,a) are my inputs. I want to find the coefficients A,B for a set of provided data.
I know how to do this for simple polynomials by defining states. As an example, if states= (ones(size(u)), u u.^2), this will give me L+Mu+Nu^2, with L, M and N being regression coefficients.
However, this is not the case for the above equation. How could I do this in MATLAB?
Building on what #eigenchris said, simply take the natural logarithm (log in MATLAB) of both sides of the equation. If we do this, we would in fact be linearizing the equation in log space. In other words, given your original equation:
We get:
However, this isn't exactly polynomial regression. This is more of a least squares fitting of your points. Specifically, what you would do is given a set of y and set pair of (u,a) points, you would build a system of equations and solve for this system via least squares. In other words, given the set y = (y_0, y_1, y_2,...y_N), and (u,a) = ((u_0, a_0), (u_1, a_1), ..., (u_N, a_N)), where N is the number of points that you have, you would build your system of equations like so:
This can be written in matrix form:
To solve for A and B, you simply need to find the least-squares solution. You can see that it's in the form of:
Y = AX
To solve for X, we use what is called the pseudoinverse. As such:
X = A^{*} * Y
A^{*} is the pseudoinverse. This can eloquently be done in MATLAB using the \ or mldivide operator. All you have to do is build a vector of y values with the log taken, as well as building the matrix of u and a values. Therefore, if your points (u,a) are stored in U and A respectively, as well as the values of y stored in Y, you would simply do this:
x = [u.^2 a.^3] \ log(y);
x(1) will contain the coefficient for A, while x(2) will contain the coefficient for B. As A. Donda has noted in his answer (which I embarrassingly forgot about), the values of A and B are obtained assuming that the errors with respect to the exact curve you are trying to fit to are normally (Gaussian) distributed with a constant variance. The errors also need to be additive. If this is not the case, then your parameters achieved may not represent the best fit possible.
See this Wikipedia page for more details on what assumptions least-squares fitting takes:
http://en.wikipedia.org/wiki/Least_squares#Least_squares.2C_regression_analysis_and_statistics
One approach is to use a linear regression of log(y) with respect to u² and a³:
Assuming that u, a, and y are column vectors of the same length:
AB = [u .^ 2, a .^ 3] \ log(y)
After this, AB(1) is the fit value for A and AB(2) is the fit value for B. The computation uses Matlab's mldivide operator; an alternative would be to use the pseudo-inverse.
The fit values found this way are Maximum Likelihood estimates of the parameters under the assumption that deviations from the exact equation are constant-variance normally distributed errors additive to A u² + B a³. If the actual source of deviations differs from this, these estimates may not be optimal.