Why does treating the index as a continuous variable not work when performing an inverse discrete Fourier transform? - matlab

I have a set of points describing a closed curve in the complex plane, call it Z = [z_1, ..., z_N]. I'd like to interpolate this curve, and since it's periodic, trigonometric interpolation seemed a natural choice (especially because of its increased accuracy). By performing the FFT, we obtain the Fourier coefficients:
F = fft(Z);
At this point, we could get Z back by the formula (where 1i is the imaginary unit, and we use (k-1)*(n-1) because MATLAB indexing starts at 1)
N
Z(n) = (1/N) sum F(k)*exp( 1i*2*pi*(k-1)*(n-1)/N), 1 <= n <= N.
k=1
My question
Is there any reason why n must be an integer? Presumably, if we treat n as any real number between 1 and N, we will just get more points on the interpolated curve. Is this true? For example, if we wanted to double the number of points, could we not set
N
Z_new(n) = (1/N) sum F(k)*exp( 1i*2*pi*(k-1)*(n-1)/N), with n = 1, 1.5, 2, 2.5, ..., N-1, N-0.5, N
k=1
?
The new points are of course just subject to some interpolation error, but they'll be fairly accurate, right? The reason I'm asking this question is because this method is not working for me. When I try to do this, I get a garbled mess of points that makes no sense.
(By the way, I know that I could use the interpft() command, but I'd like to add points only in certain areas of the curve, for example between z_a and z_b)

The point is when n is integer, you have some primary functions which are orthogonal and can be as a basis for the space. When, n is not integer, The exponential functions in the formula, are not orthogonal. Hence, the expression of a function based on these non-orthogonal basis is not meaningful as much as you expected.
For orthogonality case you can see the following as an example (from here). As you can check, you can find two n_1 and n_2 which are not integer, the following integrals are not zero any more, and they are not orthogonal.

Related

Computing the DFT of an arbitrary signal

As part of a course in signal processing at university, we have been asked to write an algorithm in Matlab to calculate the single sided spectrum of our signal using DFT, without using the fft() function built in to matlab. this isn't an assessed part of the course, I'm just interested in getting this "right" for myself. I am currently using the 2018b version of Matlab, should anyone find this useful.
I have built a signal of a 1 KHz and 2KHz sinusoid, phase shifted by 135 degrees (2*pi/3 rad).
then using the equations in 9.1 of Discrete time signal processing (Allan V. Oppenheim) and Euler's formula to simplify the exponent, I produce this code:
%%DFT(currently buggy)
n=0;m=0;
for m=1:DFT_N-1 %DFT_Fmin;DFT_Fmax; %scrolls through DFT m values (K in text.)
for n=1:DFT_N-1;%;(DFT_N-1);%<<redundant code? from Oppenheim eqn. 9.1 % eulers identity, K=m and n=n
X(m)=x(n)*(cos((2*pi*n*m)/DFT_N)-j*sin((2*pi*n*m)/DFT_N));
n=n+1;
end
%m=m+1; %redundant code?
end
This takes x as the input, in this case the signal mentioned earlier, as well as the resolution of the transform, as represented by the DFT_N, which has been initialized to 100. The output of this function, X, should be something in the frequency domain, but plotting X yields a circular plot slightly larger than the unit circle, and with a gap on the left hand edge.
I am struggling to see how I am supposed to convert this to the stem() plots as given by the in-built DFT algorithm.
Many thanks, J.
This is your bug:
replace X(m)=x(n)*(cos.. with X(m)=X(m)+x(n)*(cos..
For a given m, it does not integrate over the variable n, but overwrites X(m) only the last calculation for n = DFT_N-1.
Notice that integrating over n=1:DFT_N-1 omits one harmonic, i.e., the first one, exp(-j*2*pi). Replace
n=1:DFT_N-1 with n=1:DFT_N to include that. I would also replace m=1:DFT_N-1 with m=1:DFT_N for plotting concerns.
Also replace any 2*pi*n*m with 2*pi*(n-1)*(m-1) to get the phase right, since the first entry of X should correspond to zero-frequency, yielding sum_n x(n) * (cos(0) + j sin(0)) = sum_n x(n). If your signal x is real-valued then the zero-frequency component X(1) should be real-valued, angle(X(1))=0.
Last remark, don't forget to shift zero-frequency component to the center of the spectrum for better visibility, X = circshift(X,floor(size(X)/2));
If you are interested in the single-sided spectrum only, than you can just calculate X(m) for m=1:DFT_N/2 since X it is conjugate symmetric around m=DFT_N/2, i.e., X(DFT_N/2+m) = X(DFT_N/2-m)', due to exp(-j*(pi*n+2*pi/DFT_N*m)) = exp(-j*(pi*n-2*pi/DFT_N*m))'.
As a side note, for a given m this program calculates an inner product between the array x and another array of complex exponentials, i.e., exp(-j*2*pi/DFT_N*m*n), for n = 0,1,...,N-1. MATLAB syntax is very convenient for such calculations, and you can avoid this inner loop by the following command
exp(-j*2*pi/DFT_N*m*(0:DFT_N-1)) * x
where x is a column vector. Similarly, you can avoid the first loop too by expanding your complex exponential vector row-wise for every m, i.e., build the matrix exp(-j*2*pi/DFT_N*(0:DFT_N-1)'*(0:DFT_N-1)). Then your DFT is simply
X = exp(-j*2*pi/DFT_N*(0:DFT_N-1)'*(0:DFT_N-1)) * x
For single-sided spectrum, instead use
X = exp(-j*2*pi/DFT_N*(0:floor((DFT_N-1)/2))'*(0:DFT_N-1)) * x

Matlab: Euclidean norm (or difference) between two vectors

I'd like to calculate the Euclidean distance between a vector G and each row of an array C, while dividing each row by a value in a vector GSD. What I've done seems very inefficient. What's my biggest overhead?
Could I speed it up?
m=1E7;
G=1E5*rand(1,8);
C=1E5*[zeros(m,1),rand(m,8)];
GSD=10*rand(1,8);
%I've taken the log10 of the values because G and C are very large in magnitude.
%Don't know if it's worth it.
for i=1:m
dG(i,1)=norm((log10(G)-log10(C(i,2:end)))/log10(GSD));
end
Using the examples from below, they don't all give the same answer. In fact none of them give the same answer (see following figure using:
dG = pdist2(log10(G),log10(C(:,2:end)),'mahalanobis',diag(log10(GSD))); %(1)
dG = sqrt(sum((log10(G)-log10(C(:,2:end))./log10(GSD)).^2,2));
tmp=bsxfun(#rdivide,bsxfun(#minus,log10(G),log10(C(:,2:end))),log10(GSD)); %(4)
dG = sqrt(sum(tmp.^2,2));
You can use pdist2(x,y) to calculate the pairwise distance between all elements in x and y, thus your example would be something like
dG = pdist2(log10(G),log10(C(:,2:end)),'mahalanobis',diag(log10(GSD)).^2);
where the name-pair 'mahalanobis',diag(log10(GSD)).^2 puts log10(GSD) as weights on the Eucledean, which is the known as the Mahalanobis distance.
Note that the Mahalanobis distance is originally intented for normalising data, thus it is the "covariance" which have to be put as the fourth input, which MATLAB then finds the Cholesky decomposition of (element-wise squareroot when diagonal, as here).
Implicit expansion
In newer MATLAB editions, one can also just just the implcit expansion as the first entry is only 1 vector.
dG = sqrt(sum(((log10(G)-log10(C(:,2:9)))./log10(GSD)).^2,2));
which is probably a tad faster, I do, however, prefer the pdist2 solution as I find it clearer.
The floating point should handle the large magnitude of the input data, up to a certain point with float data and with any reasonable value with double data
realmax('single')
ans =
3.4028e+38
realmax('double')
ans =
1.7977e+308
With 1e7 values in the +/- 1e5 range, you may expect the square of the Euclidian distance to be in the +/- 1e17 range (5+5+7), which both formats will handle with ease.
In any case, you should vectorize the code to remove the loop (which Matlab has a history of handling very inefficiently, especially in older versions)
With new versions (2016b and later), simply use:
tmp=(log10(G)-log10(C(:,2:end)))./log10(GSD);
dG = sqrt(sum(tmp.^2,2)); %row-by-row norm
Note that you have to use ./ which is a element-wise division, not / which is matrix right division.
The following code will work everywhere
tmp=bsxfun(#rdivide,bsxfun(#minus,log10(G),log10(C(:,2:end))),log10(GSD));
dG = sqrt(sum(tmp.^2,2)); %row-by-row norm
I however believe that the use of log10 is a mathematical error. The result dG will not be the Euclidian norm. You should stick with the root mean square of the weighted difference:
dG = sqrt(sum(bsxfun(#rdivide,bsxfun(#minus,G,C(:,2:end)),GSD).^2,2)); % all versions
dG = sqrt(sum((G-C(:,2:end)./GSD).^2,2)); %R2016b and later

How can I make all-in-one polynomial from multi-polynomial?

I'm not familiar with expert math. so I don't know where to start from.
I have get a some article like this. I am just following this article description. But this is not easy to me.
But I'm not sure how to make just one polynomial equation(or something like that) from above 4 polynomial equations. Is this can be possible way?
If yes, Would you please help me how to get a polynomial(or something like equation)? If not, would you let me know the reason of why?
UPDATE
I'd like to try as following
clear all ;
clc
ab = (H' * H)\H' * y;
y2 = H*ab;
Finally I can get some numbers like this.
So, is this meaning?
As you can see the red curve line, something wrong.
What did I miss anythings?
All the article says is "you can combine multiple data sets into one to get a single polynomial".
You can also go in the other direction: subdivide your data set into pieces and get as many separate ones as you wish. (This is called n-fold validation.)
You start with a collection of n points (x, y). (Keep it simple by having only one independent variable x and one dependent variable y.)
Your first step should be to plot the data, look at it, and think about what kind of relationship between the two would explain it well.
Your next step is to assume some form for the relationship between the two. People like polynomials because they're easy to understand and work with, but other, more complex relationships are possible.
One polynomial might be:
y = c0 + c1*x + c2*x^2 + c3*x^3
This is your general relationship between the dependent variable y and the independent variable x.
You have n points (x, y). Your function can't go through every point. In the example I gave there are only four coefficients. How do you calculate the coefficients for n >> 4?
That's where the matricies come in. You have n equations:
y(1) = c0 + c1*x(1) + c2*x(1)^2 + c3*x(1)^3
....
y(n) = c0 + c1*x(n) + c2*x(n)^2 + c3*x(n)^3
You can write these as a matrix:
y = H * c
where the prime denotes "transpose".
Premultiply both sides by transpose(X):
transpose(X)* y = transpose(H)* H * c
Do a standard matrix inversion or LU decomposition to solve for the unknown vector of coefficients c. These particular coefficients minimize the sum of squares of differences between the function evaluated at each point x and your actual value y.
Update:
I don't know where this fixation with those polynomials comes from.
Your y vector? Wrong. Your H matrix? Wrong again.
If you must insist on using those polynomials, here's what I'd recommend: You have a range of x values in your plot. Let's say you have 100 x values, equally spaced between 0 and your max value. Those are the values to plug into your H matrix.
Use the polynomials to synthesize sets of y values, one for each polynomial.
Combine all of them into a single large problem and solve for a new set of coefficients. If you want a 3rd order polynomial, you'll only have four coefficients and one equation. It'll represent the least squares best approximation of all the synthesized data you created with your four polynomials.

Exponential curve fit matlab

I have the following equation:
I want to do a exponential curve fitting using MATLAB for the above equation, where y = f(u,a). y is my output while (u,a) are my inputs. I want to find the coefficients A,B for a set of provided data.
I know how to do this for simple polynomials by defining states. As an example, if states= (ones(size(u)), u u.^2), this will give me L+Mu+Nu^2, with L, M and N being regression coefficients.
However, this is not the case for the above equation. How could I do this in MATLAB?
Building on what #eigenchris said, simply take the natural logarithm (log in MATLAB) of both sides of the equation. If we do this, we would in fact be linearizing the equation in log space. In other words, given your original equation:
We get:
However, this isn't exactly polynomial regression. This is more of a least squares fitting of your points. Specifically, what you would do is given a set of y and set pair of (u,a) points, you would build a system of equations and solve for this system via least squares. In other words, given the set y = (y_0, y_1, y_2,...y_N), and (u,a) = ((u_0, a_0), (u_1, a_1), ..., (u_N, a_N)), where N is the number of points that you have, you would build your system of equations like so:
This can be written in matrix form:
To solve for A and B, you simply need to find the least-squares solution. You can see that it's in the form of:
Y = AX
To solve for X, we use what is called the pseudoinverse. As such:
X = A^{*} * Y
A^{*} is the pseudoinverse. This can eloquently be done in MATLAB using the \ or mldivide operator. All you have to do is build a vector of y values with the log taken, as well as building the matrix of u and a values. Therefore, if your points (u,a) are stored in U and A respectively, as well as the values of y stored in Y, you would simply do this:
x = [u.^2 a.^3] \ log(y);
x(1) will contain the coefficient for A, while x(2) will contain the coefficient for B. As A. Donda has noted in his answer (which I embarrassingly forgot about), the values of A and B are obtained assuming that the errors with respect to the exact curve you are trying to fit to are normally (Gaussian) distributed with a constant variance. The errors also need to be additive. If this is not the case, then your parameters achieved may not represent the best fit possible.
See this Wikipedia page for more details on what assumptions least-squares fitting takes:
http://en.wikipedia.org/wiki/Least_squares#Least_squares.2C_regression_analysis_and_statistics
One approach is to use a linear regression of log(y) with respect to u² and a³:
Assuming that u, a, and y are column vectors of the same length:
AB = [u .^ 2, a .^ 3] \ log(y)
After this, AB(1) is the fit value for A and AB(2) is the fit value for B. The computation uses Matlab's mldivide operator; an alternative would be to use the pseudo-inverse.
The fit values found this way are Maximum Likelihood estimates of the parameters under the assumption that deviations from the exact equation are constant-variance normally distributed errors additive to A u² + B a³. If the actual source of deviations differs from this, these estimates may not be optimal.

how to raise the fourier transformed probability density function to a fractional power?

By hypothesis my measured probability density functions (PDF) result from n convolutions of an elementary distribution (E).
I have two distributions the first (F) of which is supposed to have undergone more convolutions (m_1) than the second (G) (m_2 convolutions).
In fourier space:
F' = E'^m_1
G' = E'^m_2
As the two PDFs are constituted from the same elementary distribution, I should be able to be able to calculate the PDF of G from F
G' = F'^{m_1/m_2}
Taking the IFFT i should have a distribution that overlaps well with G.
A naive way of doing this would to be simply to calculate the FT of F and raise it to the power 1/integer and testing a range of integers.
My question are there any tricks for raising the Fourier transformed PDF to a fractional power. I have done so but the IFFT gives a distribution far from that which is expected. And strange aliasing errors.
I've included a padded vector as one might do if they were to do a convolution of two PDFS.
My normalization is based on the fact that the k=0 [ProbF(1,1)] wave vector gives the integral of the PDF which should be equal to one.
Of course, the hypothesis could be wrong, but it has all the reasons in the world to be valid.
My code
Inc = INC1 ; % BINS
null = zeros(1,length(Inc)); % PADDED PROB
Inc = [ Inc.*-1 (Inc) ]; % PADDED INC VECTOR
Prob = [ null heightProb1 ] ; % PADDED PROB VECTOR
ProbF = (fft(Prob)) ;
ProbFnorm = ProbF./ProbF(1,1) ; % NORMALIZED BY K=0 COMPONENT (integral of PDF =1)
m=.79 % POWER TO RAISE
ProbFtrans = ((ProbFnorm).^(m)); % 'DECONVOLUTION' IN FOURIER SPACE
ProbIF = (ifft((ProbFtrans)).*(ProbF(1,1))); % RETURN TO PROBABILITY SPACE
figure(2);
plot(Inc,ProbIF,'rs')
Thank you in advance for your help
Fourier coefficients are typically complex numbers (unless your function is symmetric).
You should be very careful when you raise complex numbers to fractional powers.
For example, consider
z=1.2 + i*0.65;
then raise z to power 4
>> z^4
ans =
-1.398293750000001e+00 + 3.174599999999999e+00i
and to power 8.
>> z^8
ans =
-8.122859748710933e+00 - 8.878046677500002e+00i
Then try obtain z^4 as (z^8)^(1/2)
>> (z^8)^(1/2)
ans =
1.398293750000001e+00 - 3.174600000000000e+00i
Surprise! You don't get z^4! (wrong sign)
If you avoid taking the fractional power and "rewind" z^8 by diving back by z you get back z^4 correctly:
>> (z^8)/z/z/z/z
ans =
-1.398293750000000e+00 + 3.174599999999998e+00i
The reason is in the definition of fractional powers in the complex plane. Fractional powers are multi-valued functions, which are made single-valued by introducing a branch-cut in the complex plane. The nth-root z^(1/n) has n possible values, and
matlab singles out one of these by interpreting the complex function z^(1/n) as its so-called principal branch. The main implication is that in the world of complex numbers ^1/n does not always invert ^n.
If this doesn't make any sense to you, you should probably review some basic complex analysis, but the bottom line is that fractional powers of complex numbers are tricky animals. Wherever possible you should try to work around fractional powers by using division (as show above).
I am not sure this will fix all of your problems, but from your description it looks like this is certainly one problem you have.