Strange values for approximating coefficients in wavelet decompsition in Matlab - matlab

I'm trying to get wavelet decomposition of arcsin(x) using, say, Haar wavelets
When using both Matlab's dwt or wavedec functions, I get strange values for approximating coefficients. Since applying low-pass Haar wavelets's filter equals to performing half-sum and the maximum of arcsin is pi/2, I assume that approximating coefficients can't surpass pi/2, yet this code:
x = linspace(0,1,128);
y = asin(x);
[cA, cD] = dwt(y, 'haar'); %//cA for approximating coefficients
returns values more than pi/2 in cA. Why is that?

I believe what makes you confused here is thinking that Haar's filter just averages two adjacent numbers when computing 1-level approximation coefficients. Due to the energy preservation feature of the scaling function, each pair of numbers gets divided by sqrt(2) instead of 2. In fact, you could see what a particular wavelet filter does by typing in the following command (for the Haar filter in this case):
[F1,F2] = wfilters('haar','d')
F1 =
0.7071 0.7071
F2 =
-0.7071 0.7071
You can then check the validity of what you have gotten above by constructing a simple loop:
CA_compare = zeros(1,64);
for k = 1 : 64
CA_compare(k) = dot( y(2*k-1 : 2*k), F1 );
end
You will then see that "CA_compare" contains exactly the same values as your "cA" does.
Hope this helps.

Related

How to calculate cosine similarity between two frequency vectors in MATLAB?

I need to find the cosine similarity between two frequency vectors in MATLAB.
Example vectors:
a = [2,3,4,4,6,1]
b = [1,3,2,4,6,3]
How do I measure the cosine similarity between these vectors in MATLAB?
Take a quick look at the mathematical definition of Cosine similarity.
From the definition, you just need the dot product of the vectors divided by the product of the Euclidean norms of those vectors.
% MATLAB 2018b
a = [2,3,4,4,6,1];
b = [1,3,2,4,6,3];
cosSim = sum(a.*b)/sqrt(sum(a.^2)*sum(b.^2)); % 0.9436
Alternatively, you could use
cosSim = (a(:).'*b(:))/sqrt(sum(a.^2)*sum(b.^2)); % 0.9436
which gives the same result.
After reading this correct answer, to avoid sending you to another castle I've added another approach using MATLAB's built-in linear algebra functions, dot() and norm().
cosSim = dot(a,b)/(norm(a)*norm(b)); % 0.9436
See also the tag-wiki for cosine-similarity.
Performance by Approach:
sum(a.*b)/sqrt(sum(a.^2)*sum(b.^2))
(a(:).'*b(:))/sqrt(sum(a.^2)*sum(b.^2))
dot(a,b)/(norm(a)*norm(b))
Each point represents the geometric mean of the computation times for 10 randomly generated vectors.
If you have the Statistics toolbox, you can use the pdist2 function with the 'cosine' input flag, which gives 1 minus the cosine similarity:
a = [2,3,4,4,6,1];
b = [1,3,2,4,6,3];
result = 1-pdist2(a, b, 'cosine');

Downsampling in MATLAB - why scaling filter coefficients?

Trying to undertsand how MATLAB does resampling, looking at toolbox/signal/resample.m.
This is the part where low-pass FIR filter coefficients are calculated:
fc = 1/2/pqmax;
L = 2*N*pqmax + 1;
h = firls( L-1, [0 2*fc 2*fc 1], [1 1 0 0]).*kaiser(L,bta)' ;
h = p*h/sum(h);
(In this case pqmax and p both represent downsampling ratio.)
Not sure what is the purpose of last line where all filter coefficients are scaled by downsampling ratio over coeffs sum, p/sum(h)? Does anyone know the theory behind?
MATLAB firls
MATLAB kaiser
firls theory: Least Squared Error Design of FIR Filters
In most filters, you want the output magnitude to remain equal after filtering, not scaled up or down.
Mathematically this is generally properly done in the filter equation, but in discrete numbers this doesnt generally doesn't apply.
If you divide the filer by its sum, you make sure that for any input data p, the filter will always have a general scale factor of 1, as its normalized. For a p=ones(1:1000,1), if sum(h) is not 1, the result after filtering will be scaled, i.e. the values of p_filtered will not be 1.

Derivative of Gaussian filter in Matlab

Is there a derivative of Gaussian filter function in Matlab? Would it be proper to convolve the Gaussian filter with [1 0 -1] to obtain the result?
As far as I know there is no built in derivative of Gaussian filter. You can very easily create one for yourself as follow:
For 2D
G1=fspecial('gauss',[round(k*sigma), round(k*sigma)], sigma);
[Gx,Gy] = gradient(G1);
[Gxx,Gxy] = gradient(Gx);
[Gyx,Gyy] = gradient(Gy);
Where k determine the size of it (depends to which extent you want support).
For 1D is the same, but you don't have two gradient directions, just one. Also you would create the gaussian filter in another way and I assume you already have your preferred method.
Here I gave you up to second order, but you can see the pattern here to proceed to further orders.
The convolving filter you posted ( [1 0 -1] ) looks like finite difference. Although yours I believe is conceptually right, the most correct and common way to do it is with [1 -1] or [-1 1], with that 0 in the middle you skip the central sample in approximating the derivative. This may work as well (but bare in mind that is an approximation that for higher orders diverge from true result), but I usually prefer the method I posted above.
Note: If you are indeed interested in 2D filters, Derivative of Gaussian family has the steerability property, meaning that you can easily create a filter for a Derivative of Gaussian in any direction from the one I gave you up. Supposing the direction you want is defined as
cos(theta), sin(theta)
Then the derivative of Gaussian in that direction is
Gtheta = cos(theta)*Gx + sin(theta)*Gy
If you reapply this recursively you can go to any order you like.
Of just use the derivative of a gaussian which is not more difficult then computing the gaussian itself.
G(x,y) = exp(-(x^2+y^2)/(2*s^2))
d/dx G(x, y) = -x/s^2 G(x,y)
d/dy G(x, y) = -y/s^2 G(x,y)
function [ y ] = dgaus( x,n )
%DGAUS nth derivative of exp(-x.^2)
odd=0*x;
even=exp(-x.^2);
for order=0:(floor(n/2)-1)
odd=-4*order*odd-2*x.*even;
even=-(4*order+2)*even-2*x.*odd;
end
if mod(n,2)==0
y=even;
else
y=-2*(n-1)*odd-2*x.*even;
end
end

How does the sgolay function work in Matlab R2013a?

I have a question about the sgolay function in Matlab R2013a. My database has 165 spectra with 2884 variables and I would like to take the first and second derivatives of them. How might I define the inputs K and F to sgolay?
Below is an example:
sgolay is used to smooth a noisy sinusoid and compare the resulting first and second derivatives to the first and second derivatives computed using diff. Notice how using diff amplifies the noise and generates useless results.
K = 4; % Order of polynomial fit
F = 21; % Window length
[b,g] = sgolay(K,F); % Calculate S-G coefficients
dx = .2;
xLim = 200;
x = 0:dx:xLim-1;
y = 5*sin(0.4*pi*x)+randn(size(x)); % Sinusoid with noise
HalfWin = ((F+1)/2) -1;
for n = (F+1)/2:996-(F+1)/2,
% Zero-th derivative (smoothing only)
SG0(n) = dot(g(:,1), y(n - HalfWin: n + HalfWin));
% 1st differential
SG1(n) = dot(g(:,2), y(n - HalfWin: n + HalfWin));
% 2nd differential
SG2(n) = 2*dot(g(:,3)', y(n - HalfWin: n + HalfWin))';
end
SG1 = SG1/dx; % Turn differential into derivative
SG2 = SG2/(dx*dx); % and into 2nd derivative
% Scale the "diff" results
DiffD1 = (diff(y(1:length(SG0)+1)))/ dx;
DiffD2 = (diff(diff(y(1:length(SG0)+2)))) / (dx*dx);
subplot(3,1,1);
plot([y(1:length(SG0))', SG0'])
legend('Noisy Sinusoid','S-G Smoothed sinusoid')
subplot(3, 1, 2);
plot([DiffD1',SG1'])
legend('Diff-generated 1st-derivative', 'S-G Smoothed 1st-derivative')
subplot(3, 1, 3);
plot([DiffD2',SG2'])
legend('Diff-generated 2nd-derivative', 'S-G Smoothed 2nd-derivative')
Taking derivatives in an inherently noisy process. Thus, if you already have some noise in your data, indeed, it will be magnified as you take higher order derivatives. Savitzky-Golay is a very useful way of combining smoothing and differentiation into one operation. It's a general method and it computes derivatives to an arbitrary order. There are trade-offs, though. Other special methods exist for data with a certain structure.
In terms of your application, I don't have any concrete answers. Much depends on the nature of the data (sampling rate, noise ratio, etc.). If you use too much smoothing, you'll smear your data or produce aliasing. Same thing if you over-fit the data by using high order polynomial coefficients, K. In your demo code you should also plot the analytical derivatives of the sin function. Then play with different amounts of input noise and smoothing filters. Such a tool with known exact answers may be helpful if you can approximate aspects of your real data. In practice, I try to use as little smoothing as possible in order to produce derivatives that aren't too noisy. Often this means a third-order polynomial (K = 3) and a window size, F, as small as possible.
So yes, many suggest that you use your eyes to tune these parameters. However, there has also been some very recent research on choosing the coefficients automatically: On the Selection of Optimum Savitzky-Golay Filters (2013). There are also alternatives to Savitzky-Golay, e.g., this paper based on regularization, but you may need to implement them yourself in Matlab.
By the way, a while back I wrote a little replacement for sgolay. Like you, I only needed the second output, the differentiation filters, G, so that's all it calculates. This function is also faster (by about 2–4 times):
function G=sgolayfilt(k,f)
%SGOLAYFILT Savitzky-Golay differentiation filters
s = vander(0.5*(1-f):0.5*(f-1));
S = s(:,f:-1:f-k);
[~,R] = qr(S,0);
G = S/R/R';
A full version of this function with input validation is available on my GitHub.

Calculate autocorrelation using FFT in Matlab

I've read some explanations of how autocorrelation can be more efficiently calculated using the fft of a signal, multiplying the real part by the complex conjugate (Fourier domain), then using the inverse fft, but I'm having trouble realizing this in Matlab because at a detailed level.
Just like you stated, take the fft and multiply pointwise by its complex conjugate, then use the inverse fft (or in the case of cross-correlation of two signals: Corr(x,y) <=> FFT(x)FFT(y)*)
x = rand(100,1);
len = length(x);
%# autocorrelation
nfft = 2^nextpow2(2*len-1);
r = ifft( fft(x,nfft) .* conj(fft(x,nfft)) );
%# rearrange and keep values corresponding to lags: -(len-1):+(len-1)
r = [r(end-len+2:end) ; r(1:len)];
%# compare with MATLAB's XCORR output
all( (xcorr(x)-r) < 1e-10 )
In fact, if you look at the code of xcorr.m, that's exactly what it's doing (only it has to deal with all the cases of padding, normalizing, vector/matrix input, etc...)
By the Wiener–Khinchin theorem, the power-spectral density (PSD) of a function is the Fourier transform of the autocorrelation. For deterministic signals, the PSD is simply the magnitude-squared of the Fourier transform. See also the convolution theorem.
When it comes to discrete Fourier transforms (i.e. using FFTs), you actually get the cyclic autocorrelation. In order to get proper (linear) autocorrelation, you must zero-pad the original data to twice its original length before taking the Fourier transform. So something like:
x = [ ... ];
x_pad = [x zeros(size(x))];
X = fft(x_pad);
X_psd = abs(X).^2;
r_xx = ifft(X_psd);