If I perform a standard wavelet transform and then perform the inverse, I was expecting to get the original signal back:
% dummy series:
Fs = 1e3;
t = 0:1/Fs:1;
x = exp(cos(2*pi*32*t).*(t>=0.1 & t<0.3) + sin(2*pi*64*t).*(t>0.7));
% perform default transform and inverse
wt=cwt(x)
rx=icwt(wt)
% plot
plot(t,x,t,rx)
Apart from the offset, the flat period signals are distorted.
It seems to be possible to perform a transform/inverse and have something close to the identity function, as here Wavelet reconstruction of time series , but reading the tutorials/help for cwt I do not see how to achieve this.
The matlab documentation explains that the CWT is not the best choice for perfect reconstruction. However if you want to compare different bands as signals with the same size as the original, you can use the MODWT (or the shift-invariant DWT by cycle-spinning, sometimes calles à trous).
Related
I'm writing currently a program in Matlab which is related on image hashing. Loading the image and performing a simple down-sampling was not a problem.Here is my code
clear all;
close all;
clc;
%%load Image
I = im2single(imread('cameraman.tif'));
%%Perform filtering and downsampling
gaussPyramid = vision.Pyramid('PyramidLevel', 2);
J = step(gaussPyramid, I); %%Preprocessed Image
%%Get 2D Fourier Transform of Input Image
Y = fft2(I); %%fft of input image
The algorithm next assumes that 2D Fourier Transform ( Y in my case ) must be in the form Y(f_x,f_y) where f_x,f_y are the normalized spatial frequencies
in the range [0, 1].I'm not able to transform the output of fft2 function from Matlab as it is required by the algorithm.
I looked up the paper in question (Multimedia Forensic Analysis via Intrinsic and Extrinsic Fingerprints by Ashwin Swaminathan) and found
We take a Fourier transform on the preprocessed image
to obtain I(fx, fy). The Fourier transform output is converted into polar co-ordinates to arrive at I′(ρ, θ)
By this, they seem to mean that I(fx, fy) is the Fourier transformation of the image. But they're trying to use the Fourier-Mellin transformation, which is different from a simple fft2 on the image. Based on the information found in this set of slides,
If an input image is expressed in terms of coordinates that are natural logarithms of the original coordinates, then the magnitude of its FT is insensitive to any change in scale of the original image (since it is Mellin Transform in the original coordinates)
and in files on the MATLAB Central File exchange, you have to do additional work to get the Mellin-Fourier transform; particularly, taking the magnitude (which appears to be your missing step), converting the coordinates to log polar and taking a second fft2. It's unclear to me why the log of log polar coordinates was omitted from the paper in the steps required. See an implementation for image registration here for an example. Note that this code is old, and the transformImage method appears not to exist; it does the log polar transform.
I am trying to distinguish shape for images in Matlab by using the Fourier Descriptor. What I want to do is : 1. Generate the Fourier Descriptors for each image; 2. Calculate the Euclidian Distance between these Fourier Descriptors to compare the shapes.
My problem is I cannot make the result of calculating the Fourier Descriptor to be insensitive for the geometric transformation (e.g. Rotation & Scaling).
The code I use now is the "Gonzales matlab version", the one in this link. I have tried to normalize the result by doing this:
% Normalization
DC = f(1);
f = f(2:11); % getting the first 20 & deleting the dc component
f = abs(f) ; % use magnitudes to be invariant to translation & rotation
f = f/DC; % devide the fourier coeffients by the DC-coefficient to be invariant to scale
But I don't think it worked as I expected. The result is different if I change the direction or the scale of a same image.
I have been trapped by this question for a couple of days. I will appreciate any suggestion, thank you all in advance!
I recommend you to read
"Feauture Extraction and Image Processing for Computer Vision"
by Nixon and Aguado.
You will find there what you are looking for
It's my first time performing an FFT within MatLab by experimenting with some example code from the MathWorks website. I was wondering if it was possible to take the code I have and transform the x axis to a log-scale representation rather than linear. I understand most of the code, but it is the x axis line of code that I'm still not 100% sure exactly what it is doing apart from the +1 at the end of the line, which is that fact that MatLab's indexing structure doesn't start on 0.
My code so far is:
[y,fs] = wavread('Wav/800Hz_2sec.wav');
NFFT = 4096;
Y = fft(y,NFFT)/length(y);
f = fs/2*linspace(0,1,NFFT/2+1);
plot(f,2*abs(Y(1:NFFT/2+1))
frequency usually comes out in linear scale from Discrete Fourier Transform. if you want, you can make a new frequency vector in log scale and interpolate the results you already have
fnew=fs/2.*logspace(log10(fs/length(y)),0,npts);
Ynew= interp1(f,Y(1:NFFT/2+1),fnew);
where npts is the length of your new frequency vector. for just plotting
loglog(f,2*abs(Y(1:NFFT/2+1));
honestly IMO, the interpolation thing doesn't work very well because FFT of real signals produces strong peaks and troughs in spectra, so unless you smooth your spectrum first, the interpolated spectrum won't look as nice
I want to estimate the time of arrival of GPR echo signals using Music algorithm in matlab, I am using the duality property of Fourier transform.
I am first applying FFT on the obtained signal and then passing these as parameters to pmusic function, i am still getting the result in frequency domain.?
Short Answer: You're using the wrong function here.
As far as I can tell Matlab's pmusic function returns the pseudospectrum of an input signal.
If you click on the pseudospectrum link, you'll see that the pseudospectrum of a signal lives in the frequency domain. In particular, look at the plot:
(from Matlab's documentation: Plotting Pseudospectrum Data)
Notice that the result is in the frequency domain.
Assuming that by GPR you mean Ground Penetrating Radar, then try radar or sonar echo detection approach to estimate the two way transit time.
This can be done and the theory has been published in several papers. See, for example, here:
STAR Channel Estimation in DS-CDMA Systems
That paper describes spatiotemporal estimation (i.e. estimation of both time and direction of arrival), but you can ignore the spatial part and just do temporal estimation if you have a single-antenna receiver.
You probably won't want to use Matlab's pmusic function directly. It's always quicker and easier to write these sorts of functions for yourself, so you know what is actually going on. In the case of MUSIC:
% Get noise subspace (where M is number of signals)
[E, D] = eig(Rxx);
[lambda, idx] = sort(diag(D), 'descend');
E = E(:, idx);
En = E(:,M+1:end);
% [Construct matrix S, whose columns are the vectors to search]
% Calculate MUSIC null spectrum and convert to dB
Z = 10*log10(sum(abs(S'*En).^2, 2));
You can use the Phased array system toolbox of MATLAB if you want to estimate the DOA using different algorithms using a single command. Such as for Root MUSIC it is phased.RootMUSICEstimator phased.ESPRITEstimator.
However as Harry mentioned its easy to write your own function, once you define the signal subspace and receive vector, you can directly apply it in the MUSIC function to find its peaks.
This is another good reference.
http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1143830
I am pretty new to Matlab and I am trying to write a simple frequency based speech detection algorithm. The end goal is to run the script on a wav file, and have it output start/end times for each speech segment. If use the code:
fr = 128;
[ audio, fs, nbits ] = wavread(audioPath);
spectrogram(audio,fr,120,fr,fs,'yaxis')
I get a useful frequency intensity vs. time graph like this:
By looking at it, it is very easy to see when speech occurs. I could write an algorithm to automate the detection process by looking at each x-axis frame, figuring out which frequencies are dominant (have the highest intensity), testing the dominant frequencies to see if enough of them are above a certain intensity threshold (the difference between yellow and red on the graph), and then labeling that frame as either speech or non-speech. Once the frames are labeled, it would be simple to get start/end times for each speech segment.
My problem is that I don't know how to access that data. I can use the code:
[S,F,T,P] = spectrogram(audio,fr,120,fr,fs);
to get all the features of the spectrogram, but the results of that code don't make any sense to me. The bounds of the S,F,T,P arrays and matrices don't correlate to anything I see on the graph. I've looked through the help files and the API, but I get confused when they start throwing around algorithm names and acronyms - my DSP background is pretty limited.
How could I get an array of the frequency intensity values for each frame of this spectrogram analysis? I can figure the rest out from there, I just need to know how to get the appropriate data.
What you are trying to do is called speech activity detection. There are many approaches to this, the simplest might be a simple band pass filter, that passes frequencies where speech is strongest, this is between 1kHz and 8kHz. You could then compare total signal energy with bandpass limited and if majority of energy is in the speech band, classify frame as speech. That's one option, but there are others too.
To get frequencies at peaks you could use FFT to get spectrum and then use peakdetect.m. But this is a very naïve approach, as you will get a lot of peaks, belonging to harmonic frequencies of a base sine.
Theoretically you should use some sort of cepstrum (also known as spectrum of spectrum), which reduces harmonics' periodicity in spectrum to base frequency and then use that with peakdetect. Or, you could use existing tools, that do that, such as praat.
Be aware, that speech analysis is usually done on a frames of around 30ms, stepping in 10ms. You could further filter out false detection by ensuring formant is detected in N sequential frames.
Why don't you use fft with `fftshift:
%% Time specifications:
Fs = 100; % samples per second
dt = 1/Fs; % seconds per sample
StopTime = 1; % seconds
t = (0:dt:StopTime-dt)';
N = size(t,1);
%% Sine wave:
Fc = 12; % hertz
x = cos(2*pi*Fc*t);
%% Fourier Transform:
X = fftshift(fft(x));
%% Frequency specifications:
dF = Fs/N; % hertz
f = -Fs/2:dF:Fs/2-dF; % hertz
%% Plot the spectrum:
figure;
plot(f,abs(X)/N);
xlabel('Frequency (in hertz)');
title('Magnitude Response');
Why do you want to use complex stuff?
a nice and full solution may found in https://dsp.stackexchange.com/questions/1522/simplest-way-of-detecting-where-audio-envelopes-start-and-stop
Have a look at the STFT (short-time fourier transform) or (even better) the DWT (discrete wavelet transform) which both will estimate the frequency content in blocks (windows) of data, which is what you need if you want to detect sudden changes in amplitude of certain ("speech") frequencies.
Don't use a FFT since it calculates the relative frequency content over the entire duration of the signal, making it impossible to determine when a certain frequency occured in the signal.
If you still use inbuilt STFT function, then to plot the maximum you can use following command
plot(T,(floor(abs(max(S,[],1)))))