Wrong Amplitude after FFT [Matlab] [duplicate] - matlab

I am trying to use FFT to decode morse code, but I'm finding that when I examine the resulting frequency bin/bucket I'm interested in, the absolute value is varying quite significantly even when a constant tone is presented. This makes it impossible for me to use the rise and fall around a threshold and therefore decode audio morse.
I've even tried the simple example that seems to be copied everywhere, but it also varies...
I can't work out what I'm doing wrong, and my maths is not clever enough to understand all the formulas associated with FFT.
I now it must be possible, but I can't find out how... can anyone help please?

Make sure you are using the magnitude of the FFT result, not just the real or imaginary component of a complex result.
In general, when a longer constant amplitude sinusoid is fed to a series of shorter FFTs (windowed STFT), the magnitude result will only be constant if the period of the sinusoid is exactly integer periodic in the FFT length. e.g.
f_tone modulo (f_sampling_rate / FFT_length) == 0
If you are only interested in the magnitude of one selected tone frequency, the Goertzel algorithm would serve as a more efficient filter than a full FFT. And, depending on the setup and length restrictions required by your chosen FFT library, it may be easier to vary the length of a Goertzel to match the requirements for your target tone frequency, as well as the time/frequency resolution trade-off needed.

Related

How to convert scales to frequencies in Wavelet Transform

I'm dealing with CWT, and I have a big problem converting scales to frequencies. In the MAtlab Wavelet Tutorial they use this expression to convert scales to frequencies
But if i use the default function scal2freq I obtain different result.
I don't understand the role of the Morlet Fourier Factor
Thanks in advance
It is a pretty complicated concept, which I somewhat understand it. I'll write some points here so that you might figure it out yourself, rather easier.
A simple fact is that:
Scale is inversely proportional to frequency.
For example, imagine we have a 1-100 Hz range of frequencies in some time series data such as stock markets data or earthquake data. Scale is "supposed to be" the inverse of that. For instance, if scale would be in range of 1 to 100, we'd have had:
Scale(1/Hz) Frequency (Hz)
1 100
50 50
100 1
Therefore,
The frequency is not the real frequency of those time series data (e.g., stock market, earthquake) that we know of. They are only related, inversely.
And we can safely say that here we are calculating some "pseudo-frequencies", which MATLAB does that (by approximating that). You can read about the approximation process in the documentation in the section pseudo-frequencies:
MATlAB does calculate those pseudo-frequencies based on:
In wavelet analysis, the way to relate scales to frequencies is to determine the center frequency of the wavelet function:
which you can visually see in this image and of-course it would differ, when we would change the types of our function in the calculation. Thus, that center frequency will change everytime in our approximation process:
That "MorletFourierFactor" is a variable to approximate a constant so that when you would do the 1/scale, it would closely approximate those "pseudo-frequencies".
I thought this image about shifting (time axis) and scaling (frequency axis) might be a little helpful to look into as well:
The bottom line is that don't worry about pseudo-frequencies, you wouldn't probably need those. If you would want any frequency spectrum, you can likely go towards applying some of those frequency methods (such as Fast Fourier Transform) on whatever time series data that you have.
If you really really want to map that, you can also try to design some methods to approximate it yourself.
Source
Harvard Seismology

Modifying Sound Input to Determine Frequency

I'm working on a project and I've hit a snag that is past my understanding. My goal is to create an artificial neural network which is fed information from a sound file which is then ported through the system, resulting in a labeling of the chord. I'm hoping to make this to help in music transcription -- not to actually do the transcription itself, but to help in the harmonization aspect. I digress.
I've read as much as I can on the Goertzel and the FFT function, but I'm unsure if these functions are what I'm looking for. I'm not looking for any particular frequency in the sound sample, but rather, I'm hoping to find the higher, middle, and low range frequencies of the sample.
I know the Goertzel algorithm returns a high number if a particular frequency is found, but it seems computational wasteful to run the algorithm for all possible tones in a given sample. Any ideas on what to use?
Or, if this is impossible, I'd love to know that too before spending too much time on this one project.
Thank you for your time!
Probably better suited to DSP StackExchange.
Suppose you FFT a single 110Hz tone to get a spectrogram; you'll see evenly spaced peaks at 110 220 330 etc Hz -- the harmonics. 110 is the fundamental.
Suppose you have 3 tones. Already it's going to look quite messy in the frequency domain. Especially if you have a chord containing e.g. A110 and A220.
On account of this, I think a neural network is a good approach.
Feed in FFT output.
It would be a good idea to use a neural network that accepts complex valued inputs, as FFT outputs of a complex number for each frequency bin.
http://www.eagle.tamut.edu/faculty/igor/PRESENTATIONS/IJCNN-0813_Tutorial.pdf
It may seem computationally wasteful to extract so many frequencies with FFT, but FFT algorithms are extremely efficient nowadays. You should probably use a bit strength of 10, so 2^10 inputs -> 2^9 = 512 complex bins.
FFT is the right solution. Basically, when you have the FFT of an input signal that consists only of sinus waves, you can determine the chord by just mapping which frequencys are present to specific tones in whichever musical temperament you want to use, then look up the chord specified by those tones. If you don't have sinus-waves as input, then using a neural network is a valid attempt in solving the problem, provided that you have enough samples to train it.
FFT is the right way. Harmonics don't bother you, since they are an integer multiple of the fundamental frequency they're just higher 'octaves' of the same note. And to recognize a chord, tranpositions of notes over whole octaves don't matter.

MATLAB, averaging multiple fft's ,coherent integration

I have audio record.
I want to detect sinusoidal pattern.
If i do regular fft i have result with bad SNR.
for example
my signal contents 4 high frequencies:
fft result:
To reduce noise i want to do Coherent integration as described in this article: http://flylib.com/books/en/2.729.1.109/1/
but i cant find any MATLAB examples how to do it. Sorry for bad english. Please help )
I look at spectra almost every day, but I never heard of 'coherent integration' as a method to calculate one. As also mentioned by Jason, coherent integration would only work when your signal has a fixed phase during every FFT you average over.
It is more likely that you want to do what the article calls 'incoherent integration'. This is more commonly known as calculating a periodogram (or Welch's method, a slightly better variant), in which you average the squared absolute value of the individual FFTs to obtain a power-spectral-density. To calculate a PSD in the correct way, you need to pay attention to some details, like applying a suitable Fourier window before doing each FFT, doing the proper normalization (so that the result is properly calibrated in i.e. Volt^2/Hz) and using half-overlapping windows to make use of all your data. All of this is implemented in Matlab's pwelch function, which is part of the signal-processing toolbox. See my answer to a similar question about how to use pwelch.
Integration or averaging of FFT frames just amounts to adding the frames up element-wise and dividing by the number of frames. Since MATLAB provides vector operations, you can just add the frames with the + operator.
coh_avg = (frame1 + frame2 + ...) / Nframes
Where frameX are the complex FFT output frames.
If you want to do non-coherent averaging, you just need to take the magnitude of the complex elements before adding the frames together.
noncoh_avg = (abs(frame1) + abs(frame2) + ...) / Nframes
Also note that in order for coherent averaging to work the best, the starting phase of the signal of interest needs to be the same for each FFT frame. Otherwise, the FFT bin with the signal may add in such a way that the amplitudes cancel out. This is usually a tough requirement to ensure without some knowledge of the signal or some external triggering so it is more common to use non-coherent averaging.
Non-coherent integration will not reduce the noise power, but it will increase signal to noise ratio (how the signal power compares to the noise power), which is probably what you really want anyway.
I think what you are looking for is the "spectrogram" function in Matlab, which computes the short time Fourier transform(STFT) of an input signal.
STFT
Spectrogram

Choosing Real vs Complex 2D FFTs using apple Accelerate framework

Can anyone advise on the correct FFT to be using (Real or Complex)? I have looked here but still have questions.
I want to do image correlation to identify the location of a sub image within a master image. I understand the basics of FFTs and iFFTs.
The plan:
Perform an FFT on a master image. 512x512
Take complex conjugate of sub image.
Perform an FFT on the sub image. 30x30 but padded with zeros to 512x512
Complex Multiply the two resulting matrixes
Perform iFFT on result
Even though the result should be (mostly) real, take the magnitude of resulting matrix
Look for maximum value which should correspond to maximum correlation.
I am having trouble getting the results that I anticipate.
If I use the real 2d fft (vDSP_fft2dzrip), the result is in a packed format that makes it hard to use a vDSP_zvmul to multiply to two result matrixes.
If I use the complex fft (vDSP_fft2dzip), I fail to get any correlation at all.
The apple examples and most of the audio examples don't do anything with the results of the forward FFT other than do the inverse.
Can anyone help me get started with image correlation? First question...can I use the complex FFT and avoid the packed format?
The only difference between a real and complex FFT is that the real FFT can be slightly more efficient by using a clever packing scheme that transforms a 2^n real FFT into a 2^(n-1) complex FFT. The results should be the same in both cases. So I would stick with the complex FFT for simplicity if I were you, at least until you have everything working.
Have you also taken a look at vImageConvolve_ARGB8888? It seems to do what you are trying to do, with a lot less effort :)

Finding Relevant Peaks in Messy FFTs

I have FFT outputs that look like this:
At 523 Hz is the maximum value. However, being a messy FFT, there are lots of little peaks that are right near the large peaks. However, they're irrelevant, whereas the peaks shown aren't. Are the any algorithms I can use to extract the maxima of this FFT that matter; I.E., aren't just random peaks cropping up near "real" peaks? Perhaps there is some sort of filter I can apply to this FFT output?
EDIT: The context of this is that I am trying to take one-hit sound samples (like someone pressing a key on a piano) and extract the loudest partials. In the image below, the peaks above 2000 Hz are important, because they are discrete partials of the given sound (which happens to be a sort of bell). However, the peaks that are scattered about right near 523 seem to be just artifacts, and I want to ignore them.
If the peak is broad, it could indicate that the peak frequency is modulated (AM, FM or both), or is actually a composite of several spectral peaks, themselves each potentially modulated.
For instance, a piano note may be the result of the hammer hitting up to 3 strings that are all tuned just a tiny fraction differently, and they all can modulate as they exchange energy between strings though the piano frame. Guitar strings can change frequency as the pluck shape distortion smooths out and decays. Bells change shape after they are hit, which can modulate their spectrum. Etc.
If the sound itself is "messy" then you need a good definition of what you mean by the "real" peak, before applying any sort of smoothing or side-band rejection filter. e.g. All that "messiness" may be part of what makes a bell sound like a real bell instead of an electronic sinewave generator.
Try convolving your FFT (treating it as a signal) with a rectangular pulse( pulse = ones(1:20)/20; ). This might get rid of some of them. Your maxima will be shifted by 10 frequency bins to teh right, to take that into account. You would basically be integrating your signal. Similar techniques are used in Pan-Tompkins algorithm for heart beat identification.
I worked on a similar problem once, and choosed to use savitsky-golay filters for smoothing the spectrum data. I could get some significant peaks, and it didn't messed too much with the overall spectrum.
But I Had a problem with what hotpaw2 is alerting you, I have lost important characteristics along with the lost of "messiness", so I truly recommend you hear him. But, if you think you won't have a problem with that, I think savitsky-golay can help.
There are non-FFT methods for creating a frequency domain representation of time domain data which are better for noisy data sets, like Max-ent recontruction.
For noisy time-series data, a max-ent reconstruction will be capable of distinguising true peaks from noise very effectively (without adding any artifacts or other modifications to suppress noise).
Max ent works by "guessing" an FFT for a time domain specturm, and then doing an IFT, and comparing the results with the "actual" time-series data, iteratively. The final output of maxent is a frequency domain spectrum (like the one you show above).
There are implementations in java i believe for 1-d spectra, but I have never used one.