Let's say I have a vector of length 1000. If take the FFT of this data; MATLAB chooses the k values as 0:1:length(data)-1. How can I change this range to 0:1:length(data)*(an integer)-1 or any desired range?
See the documentation to fft. The second parameter sets the size of the transform:
x = randn(1,1000);
y = fft(x,512);
However, this is equivalent to
y = fft(x(1:512));
That is, the input data is cropped to the right length, rather than using all the input data and computing only part of the output values.
There is no way to compute only part of the output values, as the FFT algorithm is most efficient when computing the full transform.
Alternatives are to simply crop the output (is the computation taking too long?), or to compute the DFT sample by sample (will be efficient only for a few output samples, anything more and your computation will take longer than the full FFT.
Related
This question relates to SciPy's Short-time Fourier Transform function for signal processing.
For some reason I don't understand, the size of the output 'array of sample frequencies' is exactly equal to the hop size. From the documentation:
nperseg : int, optional
Length of each segment. Defaults to 256.
noverlap : int, optional
Number of points to overlap between segments. If None, noverlap = nperseg // 2. Defaults to None. When specified, the COLA constraint must be met (see Notes below).
f : ndarray
Array of sample frequencies.
hop size H = nperseg - noverlap
I'm new to signal processing and Fourier transforms, but as far as I understand a STFT is just chopping an audio file into segments ('time frames') on which you perform a Fourier transform. So if I want to do a STFT on 100 time frames, I'd expect the output to be a matrix of size 100 x F, where F is an array of measured frequencies ('measured' probably isn't the right word here but you know what I mean).
This is kinda what SciPy's implementation does, but the size of f here is what bothers me. It's supposed to be an array describing the different frequencies, like [0Hz 500Hz 1000Hz], and it does, but for some reasons its size exactly the same as the hop size. If the hop size is 700, the number of measured frequencies is 700.
The hop size is the number of samples (i.e. time) between each time frame, and is correctly calculated as H = nperseg - noverlap, but what does this have to do with the frequency array?
Edit: Related to this question
An FFT is an square matrix transform from one orthogonal basis to another of the same dimension. This is because N is the exact number of orthogonal (e.g. that don't interfere with one another) complex sinusoids that fit in a time domain vector of length N.
A longer time vector can contain more frequency information (e.g. it's hard to tell 2 frequencies apart using just 3 sample points, but much easier with 3000 samples, etc.)
You can zero-pad your short time vector of length N to use a longer FFT, but that is identical to interpolating a nice curve between N frequency points, which makes all the FFT results interdependent.
For many purposes (visualization, etc.) an STFT is overlapped, where the adjacent segments share some overlapped data instead of just being end-to-end. This gives better time locality (e.g. the segments can be spaced closer but still be long enough so that each one can provide the frequency resolution required).
I want to do a comparison of 2 audio files (each audio file is speaking "ba a ta") with the existing function in matlab called Dynamic Time Warping (DTW). Before doing a dynamic time warping, I get an array/vector from the Fast Fourier Transform (FFT) functions available in matlab, my code so far (my matlab filename: test.m):
fftRecording1 = fft(audioread('C:\Users\handy\Documents\MATLAB\my_recording_1.wav'));
fftRecording2 = fft(audioread('C:\Users\handy\Documents\MATLAB\fajar.wav'));
dist = dtw(fftRecording1, fftRecording2);
When I try the DTW function there is an error because the length (row) of the array/vector 2 file is different. Error message:
Error using dtw (line 82)
The number of rows between X and Y must be equal when X and Y are matrices
Error in test (line 3)
dist = dtw(fftRecording1, fftRecording2);
contents of the fftRecording1 and fftRecording2 variables
My question is: before do the FFT and DTW, how do step by step normalize so that the length (row) 2 audio files is equal? or there are other ways to make the data length (row) 2 audio files is equal?
According to dtw's documentation:
To stretch the inputs, dtw repeats each element of x and y as many times as necessary. If x and y are matrices, then dist stretches them by repeating their columns. In that case, x and y must have the same number of rows.
In your case your columns represent the audio channels, with the rows representing the quantity to be aligned (i.e. the reverse of what dtw is expecting). To setup the inputs according to what dtw expect, simply transpose the inputs:
dist = dtw(transpose(fftRecording1), transpose(fftRecording2));
Dynamic Time Warping does not need the input sequences to be of same length. DTW is actually used to find similarity between two different time aligned sequences.
No, they don’t need to have the same length in a time-related-sense. They need to have the same number of dimensions (2D Signal, 3D Signal,...) which is equivalent to their number or rows. The whole idea of DTW is to match similar contents which might be stretched to different lengths - so there would absolutely be no point in requiring the inputs to have the same length.
Related to your question: just call the dtw with the transposed of your signals and you will get a proper result.
dtw(signal1’, signal2’);
You should apply the DTW on the original signals rather than the fourier transforms. The FFT transfers the signal from time to frequency domain. So instead of warping signal1 in order to match signal2, you are warping frequencies when using FFT before DTW. The amplitude of the fourier transform depends on the number of points in the considered FFT-Time-Window. From my point of view there is absolutely no point in applying DTW on a fourier transform.
I need a matlab code for a perceptual hashing algorithm descried here:
http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
Basically I want this to remove deatails in an image and only leave the major structure components information.
To do so, I think I need the following steps:
1. Reduce the DCT. Suppose the DCT is 32x32 (), just keep the top-left 8x8. Those represent the lowest frequencies in the picture.
Compute the average value. Like the Average Hash, compute the mean DCT value (using only the 8x8 DCT low-frequency values and excluding the first term since the DC coefficient can be significantly different from the other values and will throw off the average).
Further reduce the DCT. Set the 64 hash bits to 0 or 1 depending on whether each of the 64 DCT values is above or below the average value. The result doesn't tell us the actual low frequencies; it just tells us the very-rough relative scale of the frequencies to the mean. The result will not vary as long as the overall structure of the image remains the same; this can survive gamma and color histogram adjustments without a problem.
reconstruct image after the processing.
Anyone can help on any one of above steps?
I have tried some code that gives some results (in the below link), it is not yet perfect:
https://stackoverflow.com/questions/26748051/extract-low-frequency-from-dct-coeffecients-of-an-image-in-matlab
Try this:
% read image
I = imread('cameraman.tif');
% cosine transform and reduction
d = dct2(I);
d = d(1:8,1:8);
% compute average
a = mean(mean(d));
% set bits, here unclear whether > or >= shall be used
b = d > a;
% maybe convert to string:
string = num2str(b(:)');
I'm trying to figure out how MATLAB does the short time Fourier transforms for its spectrogram function (and related functions like specgram, or stft in Octave). What is curious to me is that you can apparently specify the length of the window and the FFT length (number of output frequencies) independently, whereas I would have expected that these two should be equal (since the length of an FFT'd signal is the same as the length of the original signal). To illustrate what I mean, here is the function call:
[S,F,T]=spectrogram(signal,winSize,overlapSize,fftSize,rate);
winSize is the length of subintervals which are to be (individually) FFT'd, and fftSize is the number of frequency components given in the output. When these are not equal, does Matlab do interpolation to produce the required number of frequency bins?
Ultimately the reason I want to know is so that I can determine the proper units and scaling for the frequencies.
Cheers
A windowed segment of a signal can be zero-padded to a longer length vector to use a longer FFT. The frequency scaling will be determined by the length of the FFT (and the signals sample rate). The window size and window formula will determine the effective resolution, in terms of peak separation ability.
Why do this? Some FFT sizes can be computed more efficiently than others (slightly or a lot, depending on the FFT library used). Also, a longer FFT will calculate more points or bins, thus producing a higher density of interpolated points in a potentially smoother spectrum result.
I'm trying to write a simple matlab code which enlarges an image using fft. I tried the known algorithm for image expansion, which computes the Fourier transform of the image, pads it with zeros and computes the inverse Fourier of the padded image.
However, the inverse Fourier transform returns an image which contains complex numbers.
Therefore, when I'm trying to show the result using imshow, I'm getting the following error:
Warning: Displaying real part of complex input.
Do you have an idea what am I doing wrong?
my code:
im = imread('fruit.jpg');
imFFT = fft2(im);
bigger = padarray(imFFT,[10,10]);
imEnlarged = ifft2(bigger);
Thanks!
That's because the FFT returns values corresponding to the discrete (spatial) frequencies from 0 through Fs, where Fs is the (spatial) sampling rate. You need to insert zeros at high frequencies, which are located at the center of the returned FFT, not in its end.
You can use fftshift to shift the high frequencies to the end, pad with zeros, and then shift back with ifftshift (thanks to #Shai for the correction):
bigger = ifftshift(padarray(fftshift(imFFT),[10,10]));
Also, note that padding with zeros decreases the values in the enlarged image. You can correct that using a suitable amplification factor amp, which in this case would be equal to (1+2*10/length(im))^2:
bigger = ifftshift(padarray(fftshift(amp*imFFT),[10,10]));
You can pad at the higher frequencies directly (without fftshift suggested by Luis Mendo)
>> BIG = padarray( amp*imFFT, [20 20], 0, 'post' );
>> big = ifft2( BIG );
If you want a strictly real result, then before you do the IFFT you need to make sure the zero-padded array is exactly conjugate symmetric. Adding the zeros off-center could prevent this required symmetry.
Due to finite numerical precision, you may still end up with a complex IFFT result, but the imaginary components will all be tiny values that are essentially equivalent to zero.
Your FFT library may contain a half-to-real (quarter-size input for 2D) version that enforces symmetry and throws away the almost-zero numerical noise for you.