Increase/decrease play speed of a WAV file in MATLAB - matlab

Hello I need help in MATLAB.
My wave file plays with this code:
x=wavread('D:\\Sample.wav');
Now I want to increase/decrease the play speed of a WAV file in MATLAB with reshape. For example, double the speed.
Let me to Explain it .
when We use this code:
x=wavread('D:\\\Sample.wav');
now x is a Matrix 92086 * 1
and now I want to set zero for Decussate of X Like this:
0
value1
0
value2
...
...
now how can i do it whit reshape?
After that, I need to merge two WAV files into one WAV file. For example I have two files:
x=wavread('D:\\Sample1.wav');
y=wavread('D:\\Sample2.wav');
and need to merge these and play it.

I assume you mean to use the resample and not the reshape function. reshape is used to (well..) reshape a matrix, i.e. change the number of rows and columns. The resample function can be used to change the sampling rate of a signal. You can use this to increase / decrease the play speed of your WAV file. The syntax of resample is:
y = resample(x,p,q);
where x is the input signal, p is the desired sampling rate and q is the current sampling rate. The output y is then the input x, resampled at p/q times the original rate.
Now how can we double the speed? - If we set p=2 and q=1 we get a resampled signal at double the sampling rate, i.e. we have twice as many samples. If you play the WAV with the same command, then the signal takes twice as long to play, so we divided the play speed by 2.
To double the play speed, we'll have to do the opposite and set p=1 and q=2:
x = wavread('D:\\Sample.wav');
y = resample(x,1,2);
--
As requested in an edit, it is of course possible to add zeros e.g. at every second position to change the sampling rate. Note that this creates high-frequency noise, which is usually removed by FIR-filtering. The procedure however is quite easy:
x = x(:).'; % Make x a row vector
y = [x; zeros(1,numel(x))]; % add one zero between elements
y = y(:);
The last row does the magic here: it takes the columns of y and stacks them above each other. As x was one row, and we added a row of zeros below that, the resulting y will be a row containing all elements of x with zeros between the values.
As you specifically wanted to use reshape, we can do the same using reshape:
x = x(:).'; % Make x a row vector
y = [x; zeros(1,numel(x))]; % add one zero between elements
y = reshape(y,[],1);
--
To merge two WAV files into one, we can simply concatenate the vectors using the [...] notation or the cat function.
x = wavread('D:\\Sample1.wav');
y = wavread('D:\\Sample2.wav');
z = [x,y];
z = cat(2,x,y);

Related

Change limits of a time signal at matlab

i am a beginner an have an easy question. I have a signal on y-axis and time signal on x-axis. I need to change boundaries of the time signal. It's between 0 and 18 seconds, but i want to change in between 5 and 10. I used already "xlim", it work for plot but actually i want to create a new time signal.
Any idea? Thank you!
Since you didn't post your code I'll need to make some assumptions. I'll assume you have your data stored in row vectors x and y and that x is uniform and monotonically increasing.
1. Construct a truncated signal using logical indexing.
index = x >= 5 & x <= 10;
x_new = x(index);
y_new = y(index);
plot(x_new, y_new);
The above only takes a subset of the data, if x doesn't contain 5 and 10 then the plot will be truncated. If you're dealing with time series data this is probably the most reasonable approach since it doesn't change the sampling rate.
2. Re-sampling the signal between 5 and 10 using interpolation.
num_samples = 100;
x_new = linspace(5, 10, num_samples);
y_new = interp1(x, y, x_new);
plot(x_new, y_new);
This may not exactly match the original plot since the original samples aren't guaranteed to be included. However it will exactly span the desired domain.
3. If you don't care that x is uniform but want to create a plot that exactly matches the original then you can append the bounds of x to the subset from method 1 and use interp1 to sample y.
x_min = 5; x_max = 10;
index = x > x_min & x < x_max;
x_new = [x_min, x(index), x_max];
y_new = interp1(x, y, x_new);
plot(x_new, y_new);
Example
Example demonstrating the differences between the different methods, plotted with additional offset and markings at samples for clarity.
If you want to delete the elements n from the back of a vector y and store the result in y_cut, you should be able to do that with:
y_cut = y(1:end-n);
It would be important to know in which form you stored the time signal.
If you have one value for each second the solution would be:
y_cut = y(5:10);
But I assume you're storing your y-values as samples with a given sample rate fs
One second would then be equal to fs (for example 44100 for a CD audio file, resulting in 44100 samples per second) and the solution would be:
y_cut = y(5*fs:10*fs);
I hope I could help.
Cheers,
Simon

What is the reverse process of the repmat or repemel command?

I have a matrix of 50-by-1 that is demodulated data. As this matrix has only one element in each row, I want to repeat this only element 16 times in each row so the matrix become 50 by 16. I did it using the repmat(A,16) command in Matlab. Now at receiving end noise is also added in matrix of 50 by 16. I want to get it back of 50 by 1 matrix. How can I do this?
I tried averaging of all rows but it is not a valid method. How can I know when an error is occurring in this process?
You are describing a problem of the form y = A * x + n, where y is the observed data, A is a known linear transform, and n is noise. The least squares estimate is the simplest estimate of the unknown vector x. The keys here are to express the repmat() function as a matrix and the observed data as a vector (i.e., a 50*16x1 vector rather than a 50x16 matrix).
x = 10 * rand(50,1); % 50x1 data vector;
A = repmat(eye(length(x)),[16,1]); % This stacks 16 replicas of x.
n = rand(50*16,1); % Noise
y = A * x + n; % Observed data
xhat = A \ y; % Least squares estimate of x.
As for what the inverse (what I assume you mean by 'reverse') of A is, it doesn't have one. If you look at its rank, you'll see it is only 50. The best you can do is to use its pseudoinverse, which is what the \ operator does.
I hope the above helps.

Convolution of multiple 1D signals in a 2D matrix with multiple 1D kernels in a 2D matrix

I have a randomly defined H matrix of size 600 x 10. Each element in this matrix H can be represented as H(k,t). I obtained a speech spectrogram S which is 600 x 597. I obtained it using Mel features, so it should be 40 x 611 but then I used a frame stacking concept in which I stacked 15 frames together. Therefore it gave me (40x15) x (611-15+1) which is 600 x 597.
Now I want to obtain an output matrix Y which is given by the equation based on convolution Y(k,t) = ∑ H(k,τ)S(k,t-τ). The sum goes from τ=0 to τ=Lh-1. Lh in this case would be 597.
I don't know how to obtain Y. Also, my doubt is the indexing into both H and S when computing the convolution. Specifically, for Y(1,1), we have:
Y(1,1) = H(1,0)S(1,1) + H(1,1)S(1,0) + H(1,2)S(1,-1) + H(1,3)S(1,-2) + ...
Now, there is no such thing as negative indices in MATLAB - for example, S(1,-1) S(1,-2) and so on. So, what type of convolution should I use to obtain Y? I tried using conv2 or fftfilt but I think that will not give me Y because Y must also be the size of S.
That's very easy. That's a convolution on a 2D signal only being applied to 1 dimension. If we assume that the variable k is used to access the rows and t is used to access the columns, you can consider each row of H and S as separate signals where each row of S is a 1D signal and each row of H is a convolution kernel.
There are two ways you can approach this problem.
Time domain
If you want to stick with time domain, the easiest thing would be to loop over each row of the output, find the convolution of each pair of rows of S and H and store the output in the corresponding output row. From what I can tell, there is no utility that can convolve in one dimension only given an N-D signal.... unless you go into frequency domain stuff, but let's leave that for later.
Something like:
Y = zeros(size(S));
for idx = 1 : size(Y,1)
Y(idx,:) = conv(S(idx,:), H(idx,:), 'same');
end
For each row of the output, we perform a row-wise convolution with a row of S and a row of H. I use the 'same' flag because the output should be the same size as a row of S... which is the bigger row.
Frequency domain
You can also perform the same computation in frequency domain. If you know anything about the properties of convolution and the Fourier Transform, you know that convolution in time domain is multiplication in the frequency domain. You take the Fourier Transform of both signals, multiply them element-wise, then take the Inverse Fourier Transform back.
However, you need to keep the following intricacies in mind:
Performing a full convolution means that the final length of the output signal is length(A)+length(B)-1, assuming A and B are 1D signals. Therefore, you need to make sure that both A and B are zero-padded so that they both match the same size. The reason why you make sure that the signals are the same size is to allow for the multiplication operation to work.
Once you multiply the signals in the frequency domain then take the inverse, you will see that each row of Y is the full length of the convolution. To ensure that you get an output that is the same size as the input, you need to trim off some points at the beginning and at the end. Specifically, since each kernel / column length of H is 10, you would have to remove the first 5 and last 5 points of each signal in the output to match what you get in the for loop code.
Usually after the inverse Fourier Transform, there are some residual complex coefficients due to the nature of the FFT algorithm. It's good practice to use real to remove the complex valued parts of the results.
Putting all of this theory together, this is what the code would look like:
%// Define zero-padded H and S matrices
%// Rows are the same, but columns must be padded to match point #1
H2 = zeros(size(H,1), size(H,2)+size(S,2)-1);
S2 = zeros(size(S,1), size(H,2)+size(S,2)-1);
%// Place H and S at the beginning and leave the rest of the columns zero
H2(:,1:size(H,2)) = H;
S2(:,1:size(S,2)) = S;
%// Perform Fourier Transform on each row separately of padded matrices
Hfft = fft(H2, [], 2);
Sfft = fft(S2, [], 2);
%// Perform convolution
Yfft = Hfft .* Sfft;
%// Take inverse Fourier Transform and convert to real
Y2 = real(ifft(Yfft, [], 2));
%// Trim off unnecessary values
Y2 = Y2(:,size(H,2)/2 + 1 : end - size(H,2)/2 + 1);
Y2 should be the convolved result and should match Y in the previous for loop code.
Comparison between them both
If you actually want to compare them, we can. What we'll need to do first is define H and S. To reconstruct what I did, I generated random values with a known seed:
rng(123);
H = rand(600,10);
S = rand(600,597);
Once we run the above code for both the time domain version and frequency domain version, let's see how they match up in the command prompt. Let's show the first 5 rows and 5 columns:
>> format long g;
>> Y(1:5,1:5)
ans =
1.63740867892464 1.94924208172753 2.38365646354643 2.05455605619097 2.21772526557861
2.04478411247085 2.15915645246324 2.13672842742653 2.07661341840867 2.61567534623066
0.987777477630861 1.3969752201781 2.46239452105228 3.07699790208937 3.04588738611503
1.36555260994797 1.48506871890027 1.69896157726456 1.82433906982894 1.62526864072424
1.52085236885395 2.53506897420001 2.36780282057747 2.22335617436888 3.04025523335182
>> Y2(1:5,1:5)
ans =
1.63740867892464 1.94924208172753 2.38365646354643 2.05455605619097 2.21772526557861
2.04478411247085 2.15915645246324 2.13672842742653 2.07661341840867 2.61567534623066
0.987777477630861 1.3969752201781 2.46239452105228 3.07699790208937 3.04588738611503
1.36555260994797 1.48506871890027 1.69896157726456 1.82433906982894 1.62526864072424
1.52085236885395 2.53506897420001 2.36780282057747 2.22335617436888 3.04025523335182
Looks good to me! As another measure, let's figure out what the largest difference is between one value in Y and a corresponding value in Y2:
>> max(abs(Y(:) - Y2(:)))
ans =
5.32907051820075e-15
That's saying that the max error seen between both outputs is in the order of 10-15. I'd say that's pretty good.

How to find the frequency response of the Rosenberg Glottal Model

Is there an easy way to calculate the frequency response of the following function?
I tried using heaviside function but with no luck.
Basically I want to write a function to return the frequency response based on input N1 and N2 and also the number of points (lets say x) between 0 and pi
The output would be a vector which returns x values for the frequency response for corresponding frequencies => 0:pi/x:pi
Assuming that N1 + N2 < num_points, where num_points is the length of the sequence, you can simply write the function like so:
function [gr] = rosenburg(N1, N2, num_points)
gr = zeros(num_points,1);
range1 = 0:N1;
range2 = N1+1:N1+N2;
gr(range1+1) = 0.5*(1 - cos(pi*range1/N1));
gr(range2+1) = cos(pi*(range2-N1) / (2*N2));
end
The function prototype, rosenburg takes in N1, N2 and the total number of points you want this function to take in, num_points. How this code works is that we first allocate an array that is all zeroes of size num_points. We then compute two linear ranges: One from 0 <= n <= N1 and the other from N1 < n <= N2. Note that the second range starts by offsetting N1 by 1 because we have already computed the value at n = N1. Once we compute these ranges, we simply apply the right relationship in the right ranges. Note that when I'm assigning the relationships to the correct intervals in the array, I need to offset by 1 because MATLAB begins indexing arrays at index 1. The rest of the values are zero due to the initialization at the beginning of the function.
Now, if you want to find the frequency response of this signal, just use fft which is the Fast Fourier Transform. It's the classic method to find the frequency domain version of a discrete input signal on a numerical basis. As such, once you create your signal using the rosenburg function, then throw this into the FFT function. How you call it is like so:
X = fft(gr);
This computes the N point FFT, where N is the length of the signal gr. Alternatively, you can provide the number of points you want to compute the FFT for. Specifically:
X = fft(gr, N);
Basically, the higher N is, the finer or granular the frequency components will be. Note that the frequency axis is normalized between 0 to 2*pi, and so the higher N is, the finer resolution you will have between neighbouring points on the axis. Specifically, each point on this axis has the following frequency:
w = i*(2*pi)/x;
i would be the index on the x-axis (0, 1, 2, ..., num_points-1) and x would be the total number of points for the FFT. Normally, people show the spectrum between -pi <= w <= pi, and so some people apply fftshift to shift the spectrum so that the DC component is located at the centre of the spectrum, which is how we naturally perceive the spectrum to be.
When you say "frequency response", I believe you are referring to the magnitude, and so use abs to calculate the complex magnitude of each value, as the fft is generally complex valued. Therefore, assuming that you wish to compute the FFT to be as many points as the length of your signal, and let's say we choose N1 = 4, N2 = 8 and we want 64 points, and we want to plot the spectrum. Simply do this:
gr = rosenburg(4, 8, 64);
X = fft(gr);
Xshift = fftshift(X);
plot(linspace(-pi,pi,64), abs(Xshift));
grid;
The above code will shift the spectrum, then plot its magnitude between -pi to pi. This is what I get:
As an illustration, this is what the spectrum looks like before we apply fftshift:
Here's the code to generate the above figure:
plot(linspace(0,2*pi,64), abs(X));
grid;
You can see that the spectra is symmetric. Right at the frequency pi, you can see that it is mirror reflected, which makes sense as the range from pi to 2*pi, precisely maps to -pi to 0. Because the signal is real, the spectrum is symmetric. In fact, we can call this signal Hermitian symmetric. Obviously, the frequency components are a bit sparsely spaced. It may be better to increase the total number of points to something like 256. This is what I get when I change the number of points to 256:
Pretty smooth! Now, if you want to extract the frequency components from 0 to pi, you need to extract half of the frequency decomposition that is stored in X. Therefore, you would simply do:
f = X(1:numel(X)/2);
numel determines how many elements are in an array or matrix. However, remember that each frequency point was defined as:
w = i*(2*pi)/x
You specifically want:
w = i*pi/x
As such, you'll need to compute the FFT at twice the size of your signal first, then extract half of the spectra in the same way. For example, for 64 points:
gr = rosenburg(4, 8, 64);
X = fft(gr, 128);
f = X(1:numel(X)/2);
This should hopefully get you started. Good luck!

Matlab - Signal Noise Removal

I have a vector of data, which contains integers in the range -20 20.
Bellow is a plot with the values:
This is a sample of 96 elements from the vector data. The majority of the elements are situated in the interval -2, 2, as can be seen from the above plot.
I want to eliminate the noise from the data. I want to eliminate the low amplitude peaks, and keep the high amplitude peak, namely, peaks like the one at index 74.
Basically, I just want to increase the contrast between the high amplitude peaks and low amplitude peaks, and if it would be possible to eliminate the low amplitude peaks.
Could you please suggest me a way of doing this?
I have tried mapstd function, but the problem is that it also normalizes that high amplitude peak.
I was thinking at using the wavelet transform toolbox, but I don't know exact how to reconstruct the data from the wavelet decomposition coefficients.
Can you recommend me a way of doing this?
One approach to detect outliers is to use the three standard deviation rule. An example:
%# some random data resembling yours
x = randn(100,1);
x(75) = -14;
subplot(211), plot(x)
%# tone down the noisy points
mu = mean(x); sd = std(x); Z = 3;
idx = ( abs(x-mu) > Z*sd ); %# outliers
x(idx) = Z*sd .* sign(x(idx)); %# cap values at 3*STD(X)
subplot(212), plot(x)
EDIT:
It seems I misunderstood the goal here. If you want to do the opposite, maybe something like this instead:
%# some random data resembling yours
x = randn(100,1);
x(75) = -14; x(25) = 20;
subplot(211), plot(x)
%# zero out everything but the high peaks
mu = mean(x); sd = std(x); Z = 3;
x( abs(x-mu) < Z*sd ) = 0;
subplot(212), plot(x)
If it's for demonstrative purposes only, and you're not actually going to be using these scaled values for anything, I sometimes like to increase contrast in the following way:
% your data is in variable 'a'
plot(a.*abs(a)/max(abs(a)))
edit: since we're posting images, here's mine (before/after):
You might try a split window filter. If x is your current sample, the filter would look something like:
k = [L L L L L L 0 0 0 x 0 0 0 R R R R R R]
For each sample x, you average a band of surrounding samples on the left (L) and a band of surrounding samples on the right. If your samples are positive and negative (as yours are) you should take the abs. value first. You then divide the sample x by the average value of these surrounding samples.
y[n] = x[n] / mean(abs(x([L R])))
Each time you do this the peaks are accentuated and the noise is flattened. You can do more than one pass to increase the effect. It is somewhat sensitive to the selection of the widths of these bands, but can work. For example:
Two passes:
What you actually need is some kind of compression to scale your data, that is: values between -2 and 2 are scale by a certain factor and everything else is scaled by another factor. A crude way to accomplish such a thing, is by putting all small values to zero, i.e.
x = randn(1,100)/2; x(50) = 20; x(25) = -15; % just generating some data
threshold = 2;
smallValues = (abs(x) <= threshold);
y = x;
y(smallValues) = 0;
figure;
plot(x,'DisplayName','x'); hold on;
plot(y,'r','DisplayName','y');
legend show;
Please do not that this is a very nonlinear operation (e.g. when you have wanted peaks valued at 2.1 and 1.9, they will produce very different behavior: one will be removed, the other will be kept). So for displaying, this might be all you need, for further processing it might depend on what you are trying to do.
To eliminate the low amplitude peaks, you're going to equate all the low amplitude signal to noise and ignore.
If you have any apriori knowledge, just use it.
if your signal is a, then
a(abs(a)<X) = 0
where X is the max expected size of your noise.
If you want to get fancy, and find this "on the fly" then, use kmeans of 3. It's in the statistics toolbox, here:
http://www.mathworks.com/help/toolbox/stats/kmeans.html
Alternatively, you can use Otsu's method on the absolute values of the data, and use the sign back.
Note, these and every other technique I've seen on this thread is assuming you are doing post processing. If you are doing this processing in real time, things will have to change.