Sound source localization with Matlab

Sound source localization with Matlab - matlab

I have three recordings of a signal taken with an array of three hydrophones (one sound source).
I would like to estimate the source localization using the time of arrival differences for the three recordings. In Matlab, I started the following, by estimating the time of arrival differences with the GCC-PHAT algorithm (Generalized cross-correlation):
[sig1, fs] = audioread('signal1.wav');
sig2 = audioread('signal2.wav');
sig3 = audioread('signal3.wav');
refsig = sig1;
[tau_est, R, lags] = gccphat([sig2,sig3],refsig, fs);
disp(tau_est * fs)
It gives me the time of arrival differences of signals 2 and 3 compared to signal 1 (tau).
Now I would like to get the direction of arrival estimates (DOAs) and proceed with triangulation to assess source position.
Any help will be greatly appreciated!

Some additional information is needed to convert the time-of-arrival-difference estimates to directions of arrival:
Spatial coordinates of each mic / hydrophone in the array.
Speed of sound. Speed of sound in seawater varies significantly with temperature, salinity, and depth; see this Wikipedia article for an empirical formula.
Once you know this, use equation (1) from
https://people.engr.ncsu.edu/kay/msf/sound.htm to estimate the angle of arrival θ for each mic pair:
θ = sin⁻¹(speed_of_sound * time_difference_of_arrival / distance_between_mics)
A limitation is that the angle of arrival estimated from a single mic pair has front-back ambiguity. For instance an angle at θ = 30 degrees would have the same time difference of arrival as θ = 150 degrees. You can think of the sin⁻¹ inverse function as multivalued to represent this ambiguity. Also, keep in mind that the formula derivation assumes plane wave propagation from a source at an infinite distance, and it ignores shadowing and diffraction effects around the mic, so obviously there is some inaccuracy. But it is simple and often works decently well.
From there, you can combine angle estimates from the three mic pairs to triangulate the source position.

Related

Using the output of an FFT on a wav file with a single note, find out the frequency of the note(to determine it) [duplicate]

Yesterday I finalised the code for detecting the audio energy of a track displayed over time, which I will eventually use as part of my audio thumbnailing project.
However I would also like a method that can detect the pitch of a track displayed over time, so I have 2 options from which to base my research upon.
[y, fs, nb] = wavread('Three.wav'); %# Load the signal into variable y
frameWidth = 441; %# 10 msec
numSamples = length(y); %# Number of samples in y
numFrames = floor(numSamples/frameWidth); %# Number of full frames in y
energy = zeros(1,numFrames); %# Initialize energy
for frame = 1:numFrames %# Loop over frames
startSample = (frame-1)*frameWidth+1; %# Starting index of frame
endSample = startSample+frameWidth-1; %# Ending index of frame
energy(frame) = sum(y(startSample:endSample).^2); %# Calculate frame energy
end
That is the correct code for the energy method, and after researching, I found that I would need to use a Discrete Time Fourier Transform to find the current pitch of each frame in the loop.
I thought that the process would be as easy as modifying the final lines of the code to include the "fft" MATLAB command for calculating Discrete Fourier Transforms but all I am getting back is errors about an unbalanced equation.
Help would be much appreciated, even if it's just a general pointer in the right direction. Thank you.

Determining pitch is a lot more complicated than just applying a DFT. It also depends on the nature of the source - different algorithms are appropriate for speech versus a musical instrument, for example. If this is a music track, as your question seems to imply, then you're probably out of luck as there is no obvious way of determining a single pitch value for multiple musical instruments playing together (how would you even define pitch in this context ?). Maybe you could be more specific about what you are trying to do - perhaps a power spectrum would be more useful than trying to determine an arbitrary pitch ?

fitness in inverted pendulum

What is the fitness function used to solve an inverted pendulum ?
I am evolving neural networks with genetic algorithm. And I don't know how to evaluate each individual.
I tried minimize the angle of pendulum and maximize distance traveled at the end of evaluation time (10 s), but this won't work.
inputs for neural network are: cart velocity, cart position, pendulum angular velocity and pendulum angle at time (t). The output is the force applied at time (t+1)
thanks in advance.

I found this paper which lists their objective function as being:
Defined as:
where "Xmax = 1.0, thetaMax = pi/6, _X'max = 1.0, theta'Max =
3.0, N is the number of iteration steps, T = 0.02 * TS and Wk are selected positive weights." (Using specific values for angles, velocities, and positions from the paper, however, you will want to use your own values depending on the boundary conditions of your pendulum).
The paper also states "The first and second terms determine the accumulated sum of
normalised absolute deviations of X1 and X3 from zero and the third term when minimised, maximises the survival time."
That should be more than enough to get started with, but i HIGHLY recommend you read the whole paper. Its a great read and i found it quite educational.
You can make your own fitness function, but i think the idea of using a position, velocity, angle, and the rate of change of the angle the pendulum is a good idea for the fitness function. You can, however, choose to use those variables in very different ways than the way the author of the paper chose to model their function.
It wouldn't hurt to read up on harmonic oscillators either. They take the general form:
mx" + Bx' -kx = Acos(w*t)
(where B, or A may be 0 depending on whether or not the oscillator is damped/undamped or driven/undriven respectively).

Time delay estimation using crosscorrelation

I have two sensors seperated by some distance which receive a signal from a source. The signal in its pure form is a sine wave at a frequency of 17kHz. I want to estimate the TDOA between the two sensors. I am using crosscorrelation and below is my code
x1; % signal as recieved by sensor1
x2; % signal as recieved by sensor2
len = length(x1);
nfft = 2^nextpow2(2*len-1);
X1 = fft(x1);
X2 = fft(x2);
X = X1.*conj(X2);
m = ifft(X);
r = [m(end-len+1) m(1:len)];
[a,i] = max(r);
td = i - length(r)/2;
I am filtering my signals x1 and x2 by removing all frequencies below 17kHz.
I am having two problems with the above code:
1. With the sensors and source at the same place, I am getting different values of 'td' at each time. I am not sure what is wrong. Is it because of the noise? If so can anyone please provide a solution? I have read many papers and went through other questions on stackoverflow so please answer with code along with theory instead of just stating the theory.
2. The value of 'td' is sometimes not matching with the delay as calculated using xcorr. What am i doing wrong? Below is my code for td using xcorr
[xc,lags] = xcorr(x1,x2);
[m,i] = max(xc);
td = lags(i);

One problem you might have is the fact that you only use a single frequency. At f = 17 kHz, and an estimated speed-of-sound v = 340 m/s (I assume you use ultra-sound), the wavelength is lambda = v / f = 2 cm. This means that your length measurement has an unambiguity range of 2 cm (sorry, cannot find a good link, google yourself). This means that you already need to know your distance to better than 2 cm, before you can use the result of your measurement to refine the distance.
Think of it in another way: when taking the cross-correlation between two perfect sines, the result should be a 'comb' of peaks with spacing equal to the wavelength. If they overlap perfectly, and you displace one signal by one wavelength, they still overlap perfectly. This means that you first have to know which of these peaks is the right one, otherwise a different peak can be the highest every time purely by random noise. Did you make a plot of the calculated cross-correlation before trying to blindly find the maximum?
This problem is the same as in interferometry, where it is easy to measure small distance variations with a resolution smaller than a wavelength by measuring phase differences, but you have no idea about the absolute distance, since you do not know the absolute phase.
The solution to this is actually easy: let your source generate more frequencies. Even using (band-limited) white-noise should work without problems when calculating cross-correlations, and it removes the ambiguity problem. You should see the white noise as a collection of sines. The cross-correlation of each of them will generate a comb, but with different spacing. When adding all those combs together, they will add up significantly only in a single point, at the delay you are looking for!

White Noise, Maximum Length Sequency or other non-periodic signals should be used as the test signal for time delay measurement using cross correleation. This is because non-periodic signals have only one cross correlation peak and there will be no ambiguity to determine the time delay. It is possible to use the burst type of periodic signals to do the job, but with degraded SNR. If you have to use a continuous periodic signal as the test signal, then you can only measure a time delay within one period of the periodic test signal. This should explain why, in your case, using lower frequency sine wave as the test signal works while using higher frequency sine wave does not. This is demonstrated in these videos: https://youtu.be/L6YJqhbsuFY, https://youtu.be/7u1nSD0RlwY .

Mean-Squared Displacement (MATLAB)

Please can you help me understand how to calculate the Mean-Squared Displacement for a single particle moving randomly within a given period of time. I have read a lot of articles on this (including Saxton,1991,Single-Particle Tracking: The Distribution of Diffusion Coefficients), but still confused (not getting the right answer).
Let me start by showing you how I do it and please correct me if I'm wrong:
The way I'm doing it is as follows:
1.Run the program from t=0 to t=100
2.Calculate the displacement, (s(t)-s(t+tau)), at each timestep (ie. at t=1,2,3,...100) and store it in a vector
3.Square the answer to number 2
4.find the mean to the answer of 3
In essence, this is what I'm doing in Matlab
%Initialise the lattice with a square consisting of 16 nonzero lattice sites then proceed %as follows to calculate the MSD:
for t=1:tend
% Allow the particle to move randomly in the lattice. Then do the following
[row,col]=find(lattice>0);
centroid=mean([row col]);
xvec=[xvec centroid(2)];
yvec=[yvec centroid(1)];
k=length(xvec)-1; % Time
dt=1;
diffx = xvec(1:k) - xvec((1+dt):(k+dt));
diffy = yvec(1:k) - yvec((1+dt):(k+dt));
xsquare = diffx.^2;
ysquare = diffy.^2;
MSD=mean(xsquare+ysquare);
end
I'm trying to find the MSD in order to compute the diffusion co-efficient. Note that I'm modelling a collection of lattice sites (16) to represent a single particle (more biologically realistic), instead of just one. I have been brief with the comment within the for loop as it is quite long, but I'm happy to send it to you.
So far, I'm getting very small MSD values (in the range of 0.001-1), whereas I'm supposed to get values in the range of (10-50). The particle moves very large distances so surely my range of 0.001-1 cannot be right!
This is an extract from the article which I'm trying to reproduce their figure:
" We began by running some simulations in 1D for a single
cell. We allowed the cell to move for a given number of
Monte Carlo time steps (MCS), worked out the mean square
distance traveled in that time, repeated this process 500
times, and evaluate the mean squared distance for this t.
We then repeated this process ten times to get the mean of
. The reason for this choice of repetitions was to
keep the time required to run the simulations within a reasonable
level yet ensuring that the standard deviation of the
mean was relatively small (<7%)".
You can access the article here "From discrete to a continuous model of biological cell movement, 2004, by Turner et al., Physical Review E".
Any hints are greatly appreciated.

How many dimensions does the particle move along ?
I don't have Matlab right now, but here is how I'd do that over one dimension :
% pos is the vector of positions
delta = pos(2:100) - pos(1:99);
meanSquared = mean(delta .* delta);

First of all, why have a particle cover multiple lattice sites? What counts for MSD, in the end, is the displacement of the centroid, which can be represented as a point. If your particle (or cell) is large, or only takes large steps, you can always just make a wider grid. Also, if you're trying to reproduce a figure from somewhere else, you should really use the same algorithm.
For your Monte Carlo simulation, what do you do? If all you really want is get a displacement, you can generate a bunch of random movement vectors in one go (using rand or randi), and use cumsum to calculate the positions. Also, have you plotted your random walks to make sure the data is sensible?
Then, your code looks a bit funny (see comments). Why don't you just use the code provided in this answer to calculate MSD from the positions?
for t=1:tend
% Allow the particle to move randomly in the lattice. Then do the following
[row,col]=find(lattice>0); %# what do you do this for?
centroid=mean([row col]);
xvec=[xvec centroid(2)];
yvec=[yvec centroid(1)]; %# till here, I have no idea what you want to do
k=length(xvec)-1; % Time %# you should subtract dt here
dt=1; %# dt should depend on t!
diffx = xvec(1:k) - xvec((1+dt):(k+dt));
diffy = yvec(1:k) - yvec((1+dt):(k+dt));
xsquare = diffx.^2;
ysquare = diffy.^2;
MSD=mean(xsquare+ysquare);
end

Calculating audio pitch in MATLAB?

Yesterday I finalised the code for detecting the audio energy of a track displayed over time, which I will eventually use as part of my audio thumbnailing project.
However I would also like a method that can detect the pitch of a track displayed over time, so I have 2 options from which to base my research upon.
[y, fs, nb] = wavread('Three.wav'); %# Load the signal into variable y
frameWidth = 441; %# 10 msec
numSamples = length(y); %# Number of samples in y
numFrames = floor(numSamples/frameWidth); %# Number of full frames in y
energy = zeros(1,numFrames); %# Initialize energy
for frame = 1:numFrames %# Loop over frames
startSample = (frame-1)*frameWidth+1; %# Starting index of frame
endSample = startSample+frameWidth-1; %# Ending index of frame
energy(frame) = sum(y(startSample:endSample).^2); %# Calculate frame energy
end
That is the correct code for the energy method, and after researching, I found that I would need to use a Discrete Time Fourier Transform to find the current pitch of each frame in the loop.
I thought that the process would be as easy as modifying the final lines of the code to include the "fft" MATLAB command for calculating Discrete Fourier Transforms but all I am getting back is errors about an unbalanced equation.
Help would be much appreciated, even if it's just a general pointer in the right direction. Thank you.

Determining pitch is a lot more complicated than just applying a DFT. It also depends on the nature of the source - different algorithms are appropriate for speech versus a musical instrument, for example. If this is a music track, as your question seems to imply, then you're probably out of luck as there is no obvious way of determining a single pitch value for multiple musical instruments playing together (how would you even define pitch in this context ?). Maybe you could be more specific about what you are trying to do - perhaps a power spectrum would be more useful than trying to determine an arbitrary pitch ?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse