Acoustic Echo Cancellation algorithm in Matlab - matlab

My collegue and I are developping a sound and speech processing module on a Analog Device DSP. Because of the proximity of our single microphone and speaker, we have been experiencing some important echo. We want to implement an NLMS based algorithm to reduce this echo.
I first wanted to implement it and test the algorithm in Matlab but I am still having some issues. I think I might have some theoretical issue in my algorithm. I have a rough time understanding what would be the "desired signal" in the algorithm since I don't have access to a uncorrupted signal.
Here is an overview of my naive way to implement this in Matlab.
Simulink diagram here
Link to Simulink code (.slx)
Right now the code can't compile because of an "algeabric loop error" in Simulink, but I have a feeling there is more to this problem.
Any help would be appreciated.

The model you have is not fully correct. For acoustic echo cancellation you are using the adaptive filter to model the room. You are identifying the room characteristics using the adaptive filter. Once you do this you can then use your adaptive filter to identify the part of the far end signal from the loud speaker which goes back into the microphone and subtract that from the microphone signal to remove the echo.
For your adaptive filter your input should be the signal from far end which would be the signal going to the loud speaker in the room. Your desired signal is the signal coming out of the microphone in the room. The microphone signal contains signals from the voices from the person in the room and also a portion of sound from the loud speaker which is the echo.
Sound from far end ----|In | Out (You can ignore this)
| Adaptive Filter |
Sound from local microphone ----|Desired | Error
In this model Error output signal from adaptive filter is your desired echo free signal. This is because error is computed by subtracting adaptive filter output from desired which is basically removing the echo.
To simulate this system in Simulink you need a filter to represent the room. You can use an ordinary FIR filter for this. You should be able to get room impulse responses online. These are usually long (~1000) slowly decaying impulse responses. Your audio source can represent signal from the loud speaker. You feed the same audio signal into this room response filter and you will get your desired signal. Feeding both into adaptive filter will make adaptive filter adapt to the room response filter.

Related

Which input format is the best for sound recognition in recurrent neural networks?

I want to create sound or pitch recognition with recurrent deep neural network. And I'm wondering with what input will I get best results.
Should I feed DNN with amplitudes or with FFT(Fast Fourier transform) result?
Is there any other format that is known to produce good results and fast learning?
While MFCCs have indeed been used in music information retrieval research (for genre classification etc...), in this case (pitch detection) you may want to use a semi-tone filterbank or constant Q transform as a first information reduction step. These transformations match better with musical pitch.
But I think it's also worth trying to use the audio samples directly with RNNs, in case you have a huge number of samples. In theory, the RNNs should be able to learn the wave patterns corresponding to particular pitches.
From your description, it's not entirely clear what type of "pitch recognition" you're aiming for: monophonic instruments (constant timbre, and only 1 pitch sounding at a time)? polyphonic (constant timbre, but multiple pitches may be sounding simultaneously)? multiple instruments playing together (multiple timbres, multiple pitches)? or even a full mix with both tonal and percussive sounds? The hardness of these use cases roughly increases in the order I mentioned them, so you may want to start with monophonic pitch recognition first.
To obtain the necessary amount of training examples, you could use a physical model or a multi-sampled virtual instrument to generate the audio samples for particular pitches in a controlled way. This way, you can quickly create your training material instead of recording it and labeling it manually. But I would advise you to at least add some background noise (random noise, or very low-level sounds from different recordings) to the created audio samples, or your data may be too artificial and lead to a model that doesn't work well once you want to use it in practice.
Here is a paper that might give you some ideas on the subject:
An End-to-End Neural Network for Polyphonic Piano Music Transcription
(Siddharth Sigtia, Emmanouil Benetos, and Simon Dixon)
https://arxiv.org/pdf/1508.01774.pdf
The Mel-frequency cepstrum is general used for speech recognition.
mozilla DeepSpeech is using MFCCs as input to their DNN.
For python implementation you can use python-speech-features lib.

Manual pitch estimation of a speech signal

I am new to speech processing. So please forgive for my ignorance. I was given a short speech signal (10 sec) and was asked to manually annotate pitch using MATLAB or Wavesufer software. Now how to find pitch of a speech signal?. Is there any theoretical resource to help the problem? I tried to plot pitch-contour of the signal using Wavesurfer.Is it right?
Edit 1:My work is applying various pitch detection algorithms for our data and compare their accuracies. So manually annotated pitch acts as the reference.
UPDATE 1: I obtained the GCIs (Glottal Closure Instants) by differentiating EGG (dEGG) signal and the peaks in dEGG are GCIs. Time interval between two successive GCIs is the pitch period (s). The inverse of pitch period is pitch (hz).
UPDATE 2 : SIGMA is a famous algorithm for automatic GCI detection.
Thanks everyone.
Usually ground truth is obtained on the signal accompanied with EGG recording. EGG is an acronym for Electrogastrogram, it's a special device which records true pitch.
Since I doubt you have access to such device, I recommend you to use existing database for pitch extraction evaluation carefully prepared for that task. You can download it here. This data was collected in University of Edinburgh by Paul Bagshaw
I suggest you to read his thesis as well.
If you want to compare with the state of the art algorithm for pitch extraction check https://github.com/google/REAPER. Also note that "true" pitch might not be the best feature for subsequent algorithms. Sometime you might extract pitch with mistakes but get better accuracy for example for speech recognition. Check for more information this publication.

Remove noise from magnetometer data

I want to use data from a magnetometer to gain information about the motion of a metal object near it. After recording the data, I need to remove noise from the data before using it. What is a good method to remove noise? I read about filters in Matlab here but cannot decide which one to use. How can I decide which filter to use?
Edit:
The metal object moves at a steady rate and I want to find out the angle of its motion. I am adding a graph from my sample data which I want to filter. Sample Magnetometer data
I guess you're able to record the noise. And if you can do it, yo can also use some adaptive filtering.
From MathWorks' Overview of Adaptive Filters and Applications:
Block Diagram That Defines the Inputs and Output of a Generic RLS Adaptive Filter
You can use recorded noise as desired signal and your error signal should be around 0 without any motion near it, and should have some filtered value when the motion appears.
You can find an example of adaptive filtering on the MathWorks website:
Consider a pilot in an airplane. When the pilot speaks into a microphone, the engine noise in the cockpit combines with the voice signal. This additional noise makes the resultant signal heard by passengers of low quality. The goal is to obtain a signal that contains the pilot's voice, but not the engine noise. You can cancel the noise with an adaptive filter if you obtain a sample of the engine noise and apply it as the input to the adaptive filter.
Read more about adaptive filtering:
Overview: http://www.mathworks.com/help/dsp/ug/overview-of-adaptive-filters-and-applications.html
NN adaptive filters: http://www.mathworks.com/help/nnet/ug/adaptive-neural-network-filters.html

Simulink Desktop Real-Time and Least Mean Square adaptive algorithm

I need to implement an LMS-based adaptive audio-cancellation algorithm on the Simulink Desktop Real-Time toolbox.
The physical system is composed of a microphone recording a noise source and another microphone recording the residual noise after the control process (antinoise being injected by a speaker controlled by Simulink).
For the (adaptive) LMS algorithm to work properly I need to be able to work on a sample-by-sample basis, that is at each sampled time instant I need to update the adaptive filter using the synchronised current sample value of both microphones. I realise some delay is inevitable but I was wondering whether it's possible on Simulink Desktop Real-Time to reduce the buffer size of the inputs to one sample and thus work on a sample-by-sample basis.
Thanks for your help in advance.
You can always implement the filter on a sample by sample basis.
But you still need a history of input values to perform the actual LMS calculation on. On a sample by sample basis this would just mean using a simple FIFO buffer.
If you have access to the DSP Toolbox then there is already an LMS Filter block that will do this for you.

Why isn't there a simple function to reduce background noise of an audio signal in Matlab?

Is this because it's a complex problem ? I mean to wide and therefore it does not exist a simple / generic solution ?
Because every (almost) software making signal processing (Avisoft, GoldWave, Audacity…) have this function that reduce background noise of a signal. Usually it uses FFT. But I can't find a function (already implemented) in Matlab that allows us to do the same ? Is the right way to make it manually then ?
Thanks.
The common audio noise reduction approaches built-in to things like Audacity are based around spectral subtraction, which estimates the level of steady background noise in the Fourier transform magnitude domain, then removes that much energy from every frame, leaving energy only where the signal "pokes above" this noise floor.
You can find many implementations of spectral subtraction for Matlab; this one is highly rated on Matlab File Exchange:
http://www.mathworks.com/matlabcentral/fileexchange/7675-boll-spectral-subtraction
The question is, what kind of noise reduction are you looking for? There is no one solution that fits all needs. Here are a few approaches:
Low-pass filtering the signal reduces noise but also removes the high-frequency components of the signal. For some applications this is perfectly acceptable. There are lots of low-pass filter functions and Matlab helps you apply plenty of them. Some knowledge of how digital filters work is required. I'm not going into it here; if you want more details consider asking a more focused question.
An approach suitable for many situations is using a noise gate: simply attenuate the signal whenever its RMS level goes below a certain threshold, for instance. In other words, this kills quiet parts of the audio dead. You'll retain the noise in the more active parts of the signal, though, and if you have a lot of dynamics in the actual signal you'll get rid of some signal, too. This tends to work well for, say, slightly noisy speech samples, but not so well for very noisy recordings of classical music. I don't know whether Matlab has a function for this.
Some approaches involve making a "fingerprint" of the noise and then removing that throughout the signal. It tends to make the result sound strange, though, and in any case this is probably sufficiently complex and domain-specific that it belongs in an audio-specific tool and not in a rather general math/DSP system.
Reducing noise requires making some assumptions about the type of noise and the type of signal, and how they are different. Audio processors typically assume (correctly or incorrectly) something like that the audio is speech or music, and that the noise is typical recording session background hiss, A/C power hum, or vinyl record pops.
Matlab is for general use (microwave radio, data comm, subsonic earthquakes, heartbeats, etc.), and thus can make no such assumptions.
matlab is no exactly an audio processor. you have to implement your own filter. you will have to design your filter correctly, according to what you want.