Noisy signal correlation

Noisy signal correlation - matlab

I have two (or more) time series that I would like to correlate with one another to look for common changes e.g. both rising or both falling etc.
The problem is that the time series are all fairly noisy with relatively high standard deviations meaning it is difficult to see common features. The signals are sampled at a fairly low frequency (one point every 30s) but cover reasonable time periods 2hours +. It is often the case that the two signs are not the same length, for example 1x1hour & 1x1.5 hours.
Can anyone suggest some good correlation techniques, ideally using built in or bespoke matlab routines? I've tried auto correlation just to compare lags within a single signal but all I got back is a triangular shape with the max at 0 lag (I assume this means there is no obvious correlation except with itself?) . Cross correlation isn't much better.
Any thoughts would be greatly appreciated.

Start with a cross-covariance (xcov) instead of the cross-correlation. xcov removes the DC component (subtracts off the mean) of each data set and then does the cross-correlation. When you cross-correlate two square waves, you get a triangle wave. If you have small signals riding on a large offset, you get a triangle wave with small variations in it.

If you think there is a delay between the two signals, then I would use xcorr to calculate the delay. Since xcorr is doing an FFT of the signal, you should remove the means before calling xcorr, you may also want to consider adding a window (e.g. hanning) to reduce leakage if the data is not self-windowing.
If there is no delay between the signals or you have found and removed the delay, you could just average the two (or more) signals. The random noise should tend to average to zero and the common features will approach the true value.

Related

How to remove periodicity in hourly wind speed data by using fourier transform in matlab

Review for removing periodicsI have a dataset that contains hourly wind speed data for 7 seven. I am trying to implement a forecasting model to the data and the review paper states that trimming of diurnal, weekly, monthly, and annual patterns in data significantly enhances estimation accuracy. They then follow along by using the fourier series to remove the periodic components as seen in the image. Any ideas on how i model this in matlab?

I am afraid this topic is not explained "urgently". What you need is a filter for the respective frequencies and a certain number of their harmonics. You can implement such a filter with an fft or directly with a IIR/FIR-formula.
FFT is faster than a IIR/FIR-implementation, but requires some care with respect to window function. Even if you do a "continuous" DFT, you will have a window function (like exponential or gaussian). The window function determines the bandwidth. The wider the window, the smaller the bandwidth. With an IIR/FIR-filter the bandwidth is encoded in the recursive parameters.
For suppressing single frequencies (like the 24hr weather signal) you need a notch-filter. This also requires you to specify a bandwidth, as you can see in the linked article. The smaller the bandwidth, the longer it will take (in time) until the filter has evolved to the frequency to suppress it. If you want the filter to recognize the amplitude of the 24hr-signal fast, then you need a wider bandwidth. But then however you are going to suppress also more frequencies slightly lower and slightly higher than 1/24hrs. It's a tradeoff.
If you also want to suppress several harmonics (like described in the paper) you have to combine several notch-filters in series. If you want to do it with FFT, you have to model the desired transfer function in the frequency space and since you can do this for all frequencies at once, so it's more efficient.
An easy but approximate way to get something similar to a notch-filter including all harmonics is with a Comb-filter. But it's an approximation, you have no control over the details of the transfer function. You could do that in Matlab by adding to the original a signal that is shifted by 12hrs. This is because a sinusoidal signal will cancel with one that is shifted by pi.
So you see, there's lots of possibilities for what you want.

Matlab: Comparing two signals with different time values and placed impulses

We are analysing some signals that contains an impuls in the form of a dip in the standard signal in matlab.
Signals
As you can see on the picture, we need to find the difference between the "Zlotty" and the "Krone". The two graphs besides each other, are the graphs that needs to be analyzed.
As you can see the time of the impulse is different in when it occures and in how long the impuls is. We can not use the Time as a value of measurements because that can vary randomly.
Each graph is made by vectors containing 2.5mio datapoints.
How would you use matlab to find a difference?

You could split the problem into two parts. Ensuring the same time scale for both signals and finding a possible time shift in the alignment of the resulting signals. The first part could be achieved by using the resample function of Matlab; and the second task by using cross-correlation. Using two nested for loops, you could perform a search for the "best" stretch factor and time shift that result in the maximum correlation coefficient.

Estimating Quasi-stationary part of a signal

I am trying to estimate the Estimating Quasi-stationary part of a signal in Matlab. It is a 1 second long sound signal that belongs to a bird.
I am using MFCC to extract features but would like to have a window size for MFCC that is guaranteed to operate on statistically quasi-stationary part.
My questions are:
Do you think it is a solid approach if I iterate by varying my window size from 1 second to a smaller interval by observing the change of second moment of features and making a decision where the second moment is not changing anymore?
If I use Shannon entropy method by again varying my MFCC window size, how the number of bits I got at the output of the entropy algorithm would help me to identify the Estimating Quasi-stationary part of the signal
Are there any other ideas?

Best way to extract neuronal spike times from a noisy signal / voltage meaurement

I'm a neuroscientist, and not a very good one. My colleague has kindly provided me with a noisy voltage measurements of the PY neuron of the Stomatogastric Ganglion of the lobster.
The activity of this neuron is characterised by a slow depolarised plateaux with fast spikes on top (a burst).
Both idealised and noisy versions are presented here for you to peruse at your leisure.
It's my job to extract the spike times from the noisy signal but this is so far beyond my experience level I have no idea where to begin. Fortunately, I am a total ninja at Matlab.
Could someone kindly provide me with the name of the procedure, filter or smoothing function which is best suited for this task. Or even the appropriate forum to ask such an asinine question.
Presumably, it needs to increase the signal to noise ratio? The problem here seems to be determining the difference between noise and a bona fide spike as the margin between the two is quite small.
UPDATE: 02/07/2013
I have tried the following filters in Matlab with mixed results. It's still very hard to say what is noise and what is a spike.
Lowpass Butterworth filter,
median filter,
gaussian,
moving weighted window,
moving average filter,
smooth,
sgolay filter.

This may not be an adequate response for stackoverflow - but one way of increasing a signal to noise ratio in your case is to average parts of the signal.
low pass your signal to remove noise (and spikes), and find the minima of the filtered signal (from your image, one minimum every 600 data points). Keep the indexes of each minimum,
on the noisy signal, for each minimum index, select the consecutive 700 data points. If you have 50 minima, you should have a 50 by 700 matrix,
average your matrix. You should have a 1 by 700 vector.
By averaging parts of the signal (minimum-locked potentials), you will take advantage of two properties: noise is zero-mean (well, it should be), and the signal of interest is repetitive. The first will therefore decrease as you pile up potentials, and the second will increase. With this process however, you will lose the spike times for each slow wave figure, but at least have them for blocks of 50 minima.
This technique is known in neuroscience as event-related potential (http://en.wikipedia.org/wiki/Event-related_potential). It may not fit perfectly your signal, or the result may not give nice spikes, but you may extract the spike times for some periods of interest (given the nature of your signal, I would say that you would need 5 or 10 potentials to see an emerging mean activity).
There are some toolboxes that do part of the job (but I would program it myself given the complexity of the task). These are eeglab or fieldtrip. They have a bunch of filter/decomposition options too, as well as some statistical features.

MATLAB 'spectrogram' params

I am a beginner in MATLAB and I should perform a spectral analysis of an EEG signal drawing the graphs of power spectral density and spectrogram. My signal is 10 seconds long and a sampling frequency of 160 Hz, a total of 1600 samples and have some questions on how to find the parameters of the functions in MATLAB, including:
pwelch (x, window, noverlap, nfft, fs);
spectrogram (x, window, noverlap, F, fs);
My question then is where to find values for the parameters window and noverlap I do not know what they are for.

To understand window functions & their use, let's first look at what happens when you take the DFT of finite length samples. Implicit in the definition of the discrete Fourier transform, is the assumption that the finite length of signal that you're considering, is periodic.
Consider a sine wave, sampled such that a full period is captured. When the signal is replicated, you can see that it continues periodically as an uninterrupted signal. The resulting DFT has only one non-zero component and that is at the frequency of the sinusoid.
Now consider a cosine wave with a different period, sampled such that only a partial period is captured. Now if you replicate the signal, you see discontinuities in the signal, marked in red. There is no longer a smooth transition and so you'll have leakage coming in at other frequencies, as seen below
This spectral leakage occurs through the side-lobes. To understand more about this, you should also read up on the sinc function and its Fourier transform, the rectangle function. The finite sampled sequence can be viewed as an infinite sequence multiplied by the rectangular function. The leakage that occurs is related to the side lobes of the sinc function (sinc & rectangular belong to self-dual space and are F.Ts of each other). This is explained in more detail in the spectral leakage article I linked to above.
Window functions
Window functions are used in signal processing to minimize the effect of spectral leakages. Basically, what a window function does is that it tapers the finite length sequence at the ends, so that when tiled, it has a periodic structure without discontinuities, and hence less spectral leakage.
Some of the common windows are Hanning, Hamming, Blackman, Blackman-Harris, Kaiser-Bessel, etc. You can read up more on them from the wiki link and the corresponding MATLAB commands are hann, hamming,blackman, blackmanharris and kaiser. Here's a small sample of the different windows:
You might wonder why there are so many different window functions. The reason is because each of these have very different spectral properties and have different main lobe widths and side lobe amplitudes. There is no such thing as a free lunch: if you want good frequency resolution (main lobe is thin), your sidelobes become larger and vice versa. You can't have both. Often, the choice of window function is dependent on the specific needs and always boils down to making a compromise. This is a very good article that talks about using window functions, and you should definitely read through it.
Now, when you use a window function, you have less information at the tapered ends. So, one way to fix that, is to use sliding windows with an overlap as shown below. The idea is that when put together, they approximate the original sequence as best as possible (i.e., the bottom row should be as close to a flat value of 1 as possible). Typical values vary between 33% to 50%, depending on the application.
Using MATLAB's spectrogram
The syntax is spectrogram(x,window,overlap,NFFT,fs)
where
x is your entire data vector
window is your window function. If you enter just a number, say W (must be integer), then MATLAB chops up your data into chunks of W samples each and forms the spectrogram from it. This is equivalent to using a rectangular window of length W samples. If you want to use a different window, provide hann(W) or whatever window you choose.
overlap is the number of samples that you need to overlap. So, if you need 50% overlap, this value should be W/2. Use floor(W/2) or ceil(W/2) if W can take odd values. This is just an integer.
NFFT is the FFT length
fs is the sampling frequency of your data vector. You can leave this empty, and MATLAB plots the figure in terms of normalized frequencies and the time axis as simply the data chunk index. If you enter it, MATLAB scales the axis accordingly.
You can also get optional outputs such as the time vector and frequency vector and the power spectrum computed, for use in other computations or if you need to style your plot differently. Refer to the documentation for more info.
Here's an example with 1 second of a linear chirp signal from 20 Hz to 400 Hz, sampled at 1000 Hz. Two window functions are used, Hanning and Blackman-Harris, with and without overlaps. The window lengths were 50 samples, and overlap of 50%, when used. The plots are scaled to the same 80dB range in each plot.
You can notice the difference in the figures (top-bottom) due to the overlap. You get a cleaner estimate if you use overlap. You can also observe the trade-off between main lobe width and side lobe amplitude that I mentioned earlier. Hanning has a thinner main lobe (prominent line along the skew diagonal), resulting in better frequency resolution, but has leaky sidelobes, seen by the bright colors outside. Blackwell-Harris, on the other hand, has a fatter main lobe (thicker diagonal line), but less spectral leakage, evidenced by the uniformly low (blue) outer region.

Both these methods above are short-time methods of operating on signals. The non-stationarity of the signal (where statistics are a function of time, Say mean, among other statistics, is a function of time) implies that you can only assume that the statistics of the signal are constant over short periods of time. There is no way of arriving at such a period of time (for which the statistics of the signal are constant) exactly and hence it is mostly guess work and fine-tuning.
Say that the signal you mentioned above is non-stationary (which EEG signals are). Also assume that it is stationary only for about 10ms or so. To reliably measure statistics like PSD or energy, you need to measure these statistics 10ms at a time. The window-ing function is what you multiply the signal with to isolate that 10ms of a signal, on which you will be computing PSD etc.. So now you need to traverse the length of the signal. You need a shifting window (to window the entire signal 10ms at a time). Overlapping the windows gives you a more reliable estimate of the statistics.
You can imagine it like this:
1. Take the first 10ms of the signal.
2. Window it with the windowing function.
3. Compute statistic only on this 10ms portion.
4. Move the window by 5ms (assume length of overlap).
5. Window the signal again.
6. Compute statistic again.
7. Move over entire length of signal.
There are many different types of window functions - Blackman, Hanning, Hamming, Rectangular. That and the length of the window and overlap really depend on the application that you have and the frequency characteristics of the signal itself.
As an example, in speech processing (where the signals are non-stationary and windowing gets used a lot), the most popular choices for windowing functions are Hamming/Hanning of length 10ms (320 samples at 16 kHz sampling) with an overlap of 80 samples (25% of window length). This works reasonably well. You can use this as a starting point for your application and then work on fine-tuning it a little more with different values.
You may also want to take a look at the following functions in MATLAB:
1. hamming
2. hanning
I hope you know that you can call up a ton of help in MATLAB using the help command on the command line. MATLAB is one of the best documented softwares out there. Using the help command for pwelch also pulls up definitions for window size and overlap. That should help you out too.
I don't know if all this info. helped you out or not, but looking at the question, I felt you might have needed a little help with understanding what windowing and overlapping was all about.
HTH,
Sriram.

For the last parameter fs, that is the frequency rate of the raw signal, in your case X, when you extract X from audio data using function
[X,fs]=audioread('song.mp3')
You may get fs from it.

Investigate how the following parameters change the performance of the Sinc function:
The Length of the coefficients
The Following window functions:
Blackman Harris
Hanning
Bartlett