Matlab : Plot of entropy vs digitized code length - matlab

[Tent Map][1] is a dynamical system that is discrete in time. Iterations of the map yields a time series.
The entropy of this system when discretized in 0/1 using a threshold = 0.5 is, H = log_2(2) = 0.69 approx. I want to obtain a graph with Y Axis as the entropy and X Axis as the Number of samples or the length of the time series. I have written a code for obatining the entropy by varying the length of the time series. The objective is to see at what length of the discretized time series, I get the entropy H. The code however, loops infinitely and never reaches the entropy value H. Can somebody please help in obtaining the graph? Thank you.
[1]: https://en.wikipedia.org/wiki/Tent_map

You are saying that lambda must be EXACTLY equal to log(2). When the entropy drops below log(2) it skips over this value (for instance instead of exactly log(2) maybe an iteration is 0.69309 which is less). Try replacing the line
while(lambda ~=H)
with
tol=0.01;
while(~(abs(lambda-H)<tol))
this will mean that it quits whenever lambda is close to H (with in the tolerance tol).
If your tol is too small (try 0.001) an iteration would jump over it again and you would be back to the problem you had before.

Related

Computing the DFT of an arbitrary signal

As part of a course in signal processing at university, we have been asked to write an algorithm in Matlab to calculate the single sided spectrum of our signal using DFT, without using the fft() function built in to matlab. this isn't an assessed part of the course, I'm just interested in getting this "right" for myself. I am currently using the 2018b version of Matlab, should anyone find this useful.
I have built a signal of a 1 KHz and 2KHz sinusoid, phase shifted by 135 degrees (2*pi/3 rad).
then using the equations in 9.1 of Discrete time signal processing (Allan V. Oppenheim) and Euler's formula to simplify the exponent, I produce this code:
%%DFT(currently buggy)
n=0;m=0;
for m=1:DFT_N-1 %DFT_Fmin;DFT_Fmax; %scrolls through DFT m values (K in text.)
for n=1:DFT_N-1;%;(DFT_N-1);%<<redundant code? from Oppenheim eqn. 9.1 % eulers identity, K=m and n=n
X(m)=x(n)*(cos((2*pi*n*m)/DFT_N)-j*sin((2*pi*n*m)/DFT_N));
n=n+1;
end
%m=m+1; %redundant code?
end
This takes x as the input, in this case the signal mentioned earlier, as well as the resolution of the transform, as represented by the DFT_N, which has been initialized to 100. The output of this function, X, should be something in the frequency domain, but plotting X yields a circular plot slightly larger than the unit circle, and with a gap on the left hand edge.
I am struggling to see how I am supposed to convert this to the stem() plots as given by the in-built DFT algorithm.
Many thanks, J.
This is your bug:
replace X(m)=x(n)*(cos.. with X(m)=X(m)+x(n)*(cos..
For a given m, it does not integrate over the variable n, but overwrites X(m) only the last calculation for n = DFT_N-1.
Notice that integrating over n=1:DFT_N-1 omits one harmonic, i.e., the first one, exp(-j*2*pi). Replace
n=1:DFT_N-1 with n=1:DFT_N to include that. I would also replace m=1:DFT_N-1 with m=1:DFT_N for plotting concerns.
Also replace any 2*pi*n*m with 2*pi*(n-1)*(m-1) to get the phase right, since the first entry of X should correspond to zero-frequency, yielding sum_n x(n) * (cos(0) + j sin(0)) = sum_n x(n). If your signal x is real-valued then the zero-frequency component X(1) should be real-valued, angle(X(1))=0.
Last remark, don't forget to shift zero-frequency component to the center of the spectrum for better visibility, X = circshift(X,floor(size(X)/2));
If you are interested in the single-sided spectrum only, than you can just calculate X(m) for m=1:DFT_N/2 since X it is conjugate symmetric around m=DFT_N/2, i.e., X(DFT_N/2+m) = X(DFT_N/2-m)', due to exp(-j*(pi*n+2*pi/DFT_N*m)) = exp(-j*(pi*n-2*pi/DFT_N*m))'.
As a side note, for a given m this program calculates an inner product between the array x and another array of complex exponentials, i.e., exp(-j*2*pi/DFT_N*m*n), for n = 0,1,...,N-1. MATLAB syntax is very convenient for such calculations, and you can avoid this inner loop by the following command
exp(-j*2*pi/DFT_N*m*(0:DFT_N-1)) * x
where x is a column vector. Similarly, you can avoid the first loop too by expanding your complex exponential vector row-wise for every m, i.e., build the matrix exp(-j*2*pi/DFT_N*(0:DFT_N-1)'*(0:DFT_N-1)). Then your DFT is simply
X = exp(-j*2*pi/DFT_N*(0:DFT_N-1)'*(0:DFT_N-1)) * x
For single-sided spectrum, instead use
X = exp(-j*2*pi/DFT_N*(0:floor((DFT_N-1)/2))'*(0:DFT_N-1)) * x

How to make sense (handle) when computes logarithm of zero in prior information

I am working in image classification. I am using an information that called prior probability (in Bayesian rule). It has range in [0,1]. And it requires computing in logarithm. However, as you know, logarithm of zero number is Inf.
For example, given an pixel x in image I (size 3 by 3) with an cost function such as
Cost(x)=30+log(prior(x))
where prior is an matrix 3 by 3
prior=[ 0 0 0.5;
1 1 0.2;
0.4 0 0]
I =[ 1 2 3;
4 5 6;
7 8 9]
I want to compute cost of x=1 then
cost(x=1)=30+log(0)
Now, log(0) is Inf. Then result cost(x=1) also Inf. Based on my assumption that prior=0 that mean the given pixel belongs to background, and prior=1 that mean the given pixel belongs to foreground.
My question is that how to compute log(prior) satisfy my assumption.
I am using Matlab to do it. I think that log(0) becomes very small negative value. And I just set it is -9 as my code
%% Handle with log(0)
prior(prior==0.0) = NaN;
%% Compute log
log_prior=log(prior);
%% Assume that e^-9 very near 0.
log_prior(isnan(log_prior)) = -9;
UPDATE: To make clearly what I am doing. Let see the Bayesian rule. My task is that how to assign an given pixel x belongs to Background (BG) or Foreground (FG). It will depends on the probability
P(x∈BG|x)=P(x|x∈BG)P(x∈BG)/P(x)
In which P(x|x∈BG) is likelihood function and assume that it is approximated by Gaussian distribution, P(x∈BG) is prior term and P(x) can be ignore due to it is const
Using Maximum-a-Posteriori (MAP) Estimation we can map the above equation in to log space (to resolve exponential in Gaussian function)
Cost(x)=log(P(x∈BG|x))=log(P(x|x∈BG))+log(P(x∈BG))
To make simple, let assume log(P(x|x∈BG))=30, log(P(x∈BG)) is log(prior) then my cost function can rewritten as
Cost(x)=30+log(prior(x))
Now problem is that prior is within [0,1] then it logarithm is -Inf. As the chepner said, we can add eps value as
log(prior+eps)
However, log(eps) is very a lager negative number. It will be affected my cost function (also becomes very large negative number). Then the first term in my cost function (30) becomes not necessary. Based on my assumption that log(x)=1 then the pixel x will be BG and prior(x)=1 will be FG. How to make handle with my log(prior) when I compute my cost function?
The correct thing to do, before fiddling with Matlab, is to try to understand your problem. Ask yourself "what does it mean for the prior probability to vanish?". The answer is given by Bayes theorem, one form of which is:
posterior = likelihood * prior / normalization
So places where the prior is nil are, by definition, places where you are certain that your events (the things whose probabilities you are computing) cannot happen, regardless of their apparent likelihood (i.e. "cost"). So they are not interesting for you. You just recognize that and skip them.

implementation of Lomb-Scargle periodogram

from matlab official site , Lomb-Scargle periodogram is defined as
http://www.mathworks.com/help/signal/ref/plomb.html#lomb
suppose we have some random signal let say
x=rand(1,1000);
average of this signal can be easily implemented as
average=mean(x);
variance can be implemented as
>> average_vector=repmat(average,1,1000);
>> diff=x-average_vector;
>> variance= sum(diff.*diff)/(length(x)-1);
how should i continue? i mean how to choose frequencies ? calculation of time offset is not problem,let us suppose that we have time vector
t=0:0.1:99.9;
so that total length of time vector be 1000, in generally for DFT, frequencies bins are represented as a multiplier of 2*pi/N, where N is length of signal, but what about this case? thanks in advance
As can be seen from the provided link to MATLAB documentation, the algorithm does not depend on a specific sampling times tk selection. Note that for equally spaced sampling times (as you have selected), the same link indicates:
The offset depends only on the measurement times and vanishes when the times are equally spaced.
So, as you put it "calculation of time offset is not a problem".
Similar to the DFT which can be obtained from the Discrete-Time Fourier Transform (DTFT) by selecting a discrete set of frequencies, we can also choose f[n] = n * sampling_rate/N (where sampling_rate = 10 for your selection of tk). If we disregard the value of PLS(f[n]) for n=mN where m is any integer (since it's value is ill-formed, at least in the formula posted in the link), then:
Thus for real-valued data samples:
where Y can be expressed in terms of the diff vector you defined as:
Y = fft(diff);
That said, as indicated on wikipedia, the Lomb–Scargle method is primarilly intended for use with unequally spaced data.

QRS detection(peaks) of a raw ecg signal in matlab

I want to find the peaks of the raw ecg signal so that I can calculate the beats per minute(bpm).
I Have written a code in matlab which I have attached below.In the code below I am unable to find threshold point correctly which will help me in finding the peaks and hence the bpm.
%input the signal into matlab
[x,fs]=wavread('heartbeat.wav');
subplot(2,1,1)
plot(x(1:10000),'r-')
grid on
%lowpass filter the input signal with cutoff at 100hz
h=fir1(30,0.3126); %normalized cutoff freq=0.3126
y=filter(h,1,x);
subplot(2,1,2)
plot(y(1:10000),'b-')
grid on
% peaks are seen as pulses(heart beats)
beat_count=0;
for p=2:length(y)-1
th(p)=abs(max(y(p)));
if(y(p) >y(p-1) && y(p) >y(p+1) && y(p)>th(p))
beat_count=beat_count+1;
end
end
N = length(y);
duration_seconds=N/fs;
duration_minutes=duration_seconds/60;
BPM=beat_count/duration_minutes;
bpm=ceil(BPM);
Please help me as I am new to matlab
I suggest changing this section of your code
beat_count=0;
for p=2:length(y)-1
th(p)=abs(max(y(p)));
if(y(p) >y(p-1) && y(p) >y(p+1) && y(p)>th(p))
beat_count=beat_count+1;
end
end
This is definitely flawed. I'm not sure of your logic here but what about this. We are looking for peaks, but only the high peaks, so first lets set a threshold value (you'll have to tweak this to a sensible number) and cull everything below that value to get rid of the smaller peaks:
th = max(y) * 0.9; %So here I'm considering anything less than 90% of the max as not a real peak... this bit really depends on your logic of finding peaks though which you haven't explained
Yth = zeros(length(y), 1);
Yth(y > th) = y(y > th);
OK so I suggest you now plot y and Yth to see what that code did. Now to find the peaks my logic is we are looking for local maxima i.e. points at which the first derivative of the function change from being positive to being negative. So I'm going to find a very simple numerical approximation to the first derivative by finding the difference between each consecutive point on the signal:
Ydiff = diff(Yth);
No I want to find where the signal goes from being positive to being negative. So I'm going to make all the positive values equal zero, and all the negative values equal one:
Ydiff_logical = Ydiff < 0;
finally I want to find where this signal changes from a zero to a one (but not the other way around)
Ypeaks = diff(Ydiff_logical) == 1;
Now count the peaks:
sum(Ypeaks)
note that for plotting purpouse because of the use of diff we should pad a false to either side of Ypeaks so
Ypeaks = [false; Ypeaks; false];
OK so there is quite a lot of matlab there, I suggest you run each line, one by one and inspect the variable by both plotting the result of each line and also by double clicking the variable in the matlab workspace to understand what is happening at each step.
Example: (signal PeakSig taken from http://www.mathworks.com/help/signal/ref/findpeaks.html) and plotting with:
plot(x(Ypeaks),PeakSig(Ypeaks),'k^','markerfacecolor',[1 0 0]);
What do you think about the built-in
findpeaks(data,'Name',value)
function? You can choose among different logics for peak detection:
'MINPEAKHEIGHT'
'MINPEAKDISTANCE'
'THRESHOLD'
'NPEAKS'
'SORTSTR'
I hope this helps.
You know, the QRS complex does not always have the maximum amplitude, for pathologic ECG it can be present as several minor oscillations instead of one high-amplitude peak.
Thus, you can try one good algothythm, tested by me: the detection criterion is assumed to be high absolute rate of change in the signal, averaged within the given interval.
Algorithm:
- 50/60 Hz filter (e.g. for 50 Hz sliding window of 20 msec will be fine)
- adaptive hipass filter (for baseline drift)
- find signal's first derivate x'
- fing squared derivate (x')^2
- apply sliding average window with the width of QRS complex - approx 100-150 msec (you will get some signal with 'rectangles', which have width of QRS)
- use simple threshold (e.g. 1/3 of maximum of the first 3 seconds) to determine approximate positions or R
- in the source ECG find local maximum within +-100 msec of that R position.
However, you still have to eliminate artifacts and outliers (e.g. surges, when the electrod connection fails).
Also, you can find a lot of helpful information from this book: "R.M. Rangayyan - Biomedical Signal Analysis"

The deconv() function in MATLAB does not invert the conv() function

I would like to convolve a time-series containing two spikes (call it Spike) with an exponential kernel (k) in MATLAB. Call the convolved response "calcium1". I would like to recover the original spike ("reconSpike") data using deconvolution with the kernel. I am using the following code.
k1=zeros(1,5000);
k1(1:1000)=(1.1.^((1:1000)/100)-(1.1^0.01))/((1.1^10)-1.1^0.01);
k1(1001:5000)=exp(-((1001:5000)-1001)/1000);
k1(1)=k1(2);
spike = zeros(100000,1);
spike(1000)=1;
spike(1100)=1;
calcium1=conv(k1, spike);
reconSpike1=deconv(calcium1, k1);
The problem is that at the end of reconSpike, I get a chunk of very large, high amplitude waves that was not in the original data. Anyone know why and how to fix it?
Thanks!
It works for me if you keep the spike vector the same length as the k1 vector. i.e.:
k1=zeros(1,5000);
k1(1:1000)=(1.1.^((1:1000)/100)-(1.1^0.01))/((1.1^10)-1.1^0.01);
k1(1001:5000)=exp(-((1001:5000)-1001)/1000);
k1(1)=k1(2);
spike = zeros(5000, 1);
spike(1000)=1;
spike(1100)=1;
calcium1=conv(k1, spike);
reconSpike1=deconv(calcium1, k1);
Any reason you made them different?
You are running into either a problem with MATLAB's deconvolution algorithm, or floating point precision problems (or maybe both). I suspect it's floating point precision due to all the divisions and subtractions that take place during the deconvolution, but it might be worth contacting MathWorks directly to ask what they think.
Per MATLAB documentation, if [q,r] = deconv(v,u), then v = conv(u,q)+r must also hold (i.e., the output of deconv should always satisfy this). In your case this is violently violated. Put the following at the end of your script:
[reconSpike1 rem]=deconv(calcium1, k1);
max(conv(k1, reconSpike1) + rem - calcium1)
I get 6.75e227, which is not zero ;-) Next try changing the length of spike to 6000; you will get a small number (~1e-15). Gradually increase the length of spike; the error will get larger and larger. Note that if you put only one non-zero element into your spike, this behavior doesn't happen: the error is always zero. It makes sense; all MATLAB needs to do is divide everything by the same number.
Here's a simple demonstration using random vectors:
v = random('uniform', 1,2,100,1);
u = random('uniform', 1,2,100,1);
[q r] = deconv(v,u);
fprintf('maximum error for length(v) = 100 is %f\n', max(conv(u, q) + r - v))
v = random('uniform', 1,2,1000,1);
[q r] = deconv(v,u);
fprintf('maximum error for length(v) = 1000 is %f\n', max(conv(u, q) + r - v))
The output is:
maximum error for length(v) = 100 is 0.000000
maximum error for length(v) = 1000 is 14.910770
I don't know what you are really trying to accomplish, so it's hard to give further advice. But I'll just point out that if you have a problem where pulses are piling up and you want to extract information about each pulse, this can be a tricky problem. I know some people who work on things like this, so if you want some references let me know and I will ask them.
You should never expect that a deconvolution can simply undo a convolution. This is because the deconvolution is an ill-posed problem.
The problem comes from the fact that the convolution is an integral operator (in the continuous case you write down an integral int f(x) g(x-t) dx or something similar). Now, the inverse of computing an integral (the de-convolution) is to apply a differentiation. Unfortunately, the differential amplifies noise in the input. Thus, if your integral only has slight errors on it (and floating-point inaccuarcies might already be enough), you end up with a total different outcome after differentiation.
There are some possibilities how this amplification can be mitigated but these have to be tried on a per-application basis.