Julia outperformed by MATLAB? - matlab

I have been trying to optimise some code that takes input signal in the form of a 2D array and convolves it with a gaussian of a certain FWHM. The size of this 2D signal array is (671, 2001), and I am convolving along the second axis - the time axis. Where,
time = 0:0.5:1000
nt = length(time) # nt = 2001
I am convolving with a gaussian of FWHM of 150, corrosponding to a standard deviation of about 63.7. I then pad the initial signal by 3 times the FWHM, setting the new values before the original start index as 0 and the new values after the original end index as the signal at the last time point.
Hence, nt = 3801 after padding.
My MATLAB code calculates this convolution in 117.5043 seconds. This is done through a nested for loop over each of the axis - should be slow in MATLAB. The main bit of theMATLAB code is as follows,
for it = 1:nt
% scan convolution mask across time
kernal = preb*exp(-b*(tc-tc(it)).^2); % convolution mask - Gaussian
N(it) = sum(kernal)*dt; % normalise
kernal = kernal/N(it); % ensure convolution integrates to 1
% calculate convolution for each row
for iq = 1:Nq
WWc(iq,it) = dt*sum((WWc0(iq,1:nt).').*kernal(1:nt));
end
end
I have rewritten this in Julia, except using circulant matrices to construct a gaussian kernal, this effectivley removes the foor loop over time, and the kernal to be applied over all time is now fully defined by one matrix. I then either loop over the rows and multiply this kernal by the signal of each row, or use the mapslices function. The main bit of code in Julia is as follows,
K = Circulant(G.Fx) # construct circulant matrix of gaussian function G.Fx
conv = zeros(nr, nt) # init convolved signal
#for i=1:nr # If using a loop instead of mapslice
# conv[i, :] = transpose(sC[i, :]) * K
#end
conv = mapslices(x -> K*x, sC; dims=2)
In Julia this convolution takes 368 seconds (almost 3 times as slow as MATLAB), despite using circulant matrices to skip a foor loop and reducing it down to multiplying two arrays of size (1, 3801) and (3801, 3801) for each row.
output of #time:
368.047741 seconds (19.37 G allocations: 288.664 GiB, 14.30% gc time, 0.03% compilation time)
Both are run on the same computer, should Julia not outperform MATLAB here?
UPDATE:
Playing around with the Julia code some more, I have rewritten the matrix multiplication in terms of loops, but it is still slow. The convolution is performed within a function although calls two external constructor types - Gaussian or Lorentzian depending on an input flag. I also load SpecialMatrices.jl for the Circulant constructor. I am new to Julia and not sure if that would make it slow. When using the #time macro on independent bits of the function, it is apparent that every part is really fast up until it hits the loop at the end that actually calculations the convolution. This loop is incredibly slow either way I do it, be that using matrix multiplication over the circulant kernel, a double nested loop in combination with the sum() function, or a triple nested loop that does the sum itself.
using MAT
using SpecialMatrices
wd = " "
inputfile = matopen(wd*"/Signal.mat")
signal = read(inputfile, "pdW")
close(inputfile)
inputfile = matopen(wd*"/Vectors.mat")
qAng = read(inputfile, "qAng")
tvec = read(inputfile, "tt")
struct Gaussian
Fx:: AbstractVector{Float64}
sigma:: Float64
height:: Float64
fwhm:: Float64
function Gaussian(fwhm:: Float64, x:: StepRangeLen{Float64}, x0:: Float64)
sigma = fwhm/2sqrt(2log(2))
height = 1/sigma*sqrt(2pi)
Fx = height .* exp.( (-1 .* (x .- x0).^2)./ (2*sigma^2))
new(Fx, sigma, height, fwhm)
end
end
struct Lorentzian
Fx:: AbstractVector{Float64}
gamma:: Float64
fwhm:: Float64
function Lorentzian(fwhm:: Float64, x:: StepRangeLen{Float64}, x0:: Float64)
gamma = fwhm/2
Fx = 1 ./ ((pi * gamma) .* (1 .+ ((x .- x0)./ gamma).^2 ))
new(Fx, gamma, fwhm)
end
end
function convolve(S:: Array{Float64}, fwhm:: Float64, time:: Array{Float64}, type:: Int)
nr:: Int64, nc:: Int64 = size(S)
nt:: Int64 = length(time)
dt:: Float64 = sum(diff(time, dims=2))/(length(time)-1)
duration:: Float64 = fwhm*3 # must pad three time FHWM to deal with edges
if dt != diff(time, dims=2)[1] ; error("Non-linear time range."); end
if nr != nt && nc != nt; error("Inconsistent dimensions."); end
if nc != nt; S = transpose(S); nr, nc = nc, nr; end # column always time axis
tmin:: Float64 = minimum(time); tmax:: Float64 = maximum(time)
tconv_min:: Float64 = tmin - duration ; tconv_max:: Float64 = tmax + duration
tconv = tconv_min:dt:tconv_max # extended convoluted time vector
ntc:: Int64 = length(tconv)
padend:: Int64 = (duration/dt) # index at which 0 padding ends and signal starts
padstart:: Int64 = (padend + tmax / dt) # index where signal ends and padding starts
type == 0 ? Kv = Gaussian(fwhm, tconv, tconv[1]) : Kv = Lorentzian(fwhm, tconv, tconv[1])
println("Convolving signal with a $(typeof(Kv)) function with FWHM: $(Kv.fwhm) fs.")
K:: Array{Float64} = Circulant(Kv.Fx) # construct kernel from circ matrix
sC = zeros(Float64, nr, ntc)
sC[1:nr, 1:padend] .= S[1:nr, 1] # extended and pad signal
sC[1:nr, padend:padstart] .= S
sC[1:nr, padstart:end] .= S[1:nr, end]
println("""
Signal padded by 3*FWHM ( $(duration) fs ) forwards and backwards.
Original time length: $(nt), Extended time: $(ntc), Diff: $(ntc-nt) steps.
Padded signal size: $(size(sC)).
Kernel size: $(size(K[1, :])).
""")
conv = zeros(Float64, nr, ntc)
for t in 1:ntc
for q in 1:nr
#conv[q, t] = sum(sC[q, 1:ntc] .* K[t, :])*dt
conv[q, t] = 0.0
for j in 1:ntc
conv[q, t] += sC[q, j] * K[t, j] * dt
end
end
end
return conv, tconv
end
fwhm = 100.0
conv, tconv = convolve(signal, fwhm, tvec, 0)
UPDATE 2:
Working fast code, w a wrapper for convolving a random signal.
using LinearAlgebra
using SpecialMatrices
function wrapper()
signal = rand(670, 2001)
tvec = 0:0.5:1000
fwhm = 150.0
Flag = 0
conv = convolve(signal, tvec, fwhm, Flag)
end
function Gaussian(x:: StepRangeLen{Float64}, x0:: Float64, fwhm:: T) where {T<:Union{Float64, Int64}}
sigma = fwhm/2sqrt(2log(2))
height = 1/sigma*sqrt(2pi)
Fx = height .* exp.( (-1 .* (x .- x0).^2)./ (2*sigma^2))
Fx = Fx./sum(Fx)
end
function Lorentzian(x:: StepRangeLen{Float64}, x0:: Float64, fwhm:: T) where {T<:Union{Float64, Int64}}
gamma = fwhm/2
Fx = 1 ./ ((pi * gamma) .* (1 .+ ((x .- x0)./ gamma).^2 ))
Fx = Fx./sum(Fx)
end
function generate_kernel(tconv:: StepRangeLen{Float64}, fwhm:: Float64, centre:: Float64, ctype:: Int64)
if ctype == 0
K = Gaussian(tconv, centre, fwhm)
elseif cytpe == 1
K = Lorentzian(tconv, centre, fwhm)
else
error("Only Gaussian and Lorentzian functions currently available.")
end
K = Matrix(Circulant(K))
return K
end
function convolve(S:: Matrix{Float64}, time:: StepRangeLen{Float64}, fwhm:: Float64, ctype:: Int64)
nr:: Int64, nc:: Int64 = size(S)
dt:: Float64 = sum(diff(time, dims=1))/(length(time)-1)
nt:: Int64 = length(time)
duration:: Float64 = fwhm*3 # must pad three time FHWM to deal with edges
tmin:: Float64 = minimum(time); tmax:: Float64 = maximum(time)
if dt != diff(time, dims=1)[1] ; error("Non-linear time range."); end
if nr != nt && nc != nt; error("Inconsistent dimensions."); end
if nc != nt; S = copy(transpose(S)); nr, nc = nc, nr; end # column always time axis
tconv_min:: Float64 = tmin - duration ; tconv_max:: Float64 = tmax + duration
tconv = tconv_min:dt:tconv_max # extended convoluted time vector
ntc = length(tconv)
padend:: Int64 = (duration/dt) # index at which 0 padding ends and signal starts
padstart:: Int64 = (padend + tmax / dt) # index where signal ends and padding starts
K = generate_kernel(tconv, fwhm, tconv[padend], ctype)
sC = zeros(Float64, nr, ntc)
sC[1:nr, 1:padend] .= #view S[1:nr, 1] # extended and pad signal
sC[1:nr, padend:padstart] .= S
sC[1:nr, padstart:end] .= #view S[1:nr, end]
conv = zeros(Float64, nr, ntc)
conv = convolution_integral(sC, K, dt)
S .= conv[:, 1:nt] # return convoluted signal w same original size
return S
end
function convolution_integral(signal:: Matrix{Float64}, kernel:: Matrix{Float64}, dt:: Float64)
LinearAlgebra.BLAS.set_num_threads(Sys.CPU_THREADS)
conv = signal * kernel
conv .*= dt
return conv
end
using BenchMarktools:
julia> #btime wrapper();
128.438 ms (11125 allocations: 300.24 MiB)
There is some strange slicing of the padded signal in the sense that the padding at early time flips to the end of the time extension after convolution. Something to do with the centering of the gaussian kernel I think. For now I just slice the the array back into it's original dimensions without the padding and get the result I expect.

Related

adaptive elliptical structuring element in MATLAB

I'm trying to create an adaptive elliptical structuring element for an image to dilate or erode it. I write this code but unfortunately all of the structuring elements are ones(2*M+1).
I = input('Enter the input image: ');
M = input('Enter the maximum allowed semi-major axes length: ');
% determining ellipse parameteres from eigen value decomposition of LST
row = size(I,1);
col = size(I,2);
SE = cell(row,col);
padI = padarray(I,[M M],'replicate','both');
padrow = size(padI,1);
padcol = size(padI,2);
for m = M+1:padrow-M
for n = M+1:padcol-M
a = (l2(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
b = (l1(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
if e1(m-M,n-M,1)==0
phi = pi/2;
else
phi = atan(e1(m-M,n-M,2)/e1(m-M,n-M,1));
end
% defining structuring element for each pixel of image
x0 = m;
y0 = n;
se = zeros(2*M+1);
row_se = 0;
for i = x0-M:x0+M
row_se = row_se+1;
col_se = 0;
for j = y0-M:y0+M
col_se = col_se+1;
x = j-y0;
y = x0-i;
if ((x*cos(phi)+y*sin(phi))^2)/a^2+((x*sin(phi)-y*cos(phi))^2)/b^2 <= 1
se(row_se,col_se) = 1;
end
end
end
SE{m-M,n-M} = se;
end
end
a, b and phi are semi-major and semi-minor axes length and phi is angle between a and x axis.
I used 2 MATLAB functions to compute the Local Structure Tensor of the image, and then its eigenvalues and eigenvectors for each pixel. These are the matrices l1, l2, e1 and e2.
This is the bit of your code I didn't understand:
a = (l2(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
b = (l1(m-M,n-M)+eps/l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
I simplified the expression for b to (just removing the indexing):
b = (l1+eps/l1+l2+2*eps)*M;
For l1 and l2 in the normal range we get:
b =(approx)= (l1+0/l1+l2+2*0)*M = (l1+l2)*M;
Thus, b can easily be larger than M, which I don't think is your intention. The eps in this case also doesn't protect against division by zero, which is typically the purpose of adding eps: if l1 is zero, eps/l1 is Inf.
Looking at this expression, it seems to me that you intended this instead:
b = (l1+eps)/(l1+l2+2*eps)*M;
Here, you're adding eps to each of the eigenvalues, making them guaranteed non-zero (the structure tensor is symmetric, positive semi-definite). Then you're dividing l1 by the sum of eigenvalues, and multiplying by M, which leads to a value between 0 and M for each of the axes.
So, this seems to be a case of misplaced parenthesis.
Just for the record, this is what you need in your code:
a = (l2(m-M,n-M)+eps ) / ( l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
b = (l1(m-M,n-M)+eps ) / ( l1(m-M,n-M)+l2(m-M,n-M)+2*eps)*M;
^ ^
added parentheses
Note that you can simplify your code by defining, outside of the loops:
[se_x,se_y] = meshgrid(-M:M,-M:M);
The inner two loops, over i and j, to construct se can then be written simply as:
se = ((se_x.*cos(phi)+se_y.*sin(phi)).^2)./a.^2 + ...
((se_x.*sin(phi)-se_y.*cos(phi)).^2)./b.^2 <= 1;
(Note the .* and .^ operators, these do element-wise multiplication and power.)
A further slight improvement comes from realizing that phi is first computed from e1(m,n,1) and e1(m,n,2), and then used in calls to cos and sin. If we assume that the eigenvector is properly normalized, then
cos(phi) == e1(m,n,1)
sin(phi) == e1(m,n,2)
But you can always make sure they are normalized:
cos_phi = e1(m-M,n-M,1);
sin_phi = e1(m-M,n-M,2);
len = hypot(cos_phi,sin_phi);
cos_phi = cos_phi / len;
sin_phi = sin_phi / len;
se = ((se_x.*cos_phi+se_y.*sin_phi).^2)./a.^2 + ...
((se_x.*sin_phi-se_y.*cos_phi).^2)./b.^2 <= 1;
Considering trigonometric operations are fairly expensive, this should speed up your code a bit.

Speed up calculation of maximum of normxcorr2

I need to calculate the maximum of normalized cross correlation of million of particles. The size of the two parameters of normxcorr2 is 56*56. I can't parallelize the calculations. Is there any suggestion to speed up the code especially that I don't need all the results but only the maximum value of each cross correlation (to know the displacement)?
Example of the algorithm
%The choice of 170 particles is because in each time
%the code detects 170 particles, so over 10000 images it's 1 700 000 particles
particle_1=rand(54,54,170);
particle_2=rand(56,56,170);
for i=1:170
C=normxcorr2(particle_1(:,:,i),particle_2(:,:,i));
L(i)=max(C(:));
end
I don't have MATLAB so I ran the following code on this site: https://www.tutorialspoint.com/execute_matlab_online.php which is actually octave. So I implemented "naive" normalized cross correlation and indeed for these small images sizes the naive performs better:
Elapsed time is 2.62645 seconds - for normxcorr2
Elapsed time is 0.199034 seconds - for my naive_normxcorr2
The code is based on the article http://scribblethink.org/Work/nvisionInterface/nip.pdf which describes how to calculate the standard deviation needed for the normalization in an efficient way using integral image, this is the box_corr function.
Also, MATLAB's normxcorr2 returns a padded image so I took the max on the unpadded part.
pkg load image
function [N] = naive_corr(pat,img)
[n,m] = size(img);
[np,mp] = size(pat);
N = zeros(n-np+1,m-mp+1);
for i = 1:n-np+1
for j = 1:m-mp+1
N(i,j) = sum(dot(pat,img(i:i+np-1,j:j+mp-1)));
end
end
end
%w_arr the array of coefficients for the boxes
%box_arr of size [k,4] where k is the number boxes, each box represented by
%4 something ...
function [C] = box_corr2(img,box_arr,w_arr,n_p,m_p)
% construct integral image + zeros pad (for boundary problems)
I = cumsum(cumsum(img,2),1);
I = [zeros(1,size(I,2)+2); [zeros(size(I,1),1) I zeros(size(I,1),1)]; zeros(1,size(I,2)+2)];
% initialize result matrix
[n,m] = size(img);
C = zeros(n-n_p+1,m-m_p+1);
%C = zeros(n,m);
jump_x = 1;
jump_y = 1;
x_start = ceil(n_p/2);
x_end = n-x_start+mod(n_p,2);
x_span = x_start:jump_x:x_end;
y_start = ceil(m_p/2);
y_end = m-y_start+mod(m_p,2);
y_span = y_start:jump_y:y_end;
arr_a = box_arr(:,1) - x_start;
arr_b = box_arr(:,2) - x_start+1;
arr_c = box_arr(:,3) - y_start;
arr_d = box_arr(:,4) - y_start+1;
% cumulate box responses
k = size(box_arr,1); % == numel(w_arr)
for i = 1:k
a = arr_a(i);
b = arr_b(i);
c = arr_c(i);
d = arr_d(i);
C = C ...
+ w_arr(i) * ( I(x_span+b,y_span+d) ...
- I(x_span+b,y_span+c) ...
- I(x_span+a,y_span+d) ...
+ I(x_span+a,y_span+c) );
end
end
function [NCC] = naive_normxcorr2(temp,img)
[n_p,m_p]=size(temp);
M = n_p*m_p;
% compute template mean & std
temp_mean = mean(temp(:));
temp = temp - temp_mean;
temp_std = sqrt(sum(temp(:).^2)/M);
% compute windows' mean & std
wins_mean = box_corr2(img,[1,n_p,1,m_p],1/M, n_p,m_p);
wins_mean2 = box_corr2(img.^2,[1,n_p,1,m_p],1/M,n_p,m_p);
wins_std = real(sqrt(wins_mean2 - wins_mean.^2));
NCC_naive = naive_corr(temp,img);
NCC = NCC_naive ./ (M .* temp_std .* wins_std);
end
n = 170;
particle_1=rand(54,54,n);
particle_2=rand(56,56,n);
[n_p1,m_p1,c_p1]=size(particle_1);
[n_p2,m_p2,c_p2]=size(particle_2);
L1 = zeros(n,1);
L2 = zeros (n,1);
tic
for i=1:n
C1=normxcorr2(particle_1(:,:,i),particle_2(:,:,i));
C1_unpadded = C1(n_p1:n_p2 , m_p1:m_p2);
L1(i)=max(C1_unpadded(:));
end
toc
tic
for i=1:n
C2=naive_normxcorr2(particle_1(:,:,i),particle_2(:,:,i));
L2(i)=max(C2(:));
end
toc

How integral image influence the result of local binary pattern or center symmetric local binary pattern

I know this looks somehow not related to code errors and development but
I want to know if someone can understand these codes of
integral image and local binary pattern, and tell me how they affect the resulting histograms.
Before the use of integral image the output histogram is normal, but after applying the integral image method I found that most of the histogram changed to zeros. To clarify things, the expected benefit from the use of an integral image is to speed up the process of lbp method. In fact, I haven't seen this before because I'm trying it for the first time. Does anybody who knows about this may help me please?
These are the codes of every method:
Integral image
function [outimg] = integral( image )
[y,x] = size(image);
outimg = zeros(y+1,x+1);
disp(y);
for a = 1:y+1
for b = 1:x+1
rx = b-1;
ry = a-1;
while ry>=1
while rx>=1
outimg(a,b) = outimg(a,b)+image(ry,rx);
rx = rx-1;
end
rx = b-1;
ry = ry-1;
end
% outimg(a,b) = outimg(a,b)-image(a,b);
end
end
% outimg(1,1) = image(1,1);
disp('end loop');
end
CS-LBP
function h = CSLBP(I)
%% this function takes patch or image as input and return Histogram of
%% CSLBP operator.
h = zeros(1,16);
[y,x] = size(I);
T = 0.1; % threshold given by authors in their paper
for i = 2:y-1
for j = 2:x-1
% keeping I(j,i) as center we compute CSLBP
% N0 - N4
a = ((I(i,j+1) - I(i, j-1) > T ) * 2^0 );
b = ((I(i+1,j+1) - I(i-1, j-1) > T ) * 2^1 );
c = ((I(i+1,j) - I(i-1, j) > T ) * 2^2 );
d = ((I(i+1,j-1) - I(i - 1, j + 1) > T ) * 2^3 );
e = a+b+c+d;
h(e+1) = h(e+1) + 1;
end
end
end
Matlab has an inbuilt function for creating integral images, integralimage(). If you don't want to use the computer vision system toolbox you can achieve the same result by calling:
IntIm = cumsum(cumsum(double(I)),2);
Possibly adding padding if needed. You should check out that the image is not saturated, they do that sometimes. Calculating the cumulative sum goes to integers way above the range of uint8 and uint16 quickly, I even had it happen with a double once!

Matlab : Confusion regarding unit of entropy to use in an example

Figure 1. Hypothesis plot. y axis: Mean entropy. x axis: Bits.
This Question is in continuation to a previous one asked Matlab : Plot of entropy vs digitized code length
I want to calculate the entropy of a random variable that is discretized version (0/1) of a continuous random variable x. The random variable denotes the state of a nonlinear dynamical system called as the Tent Map. Iterations of the Tent Map yields a time series of length N.
The code should exit as soon as the entropy of the discretized time series becomes equal to the entropy of the dynamical system. It is known theoretically that the entropy of the system, H is log_e(2) or ln(2) = 0.69 approx. The objective of the code is to find number of iterations, j needed to produce the same entropy as the entropy of the system, H.
Problem 1: My problem in when I calculate the entropy of the binary time series which is the information message, then should I be doing it in the same base as H? OR Should I convert the value of H to bits because the information message is in 0/1 ? Both give different results i.e., different values of j.
Problem 2: It can happen that the probality of 0's or 1's can become zero so entropy correspondng to it can become infinity. To prevent this, I thought of putting a check using if-else. But, the loop
if entropy(:,j)==NaN
entropy(:,j)=0;
end
does not seem to be working. Shall be greateful for ideas and help to solve this problem. Thank you
UPDATE : I implemented the suggestions and answers to correct the code. However, my logic of solving was not proper earlier. In the revised code, I want to calculate the entropy for length of time series having bits 2,8,16,32. For each code length, entropy is calculated. Entropy calculation for each code length is repeated N times starting for each different initial condition of the dynamical system. This appraoch is adopted to check at which code length the entropy becomes 1. The nature of the plot of entropy vs bits should be increasing from zero and gradually reaching close to 1 after which it saturates - remains constant for all the remaining bits. I am unable to get this curve (Figure 1). Shall appreciate help in correcting where I am going wrong.
clear all
H = 1 %in bits
Bits = [2,8,16,32,64];
threshold = 0.5;
N=100; %Number of runs of the experiment
for r = 1:length(Bits)
t = Bits(r)
for Runs = 1:N
x(1) = rand;
for j = 2:t
% Iterating over the Tent Map
if x(j - 1) < 0.5
x(j) = 2 * x(j - 1);
else
x(j) = 2 * (1 - x(j - 1));
end % if
end
%Binarizing the output of the Tent Map
s = (x >=threshold);
p1 = sum(s == 1 ) / length(s); %calculating probaility of number of 1's
p0 = 1 - p1; % calculating probability of number of 0'1
entropy(t) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculating entropy in bits
if isnan(entropy(t))
entropy(t) = 0;
end
%disp(abs(lambda-H))
end
Entropy_Run(Runs) = entropy(t)
end
Entropy_Bits(r) = mean(Entropy_Run)
plot(Bits,Entropy_Bits)
For problem 1, H and entropy can be in either nats or bits units, so long as they are both computed using the same units. In other words, you should use either log for both or log2 for both. With the code sample you provided, H and entropy are correctly calculated using consistant nats units. If you prefer to work in units of bits, the conversion of H should give you H = log(2)/log(2) = 1 (or using the conversion factor 1/log(2) ~ 1.443, H ~ 0.69 * 1.443 ~ 1).
For problem 2, as #noumenal already pointed out you can check for NaN using isnan. Alternatively you could check if p1 is within (0,1) (excluding 0 and 1) with:
if (p1 > 0 && p1 < 1)
entropy(:,j) = -p1 * log(p1) - (1 - p1) * log(1 - p1); %calculating entropy in natural base e
else
entropy(:, j) = 0;
end
First you just
function [mean_entropy, bits] = compute_entropy(bits, blocks, threshold, replicate)
if replicate
disp('Replication is ON');
else
disp('Replication is OFF');
end
%%
% Populate random vector
if replicate
seed = 849;
rng(seed);
else
rng('default');
end
rs = rand(blocks);
%%
% Get random
trial_entropy = zeros(length(bits));
for r = 1:length(rs)
bit_entropy = zeros(length(bits), 1); % H
% Traverse bit trials
for b = 1:(length(bits)) % N
tent_map = zeros(b, 1); %Preallocate for memory management
%Initialize
tent_map(1) = rs(r);
for j = 2:b % j is the iterator, b is the current bit
if tent_map(j - 1) < threshold
tent_map(j) = 2 * tent_map(j - 1);
else
tent_map(j) = 2 * (1 - tent_map(j - 1));
end % if
end
%Binarize the output of the Tent Map
s = find(tent_map >= threshold);
p1 = sum(s == 1) / length(s); %calculate probaility of number of 1's
%p0 = 1 - p1; % calculate probability of number of 0'1
bit_entropy(b) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculate entropy in bits
if isnan(bit_entropy(b))
bit_entropy(b) = 0;
end
%disp(abs(lambda-h))
end
trial_entropy(:, r) = bit_entropy;
disp('Trial Statistics')
data = get_summary(bit_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
end
% TO DO Compute the mean for each BIT index in trial_entropy
mean_entropy = 0;
disp('Overall Statistics')
data = get_summary(trial_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
%This is the wrong mean...
mean_entropy = data.mean;
function summary = get_summary(entropy)
summary = struct('mean', mean(entropy), 'sd', std(entropy));
end
end
and then you just have to
% Entropy Script
clear all
%% Settings
replicate = false; % = false % Use true for debugging only.
%H = 1; %in bits
Bits = 2.^(1:6);
Threshold = 0.5;
%Tolerance = 0.001;
Blocks = 100; %Number of runs of the experiment
%% Run
[mean_entropy, bits] = compute_entropy(Bits, Blocks, Threshold, replicate);
%What we want
%plot(bits, mean_entropy);
%What we have
plot(1:length(mean_entropy), mean_entropy);

plotting volume-time graph of .wav file

I'm trying to get volume-time graph of .wav file. First, I recorded sound (patient exhalations) via android as .wav file, but when I read this .wav file in MATLAB it has negative values. What is the meaning of negative values? Second, MATLAB experts could you please check if the code below does the same as written in my comments? Also another question. Y = fft(WindowArray);
p = abs(Y).^2;
I took the power of values returned from fft...is that correct and what is the goal of this step??
[data, fs] = wavread('newF2');
% read exhalation audio wav file (1 channel, mono)
% frequency is 44100 HZ
% windows of 0.1 s and overlap of 0.05 seconds
WINDOW_SIZE = fs*0.1; %4410 = fs*0.1
array_size = length(data); % array size of data
numOfPeaks = (array_size/(WINDOW_SIZE/2)) - 1;
step = floor(WINDOW_SIZE/2); %step size used in loop
transformed = data;
start =1;
k = 1;
t = 1;
g = 1;
o = 1;
% performing fft on each window and finding the peak of windows
while(((start+WINDOW_SIZE)-1)<=array_size)
j=1;
i =start;
while(j<=WINDOW_SIZE)
WindowArray(j) = transformed(i);
j = j+1;
i = i +1;
end
Y = fft(WindowArray);
p = abs(Y).^2; %power
[a, b] = max(abs(Y)); % find max a and its indices b
[m, i] = max(p); %the maximum of the power m and its indices i
maximum(g) = m;
index(t) = i;
power(o) = a;
indexP(g) = b;
start = start + step;
k = k+1;
t = t+1;
g = g+1;
o=o+1;
end
% low pass filter
% filtering noise: ignor frequencies that are less than 5% of maximum frequency
for u=1:length(maximum)
M = max(maximum); %highest value in the array
Accept = 0.05* M;
if(maximum(u) > Accept)
maximum = maximum(u:length(maximum));
break;
end
end
% preparing the time of the graph,
% Location of the Peak flow rates are estimated
TotalTime = (numOfPeaks * 0.1);
time1 = [0:0.1:TotalTime];
if(length(maximum) > ceil(numOfPeaks));
maximum = maximum(1:ceil(numOfPeaks));
end
time = time1(1:length(maximum));
% plotting frequency-time graph
figure(1);
plot(time, maximum);
ylabel('Frequency');
xlabel('Time (in seconds)');
% plotting volume-time graph
figure(2);
plot(time, cumsum(maximum)); % integration over time to get volume
ylabel('Volume');
xlabel('Time (in seconds)');
(I only answer the part of the question which I understood)
Per default Matlab normalizes the audio wave to - 1...1 range. Use the native option if you want the integer data.
First, in your code it should be p = abs(Y)**2, this is the proper way to square the values returned from the FFT. The reason why you take the absolute value of the FFT return values is because those number are complex numbers with a Real and Imaginary part, therefore the absolute value (or modulus) of an imaginary number is the magnitude of that number. The goal of taking the power could be for potentially obtaining an RMS value (root mean squared) of your overall amplitude values, but you could also have something else in mind. When you say volume-time I assume you want decibels, so try something like this:
def plot_signal(file_name):
sampFreq, snd = wavfile.read(file_name)
snd = snd / (2.**15) #Convert sound array to floating point values
#Floating point values range from -1 to 1
s1 = snd[:,0] #left channel
s2 = snd[:,1] #right channel
timeArray = arange(0, len(snd), 1)
timeArray = timeArray / sampFreq
timeArray = timeArray * 1000 #scale to milliseconds
timeArray2 = arange(0, len(snd), 1)
timeArray2 = timeArray2 / sampFreq
timeArray2 = timeArray2 * 1000 #scale to milliseconds
n = len(s1)
p = fft(s1) # take the fourier transform
m = len(s2)
p2 = fft(s2)
nUniquePts = ceil((n+1)/2.0)
p = p[0:nUniquePts]
p = abs(p)
mUniquePts = ceil((m+1)/2.0)
p2 = p2[0:mUniquePts]
p2 = abs(p2)
'''
Left Channel
'''
p = p / float(n) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p = p**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if n % 2 > 0: # we've got odd number of points fft
p[1:len(p)] = p[1:len(p)] * 2
else:
p[1:len(p) -1] = p[1:len(p) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray, 10*log10(p), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('LeftChannel_Power (dB)')
plt.show()
'''
Right Channel
'''
p2 = p2 / float(m) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p2 = p2**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if m % 2 > 0: # we've got odd number of points fft
p2[1:len(p2)] = p2[1:len(p2)] * 2
else:
p2[1:len(p2) -1] = p2[1:len(p2) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray2, 10*log10(p2), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('RightChannel_Power (dB)')
plt.show()
I hope this helps.