PDF and CDF plot for central limit theorem using Matlab - matlab

I am struggling to plot the PDF and CDF graphs of where
Sn=X1+X2+X3+....+Xn
using central limit theorem where n = 1; 2; 3; 4; 5; 10; 20; 40
I am taking Xi to be a uniform continuous random variable for values between (0,3).
Here is what i have done so far -
close all
%different sizes of input X
%N=[1 5 10 50];
N = [1 2 3 4 5 10 20 40];
%interval (1,6) for random variables
a=0;
b=3;
%to store sum of differnet sizes of input
for i=1:length(N)
%generates uniform random numbers in the interval
X = a + (b-a).*rand(N(i),1);
S=zeros(1,length(X));
S=cumsum(X);
cd=cdf('Uniform',S,0,3);
plot(cd);
hold on;
end
legend('n=1','n=2','n=3','n=4','n=5','n=10','n=20','n=40');
title('CDF PLOT')
figure;
for i=1:length(N)
%generates uniform random numbers in the interval
X = a + (b-a).*rand(N(i),1);
S=zeros(1,length(X));
S=cumsum(X);
cd=pdf('Uniform',S,0,3);
plot(cd);
hold on;
end
legend('n=1','n=2','n=3','n=4','n=5','n=10','n=20','n=40');
title('PDF PLOT')
My output is nowhere near what I am expecting any help is much appreciated.

This can be done with vectorization using rand() and cumsum().
For example, the code below generates 40 replications of 10000 samples of a Uniform(0,3) distribution and stores in X. To meet the Central Limit Theorem (CLT) assumptions, they are independent and identically distributed (i.i.d.). Then cumsum() transforms this into 10000 copies of the Sn = X1 + X2 + ... where the first row is n = 10000copies of Sn = X1, the 5th row is n copies of S_5 = X1 + X2 + X3 + X4 + X5. The last row is n copies of S_40.
% MATLAB R2019a
% Setup
N = [1:5 10 20 40]; % values of n we are interested in
LB = 0; % lowerbound for X ~ Uniform(LB,UB)
UB = 3; % upperbound for X ~ Uniform(LB,UB)
n = 10000; % Number of copies (samples) for each random variable
% Generate random variates
X = LB + (UB - LB)*rand(max(N),n); % X ~ Uniform(LB,UB) (i.i.d.)
Sn = cumsum(X);
You can see from the image that the n = 2 case, the sum is indeed a Triangular(0,3,6) distribution. For the n = 40 case, the sum is approximately Normally distributed (Gaussian) with mean 60 (40*mean(X) = 40*1.5 = 60). This shows the convergence in distribution for both the probability density function (PDF) and the cumulative distribution function (CDF).
Note: The CLT is often stated with convergence in distribution to a Normal distribution with zero mean as it has been shifted. Shifting the results by subtracting mean(Sn) = n*mean(X) = n*0.5*(LB+UB) from Sn gets this done.
Code below isn't the gold standard but it produced the image.
figure
s(11) = subplot(6,2,1) % n = 1
histogram(Sn(1,:),'Normalization','pdf')
title(s(11),'n = 1')
s(12) = subplot(6,2,2)
cdfplot(Sn(1,:))
title(s(12),'n = 1')
s(21) = subplot(6,2,3) % n = 2
histogram(Sn(2,:),'Normalization','pdf')
title(s(21),'n = 2')
s(22) = subplot(6,2,4)
cdfplot(Sn(2,:))
title(s(22),'n = 2')
s(31) = subplot(6,2,5) % n = 5
histogram(Sn(5,:),'Normalization','pdf')
title(s(31),'n = 5')
s(32) = subplot(6,2,6)
cdfplot(Sn(5,:))
title(s(32),'n = 5')
s(41) = subplot(6,2,7) % n = 10
histogram(Sn(10,:),'Normalization','pdf')
title(s(41),'n = 10')
s(42) = subplot(6,2,8)
cdfplot(Sn(10,:))
title(s(42),'n = 10')
s(51) = subplot(6,2,9) % n = 20
histogram(Sn(20,:),'Normalization','pdf')
title(s(51),'n = 20')
s(52) = subplot(6,2,10)
cdfplot(Sn(20,:))
title(s(52),'n = 20')
s(61) = subplot(6,2,11) % n = 40
histogram(Sn(40,:),'Normalization','pdf')
title(s(61),'n = 40')
s(62) = subplot(6,2,12)
cdfplot(Sn(40,:))
title(s(62),'n = 40')
sgtitle({'PDF (left) and CDF (right) for Sn with n \in \{1, 2, 5, 10, 20, 40\}';'note different axis scales'})
for tgt = [11:10:61 12:10:62]
xlabel(s(tgt),'Sn')
if rem(tgt,2) == 1
ylabel(s(tgt),'pdf')
else % rem(tgt,2) == 0
ylabel(s(tgt),'cdf')
end
end
Key functions used for plot: histogram() from base MATLAB and cdfplot() from the Statistics toolbox. Note this could be done manually without requiring the Statistics toolbox with a few lines to obtain the cdf and then just calling plot().
There was some concern in comments over the variance of Sn.
Note the variance of Sn is given by (n/12)*(UB-LB)^2 (derivation below). Monte Carlo simulation shows our samples of Sn do have the correct variance; indeed, it converges to this as n gets larger. Simply call var(Sn(40,:)).
% with n = 10000
var(Sn(40,:)) % var(S_40) = 30 (will vary slightly depending on random seed)
(40/12)*((UB-LB)^2) % 29.9505
You can see the convergence is very good by S_40:
step = 0.01;
Domain = 40:step:80;
mu = 40*(LB+UB)/2;
sigma = sqrt((40/12)*((UB-LB)^2));
figure, hold on
histogram(Sn(40,:),'Normalization','pdf')
plot(Domain,normpdf(Domain,mu,sigma),'r-','LineWidth',1.4)
ylabel('pdf')
xlabel('S_n')
Derivation of mean and variance for Sn:
For the expectation (mean), the second equality holds by linearity of expectation. The third equality holds since X_i are identically distributed.
The discrete version of this is posted here.

Related

Code wont produce the value of a definite integral in MATLAB

I've had problems with my code as I've tried to make an integral compute, but it will not for the power, P2.
I've tried using anonymous function handles to use the integral() function on MATLAB as well as just using int(), but it will still not compute. Are the values too small for MATLAB to integrate or am I just missing something small?
Any help or advice would be appreciated to push me in the right direction. Thanks!
The problem in the code is in the bottom of the section labelled "Power Calculations". My integral also gets quite messy if that makes a difference.
%%%%%%%%%%% Parameters %%%%%%%%%%%%
n0 = 1; %air
n1 = 1.4; %layer 1
n2 = 2.62; %layer 2
n3 = 3.5; %silicon
L0 = 650*10^(-9); %centre wavelength
L1 = 200*10^(-9): 10*10^(-9): 2200*10^(-9); %lambda from 200nm to 2200nm
x = ((pi./2).*(L0./L1)); %layer phase thickness
r01 = ((n0 - n1)./(n0 + n1)); %reflection coefficient 01
r12 = ((n1 - n2)./(n1 + n2)); %reflection coefficient 12
r23 = ((n2 - n3)./(n2 + n3)); %reflection coefficient 23
t01 = ((2.*n0)./(n0 + n1)); %transmission coefficient 01
t12 = ((2.*n1)./(n1 + n2)); %transmission coefficient 12
t23 = ((2.*n2)./(n2 + n3)); %transmission coefficient 23
Q1 = [1 r01; r01 1]; %Matrix Q1
Q2 = [1 r12; r12 1]; %Matrix Q2
Q3 = [1 r23; r23 1]; %Matrix Q3
%%%%%%%%%%%% Graph of L vs R %%%%%%%%%%%
R = zeros(size(x));
for i = 1:length(x)
P = [exp(j.*x(i)) 0; 0 exp(-j.*x(i))]; %General Matrix P
T = ((1./(t01.*t12.*t23)).*(Q1*P*Q2*P*Q3)); %Transmission
T11 = T(1,1); %T11 value
T21 = T(2,1); %T21 value
R(i) = ((abs(T21./T11))^2).*100; %Percent reflectivity
end
plot(L1,R)
title('Percent Reflectance vs. wavelength for 2 Layers')
xlabel('Wavelength (m)')
ylabel('Reflectance (%)')
%%%%%%%%%%% Power Calculation %%%%%%%%%%
syms L; %General lamda
y = ((pi./2).*(L0./L)); %Layer phase thickness with variable Lamda
P1 = [exp(j.*y) 0; 0 exp(-j.*y)]; %Matrix P with variable Lambda
T1 = ((1./(t01.*t12.*t23)).*(Q1*P1*Q2*P1*Q3)); %Transmittivity matrix T1
I = ((6.16^(15))./((L.^(5)).*exp(2484./L) - 1)); %Blackbody Irradiance
Tf11 = T1(1,1); %New T11 section of matrix with variable Lambda
Tf2 = (((abs(1./Tf11))^2).*(n3./n0)); %final transmittivity
P1 = Tf2.*I; %Power before integration
L_initial = 200*10^(-9); %Initial wavelength
L_final = 2200*10^(-9); %Final wavelength
P2 = int(P1, L, L_initial, L_final) %Power production
I've refactored your code
to make it easier to read
to improve code reuse
to improve performance
to make it easier to understand
Why do you use so many unnecessary parentheses?!
Anyway, there's a few problems I saw in your code.
You used i as a loop variable, and j as the imaginary unit. It was OK for this one instance, but just barely so. In the future it's better to use 1i or 1j for the imaginary unit, and/or m or ii or something other than i or j as the loop index variable. You're helping yourself and your colleagues; it's just less confusing that way.
Towards the end, you used the variable name P1 twice in a row, and in two different ways. Although it works here, it's confusing! Took me a while to unravel why a matrix-producing function was producing scalars instead...
But by far the biggest problem in your code is the numerical problems with the blackbody irradiance computation. The term
L⁵ · exp(2484/L) - 1
for λ₀ = 200 · 10⁻⁹ m will require computing the quantity
exp(1.242 · 10¹⁰)
which, needless to say, is rather difficult for a computer :) Actually, the problem with your computation is two-fold. First, the exponentiation is definitely out of range of 64 bit IEEE-754 double precision, and will therefore result in ∞. Second, the parentheses are wrong; Planck's law should read
C/L⁵ · 1/(exp(D) - 1)
with C and D the constants (involving Planck's constant, speed of light, and Boltzmann constant), which you've presumably precomputed (I didn't check the values. I do know choice of units can mess these up, so better check).
So, aside from the silly parentheses error, I suspect the main problem is that you simply forgot to rescale λ to nm. Changing everything in the blackbody equation to nm and correcting those parentheses gives the code
I = 6.16^(15) / ( (L*1e+9)^5 * (exp(2484/(L*1e+9)) - 1) );
With this, I got a finite value for the integral of
P2 = 1.052916498836486e-010
But, again, you'd better double-check everything.
Note that I used quadgk(), because it's one of the better ones available on R2010a (which I'm stuck with), but you can just as easily replace this with integral() available on anything newer than R2012a.
Here's the code I ended up with:
function pwr = my_fcn()
% Parameters
n0 = 1; % air
n1 = 1.4; % layer 1
n2 = 2.62; % layer 2
n3 = 3.5; % silicon
L0 = 650e-9; % centre wavelength
% Reflection coefficients
r01 = (n0 - n1)/(n0 + n1);
r12 = (n1 - n2)/(n1 + n2);
r23 = (n2 - n3)/(n2 + n3);
% Transmission coefficients
t01 = (2*n0) / (n0 + n1);
t12 = (2*n1) / (n1 + n2);
t23 = (2*n2) / (n2 + n3);
% Quality factors
Q1 = [1 r01; r01 1];
Q2 = [1 r12; r12 1];
Q3 = [1 r23; r23 1];
% Initial & Final wavelengths
L_initial = 200e-9;
L_final = 2200e-9;
% plot reflectivity for selected lambda range
plot_reflectivity(L_initial, L_final, 1000);
% Compute power production
pwr = quadgk(#power_production, L_initial, L_final);
% Helper functions
% ========================================
% Graph of lambda vs reflectivity
function plot_reflectivity(L_initial, L_final, N)
L = linspace(L_initial, L_final, N);
R = zeros(size(L));
for ii = 1:numel(L)
% Transmission
T = transmittivity(L(ii));
% Percent reflectivity
R(ii) = 100 * abs(T(2,1)/T(1,1))^2 ;
end
plot(L, R)
title('Percent Reflectance vs. wavelength for 2 Layers')
xlabel('Wavelength (m)')
ylabel('Reflectance (%)')
end
% Compute transmittivity matrix for a single wavelength
function T = transmittivity(L)
% Layer phase thickness with variable Lamda
y = pi/2 * L0/L;
% Matrix P with variable Lambda
P1 = [exp(+1j*y) 0
0 exp(-1j*y)];
% Transmittivity matrix T1
T = 1/(t01*t12*t23) * Q1*P1*Q2*P1*Q3;
end
% Power for a specific wavelength. Note that this function
% accepts vector-valued wavelengths; needed for quadgk()
function pwr = power_production(L)
pwr = zeros(size(L));
for ii = 1:numel(L)
% Transmittivity matrix
T1 = transmittivity(L(ii));
% Blackbody Irradiance
I = 6.16^(15) / ( (L(ii)*1e+9)^5 * (exp(2484/(L(ii)*1e+9)) - 1) );
% final transmittivity
Tf2 = abs(1/T1(1))^2 * n3/n0;
% Power before integration
pwr(ii) = Tf2 * I;
end
end
end

Matlab : Confusion regarding unit of entropy to use in an example

Figure 1. Hypothesis plot. y axis: Mean entropy. x axis: Bits.
This Question is in continuation to a previous one asked Matlab : Plot of entropy vs digitized code length
I want to calculate the entropy of a random variable that is discretized version (0/1) of a continuous random variable x. The random variable denotes the state of a nonlinear dynamical system called as the Tent Map. Iterations of the Tent Map yields a time series of length N.
The code should exit as soon as the entropy of the discretized time series becomes equal to the entropy of the dynamical system. It is known theoretically that the entropy of the system, H is log_e(2) or ln(2) = 0.69 approx. The objective of the code is to find number of iterations, j needed to produce the same entropy as the entropy of the system, H.
Problem 1: My problem in when I calculate the entropy of the binary time series which is the information message, then should I be doing it in the same base as H? OR Should I convert the value of H to bits because the information message is in 0/1 ? Both give different results i.e., different values of j.
Problem 2: It can happen that the probality of 0's or 1's can become zero so entropy correspondng to it can become infinity. To prevent this, I thought of putting a check using if-else. But, the loop
if entropy(:,j)==NaN
entropy(:,j)=0;
end
does not seem to be working. Shall be greateful for ideas and help to solve this problem. Thank you
UPDATE : I implemented the suggestions and answers to correct the code. However, my logic of solving was not proper earlier. In the revised code, I want to calculate the entropy for length of time series having bits 2,8,16,32. For each code length, entropy is calculated. Entropy calculation for each code length is repeated N times starting for each different initial condition of the dynamical system. This appraoch is adopted to check at which code length the entropy becomes 1. The nature of the plot of entropy vs bits should be increasing from zero and gradually reaching close to 1 after which it saturates - remains constant for all the remaining bits. I am unable to get this curve (Figure 1). Shall appreciate help in correcting where I am going wrong.
clear all
H = 1 %in bits
Bits = [2,8,16,32,64];
threshold = 0.5;
N=100; %Number of runs of the experiment
for r = 1:length(Bits)
t = Bits(r)
for Runs = 1:N
x(1) = rand;
for j = 2:t
% Iterating over the Tent Map
if x(j - 1) < 0.5
x(j) = 2 * x(j - 1);
else
x(j) = 2 * (1 - x(j - 1));
end % if
end
%Binarizing the output of the Tent Map
s = (x >=threshold);
p1 = sum(s == 1 ) / length(s); %calculating probaility of number of 1's
p0 = 1 - p1; % calculating probability of number of 0'1
entropy(t) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculating entropy in bits
if isnan(entropy(t))
entropy(t) = 0;
end
%disp(abs(lambda-H))
end
Entropy_Run(Runs) = entropy(t)
end
Entropy_Bits(r) = mean(Entropy_Run)
plot(Bits,Entropy_Bits)
For problem 1, H and entropy can be in either nats or bits units, so long as they are both computed using the same units. In other words, you should use either log for both or log2 for both. With the code sample you provided, H and entropy are correctly calculated using consistant nats units. If you prefer to work in units of bits, the conversion of H should give you H = log(2)/log(2) = 1 (or using the conversion factor 1/log(2) ~ 1.443, H ~ 0.69 * 1.443 ~ 1).
For problem 2, as #noumenal already pointed out you can check for NaN using isnan. Alternatively you could check if p1 is within (0,1) (excluding 0 and 1) with:
if (p1 > 0 && p1 < 1)
entropy(:,j) = -p1 * log(p1) - (1 - p1) * log(1 - p1); %calculating entropy in natural base e
else
entropy(:, j) = 0;
end
First you just
function [mean_entropy, bits] = compute_entropy(bits, blocks, threshold, replicate)
if replicate
disp('Replication is ON');
else
disp('Replication is OFF');
end
%%
% Populate random vector
if replicate
seed = 849;
rng(seed);
else
rng('default');
end
rs = rand(blocks);
%%
% Get random
trial_entropy = zeros(length(bits));
for r = 1:length(rs)
bit_entropy = zeros(length(bits), 1); % H
% Traverse bit trials
for b = 1:(length(bits)) % N
tent_map = zeros(b, 1); %Preallocate for memory management
%Initialize
tent_map(1) = rs(r);
for j = 2:b % j is the iterator, b is the current bit
if tent_map(j - 1) < threshold
tent_map(j) = 2 * tent_map(j - 1);
else
tent_map(j) = 2 * (1 - tent_map(j - 1));
end % if
end
%Binarize the output of the Tent Map
s = find(tent_map >= threshold);
p1 = sum(s == 1) / length(s); %calculate probaility of number of 1's
%p0 = 1 - p1; % calculate probability of number of 0'1
bit_entropy(b) = -p1 * log2(p1) - (1 - p1) * log2(1 - p1); %calculate entropy in bits
if isnan(bit_entropy(b))
bit_entropy(b) = 0;
end
%disp(abs(lambda-h))
end
trial_entropy(:, r) = bit_entropy;
disp('Trial Statistics')
data = get_summary(bit_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
end
% TO DO Compute the mean for each BIT index in trial_entropy
mean_entropy = 0;
disp('Overall Statistics')
data = get_summary(trial_entropy);
disp('Mean')
disp(data.mean);
disp('SD')
disp(data.sd);
%This is the wrong mean...
mean_entropy = data.mean;
function summary = get_summary(entropy)
summary = struct('mean', mean(entropy), 'sd', std(entropy));
end
end
and then you just have to
% Entropy Script
clear all
%% Settings
replicate = false; % = false % Use true for debugging only.
%H = 1; %in bits
Bits = 2.^(1:6);
Threshold = 0.5;
%Tolerance = 0.001;
Blocks = 100; %Number of runs of the experiment
%% Run
[mean_entropy, bits] = compute_entropy(Bits, Blocks, Threshold, replicate);
%What we want
%plot(bits, mean_entropy);
%What we have
plot(1:length(mean_entropy), mean_entropy);

plotting volume-time graph of .wav file

I'm trying to get volume-time graph of .wav file. First, I recorded sound (patient exhalations) via android as .wav file, but when I read this .wav file in MATLAB it has negative values. What is the meaning of negative values? Second, MATLAB experts could you please check if the code below does the same as written in my comments? Also another question. Y = fft(WindowArray);
p = abs(Y).^2;
I took the power of values returned from fft...is that correct and what is the goal of this step??
[data, fs] = wavread('newF2');
% read exhalation audio wav file (1 channel, mono)
% frequency is 44100 HZ
% windows of 0.1 s and overlap of 0.05 seconds
WINDOW_SIZE = fs*0.1; %4410 = fs*0.1
array_size = length(data); % array size of data
numOfPeaks = (array_size/(WINDOW_SIZE/2)) - 1;
step = floor(WINDOW_SIZE/2); %step size used in loop
transformed = data;
start =1;
k = 1;
t = 1;
g = 1;
o = 1;
% performing fft on each window and finding the peak of windows
while(((start+WINDOW_SIZE)-1)<=array_size)
j=1;
i =start;
while(j<=WINDOW_SIZE)
WindowArray(j) = transformed(i);
j = j+1;
i = i +1;
end
Y = fft(WindowArray);
p = abs(Y).^2; %power
[a, b] = max(abs(Y)); % find max a and its indices b
[m, i] = max(p); %the maximum of the power m and its indices i
maximum(g) = m;
index(t) = i;
power(o) = a;
indexP(g) = b;
start = start + step;
k = k+1;
t = t+1;
g = g+1;
o=o+1;
end
% low pass filter
% filtering noise: ignor frequencies that are less than 5% of maximum frequency
for u=1:length(maximum)
M = max(maximum); %highest value in the array
Accept = 0.05* M;
if(maximum(u) > Accept)
maximum = maximum(u:length(maximum));
break;
end
end
% preparing the time of the graph,
% Location of the Peak flow rates are estimated
TotalTime = (numOfPeaks * 0.1);
time1 = [0:0.1:TotalTime];
if(length(maximum) > ceil(numOfPeaks));
maximum = maximum(1:ceil(numOfPeaks));
end
time = time1(1:length(maximum));
% plotting frequency-time graph
figure(1);
plot(time, maximum);
ylabel('Frequency');
xlabel('Time (in seconds)');
% plotting volume-time graph
figure(2);
plot(time, cumsum(maximum)); % integration over time to get volume
ylabel('Volume');
xlabel('Time (in seconds)');
(I only answer the part of the question which I understood)
Per default Matlab normalizes the audio wave to - 1...1 range. Use the native option if you want the integer data.
First, in your code it should be p = abs(Y)**2, this is the proper way to square the values returned from the FFT. The reason why you take the absolute value of the FFT return values is because those number are complex numbers with a Real and Imaginary part, therefore the absolute value (or modulus) of an imaginary number is the magnitude of that number. The goal of taking the power could be for potentially obtaining an RMS value (root mean squared) of your overall amplitude values, but you could also have something else in mind. When you say volume-time I assume you want decibels, so try something like this:
def plot_signal(file_name):
sampFreq, snd = wavfile.read(file_name)
snd = snd / (2.**15) #Convert sound array to floating point values
#Floating point values range from -1 to 1
s1 = snd[:,0] #left channel
s2 = snd[:,1] #right channel
timeArray = arange(0, len(snd), 1)
timeArray = timeArray / sampFreq
timeArray = timeArray * 1000 #scale to milliseconds
timeArray2 = arange(0, len(snd), 1)
timeArray2 = timeArray2 / sampFreq
timeArray2 = timeArray2 * 1000 #scale to milliseconds
n = len(s1)
p = fft(s1) # take the fourier transform
m = len(s2)
p2 = fft(s2)
nUniquePts = ceil((n+1)/2.0)
p = p[0:nUniquePts]
p = abs(p)
mUniquePts = ceil((m+1)/2.0)
p2 = p2[0:mUniquePts]
p2 = abs(p2)
'''
Left Channel
'''
p = p / float(n) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p = p**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if n % 2 > 0: # we've got odd number of points fft
p[1:len(p)] = p[1:len(p)] * 2
else:
p[1:len(p) -1] = p[1:len(p) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray, 10*log10(p), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('LeftChannel_Power (dB)')
plt.show()
'''
Right Channel
'''
p2 = p2 / float(m) # scale by the number of points so that
# the magnitude does not depend on the length
# of the signal or on its sampling frequency
p2 = p2**2 # square it to get the power
# multiply by two (see technical document for details)
# odd nfft excludes Nyquist point
if m % 2 > 0: # we've got odd number of points fft
p2[1:len(p2)] = p2[1:len(p2)] * 2
else:
p2[1:len(p2) -1] = p2[1:len(p2) - 1] * 2 # we've got even number of points fft
plt.plot(timeArray2, 10*log10(p2), color='k')
plt.xlabel('Time (ms)')
plt.ylabel('RightChannel_Power (dB)')
plt.show()
I hope this helps.

Summing scalars and vectors [MATLAB]

First off; I'm not very well taught in programming, but i tend to learn exactly what I need to learn in order to do what I want when programming, I have moderate experience with python, html/css C and matlab. I've now enrolled in a physics-simulation course where I use matlab to compute the trajectory of 500 particles under the influence of 5 force-fields of different magnitude.
So now to my thing; I need to write the following for all i=1...500 particles
f_i = m*g - sum{(f_k/r_k^2)*exp((||vec(x)_i - vec(p)_k||^2)/2*r_k^2)(vec(x)_i - vec(p)_k)}
I hope its not too cluttered
And here is my code so far;
clear all
close all
echo off
%Simulation Parameters-------------------------------------------
h = 0.01; %Time-step h (s)
t_0 = 0; %initial time (s)
t_f = 3; %final time (s)
m = 1; %Particle mass (kg)
L = 5; %Charateristic length (m)
NT = t_f/h; %Number of time steps
g = [0,-9.81];
f = [32 40 28 16 20]; %the force f_k (N)
r = [0.3*L 0.2*L 0.4*L 0.5*L 0.3*L]; %the radii r_k
p = [-0.2*L 0.8*L; -0.3*L -0.8*L; -0.6*L 0.1*L;
0.4*L 0.7*L; 0.8*L -0.3*L];
%Forcefield origin position
%stepper = 'forward_euler'; % use forward Euler time-integration
fprintf('Simulation Parameters set');
%initialization---------------------------------------------------
for i = 1:500 %Gives inital value to each of the 500 particles
particle{i,1}.x = [-L -L];
particle{i,1}.v = [5,10];
particle{i,1}.m = m;
for k = 1:5
C = particle{i}.x - p(k,:);
F = rdivide(f(1,k),r(1,k)^2).*C
%clear C; %Creates elements for array F
end
particle{i}.fi = m*g - sum(F); %Compute attractive force on particle
%clear F; %Clear F for next use
end
What this code seems to do is that it goes into the first loop with index i, then goes through the 'k'-loop and exits it with a value for F then uses that last value for F(k) to compute f_i.
What I want it to do is to put all the values of F(k) from 1-5 and put into a matrix which columns I can sum for f_i. I'd prefer to sum the columns as the first column should represent all F-components in the x-axis and the second column all F-components in the y-axis.
Note that the expression for F in the k-loop is not done.
I fixed it by defining the index for F(k,:)

Optimizing repetitive estimation (currently a loop) in MATLAB

I've found myself needing to do a least-squares (or similar matrix-based operation) for every pixel in an image. Every pixel has a set of numbers associated with it, and so it can be arranged as a 3D matrix.
(This next bit can be skipped)
Quick explanation of what I mean by least-squares estimation :
Let's say we have some quadratic system that is modeled by Y = Ax^2 + Bx + C and we're looking for those A,B,C coefficients. With a few samples (at least 3) of X and the corresponding Y, we can estimate them by:
Arrange the (lets say 10) X samples into a matrix like X = [x(:).^2 x(:) ones(10,1)];
Arrange the Y samples into a similar matrix: Y = y(:);
Estimate the coefficients A,B,C by solving: coeffs = (X'*X)^(-1)*X'*Y;
Try this on your own if you want:
A = 5; B = 2; C = 1;
x = 1:10;
y = A*x(:).^2 + B*x(:) + C + .25*randn(10,1); % added some noise here
X = [x(:).^2 x(:) ones(10,1)];
Y = y(:);
coeffs = (X'*X)^-1*X'*Y
coeffs =
5.0040
1.9818
0.9241
START PAYING ATTENTION AGAIN IF I LOST YOU THERE
*MAJOR REWRITE*I've modified to bring it as close to the real problem that I have and still make it a minimum working example.
Problem Setup
%// Setup
xdim = 500;
ydim = 500;
ncoils = 8;
nshots = 4;
%// matrix size for each pixel is ncoils x nshots (an overdetermined system)
%// each pixel has a matrix stored in the 3rd and 4rth dimensions
regressor = randn(xdim,ydim, ncoils,nshots);
regressand = randn(xdim, ydim,ncoils);
So my problem is that I have to do a (X'*X)^-1*X'*Y (least-squares or similar) operation for every pixel in an image. While that itself is vectorized/matrixized the only way that I have to do it for every pixel is in a for loop, like:
Original code style
%// Actual work
tic
estimate = zeros(xdim,ydim);
for col=1:size(regressor,2)
for row=1:size(regressor,1)
X = squeeze(regressor(row,col,:,:));
Y = squeeze(regressand(row,col,:));
B = X\Y;
% B = (X'*X)^(-1)*X'*Y; %// equivalently
estimate(row,col) = B(1);
end
end
toc
Elapsed time = 27.6 seconds
EDITS in reponse to comments and other ideas
I tried some things:
1. Reshaped into a long vector and removed the double for loop. This saved some time.
2. Removed the squeeze (and in-line transposing) by permute-ing the picture before hand: This save alot more time.
Current example:
%// Actual work
tic
estimate2 = zeros(xdim*ydim,1);
regressor_mod = permute(regressor,[3 4 1 2]);
regressor_mod = reshape(regressor_mod,[ncoils,nshots,xdim*ydim]);
regressand_mod = permute(regressand,[3 1 2]);
regressand_mod = reshape(regressand_mod,[ncoils,xdim*ydim]);
for ind=1:size(regressor_mod,3) % for every pixel
X = regressor_mod(:,:,ind);
Y = regressand_mod(:,ind);
B = X\Y;
estimate2(ind) = B(1);
end
estimate2 = reshape(estimate2,[xdim,ydim]);
toc
Elapsed time = 2.30 seconds (avg of 10)
isequal(estimate2,estimate) == 1;
Rody Oldenhuis's way
N = xdim*ydim*ncoils; %// number of columns
M = xdim*ydim*nshots; %// number of rows
ii = repmat(reshape(1:N,[ncoils,xdim*ydim]),[nshots 1]); %//column indicies
jj = repmat(1:M,[ncoils 1]); %//row indicies
X = sparse(ii(:),jj(:),regressor_mod(:));
Y = regressand_mod(:);
B = X\Y;
B = reshape(B(1:nshots:end),[xdim ydim]);
Elapsed time = 2.26 seconds (avg of 10)
or 2.18 seconds (if you don't include the definition of N,M,ii,jj)
SO THE QUESTION IS:
Is there an (even) faster way?
(I don't think so.)
You can achieve a ~factor of 2 speed up by precomputing the transposition of X. i.e.
for x=1:size(picture,2) % second dimension b/c already transposed
X = picture(:,x);
XX = X';
Y = randn(n_timepoints,1);
%B = (X'*X)^-1*X'*Y; ;
B = (XX*X)^-1*XX*Y;
est(x) = B(1);
end
Before: Elapsed time is 2.520944 seconds.
After: Elapsed time is 1.134081 seconds.
EDIT:
Your code, as it stands in your latest edit, can be replaced by the following
tic
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
% Actual work
picture = randn(xdim,ydim,n_timepoints);
picture = reshape(picture, [xdim*ydim,n_timepoints])'; % note transpose
YR = randn(n_timepoints,size(picture,2));
% (XX*X).^-1 = sum(picture.*picture).^-1;
% XX*Y = sum(picture.*YR);
est = sum(picture.*picture).^-1 .* sum(picture.*YR);
est = reshape(est,[xdim,ydim]);
toc
Elapsed time is 0.127014 seconds.
This is an order of magnitude speed up on the latest edit, and the results are all but identical to the previous method.
EDIT2:
Okay, so if X is a matrix, not a vector, things are a little more complicated. We basically want to precompute as much as possible outside of the for-loop to keep our costs down. We can also get a significant speed-up by computing XT*X manually - since the result will always be a symmetric matrix, we can cut a few corners to speed things up. First, the symmetric multiplication function:
function XTX = sym_mult(X) % X is a 3-d matrix
n = size(X,2);
XTX = zeros(n,n,size(X,3));
for i=1:n
for j=i:n
XTX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
if i~=j
XTX(j,i,:) = XTX(i,j,:);
end
end
end
Now the actual computation script
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10
% Actual work
tic % start timing
picture = reshape(picture, [xdim*ydim,n_timepoints])';
% Here we precompute the (XT*Y) calculation to speed things up later
picture_y = [sum(Y);sum(Y.*picture)];
% initialize
est = zeros(size(picture,2),1);
picture = permute(picture,[1,3,2]);
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX); % precompute (XT*X) for speed
X = zeros(2,2); % preallocate for speed
XY = zeros(2,1);
for x=1:size(picture,2) % second dimension b/c already transposed
%For some reason this is a lot faster than X = XTX(:,:,x);
X(1,1) = XTX(1,1,x);
X(2,1) = XTX(2,1,x);
X(1,2) = XTX(1,2,x);
X(2,2) = XTX(2,2,x);
XY(1) = picture_y(1,x);
XY(2) = picture_y(2,x);
% Here we utilise the fact that A\B is faster than inv(A)*B
% We also use the fact that (A*B)*C = A*(B*C) to speed things up
B = X\XY;
est(x) = B(1);
end
est = reshape(est,[xdim,ydim]);
toc % end timing
Before: Elapsed time is 4.56 seconds.
After: Elapsed time is 2.24 seconds.
This is a speed up of about a factor of 2. This code should be extensible to X being any dimensions you want. For instance, in the case where X = [1 x x^2], you would change picture_y to the following
picture_y = [sum(Y);sum(Y.*picture);sum(Y.*picture.^2)];
and change XTX to
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture,picture.^2);
You would also change a lot of 2s to 3s in the code, and add XY(3) = picture_y(3,x) to the loop. It should be fairly straight-forward, I believe.
Results
I sped up your original version, since your edit 3 was actually not working (and also does something different).
So, on my PC:
Your (original) version: 8.428473 seconds.
My obfuscated one-liner given below: 0.964589 seconds.
First, for no other reason than to impress, I'll give it as I wrote it:
%%// Some example data
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
estimate = zeros(xdim,ydim); %// initialization with explicit size
picture = randn(xdim,ydim,n_timepoints);
%%// Your original solution
%// (slightly altered to make my version's results agree with yours)
tic
Y = randn(n_timepoints,xdim*ydim);
ii = 1;
for x = 1:xdim
for y = 1:ydim
X = squeeze(picture(x,y,:)); %// or similar creation of X matrix
B = (X'*X)^(-1)*X' * Y(:,ii);
ii = ii+1;
%// sometimes you keep everything and do
%// estimate(x,y,:) = B(:);
%// sometimes just the first element is important and you do
estimate(x,y) = B(1);
end
end
toc
%%// My version
tic
%// UNLEASH THE FURY!!
estimate2 = reshape(sparse(1:xdim*ydim*n_timepoints, ...
builtin('_paren', ones(n_timepoints,1)*(1:xdim*ydim),:), ...
builtin('_paren', permute(picture, [3 2 1]),:))\Y(:), ydim,xdim).'; %'
toc
%%// Check for equality
max(abs(estimate(:)-estimate2(:))) % (always less than ~1e-14)
Breakdown
First, here's the version that you should actually use:
%// Construct sparse block-diagonal matrix
%// (Type "help sparse" for more information)
N = xdim*ydim; %// number of columns
M = N*n_timepoints; %// number of rows
ii = 1:N;
jj = ones(n_timepoints,1)*(1:N);
s = permute(picture, [3 2 1]);
X = sparse(ii,jj(:), s(:));
%// Compute ALL the estimates at once
estimates = X\Y(:);
%// You loop through the *second* dimension first, so to make everything
%// agree, we have to extract elements in the "wrong" order, and transpose:
estimate2 = reshape(estimates, ydim,xdim).'; %'
Here's an example of what picture and the corresponding matrix X looks like for xdim = ydim = n_timepoints = 2:
>> clc, picture, full(X)
picture(:,:,1) =
-0.5643 -2.0504
-0.1656 0.4497
picture(:,:,2) =
0.6397 0.7782
0.5830 -0.3138
ans =
-0.5643 0 0 0
0.6397 0 0 0
0 -2.0504 0 0
0 0.7782 0 0
0 0 -0.1656 0
0 0 0.5830 0
0 0 0 0.4497
0 0 0 -0.3138
You can see why sparse is necessary -- it's mostly zeros, but will grow large quickly. The full matrix would quickly consume all your RAM, while the sparse one will not consume much more than the original picture matrix does.
With this matrix X, the new problem
X·b = Y
now contains all the problems
X1 · b1 = Y1
X2 · b2 = Y2
...
where
b = [b1; b2; b3; ...]
Y = [Y1; Y2; Y3; ...]
so, the single command
X\Y
will solve all your systems at once.
This offloads all the hard work to a set of highly specialized, compiled to machine-specific code, optimized-in-every-way algorithms, rather than the interpreted, generic, always-two-steps-away from the hardware loops in MATLAB.
It should be straightforward to convert this to a version where X is a matrix; you'll end up with something like what blkdiag does, which can also be used by mldivide in exactly the same way as above.
I had a wee play around with an idea, and I decided to stick it as a separate answer, as its a completely different approach to my other idea, and I don't actually condone what I'm about to do. I think this is the fastest approach so far:
Orignal (unoptimised): 13.507176 seconds.
Fast Cholesky-decomposition method: 0.424464 seconds
First, we've got a function to quickly do the X'*X multiplication. We can speed things up here because the result will always be symmetric.
function XX = sym_mult(X)
n = size(X,2);
XX = zeros(n,n,size(X,3));
for i=1:n
for j=i:n
XX(i,j,:) = sum(X(:,i,:).*X(:,j,:));
if i~=j
XX(j,i,:) = XX(i,j,:);
end
end
end
The we have a function to do LDL Cholesky decomposition of a 3D matrix (we can do this because the (X'*X) matrix will always be symmetric) and then do forward and backwards substitution to solve the LDL inversion equation
function Y = fast_chol(X,XY)
n=size(X,2);
L = zeros(n,n,size(X,3));
D = zeros(n,n,size(X,3));
B = zeros(n,1,size(X,3));
Y = zeros(n,1,size(X,3));
% These loops compute the LDL decomposition of the 3D matrix
for i=1:n
D(i,i,:) = X(i,i,:);
L(i,i,:) = 1;
for j=1:i-1
L(i,j,:) = X(i,j,:);
for k=1:(j-1)
L(i,j,:) = L(i,j,:) - L(i,k,:).*L(j,k,:).*D(k,k,:);
end
D(i,j,:) = L(i,j,:);
L(i,j,:) = L(i,j,:)./D(j,j,:);
if i~=j
D(i,i,:) = D(i,i,:) - L(i,j,:).^2.*D(j,j,:);
end
end
end
for i=1:n
B(i,1,:) = XY(i,:);
for j=1:(i-1)
B(i,1,:) = B(i,1,:)-D(i,j,:).*B(j,1,:);
end
B(i,1,:) = B(i,1,:)./D(i,i,:);
end
for i=n:-1:1
Y(i,1,:) = B(i,1,:);
for j=n:-1:(i+1)
Y(i,1,:) = Y(i,1,:)-L(j,i,:).*Y(j,1,:);
end
end
Finally, we have the main script which calls all of this
xdim = 500;
ydim = 500;
n_timepoints = 10; % for example
Y = randn(10,xdim*ydim);
picture = randn(xdim,ydim,n_timepoints); % 500x500x10
tic % start timing
picture = reshape(pr, [xdim*ydim,n_timepoints])';
% Here we precompute the (XT*Y) calculation
picture_y = [sum(Y);sum(Y.*picture)];
% initialize
est2 = zeros(size(picture,2),1);
picture = permute(picture,[1,3,2]);
% Now we calculate the X'*X matrix
XTX = cat(2,ones(n_timepoints,1,size(picture,3)),picture);
XTX = sym_mult(XTX);
% Call our fast Cholesky decomposition routine
B = fast_chol(XTX,picture_y);
est2 = B(1,:);
est2 = reshape(est2,[xdim,ydim]);
toc
Again, this should work equally well for a Nx3 X matrix, or however big you want.
I use octave, thus I can't say anything about the resulting performance in Matlab, but would expect this code to be slightly faster:
pictureT=picture'
est=arrayfun(#(x)( (pictureT(x,:)*picture(:,x))^-1*pictureT(x,:)*randn(n_ti
mepoints,1)),1:size(picture,2));