Performance analysis of series summation in Matlab - matlab

I'm writing a Matlab program to compute pi through the summation of series
A = Sum of a_i from i=1 to N
where
pi/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 ...
To compute pi through the series summation, the suggested approach is to set
a_i = (-1)^(i+1)/(2i-1)
In order to do this, I wrote the program below
n=100;
f=[];
for jj=1:n
ii=1:jj;
f=[f 4*sum( ((-1).^(ii+1))./(2.*ii-1) )];
end;
hold on
plot(f)
title('Computing of \pi using a finite sum')
xlabel('Number of summations')
ylabel('Estimated value of \pi')
plot(1:size(f,2),ones(size(f))*pi)
This program shows that the series approximation is somewhat accurate near N=80.
I am now attempting to adjust my program so that the y-axis displays total calculation time T_N and the x-axis displays N (the number of summations). The total calculation time T_N should increase as N increases. Ideally, I am looking to have the graph display something close to a linear relationship between T(N) and N
To do this, I have adjusted my original program as follows
n=100;
f=[];
tic
for jj=1:n
ii=1:jj;
f=[f 4*sum( ((-1).^(ii+1))./(2.*ii-1) )];
end;
hold on
plot(f)
title('Time it takes to sum \pi using N summations')
xlabel('Number of summations (N)')
ylabel('Total Time (T_N)')
plot(1:size(f,2),toc)
slope = polyfit(1:size(f,2),toc,1);
This looks wrong. I must have incorrectly applied the built-in timing functions in Matlab (tic and toc). So, I am going to analyze my code and ask two questions -
How could I adjust my code above so that the y-axis correctly displays the total calculation time per the summation N? It looks like I did something wrong in plot(1:size(f,2),toc).
After I get the y-axis to display the correct total calculation time (T_N), I should be able to use the polyfit command to find the slope of T(N)/N. This will give me a linear relationship between T(N) and N. I could then use the value of slope = polyfit(1:size(f,2),toc,1) to compute
t_N = a + b*N
where t_N is computed for every value of N and b is the slope calculated through the polyfit command.
I think that I should be able to find the values of a and b after I correctly display the y-axis and correctly reference the polyfit command.

There are several things that can be improved in your code:
f should be preallocated, so as not to waste time in repeatedly assigning memory.
tic should be called within the loop, in order to restart the stopwatch timer.
When you call toc you get the current time from the last tic. The time spent should be stored in a vector (also preallocated).
Since the computations you want to time are very fast, measuring the time they take is very unrealiable. The computations should be repeated many times, so the measured time is larger and you get better accuracy. Even better would be to use timeit (see below).
You cannot plot the time and the results in the same figure, because the scales are too different.
The code incorporating these changes is:
n = 100;
f = NaN(1,n); % preallocate
times = NaN(1,n); % preallocate
repeat_factor = 1e4; % repeat computations for better time accuracy
for jj=1:n
tic % initiallize time count
for repeat = 1:repeat_factor % repeat many times for better time accuracy
ii=1:jj;
f(jj) = 4*sum( ((-1).^(ii+1))./(2.*ii-1) ); % store value
end
times(jj) = toc; % store time
end
times = times / repeat_factor; % divide by repeat factor
plot(f)
title('Time it takes to sum \pi using N summations')
xlabel('Number of summations (N)')
ylabel('Total Time (T_N)')
figure % new figure for time
plot(1:size(f,2), times)
p = polyfit(1:size(f,2),times,1);
slope = p(1);
Using timeit for measuring the time will probably give improved accuracy (but not very good because, as mentioned above, the computations you want to time are very fast). To use timeit you need to define a function with the code to be timed. The simplest way is to use an anonymous function without input arguments. See code below.
n = 100;
f = NaN(1,n); % preallocate
times = NaN(1,n); % preallocate
for jj=1:n
ii=1:jj;
fun = #() 4*sum( ((-1).^(ii+1))./(2.*ii-1) );
f(jj) = fun(); % store value
times(jj) = timeit(fun); % measure and store time
end
plot(f)
title('Time it takes to sum \pi using N summations')
xlabel('Number of summations (N)')
ylabel('Total Time (T_N)')
figure % new figure for time
plot(1:size(f,2), times)
p = polyfit(1:size(f,2),times,1);
slope = p(1);

If I understand your problem correctly, I think there are two different issues here. First, you plot your result function then the elapsed time which is several orders of magnitude smaller than pi:
hold on
plot(f) % <---- Comment me out!
...stuff...
plot(1:size(f,2),toc)
Secondly, you need to store the execution time of each pass of the loop:
n=100;
f=[];
telapsed = zeros(1,n);
tic
for jj=1:n
ii=1:jj;
f=[f 4*sum( ((-1).^(ii+1))./(2.*ii-1) )];
telapsed(jj) = toc;
end
hold on
% plot(f)
title('Time it takes to sum \pi using N summations')
xlabel('Number of summations (N)')
ylabel('Total Time (T_N)')
plot(1:n,telapsed)
slope = polyfit(1:n,telapsed,1);
Note the new polyfit expression for slope of the execution time. Does that help?

Related

How to speed up the plotting of graphs using pause()?

I understand that when plotting an equation for x iterations that when you use a pause with a decimal number you can speed up the time it takes to go from one iteration to the next. My question is there a way to speed it up even more? Basically I am running a upwind 1D advection equation and my plot is doing pretty slow even when I put a pause of say 0.0001. Any advice on making the speed of this program increase for plotting or would I just need to let it run its course.
Here is the code I am using:
clear;
clc;
%Set initial values
xmin=0;
xmax=1;
N=101; %Amount of segments
dt= 0.0001; % Time step
t=0; % t initial
tmax=2; % Run this test until t=2
u=1; %Velocity
dx = (xmax - xmin)/100 %finding delta x from the given information
x =xmin-dx : dx : xmax+dx; %setting the x values that will be plugged in
h0= exp(-(x- 0.5).^2/0.01); %Initial gaussian profile for t=0
h = h0;
hp1=h0;
nsteps =tmax/dt; % total number of steps taken
for n=1 : nsteps
h(1)=h(end-2); %Periodic B.C
h(end)=h(2);
for i =2 : N+1
if u>0
hp1(i) = h(i) - u*dt/dx *( h(i)-h(i-1)); %Loop to solve the FOU
elseif u<0
hp1(i) = h(i) - u*dt/dx*(h(i+1)-h(i)); %downwind
end
end
t=t+dt; %Increase t after each iteration
h= hp1; %have the new hp1 equal to h for the next step
initial= exp(-(x- 0.5).^2/0.01); % The initial plot when t =0
%hold on
plot(x,initial,'*') %plot initial vs moving
plot(x,h,'o-')
pause(0.0001);
%hold off
figure(2)
plot(x,initial) %plot end value
end
Isn't this "speedup" due to pause() flushing the graphic event buffer like drawnow, but apparently doing it faster? So it is not the length of the pause doing any work (in fact, I don't think the resolution is in the millisecond range on many machines) but just the command itself.
The thing really slowing down your code is those for loops. You should try to change the code to calculate the segments in parallel instead.

Matlab: merging a new script with inbuilt code to compare the execution time

Please, I have this code for determinant that runs but does not display the output only the time it does display. Can gurus in the house help me out.
function determ=calll(A)
A=input('rand()');
[rows, columns] = size(A);
tic;
if rows==2:size(A);
for m11=A(2:end,2:end);
m1n=A(2:end,1:end-1);
mn1=A(1:end-1,2:end);
mnn=A(1:end-1,1:end-1);
m11nn=A(1:end-2,1:end-2);
deter=(m11)*(mnn)-((m1n)*(mn1));
determ=deter./deternew(m11nn);
end
end
toc
disp('determinant =')
The real issue is that
I want to incorporate random matrix in the code so that when I run it, a random matrix will be used only just for me to specify the order of the matrix because I can not input 1000 by 1000 matrix manually.
An inbuilt Matlab code for determinant should also be embedded in my script.
Time of execution should be included for my code and for the inbuilt Matlab code.
In all, when I run the program, it should ask for the order (since a random matrix must be used) and compute the determinant using this code and the inbuilt Matlab code concurrently. Note that the determinant of this code and Matlab will be the same but their time of execution will be different. So my output after execution should be in two forms
the value of the determinant of my script and the time taken for the execution
the value of the determinant of the inbuilt Matlab method and the time taken for the execution.
#EBH
function out = thanksEBH
A = input('matrix A =');
[rows, columns] = size(A);
N=100;
t = zeros(N,1);
for k = 1:N
end
tic;
out=1;
for i = 1:rows
out = prod(A(i,i)*A(i,end));
end
t(k,1) = toc;
for i = 1:rows
out = prod(A(i,end-1)*A(end,i));
end
t(k,2) = toc;
t(k) = toc;
I really don't understand the purpose of this second code, but as far as I can guess what you try to do, I can suggest this:
function out = thanksEBH
n = input('matrix A =');
A = rand(n);
N = 100;
t = zeros(N,1);
out = 1;
for k = 1:N
tic
% first procedure to measure run-time
for i = 1:n
out = prod(A(i,i)*A(i,end));
end
t(k,1) = toc;
tic
% second procedure to measure run-time
for i = 1:n
out = prod(A(i,end-1)*A(end,i));
end
t(k,2) = toc;
end
disp(mean(t))
end
This will measure execution time of the two inner for loops separately and will display the mean of them. Anything you put inside the inner for loops will be measured. Note that I also changed the start of the function (among some other things) so it could run.
Please, if you have any other questions regarding this, ask them separately in a different question.
Well, I'm not a guru in matlab but I will to try to answer you:
For creating random arrays:
http://es.mathworks.com/help/matlab/math/create-arrays-of-random-numbers.html
In order to calulate the determinat in matlab you ca use the function det
function det
The easy way for get execution time in matlab is using the pair functions tic and toc
tic
A = rand(12000, 4400);
B = rand(12000, 4400);
toc
C = A'.*B';
toc
tic and toc
You can ask the user with the input command:
http://es.mathworks.com/help/matlab/ref/input.html
prompt = 'What is the original value? ';
x = input(prompt)
y = x*10
simply googling for a recursive determinant algorithm:
% The derminant of a matrix can be evaluated using a recursive formula such
% that: Assuming a square matrix A of order n
% det(A)=sum(a1j*Cj), for j=1,n
% where a1j represent the elements of the 1st row of the matrix A, and
% Cj is the determinant of the reduced matrix obtained removing the 1st row
% and the column j of the matrix A
function DetA=determin(A)
% This method can be applied for any nxn square matrix;
% Check Input Argument
if isempty(A)
error('cof:EmptyMatrix','The matrix A does not exist');
end
[r, c]=size(A); % number of rows and columns of A
if r~=c
error('det:NotSquareMatrix','The matrix A is not square');
end
DetA=0;
% Calculate determinant
if r==2,
% if the matrix a 2x2 matrix, then directly calculate the determinant
% using the common formula
DetA=A(1,1)*A(2,2)-A(1,2)*A(2,1);
else
% if the matrix is not 2x2 reduce its order of 1, generating a matrix
% with r-1 rows and c-1 columns. Subsequently recall the function using
% the reduced matrix
temp_A=A;
for i=1:c
a1i=temp_A(1,i); % save the element of the 1st row and ith column of the temporary matrix; this element will be used to calculate the determinant later on
if a1i~=0
temp_A(1,:)=[]; % remove the first row to create the reduced matrix
temp_A(:,i)=[]; % remove the ith column to create the reduced matrix
Cj=temp_A;
DetCj=determin(Cj); % Calculate the determinant of the reduced matrix recalling the function
DetA=DetA+((-1)^(1+i))*a1i*DetCj;
temp_A=A; % reset elements of temporary matrix to input elemens
end
end
end

Solving with ODE45 in Matlab with a variable which is timestep dependent.

I've been solving a pretty simple set of ODEs which, for a particular case, produce results as I'd expect. The required output is a graph of the 'population' which is just a manipulation of the results output by ODE.
The issue is that I have some variable del which must be slowly incremented as time goes on (slowing being on the order of del_increment/time_increment << Omega^2, where Omega is set to 1). I can happily produce what I've been calling the 'static' case where I have a for loop incrementing del outside of ode45, however this is really just the solution for a static del value over a time period.
I'm struggling to see how I can have ODE45 increment del values [between a max and min] within the function as I have the restriction of small changes of del with time. This would show one plot, displaying small oscillations at zero population and a gradual increase to population =1, as opposed to my current multiple plots for each particular del value.
I've given what I've written below, and I hope this explanation makes sense.
del=-10:0.1:10; %start at large minus, with some small increment
Omega=1;
tmax=10; %integration time
for ret=1:length(del)
y=[0 0 1]; % starting conditions
[t,y]=ode45(#(t,y) psfour(t,y,del(ret)),[0 tmax],y);
plot(t,y(:,1)) % plot of one of the solutions
axis([0 max(t) -1 1])
p_11=0.5*(1+y(:,3)); %calculation of the population of an energy level
figure(2)
plot(t,p_11)
axis([0 max(t) 0 1])
pause(0.05) % asthetic pause so I can see results
end
and my function
function dydt = react(t,y,del)
dydt = zeros(size(y));
Omega=1;
A = y(1);
B = y(2);
C = y(3);
dydt(1) = del*B;
dydt(2) = -del*A + Omega*C;
dydt(3) = -Omega*B;
% del=del + Omega*t;
end

simulating a random walk in matlab

i have variable x that undergoes a random walk according to the following rules:
x(t+1)=x(t)-1; probability p=0.3
x(t+1)=x(t)-2; probability q=0.2
x(t+1)=x(t)+1; probability p=0.5
a) i have to create this variable initialized at zero and write a for loop for 100 steps and that runs 10000 times storing each final value in xfinal
b) i have to plot a probability distribution of xfinal (a histogram) choosing a bin size and normalization!!* i have to report the mean and variance of xfinal
c) i have to recreate the distribution by application of the central limit theorem and plot the probability distribution on the same plot!
help would be appreciated in telling me how to choose the bin size and normalize the histogram and how to attempt part c)
your help is much appreciated!!
p=0.3;
q=0.2;
s=0.5;
numberOfSteps = 100;
maxCount = 10000;
for count=1:maxCount
x=0;
for i = 1:numberOfSteps
random = rand(1, 1);
if random <=p
x=x-1;
elseif random<=(p+q)
x=x-2;
else
x=x+1;
end
end
xfinal(count) = x;
end
[f,x]=hist(xfinal,30);
figure(1)
bar(x,f/sum(f));
xlabel('xfinal')
ylabel('frequency')
mean = mean(xfinal)
variance = var(xfinal)
For the first question, check the help for hist on mathworks homepage
[nelements,centers] = hist(data,nbins);
You do not select the bin size, but the number of bins. nelements gives the elements per bin and center is all the bin centers. So to say, it would be the same to call
hist(data,nbins);
as
[nelements,centers] = hist(data,nbins);
plot(centers,nelements);
except that the representation is different (line or pile). To normalize, simply divide nelements with sum(nelements)
For c, here i.i.d. variables it actually is a difference if the variables are real or complex. However for real variables the central limit theorem in short tells you that for a large number of samples the distribution will limit the normal distribution. So if the samples are real, you simply asssumes a normal distribution, calculates the mean and variance and plots this as a normal distribution. If the variables are complex, then each of the variables will be normally distributed which means that you will have a rayleigh distribution instead.
Mathworks is deprecating hist that is being replaced with histogram.
more details in this link
You are not applying the PDF function as expected, the expression Y doesn't work
For instance Y does not have the right X-axis start stop points. And you are using x as input to Y while x already used as pivot inside the double for loop.
When I ran your code Y generates a single value, it is not a vector but just a scalar.
This
bar(x,f/sum(f));
bringing down all input values with sum(f) division? no need.
On attempting to overlap the ideal probability density function, often one has to do additional scaling, to have both real and ideal visually overlapped.
MATLAB can do the scaling for us, and no need to modify input data /sum(f).
With a dual plot using yyaxis
You also mixed variance and standard deviation.
Instead try something like this
y2=1 / sqrt(2*pi*var1)*exp(-(x2-m1).^2 / (2*var1))
ok, the following solves your question(s)
codehere
clear all;
close all;
clc
p=0.3; % thresholds
q=0.2;
s=0.5;
n_step=100;
max_cnt=10000;
n_bin=30; % histogram amount bins
xf=zeros(1,max_cnt);
for cnt=1:max_cnt % runs loop
x=0;
for i = 1:n_step % steps loop
t_rand1 = rand(1, 1);
if t_rand1 <=p
x=x-1;
elseif t_rand1<=(p+q)
x=x-2;
else
x=x+1;
end
end
xf(cnt) = x;
end
% [f,x]=hist(xf,n_bin);
hf1=figure(1)
ax1=gca
yyaxis left
hp1=histogram(xf,n_bin);
% bar(x,f/sum(f));
grid on
xlabel('xf')
ylabel('frequency')
m1 = mean(xf)
var1 = var(xf)
s1=var1^.5 % sigma
%applying central limit theorem %finding the mean
n_x2=1e3 % just enough points
min_x2=min(hp1.BinEdges)
max_x2=max(hp1.BinEdges)
% quite same as
min_x2=hp1.BinLimits(1)
max_x2=hp1.BinLimits(2)
x2=linspace(min_x2,max_x2,n_x2)
y2=1/sqrt(2*pi*var1)*exp(-(x2-m1).^2/(2*var1));
% hold(ax1,'on')
yyaxis right
plot(ax1,x2,y2,'r','LineWidth',2)
.
.
.
note I have not used these lines
% Xp=-1; Xq=-2; Xs=1; mu=Xp.*p+Xq.*q+Xs.*s;
% muN=n_step.*mu;
%
% sigma=(Xp).^2.*p+(Xq).^2.*q+(Xs).^2.s; % variance
% sigmaN=n_step.(sigma-(mu).^2);
People ususally call sigma to variance^.5
This supplied script is a good start point to now take it to wherever you need it to go.

MATLAB: Correlation with a seed region

By default, all built-in functions for computing correlation or covariance return a matrix. I am trying to write an efficient function that will compute the correlation between a seed region and various other regions, but I do not need the correlations between the other regions. I assume that computing the full correlation matrix would therefore be inefficient.
I could instead compute a the correlation matrix between each region and the seed region, choose one of the off diagonal points and store it, but I feel like looping in this situation is also inefficient.
To be more concrete, each point in my 3-dimensional space has a time dimension. I am attempting to compute the mean correlation between a given point and all points in space within a given radius. I want to repeat this procedure hundreds of thousands of times, for many different radius lengths, and so on, so I would like for this to be as efficient as possible.
So, what is the best way to compute the correlation between a single vector and several others, without computing correlations that I will just ignore?
Thank you,
Chris
EDIT: Here is my code now...
function [corrMap] = TIME_meanCorrMap(A,radius)
% Even though the variable is "radius", we work with cubes for simplicity...
% So, the radius is the distance (in voxels) from the center of the cube an edge.
denom = ((radius*2)^3)-1;
dim = size(A);
corrMap = zeros(dim(1:3));
for x = radius+1:dim(1)-radius
rx = [x-radius : x+radius];
for y = radius+1:dim(2)-radius
ry = [y-radius : y+radius];
for z = radius+1:dim(3)-radius
rz = [z-radius : z+radius];
corrCoefs = zeros(1,denom);
seed = A(x,y,z,:);
i=0;
for xx = rx
for yy = ry
for zz = rz
if ~all([x y z] == [xx yy zz])
i = i + 1;
temp = corrcoef(seed,A(xx,yy,zz,:));
corrCoeffs(i) = temp(1,2);
end
end
end
end
corrMap = mean(corrCoeffs);
end
end
end
EDIT: Here are some more times to supplement the accepted answer.
Using bsxfun() to do normalization, and matrix multiplication to compute correlations:
tic; for i=1:10000
x=rand(100);
xz = bsxfun(#rdivide,bsxfun(#minus,x,mean(x)),std(x));
cc = xz(:,2:end)' * xz(:,1) ./ 99;
end; toc
Elapsed time is 6.928251 seconds.
Using zscore() to normalize, matrix multiplication to compute correlations:
tic; for i=1:10000
x=rand(100);
xz = zscore(x);
cc = xz(:,2:end)' * xz(:,1) ./ 99;
end; toc
Elapsed time is 7.040677 seconds.
Using bsxfun() to normalize, and corr() to compute correlations.
tic; for i=1:10000
x=rand(100);
xz = bsxfun(#rdivide,bsxfun(#minus,x,mean(x)),std(x));
cc = corr(x(:,1),x(:,2:end));
end; toc
Elapsed time is 11.385707 seconds.
It is certainly possible to improve upon the for loop that you are currently employing. The correlation compuattions can be parallelized using matrix multiplications if you have sufficient RAM. However, it will require you to unwrap your 4-dimensional data matrix A into a different shape. most likely you are dealing with 3-dimensional voxelwise fMRI data, in which case you'll have to reshape from [x y z time] matrix to an [index time] matrix. I will assume you can deal with that reshaping. Once you have your seed timecourse [Time by 1] and your target timecourses [Time by NumTargets] ready, you can perform some much more efficient computations.
A quick way to efficiently compute the desired correlation is using the corr function in MATLAB. This function will accept 2 matrix arguments and it will quite efficiently compute all pairwise correlations between the columns of argument 1 and the columns of argument 2, e.g.
T = 200; %time samples
N = 20; %number of other voxels
seed = randn(T,1); %data from seed voxel
targets = randn(T,N); %data from target voxels
%here is the for loop method
tic
for n = 1:N
tmp = corrcoef(seed, targets(:,n));
tmpcc = tmp(1,2);
end
looptime = toc;
%here is the parallel method
tic
cc = corr(seed, targets);
matrixtime = toc;
On my machine, the parallel operation in corr is faster than the loop method by a factor proportional to T*N.
It is possible to go a little faster than the corr function if you are willing to perofrm the underlying matrix operations yourself, and in any case it is worth knowing what they are. The correlation between two vectors is basically a normalized dot product, so using the conventions above you can compute the correlations in the following way
zseed = zscore(seed); %normalize the seed timecourse by z-scoring
ztargets= zscore(targets); %normalize the target timecourses by z-scoring
ztargets = ztargets'; %flip columns and rows for convenience
cc2 = ztargets*zseed./(T-1); %compute many dot products with one matrix multiplication
The code above is basically what the corr function will do which is why it is much faster than the loop. Note that most of the operation time is in the zscore operations, and you can improve on the performance of the corr function if you efficiently compute the zscore using the bsxfun command. For now, I hope this gives you some direction on how to compute a correlation between a seed timecourse and many target timecourses without having to loop through and compute each one separately.