Generating iterative sequence without using for-loop - matlab

I want to generate a random series x with length N through the following rule related to non-central chi-square distribution:
xn+1~χν2(λxn)
where ν is a given constant representing degrees of freedom, λ is also pre-specified and the multiplication of λ and xn is the non-centrality parameter, x1 is supposed to be given.
I wrote the following code to generate such sequence and time the running with x1=0.04, ν=0.005, λ=100 and N=1e5:
tic;
N = 1e5;
x = zeros(1,N);
x(1) = 0.04;
nu = 0.005;
lambda = 100;
for i = 1:N-1
x(i+1) = ncx2rnd(nu,lambda*x(i));
end
toc;
To illustrate my question, I have tested another example, which is different from above. Here I considered generating N=1e5 samples from the distribution χν2(λ) with ν=0.005 and λ=100:
tic;
N = 1e5;
x = zeros(1,N);
nu = 0.005;
lambda = 100;
for i = 1:N
x(i) = ncx2rnd(nu,lambda);
end
toc;
tic;
N = 1e5;
nu = 0.005;
lambda = 100;
x = ncx2rnd(nu,lambda*ones(1,N));
toc;
These two approaches work equivalently. However, it turns out that the second approach which does not use for-loop is much faster than the first one. The difference between both examples is, in the second example, the rule to generate some sample does not require the information of previous samples, which is not the case in the first, therefore all samples can be generated simultaneously without using for-loop. Based on this I wonder whether avoiding for-loop would accelerate the code execution. So would there be any MATLAB built-in function to generate random series shown in the first example without using for-loop when the rule of dependence on previous samples is explicit? If the rule is linear I know the function filter would be a possible choice, what about cases like the first example?

Logically it's impossible to calculate something iterative without doing the iterations. If x(n+1) is dependent on x(n) then you must calculate x(n) first, there is no "clever trick" here.
That just leaves us to optimise the calculation within the loop, specifically ncx2rnd. As with most MATLAB in-built functions, it is already fairly concise and performant, but there are some things to consider. Note that what I'm about to suggest involves using edit ncx2nrd to look inside this in-built function which contains code under MathWorks copyright, I'm simply noting observations about how it works.
There are some input checks to handle incorrectly sized inputs and/or inputs with negative values. If you can take the burden of validation on yourself (i.e. you know your inputs are valid) then you can reduce the function to its single mathematical operation:
% function r = ncx2rnd(v,delta)
r = 2.*randg(poissrnd(delta./2, sizeOut)) + 2.*randg(v./2,sizeOut);
Running this standalone saves around 20% of the processing time, which was for input validation (with a nominal N=1e5).
In the MathWorks syntax, delta is equal to your lambda*x(i), the other term including v is independent of your x, so you could compute it outside of the loop, i.e. vectorising one of the calls to randg. Again using N=1e5 this brings the total time saving to around 25%.
The result would mean this change to your example:
% Common inputs
N = 1e5;
nu = 0.1;
lambda = 0.1;
% Baseline example
x = zeros(1,N);
x(1) = 0.04;
for i = 1:N-1
x(i+1) = ncx2rnd(nu,lambda*x(i));
end
% ~25% faster alternative, with no input validation and partially vectorised
x = zeros(1,N);
x(1) = 0.04;
vTerm = 2.*randg(nu./2, [1,N]);
for i = 1:N-1
x(i+1) = 2.*randg(poissrnd(lambda*x(i)./2, [1,1])) + vTerm(i);
end

Related

Implementation of steepest descent in Matlab

I have to implement the steepest descent method and test it on functions of two variables, using Matlab. Here's what I did so far:
x_0 = [0;1.5]; %Initial guess
alpha = 1.5; %Step size
iteration_max = 10000;
tolerance = 10e-10;
% Two anonymous function to compute 1st and 2nd entry of gradient
f = #(x,y) (cos(y) * exp(-(x-pi)^2 - (y-pi)^2) * (sin(x) - 2*cos(x)*(pi-x)));
g = #(x,y) (cos(x) * exp(-(x-pi)^2 - (y-pi)^2) * (sin(y) - 2*cos(y)*(pi-y)));
%Initiliazation
iter = 0;
grad = [1; 1]; %Gradient
while (norm(grad,2) >= tolerance)
grad(1,1) = f(x_0(1), x_0(2));
grad(2,1) = g(x_0(1), x_0(2));
x_new = x_0 - alpha * grad; %New solution
x_0 = x_new %Update old solution
iter = iter + 1;
if iter > iter_max
break
end
end
The problem is that, compared to for example the results from WolframAlpha, I do not obtain the same values. For this particular function, I should obtain either (3.14,3.14) or (1.3,1.3) but I obtain (0.03, 1.4).
You should know that this method is a local search and thus it can stuck in local minimum depending on the initial guess and step size.
With a different initial guess, it will find a different local minimum.
Step size is important because a big stepsize can prevent the algorithm from converging. A small stepsize makes the algorithm really slow. This is why you should adapt the size of the steps as the function value decreases.
Always it is a good idea to understand the function you want to optimize by plotting it (if possible). The function you are working with looks like this (in range [-pi pi]):
with the following parameter values you will get to the local minimum you are looking for.
x_0 = [2;2]; %Initial guess
alpha = 0.5; %Step size

Imaginary part of coherence matlab

I want to calculate IMC according to below instructions
I wrote below code in matlab, but the result is not what I was expecting.
Is that code a valid implementation of the above instructions?
Could someone help me with a better code?
function [ imag_coherence] = imcoh( x,y )
xy=xcorr(fft(x),fft(y));
xx=xcorr(fft(x));
yy=xcorr(fft(y));
imag_coherence=imag(xy./sqrt(xx.*yy));
end
xcorr actually computes the cross-correlation between the computed spectrums (a sums contributions over all frequencies), not the expectation of the point-wise multiplication (i.e. for a given fixed frequency) of those spectrums. The latter being what the coherency definition you provided uses.
Assuming the processes producing x and y are ergodic, the expectations can be estimated by computing the average over many blocks of data. With that in mind, an implementation of the coherency as described in your definition could look like:
function [ result ] = coherency( x,y,N )
% divide data in N equal length blocks for averaging later on
L = floor(length(x)/N);
xt = reshape(x(1:L*N), L, N);
yt = reshape(y(1:L*N), L, N);
% transform to frequency domain
Xf = fft(xt,L,1);
Yf = fft(yt,L,1);
% estimate expectations by taking the average over N blocks
xy = sum(Xf .* conj(Yf), 2)/N;
xx = sum(Xf .* conj(Xf), 2)/N;
yy = sum(Yf .* conj(Yf), 2)/N;
% combine terms to get final result
result=xy./sqrt(xx.*yy);
end
If you only want the imaginary part, then it's a simple matter of computing imag(coherency(x,y,N)).

Loopless Gaussian mixture model in Matlab

I have several Gaussian distributions and I want to draw different values from all of them at the same time. Since this is basically what a GMM does, I have looked into Matlab GMM implementation (gmrnd) and I have seen that it performs a simple loop over all the components.
I would like to implement it in a faster way, but the problem is that 3d matrices are involved. A simple code (with loop) would be
n = 10; % number of Gaussians
d = 2; % dimension of each Gaussian
mu = rand(d,n); % init some means
U = rand(d,d,n); % init some covariances with their Cholesky decomposition (Cov = U'*U)
I = repmat(triu(true(d,d)),1,1,n);
U(~I) = 0;
r = randn(d,n); % random values for drawing samples
samples = zeros(d,n);
for i = 1 : n
samples(:,i) = U(:,:,i)' * r(:,i) + mu(:,i);
end
Is it possible to speed it up? I do not know how to deal with the 3d covariances matrix (without using cellfun, which is much slower).
Few improvements (hopefully are improvements) could be suggested here.
PARTE #1 You can replace the following piece of code -
I = repmat(triu(true(d,d)),[1,1,n]);
U(~I) = 0;
with bsxfun(#times,..) one-liner -
U = bsxfun(#times,triu(true(d,d)),U)
PARTE #2 You can kill the loopy portion of the code again with bsxfun(#times,..) like so -
samples = squeeze(sum(bsxfun(#times,U,permute(r,[1 3 2])),2)) + mu
I'm not fully convinced this is faster, but it gets rid of the loop. It would be interesting to see benchmarking results if you can do that. I also think this code makes is rather ugly and it's a bit hard to deduce what's going on, but I'll let you decide between readability and performance.
Anyway, I decided to define a big n*d dimensional Gaussian where each block d of variates are independent of each other (as in the original). This allows defining the covariance as a block diagonal matrix, for which I use blkdiag. From there, it is a matter of applying bsxfun to remove the need for looping.
Using the same random seed, I can recover the same samples as your code:
%// sampling with block diagonal covariance matrix
rng(1) %// set random seed
Ub = mat2cell(U, d, d, ones(n,1)); %// 1-by-1-by-10 cell of 2-by-2 matrices
C = blkdiag(Ub{:});
Ns = 1; %// number of samples
joint_samples = bsxfun(#plus, C'*randn(d*n, Ns), mu(:));
new_samples = reshape(joint_samples, [d n]); %// or [d n Ns] if Ns > 1
%//Compare to original
rng(1) %// set same seed for repeatability
r = randn(d,n); % random values for drawing samples
samples = zeros(d,n);
for i = 1 : n
samples(:,i) = U(:,:,i)' * r(:,i) + mu(:,i);
end
isequal(samples, new_samples) %// true

1D gaussian filter over non equidistant data

I have a data distributed in non-equidistant 1D space and I need to convolve this with a Gaussian filter,
gaussFilter = sqrt(6.0/pi*delta**2)*exp(-6.0*x**2 /delta**2);
where delta is a constant and x corresponds to space.
Can anyone hint how to perform a good integration (2nd order) as the data is not equally spaced taking care of the finite end? I intend to write the code in Fortran, but a Matlab example is also welcome.
use this:
function yy = smooth1D(x,y,delta)
n = length(y);
yy = zeros(n,1);
for i=1:n;
ker = sqrt(6.0/pi*delta^2)*exp(-6.0*(x-x(i)).^2 /delta^2);
%the gaussian should be normalized (don't forget dx), but if you don't want to lose (signal) energy, uncomment the next line
%ker = ker/sum(ker);
yy(i) = y'*ker;
end
end
Found something which works.
Though not sure if this is very accurate way as the integration (trapz) is of first order.
function [fbar] = gaussf(f,x,delta )
n = length(f);
fbar = zeros(n,1);
for i=1:n;
kernel = sqrt(6/(pi*delta^2))*exp(-6*((x - x(k))/delta).^2);
kernel = kernel/trapz(x,kernel);
fbar(i) = trapz(x,f.*kernel);
end
end

MATLAB: Correlation with a seed region

By default, all built-in functions for computing correlation or covariance return a matrix. I am trying to write an efficient function that will compute the correlation between a seed region and various other regions, but I do not need the correlations between the other regions. I assume that computing the full correlation matrix would therefore be inefficient.
I could instead compute a the correlation matrix between each region and the seed region, choose one of the off diagonal points and store it, but I feel like looping in this situation is also inefficient.
To be more concrete, each point in my 3-dimensional space has a time dimension. I am attempting to compute the mean correlation between a given point and all points in space within a given radius. I want to repeat this procedure hundreds of thousands of times, for many different radius lengths, and so on, so I would like for this to be as efficient as possible.
So, what is the best way to compute the correlation between a single vector and several others, without computing correlations that I will just ignore?
Thank you,
Chris
EDIT: Here is my code now...
function [corrMap] = TIME_meanCorrMap(A,radius)
% Even though the variable is "radius", we work with cubes for simplicity...
% So, the radius is the distance (in voxels) from the center of the cube an edge.
denom = ((radius*2)^3)-1;
dim = size(A);
corrMap = zeros(dim(1:3));
for x = radius+1:dim(1)-radius
rx = [x-radius : x+radius];
for y = radius+1:dim(2)-radius
ry = [y-radius : y+radius];
for z = radius+1:dim(3)-radius
rz = [z-radius : z+radius];
corrCoefs = zeros(1,denom);
seed = A(x,y,z,:);
i=0;
for xx = rx
for yy = ry
for zz = rz
if ~all([x y z] == [xx yy zz])
i = i + 1;
temp = corrcoef(seed,A(xx,yy,zz,:));
corrCoeffs(i) = temp(1,2);
end
end
end
end
corrMap = mean(corrCoeffs);
end
end
end
EDIT: Here are some more times to supplement the accepted answer.
Using bsxfun() to do normalization, and matrix multiplication to compute correlations:
tic; for i=1:10000
x=rand(100);
xz = bsxfun(#rdivide,bsxfun(#minus,x,mean(x)),std(x));
cc = xz(:,2:end)' * xz(:,1) ./ 99;
end; toc
Elapsed time is 6.928251 seconds.
Using zscore() to normalize, matrix multiplication to compute correlations:
tic; for i=1:10000
x=rand(100);
xz = zscore(x);
cc = xz(:,2:end)' * xz(:,1) ./ 99;
end; toc
Elapsed time is 7.040677 seconds.
Using bsxfun() to normalize, and corr() to compute correlations.
tic; for i=1:10000
x=rand(100);
xz = bsxfun(#rdivide,bsxfun(#minus,x,mean(x)),std(x));
cc = corr(x(:,1),x(:,2:end));
end; toc
Elapsed time is 11.385707 seconds.
It is certainly possible to improve upon the for loop that you are currently employing. The correlation compuattions can be parallelized using matrix multiplications if you have sufficient RAM. However, it will require you to unwrap your 4-dimensional data matrix A into a different shape. most likely you are dealing with 3-dimensional voxelwise fMRI data, in which case you'll have to reshape from [x y z time] matrix to an [index time] matrix. I will assume you can deal with that reshaping. Once you have your seed timecourse [Time by 1] and your target timecourses [Time by NumTargets] ready, you can perform some much more efficient computations.
A quick way to efficiently compute the desired correlation is using the corr function in MATLAB. This function will accept 2 matrix arguments and it will quite efficiently compute all pairwise correlations between the columns of argument 1 and the columns of argument 2, e.g.
T = 200; %time samples
N = 20; %number of other voxels
seed = randn(T,1); %data from seed voxel
targets = randn(T,N); %data from target voxels
%here is the for loop method
tic
for n = 1:N
tmp = corrcoef(seed, targets(:,n));
tmpcc = tmp(1,2);
end
looptime = toc;
%here is the parallel method
tic
cc = corr(seed, targets);
matrixtime = toc;
On my machine, the parallel operation in corr is faster than the loop method by a factor proportional to T*N.
It is possible to go a little faster than the corr function if you are willing to perofrm the underlying matrix operations yourself, and in any case it is worth knowing what they are. The correlation between two vectors is basically a normalized dot product, so using the conventions above you can compute the correlations in the following way
zseed = zscore(seed); %normalize the seed timecourse by z-scoring
ztargets= zscore(targets); %normalize the target timecourses by z-scoring
ztargets = ztargets'; %flip columns and rows for convenience
cc2 = ztargets*zseed./(T-1); %compute many dot products with one matrix multiplication
The code above is basically what the corr function will do which is why it is much faster than the loop. Note that most of the operation time is in the zscore operations, and you can improve on the performance of the corr function if you efficiently compute the zscore using the bsxfun command. For now, I hope this gives you some direction on how to compute a correlation between a seed timecourse and many target timecourses without having to loop through and compute each one separately.