Limiting large array to 1D in MATLAB - matlab

I'm working with XShooter data and for galactic corrections, I'm using ccm_unred in MATLAB. The problem is
funred = flux*10.^(0.4*A_lambda);
this line of code generates a 29686 X 29686 double array. I want only one side of it, I can do it by reassigning funred as funred = funred(:,1) but this piece of code also takes 57 seconds to be executed and uses up my CPU and RAM too much for my laptop to stay stable. Is there any method by which I can limit the generation of funred to only (:,1) from the beginning?

You say that your code generates a 29686 X 29686 matrix, however you are doing element-wise operations in your equation. That means that either flux or A_lambda bust be 29686 X 29686. Just slice the ones that are that size!
Assuming one of them is 29686 X 29686
funred = flux(:,1)*10.^(0.4*A_lambda(:,1));
Just remove the (:,1) of the one that is is not a matrix.
If both of them are a matices, then you can not do it, as flux*... would need the whole matrix to operate.

Related

Matlab Horzcat - Out of memory

Any trick to avoid an out of memory error in matlab?
I am assuming that the reason it shows up is because matlab is very inefficient in using horzcat and actually needs to temporarily duplicate matrices.
I have a matrix A with size 108977555 x 25. I want to merge this with three vectors d, m and y with size 108977555 x 1 each.
My machine has 32GB ram, and the above matrice + vectors occupy 18GB.
Now I want to run the following command:
A = [A(:,1:3), d, m, y, A(:,5:end)];
But that yields the error:
Error using horzcat
Out of memory. Type HELP MEMORY for your options.
Any trick to do this merge?
Working with Large Data Sets. If you are working with large data sets, you need to be careful when increasing the size of an array to avoid getting errors caused by insufficient memory. If you expand the array beyond the available contiguous memory of its original location, MATLAB must make a copy of the array and set this copy to the new value. During this operation, there are two copies of the original array in memory.
Restart matlab, I often find it doesn't fully clean up its memory or it get's fragmented, leading to lower maximal array sizes.
Change your datatype (if you can). E.g. if you're only dealing with numbers 0 - 255, use uint8, the memory size will reduce by a factor 8 compared to an array of doubles
Start of with A already large enough (i.e. 108977555x27 instead of 108977555x25 and insert in place:
A(:, 4) = d;
clear d
A(:, 5) = m;
clear m
A(:, 6) = y;
Merge the data in one datatype to reduce total memory requirement, eg a date easily fits into one uint32.
Leave the data separated, think about why you want the data in one matrix in the first place and if that is really necessary.
Use C-code to do the data allocation yourself (only if you're really desperate)
Further reading: https://nl.mathworks.com/help/matlab/matlab_prog/memory-allocation.html
Even if you could make it using Gunther's suggestions, it will just occupy memory. Right now it takes more than half of available memory. So, what are you planning to do then? Even simple B = A+1 doesn't fit. The only thing you can do is stuff like sum, or operations on part of array.
So, you should consider going to tall arrays and other related big data concepts, which are exactly meant to work with such large datasets.
https://www.mathworks.com/help/matlab/tall-arrays.html
You can first try the efficient memory management strategies as mentioned on the official mathworks site : https://in.mathworks.com/help/matlab/matlab_prog/strategies-for-efficient-use-of-memory.html
Use Single (4 bytes) or some other smaller data type instead of Double (8 bytes) if your code can work with that.
If possible use block processing (like rows or columns) i.e. store blocks as separate mat files and load and access only those parts of the matrix which are required.
Use matfile command for loading large variables in parts. Perhaps something like this :
save('A.mat','A','-v7.3')
oldMat = matfile('A.mat');
clear A
newMat = matfile('Anew.mat','Writeable',true) %Empty matfile
for i=1:27
if (i<4), newMat.A(:,i) = oldMat.A(:,i); end
if (i==4), newMat.A(:,i) = d; end
if (i==5), newMat.A(:,i) = m; end
if (i==6), newMat.A(:,i) = y; end
if (i>6), newMat.A(:,i) = oldMat.A(:,i-2); end
end

MATLAB: How to apply a vectorized function using sparsity structure?

I need to (repeatedly) build a vector of length 200 from a vector of length 2500. I can describe this operation using multiplication by a matrix which is extremely sparse: it is 200x2500 and has only one entry in each row. But I have very little control over where this entry is. My actual problem is that I need to apply this matrix not to the vector that I currently have, but rather to some componentwise function of this vector. Since I have all this sparsity, it is wasteful to apply this componentwise function to all 2500 components of my vector. Instead I would rather apply it only to the 200 components that actually contribute.
A program (with randomly chosen numbers replacing of my actual numbers) which would have a similar problem would be something like this:
ind=randi(2500,200,1);
coefficients=randn(200,1);
A=sparse(1:200,ind,coefficients,200,2500);
x=randn(2500,1);
y=A*subplus(x);
What I don't like here is applying subplus to all of x; I would rather only have to apply it to x(ind), since only that contributes to the matrix product.
Right now the only way I can see to work around this is to replace my sparse matrix with a 200-component vector of coefficients and a 200-component vector of indices. Working this way, the code above would become:
ind=randi(2500,200,1);
coefficients=randn(200,1);
x=randn(2500,1);
y=coefficients.*subplus(x(ind))
Is there a better way to do this, preferably one that would work when A contains a few elements per row instead of just one?
The code in your question throws an exception, I think it should be:
n=2500;
m=200;
ind=randi(n,m,1);
coefficients=randn(m,1);
A=sparse(1:m,ind,coefficients,m,n);
x=randn(n,1);
Your idea using x(ind) was basically right, but ind would reorder x which is not intended. Instead you could use sort(unique(ind)). I opted to use the sparse logical index any(A~=0) because I expect it to be faster, but you could compare both versions.
%original code
y=A*subplus(x);
.
%multiplication using sparse logical indexing:
relevant=any(A~=0);
y=A(:,relevant)*subplus(x(relevant));
.
%fixed version of your code
relevant=sort(unique(ind));
y=A(:,relevant)*subplus(x(relevant));

Detect signal jumps relative to local activity

In Matlab, is it possible to measure local variation of a signal across an entire signal without using for loops? I.e., can I implement the following:
window_length = <something>
for n = 1:(length_of_signal - window_length/2)
global_variance(n) = var(my_signal(1:window_length))
end
in a vectorized format?
If you have the image processing toolbox, you can use STDFILT:
global_std = stdfilt(my_signal(:),ones(window_length,1));
% square to get the variance
global_variance = global_std.^2;
You could create a 2D array where each row is shifted one w.r.t. to the row above, and with the number of rows equal to the window width; then computing the variance is trivial. This doesn't require any toolboxes. Not sure if it's much faster than the for loop though:
longSignal = repmat(mySignal(:), [1 window_length+1]);
longSignal = reshape(longSignal(1:((length_of_signal+1)*window_length)), [length_of_signal+1, window_length])';
global_variance = sum(longSignal.*longSignal, 2);
global_variance = global_variance(1:length_of_signal-window_length));
Note that the second column is shifted down by one relative to the one above - this means that when we have the blocks of data on which we want to operate in rows, so I take the transpose. After that, the sum operator will sum over the first dimension, which gives you a row vector with the results you want. However, there is a bit of wrapping of data going on, so we have to limit to the number of "good" values.
I don't have matlab handy right now (I'm at home), so I was unable to test the above - but I think the general idea should work. It's vectorized - I can't guarantee it's fast...
Check the "moving window standard deviation" function at Matlab Central. Your code would be:
movingstd(my_signal, window_length, 'forward').^2
There's also moving variance code, but it seems to be broken.
The idea is to use filter function.

Matlab for loop vectorization and memory

X,Y and z are coordinates representing surface. In order to calculate some quantity, lets call it flow, at point i,j of the surface, i need to calculate contibution from all other points (i0,j0). To do so i need for example to know cos of angles between point i0,j0 and all other points (alpha). Then all contirbutions from i0,j0 must be multiplied on some constants and added. zv0 at every point i,j is final needed result.
I came up with some code written below and it seems to be extremely unappropriate. First of all it slows down rest of the program and seems to use all of the available memory. My system has 4gb physical memory and 12gb swap file and it always runs out of memory, though all of variables sizes are not bigger then 10kb. Please help up with speed up/vectorization and memory problems.
parfor i0=2:1:length(x00);
for j0=2:1:length(y00);
zv=red3dfunc(X0,Y0,f,z0,i0,j0,st,ang,nx,ny,nz);
zv0=zv0+zv;
end
end
function[X,Y,z,zv]=red3dfunc(X,Y,f,z,i0,j0,st,ang,Nx,Ny,Nz)
x1=X(i0,j0);
y1=Y(i0,j0);
z1=z(i0,j0);
alpha=zeros(size(X));
betha=zeros(size(X));
r=zeros(size(X));
XXa=X-x1;
YYa=Y-y1;
ZZa=z-z1;
VEC=((XXa).^2+(YYa).^2+(ZZa).^2).^(1/2);
VEC(i0,j0)=VEC(i0-1,j0-1);
XXa=XXa./VEC;
YYa=YYa./VEC;
ZZa=ZZa./VEC;
alpha=-(Nx(i0,j0).*XXa+Ny(i0,j0).*YYa+Nz(i0,j0).*ZZa);
betha=Nx.*XXa+Ny.*YYa+Nz.*ZZb;
r=VEC;
zv=(1/pi)*st^2*ang.*f.*(alpha).*betha./r.^2;
The obvious thing to do this is to use Kroneker product. The matlab function is kron(A,B) for matricies of dimensions nAxmA and nBxmB. This function will return matrix of dimension (nA*nB)x(mA*mB), which will look something like
[a11*B a12*B ... a1mA*B;
.......................;
anA1*B ........ anAmA*B]
So your problem may be solved by introducing the matrix of ones I=ones(size(X)). You will then define your XXa, YYa, ZZa and VEC matricies without any loop as
XXa = kron(I,X)-kron(X,I);
YYa = kron(I,Y)-kron(Y,I);
ZZa = kron(I,Z)-kron(Z,I);
VEC=((XXa).^2+(YYa).^2+(ZZa).^2).^(1/2);
You will then find VEC for any i0,j0 as (if you define n and m as size components of X)
VEC((1+n*(i0-1)):(n*i0),(1+m*(j0-1)):(m*j0))

4 dimensional matrix

I need to use 4 dimensional matrix as an accumulator for voting 4 parameters. every parameters vary in the range of 1~300. for that, I define Acc = zeros(300,300,300,300) in MATLAB. and somewhere for example, I used:
Acc(4,10,120,78)=Acc(4,10,120,78)+1
however, MATLAB says some error happened because of memory limitation.
??? Error using ==> zeros
Out of memory. Type HELP MEMORY for your options.
in the below, you can see a part of my code:
I = imread('image.bmp'); %I is logical 300x300 image.
Acc = zeros(100,100,100,100);
for i = 1:300
for j = 1:300
if I(i,j)==1
for x0 = 3:3:300
for y0 = 3:3:300
for a = 3:3:300
b = abs(j-y0)/sqrt(1-((i-x0)^2) / (a^2));
b1=floor(b/3);
if b1==0
b1=1;
end
a1=ceil(a/3);
Acc(x0/3,y0/3,a1,b1) = Acc(x0/3,y0/3,a1,b1)+1;
end
end
end
end
end
end
As #Rasman mentioned, you probably want to use a sparse representation of the matrix Acc.
Unfortunately, the sparse function is geared toward 2D matrices, not arbitrary n-D.
But that's ok, because we can take advantage of sub2ind and linear indexing to go back and forth to 4D.
Dims = [300, 300, 300, 300]; % it will be a 300 by 300 by 300 by 300 matrix
Acc = sparse([], [], [], prod(Dims), 1, ExpectedNumElts);
Here ExpectedNumElts should be some number like 30 or 9000 or however many non-zero elements you expect for the matrix Acc to have. We notionally think of Acc as a matrix, but actually it will be a vector. But that's okay, we can use sub2ind to convert 4D coordinates into linear indices into the vector:
ind = sub2ind(Dims, 4, 10, 120, 78);
Acc(ind) = Acc(ind) + 1;
You may also find the functions find, nnz, spy, and spfun helpful.
edit: see lambdageek for the exact same answer with a bit more elegance.
The other answers are helping to guide you to use a sparse mat instead of your current dense solution. This is made a little more difficult since current matlab doesn't support N-dimensional sparse arrays. One implementation to do this is
replace
zeros(100,100,100,100)
with
sparse(100*100*100*100,1)
this will store all your counts in a sparse array, as long as most remain zero, you will be ok for memory.
then to access this data, instead of:
Acc(h,i,j,k)=Acc(h,i,j,k)+1
use:
index = h+100*i+100*100*j+100*100*100*k
Acc(index,1)=Acc(index,1)+1
See Avoiding 'Out of Memory' Errors
Your statement would require more than 4 GB of RAM (Around 16 Gigs, to be specific).
Solutions to 'Out of Memory' problems
fall into two main categories:
Maximizing the memory available to
MATLAB (i.e., removing or increasing
limits) on your system via operating
system selection and system
configuration. These usually have the
greatest overall applicability but are
potentially the most disruptive (e.g.
using a different operating system).
These techniques are covered in the
first two sections of this document.
Minimizing the memory used by MATLAB
by making your code more memory
efficient. These are all algorithm
and application specific and therefore
are less broadly applicable. These
techniques are covered in later
sections of this document.
In your case later seems to be the solution - try reducing the amount of memory used / required.