I have an m x n matrix of integers and where n is a fairly big number m and n ~1000. I want to iterate through all of these and perform a some operations, like accessing a particular cell and assigning a value of a particular cells.
However, at least in my implementation, this is rather inefficient as I have two for loops with Matrix(a,b) = Matrix(a,b+1) or something along these lines. Is there any other way to do this seeing as my current implementation takes a long time to traverse through about 100,000 cells and perform some operations.
Thank you
In matlab, it's almost always possible to avoid loops.
If you want to do Matrix(a,b)=Matrix(a,b+1), you should just do Matrix2=Matrix(:,2:end);
If you are more precise about what you do inside the loop, I can help you more.
Matlab uses column major ordering of matrixes in memory (unlike C). Are you sure you are iterating the indexes in the correct order? If not, try switching them and see if performance improves..
If you can't get rid of the for loops, one possibility would be to rewrite the expensive operations in C and create a MEX file as described here.
Related
I wonder why it is faster to square a matrix with the A2=A^2 command (A being a LxL matrix) than to just do a double for loop and assign the value to a zeroed matrix.
I have run the following code to check the first case
tic
psi2=psi.^2;
T1=toc;
and the following for the second
psi2=zeros(L,L);
tic
for i=1:L
for j=1:L
psi2(i,j)=psi(i,j)^2;
end
end
T2=toc;
In this figure the elapsed time for several matrices sizes (L) are shown and the speedup is clear.
I would not be surprised to see that MATLAB has a very efficient implementation of matrix multiplication as it what is it made for, but I can't understand how there's a faster way to do element-wise operations than just looping over it.
Thanks for time.
There are several things that make the vector operation faster than your loop.
First, a loop compiled into C++ code is faster than a script loop which is interpreted / converted and compiled as Java.
Secondly, the C or C++ compiler can use Single Instruction, Multiple Data instructions (SIMD) to do the operation on multiple matrix elements in a single operation. And then do this in multiple threads.
Finally, it's possible to push the operation to the GPU which can process even more elements simultaneously (hundreds of cores, compared to 4-8 on the CPU). Your scripted loops cannot do this.
.^2 take the advantage of doing parallel operation using CPU. For a nested loop (double loop) solution, the entire solution is done in sequence. In addition it also have over head to increment of the loop control variable and condition checking.
I want to use pyfft to repeatedly compute the discrete Fourier transform of a subset of rows for a two-dimensional array. I do not know in advance which rows I need to transform, that depends on the output from the previous round. I do know that doing it for all rows is wasteful.
It is my understanding that a 'plan' in FFTW3 is associated with the type of transform (c2c, r2c, etc) and the input/output length, which is always a vector in the 1D case. In pyfftw it looks like a 'plan' is associated to the type of transform and the input/output shape, so my interpretation is that it uses the same FFTW3 plan for every row.
My question is: is it possible to use the same FFTW3 plan for some of the rows, without creating separate pyfftw.FFTW objects for all possible combinations of rows?
On a different note, I am wondering how pyfftw uses multiple cores: does it use multiple cores for each row (this appears natural in view of FFTW3 documentation) or does it farm out different rows to different cores (which was my initial assumption)?
If you can create a numpy array from a view, you can plan for it with pyFFTW - all valid numpy arrays should work just fine.
This means several things:
Your array needs to have regular strides, but those strides can be arbitrary.
ND arrays are planned as ND transforms, with the selected axes being used.
You can probably do something cunning with stride tricks and it will probably work (but might not do what you expect if you do something too nefarious like overlapping rows and then use threads).
One solution that I've used quite a bit is to copy the rows that you want to transform into an interim array, and transforming that. You might well find that's the fastest option (particularly when you can allow for getting the byte offset correct).
Obviously, this doesn't work if you always have a different number of rows. You might still find that if you plan for the largest number of rows that are transformed and then copy in a subset you still do faster than otherwise.
The problem you're going to come up against, even if you go down to the C level, is that the planning overhead might well dominate if you're changing your transform sizes often.
You could also try pyfftw.interfaces.numpy_fft which is normally faster that numpy and has the ability to cache repeated transform sizes.
I want to create a multidimensional array A in Matlab of dimension NxMxG with N,M,G very large (e.g. 10^6).
Then I need to access Ain a loop as
for g=1:G
Atemp=A(:,:,g);
%etc etc
end
What is more convenient in terms of speed and memory between storing the values of A in a multidimensional array or in a cell array?
If you always loop on slices in the same way, and process them one at a time, as your bit of code seem to suggest, then the performance should be roughly equivalent.
If you really intend to store 1e6x1e6x1e6 double's, Matlab is definitely probably not your tool. However, if slices are sparse, then it's probably a bit more efficient to store them as a cell array, so Matlab does not have to search the full 3D space when "cutting" the slice, and Atemp=A{g}; simply copies a sparse matrix.
If you are working on full (nonsparse) slices then probably you should load/save your slice to disk and use instead a function/support class which loads from file: Atemp=A(g);. Mind that text loading takes up much more time than loading a binary file: so choose your file format carefully!
If you use numbers, a multidimensional array is the right thing to use. A cell array also allows other types, so is less optimised for numbers only. Because you are using very large arrays, maybe a sparse matrix may be appropriate for you.
First, note that neither pick will let you handle 10^18 values. You don't have exabytes of storage, let alone memory.
If you will ONLY ever use it as Atemp = A(:,:,g); with N and M always the same size for all g, having it multi-dimensional or cell shouldn't change anything meaningful as far as performance goes. N-D will be probably a bit faster, but nothing significant.
Obviously, if you ever want to have computation with different sizes of N, M depending on g, you need to pick cell array. And if you want to have computation with say Atemp = squeeze(A(:,g,:)); N-D array is clear choice here.
So, choice most likely depends if you prefer doing A(:,:,g) or A{g};, which depends on your meaning of data. Say if you have weather data and currently only care about what happens at specific height (not what happens between the layers), A(:,:,g) is clearly more sensible. It is possible you will require inter-layer calculations at some point. But if you have instead g meaning different measurement sites gathering data, A{g} should be used to pick the site. You will likely have some sites larger or smaller eventually.
How can I store a matrix with 2^100 rows in MatLab! it is my search space and I really need to do it .
In your opinion, is it possible ? if yes, please help me that how can i do it?
2100 is about 1030, which is much too large for you to fit in memory - so you won't be able to store this matrix.
A couple of alternatives that you might want to think about -
Are many of the entries in the matrix zero? If so, you could consider using a sparse matrix which is much more memory efficient.
Do you need to be able to access the rows in an arbitrary order, or sequentially? If sequentially, you can generate the rows on an as-needed basis (perhaps in blocks of ten thousand at a time)
Do you need to look at all the rows at all? If not, perhaps you can define a function which generates the entries on the fly, as they are requested.
I would like to know whether there is a way to reduce the amount of memory used by the following piece of code in Matlab:
n=3;
T=100;
r=T*2;
b=80;
BS=1000
bsuppostmp_=cell(1,BS);
bslowpostmp_=cell(1,BS);
bsuppnegtmp_=cell(1,BS);
bslownegtmp_=cell(1,BS);
for w=1:BS
bsuppostmp_{w}= randi([0,1],n*T,2^(n-1),r,b);
bslowpostmp_{w}=randi([0,3],n*T,2^(n-1),r,b);
bsuppnegtmp_{w}=randi([0,4],n*T,2^(n-1),r,b);
bslownegtmp_{w}=randi([0,2],n*T,2^(n-1),r,b);
end
I have decided to use cells of matrices because after this loop I need to call separately each single matrix in another loop.
If I run this code I get the message error "Your system has run out of application memory".
Do you know a more efficient (in terms of memory) way to store each single matrix?
Let's refer the page about Strategies for Efficient Use of Memory:
Because simple numeric arrays (comprising one mxArray) have the least overhead, you should use them wherever possible. When data is too complex to store in a simple array (or matrix), you can use other data structures.
Cell arrays are comprised of separate mxArrays for each element. As a result, cell arrays with many small elements have a large overhead.
I doubt that the overhead for cell array is really large ...
Let me give a possible explanation. What if Matlab cannot use the swap file in case of storing the 4D arrays into a cell array? When storing large numeric arrays, there is no out-of-memory error because Matlab uses the swap file for caching each variable when the used memory becomes too big. Whereas if each 4D array is stored in a super cell array, Matlab sees it as a single variable and cannot fragment it part in the RAM and part in the swap file. Ok I don't work at Mathworks so I don't know if I'm right or not, it's just an idea about this topic so I would be glad to know what is the real reason.
So my advice is the same as other comments: try to free matrices as soon as you've done with them. There is not so many possibilities to store many dense arrays: one big array (NOT recommended here, will reach out-of-memory sooner because Matlab makes it contiguous), cell array or struct array (and if I correctly understand the documentation, the overhead can be equivalent). In all cases, the data amount over all 4D arrays is really large, so the best thing to do is to care about keeping the memory constantly as low as possible by discarding some data once they are used and keep in memory only the results of computation (in case they take lower memory usage ...).