Smart way to extend matrix in matlab - matlab

For example I have 2x2 matrix, now i have to increase its left and right side to 1 column each, then top and bottom side to 1 row each, now I will have a 4x4 matrix with the old matrix is located in the center of the new one. Is there any way to do it fast rather than create new one and transfer values from old to new one?
Thank you very much

You will always have to allocate new memory for the new array, no matter what you do.
Also, if your matrix is only 2x2, the speed of any approach is good enough. Or do you want to handle larger matrices as well? Then, consider the following tests of two methods you can use:
A = rand(5000);
% explicitly add zero vectors on all sides of A
tic;
B = [zeros(1, size(A,1)+2);
zeros(size(A, 2),1) A zeros(size(A, 2),1);
zeros(1, size(A,1)+2)];
toc
Elapsed time is 0.204940 seconds.
% create the output array and assign the A array to correct sub-matrix
tic
B = zeros(size(A)+2);
B(2:end-1,2:end-1) = A;
toc
Elapsed time is 0.102501 seconds.

Another option is
B = padarray(A,[1,1],'both');
For speed (at least for my computer), this is between the two methods suggested by angainor, and it has the advantage that you don't have to create a new variable if you prefer not to.

Related

How to extract a submatrix without making a copy in Matlab

I have a large matrix, and I need to extract a small matrix taken from a sliding window which runs all over the large matrix, but during the operations the content of the extracted matrix does not change, so I'd like to extract the submatrix without creating a new copy but instead just acts like a C pointer that points to a portion of the large matrix. How can I do this? Please help me, thank you very much :)
I did some benchmarking to test if not using an explicit temporary matrix is faster, and it's probably not:
function move_mean(N)
M = randi(100,N);
window_size = [50 50];
dir_time = timeit(#() direct(M,window_size))
tmp_time = timeit(#() with_tmp(M,window_size))
end
function direct(M,window_size)
m = zeros(size(M)./2);
for r = 1:size(M,1)-window_size(1)
for c = 1:size(M,2)-window_size(2)
m(r,c) = mean(mean(M(r:r+window_size(1),c:c+window_size(2))));
end
end
end
function with_tmp(M,window_size)
m = zeros(size(M)./2);
for r = 1:size(M,1)-window_size(1)
for c = 1:size(M,2)-window_size(2)
tmp = M(r:r+window_size(1),c:c+window_size(2));
m(r,c) = mean(mean(tmp));
end
end
end
for M at size 100*100:
dir_time =
0.22739
tmp_time =
0.22339
So it's seems like using a temporary variable only makes your code readable, not slower.
In this answer I describe what is the 'best' solution in general. For this answer I define 'best' as most readable without a significant performance hit. (Partially shown by the existing answer).
Basically there are 2 situations that you may be in.
1. You use your submatrix several times
In this situation the best solution in general is to create a temporary variable containing the submatrix.
A = M(rmin:rmax, cmin:cmax)
There may be ways around it (defining a function/anonymous function that indexes into the matrix for you), but in general that won't make you happy.
2. You use your submatrix only 1 time
In this case the best solution is typically exactly what you referred to in the comments:
M(rmin:rmax, cmin:cmax)
A specific case of using the submatrix only 1 time, is when it is passed once to a function. Of course the contents of the submatrix may be used in that function several times, but that is irrelevant.

Matlab vectorization of multiple embedded for loops

Suppose you have 5 vectors: v_1, v_2, v_3, v_4 and v_5. These vectors each contain a range of values from a minimum to a maximum. So for example:
v_1 = minimum_value:step:maximum_value;
Each of these vectors uses the same step size but has a different minimum and maximum value. Thus they are each of a different length.
A function F(v_1, v_2, v_3, v_4, v_5) is dependant on these vectors and can use any combination of the elements within them. (Apologies for the poor explanation). I am trying to find the maximum value of F and record the values which resulted in it. My current approach has been to use multiple embedded for loops as shown to work out the function for every combination of the vectors elements:
% Set the temp value to a small value
temp = 0;
% For every combination of the five vectors use the equation. If the result
% is greater than the one calculated previously, store it along with the values
% (postitions) of elements within the vectors
for a=1:length(v_1)
for b=1:length(v_2)
for c=1:length(v_3)
for d=1:length(v_4)
for e=1:length(v_5)
% The function is a combination of trigonometrics, summations,
% multiplications etc..
Result = F(v_1(a), v_2(b), v_3(c), v_4(d), v_5(e))
% If the value of Result is greater that the previous value,
% store it and record the values of 'a','b','c','d' and 'e'
if Result > temp;
temp = Result;
f = a;
g = b;
h = c;
i = d;
j = e;
end
end
end
end
end
end
This gets incredibly slow, for small step sizes. If there are around 100 elements in each vector the number of combinations is around 100*100*100*100*100. This is a problem as I need small step values to get a suitably converged answer.
I was wondering if it was possible to speed this up using Vectorization, or any other method. I was also looking at generating the combinations prior to the calculation but this seemed even slower than my current method. I haven't used Matlab for a long time but just looking at the number of embedded for loops makes me think that this can definitely be sped up. Thank you for the suggestions.
No matter how you generate your parameter combination, you will end up calling your function F 100^5 times. The easiest solution would be to use parfor instead in order to exploit multi-core calculation. If you do that, you should store the calculation results and find the maximum after the loop, because your current approach would not be thread-safe.
Having said that and not knowing anything about your actual problem, I would advise you to implement a more structured approach, like first finding a coarse solution with a bigger step size and narrowing it down successivley by reducing the min/max values of your parameter intervals. What you have currently is the absolute brute-force method which will never be very effective.

Data fitting for time dependent matrix sets in Matlab

I have 6 data sets, each data set is a 576by576 matrix. Each set of data represents measurements taken within 30 second intervals. e.g. set1 at t=0, set2 at time =30, ... ,set5 at 150 seconds.
You can look at these sets as frames if you will. I need to take first data point (1,1) from each data set -> (1,1,0), (1,1,3),(1,1,6),(1,1,9),(1,1,12),(1,1,15) and based on those 6 points find a fitting formula, then assign that general solution to the first spot of my solution matriz SM(1,1). I need to do this for every data point in the 6 sets until I have a 576by576 solution matriz.
if everything goes right I should be able to plot SM(0s)=set1, SM(30s)=set2,etc. but not only that. SM(45) should return a prediction of measurements at t=45 and so on and so forth. The purpose is to have one matrix than can predict data fluctuation from time t= 0 to 150 seconds.
Additional information:
1.- Each data point is independent form the rest of the data points in the same set.
2.- it is a none-linear fit
3.- all values are real
Does Matlab have an optimization tool for this kind of problem?
Should I treat the problem as 1D data fit and create a for loop that does the job 576^2 times?
(I don't even know where to begin)
Feel free to ask or edit anything if I wasn't clear enough. I am not sure that I've chosen the most precise title for this kind of problem.Thanks
Update:
Based on Guddu's answer I came up with this:
%% Loadint data Matrix A
A(:,:,1) = abs(set1);
A(:,:,2) = abs(set2);
A(:,:,3) = abs(set3);
A(:,:,4) = abs(set4);
A(:,:,5) = abs(set5);
A(:,:,6) = abs(set6);
%% Creating Solution Matrix B
t=0:30:150;
SM=zeros([576 576 150]);
for i=1:576
for j=1:576
y=squeeze(A(i,j,1:6));
f=fit(t',y,'smoothingspline');
data=feval(f,1:150);
SM(i,j,:)=data;
end
end
%% Plotting Frame at t=45
figure(1);
imshow(SM(:,:,45),[])
I am not sure if this is the most efficient way to do it, but it works. I am open to new ideas or suggestions. Thanks
i would suggest your main data matrix would be of size (6,576,576) -> a. first point (1,1) from each dataset would be a(1,1,1), a(2,1,1), a(3,1,1) .. a(6,1,1). as you have said each point (i,j) in each dataset from other point (k,l), i would suggest treat each position (i,j) for all datasets separately from other positions. so it would be looping 576*576 times. code could be something like this
t=0:30:150;
for i=1:576
for j=1:576
datavec=squeeze(a(1:6,i,j)); % Select (i,j) point from all 6 frames
% do the curve fitting and save in SM(i,j)
end
end
i am just curious what kind of non-linear function you want to fit to 6 points. this might not be the answer you want but it was kind of long to put in a comment

Recursive loop optimization

Is there a way to rewrite my code to make it faster?
for i = 2:length(ECG)
u(i) = max([a*abs(ECG(i)) b*u(i-1)]);
end;
My problem is the length of ECG.
You should pre-allocate u like this
>> u = zeros(size(ECG));
or possibly like this
>> u = NaN(size(ECG));
or maybe even like this
>> u = -Inf(size(ECG));
depending on what behaviour you want.
When you pre-allocate a vector, MATLAB knows how big the vector is going to be and reserves an appropriately sized block of memory.
If you don't pre-allocate, then MATLAB has no way of knowing how large the final vector is going to be. Initially it will allocate a short block of memory. If you run out of space in that block, then it has to find a bigger block of memory somewhere, and copy all the old values into the new memory block. This happens every time you run out of space in the allocated block (which may not be every time you grow the array, because the MATLAB runtime is probably smart enough to ask for a bit more memory than it needs, but it is still more than necessary). All this unnecessary reallocating and copying is what takes a long time.
There are several several ways to optimize this for loop, but, surprisingly memory pre-allocation is not the part that saves the most time. By far. You're using max to find the largest element of a 1-by-2 vector. On each iteration you build this vector. However, all you're doing is comparing two scalars. Using the two argument form of max and passing it two scalar is MUCH faster: 75+ times faster on my machine for large ECG vectors!
% Set the parameters and create a vector with million elements
a = 2;
b = 3;
n = 1e6;
ECG = randn(1,n);
ECG2 = a*abs(ECG); % This can be done outside the loop if you have the memory
u(1,n) = 0; % Fast zero allocation
for i = 2:length(ECG)
u(i) = max(ECG2(i),b*u(i-1)); % Compare two scalars
end
For the single input form of max (not including creation of random ECG data):
Elapsed time is 1.314308 seconds.
For my code above:
Elapsed time is 0.017174 seconds.
FYI, the code above assumes u(1) = 0. If that's not true, then u(1) should be set to it's value after preallocation.

Matlab Preallocation, guess a large matrix or a small one?

According to this question, I should try to use Preallocation is Matlab.
Now I have a situation that I cannot calculate the exact size of the matrix to preallocate. I can guess the size.
suppose the actual size of the matrix is 100, but I don't know it. Sh
Which scenario is more efficient:
Should I be lavish? I guess a large matrix and at the end I remove extra rows.
Should I be stingy? I guess a small size and If it was wrong, I add new rows.
Thanks.
To my opinion, the answer is a bit more complex than portrayed by #natan.
I think there are two factors his answer does not take into account:
Possible necessary copies of memory: when you under-estimate a matrix size and you re-allocate it, all its old values should be copied to the new allocated location.
Continuity of memory chunks: sometimes Matlab is able to allocate new memory continuously at the end of the old matrix. In principle, in such a scenario the old values need not be copied to the new location - since it is the same as the old one just bigger. However, if you add rows to a 2D matrix, the content needs to be copied even in this scenario, since Matlab stores matrices in a row-major fashion in memory.
So, my answer is this:
First of all, what exactly don't you know about the size of the matrix: if you know one dimension - make it the number of rows of your matrix, so you'll only need to change the number of columns. This way, if your already stored data needs to be copied, it would be copied at larger chunks.
Second, it depends on how much free RAM you have at your disposal.
If you are not short at RAM, then there's nothing wrong with over estimating.
However, if you are short at RAM, consider under estimating. BUT when you re-allocate, increase the new block size at each iteration:
BASIC_SIZE = X; % first estimate
NEW_SIZE = Y; % if need more, add this amount
factor = 2;
arr = zeros( m, BASIC_SIZE ); % first allocation, assuming we know number of rows
while someCondition
% process arr ...
if needMoreCols
arr(:, size(arr,2) + (1:NEW_SIZE) ) = 0; % allocate another block
NEW_SIZE = round(NEW_SIZE * factor); % it seems like we are off in estimation, try larger chunk next time factor should be > 1
end
end
arr = arr(:, 1:actualNumOfCols ); % resize to actual size, discard unnecessary columns
+1 for the interesting question.
EDITED Answer:
From a little experimental study at first it seems better to add rows later, but it now seems more efficient overesrimated and preallocate again when you have the info of the correct size . I started with matrix size 3000 and guessed an error of 10% in the size estimation, see below:
clear all
clc
guess_size=3000;
m=zeros(guess_size);
%1. oops overesrimated, take out rows
tic
m(end-300:end,:)=[];
toc
%1b. oops overesrimated, preallocate again
tic
m=zeros(guess_size-300,guess_size);
toc
%2. oops overesrimated, take out cols
m=zeros(guess_size);
tic
m(:,end-300:end)=[];
toc
%2b. oops overesrimated, preallocate again
m=zeros(guess_size);
tic
m=zeros(guess_size,guess_size-300);
toc
%3. oops underesrimated, add rows
m=zeros(guess_size);
tic
m=zeros(guess_size+300,guess_size);
toc
%4. oops underesrimated, add cols
m=zeros(guess_size);
tic
m=zeros(guess_size,guess_size+300);
toc
Elapsed time is 0.041893 seconds.
Elapsed time is 0.026925 seconds.
Elapsed time is 0.041818 seconds.
Elapsed time is 0.023425 seconds.
Elapsed time is 0.027523 seconds.
Elapsed time is 0.029509 seconds.
Option 2b and 1b are slightly faster than underestimating, so if you can, better overestimate and then preallocate again. It is never efficient to delete rows from an array. Also adding columns seems slightly more efficient, but this is just a quick and dirty job. See #Shai detailed answer for the inner workings...
In addition to the other educating answers, The short short version:
There are three cases:
The size of the array is relatively small (up to thousends of bytes) -> it doesn't really matter.
The array is big, but you are not bounded by the amount of memory your system have -> Overestimate.
The array is big, and you are bounded by the amount of memory your system have -> do what Shai suggested.