copy certain rows in a new matrix within a loop - matlab

I'm trying to split a matrix in smaller matrices depending on one characteristic (i use 'if').
for jj = 1:length(FailureHoopUP_sorted)
if FailureHoopUP_sorted(jj,1)==20
FailureHoopUP_20(jj,:) = FailureHoopUP_sorted(jj,:);
elseif FailureHoopUP_sorted(jj,1)==30
FailureHoopUP_30(jj,:) = FailureHoopUP_sorted(jj,:);
else
FailureHoopUP_40(jj,:) = FailureHoopUP_sorted(jj,:);
end
end
The problem I have is that there are rows of zeroes that get in between the rows with data in the new created matrices.
I was wondering how i could avoid this?
Thank you for your help.

You don't need a loop, you can use logical indexing. For example:
FailureHoopUP_20=FailureHoopUP_sorted(FailureHoopUP_sorted(:,1)==20,:)
...
...
This should also solve the zeros issue (that happens because you keep the original index jj that is related to the length of FailureHoopUP_sorted).

Related

How to avoid sub2ind and decrease execution time in manipulating multidimensional arrays using original indexes

I would like your suggestions to make my code faster (and elegant). In particular, sub2ind (and the if-loop) slow it down dramatically according to the matlab profiler. I will try to explain what I need from my code as simply as I can.
Assuming I have the following problem, for simplicity.
Every citizen of every city has a car of a specific brand and a specific color.
What I would like to have is a 4D multidimensional array Data_4D(City,Citizen,Car_brand,Car_color) that I can manipulate (read and modify) using as indexes only these four dimensions.
Then, I want to reshape my multidimensional array into a 1D array Data_1D with
length(Data_1D)=(City*Citizen*Car_brand*Car_color)
The order of the elements must follow an indexing rule:
Example assuming City=2, Citizen=2, Car_brand=2, Car_color=2;
Data_1D(1)=Data_4D(1,1,1,1)
Data_1D(2)=Data_4D(1,1,1,2)
Data_1D(3)=Data_4D(1,1,2,1)
Data_1D(4)=Data_4D(1,1,2,2)
Data_1D(5)=Data_4D(1,2,1,1)
Data_1D(6)=Data_4D(1,2,1,2)
Data_1D(7)=Data_4D(1,2,2,1)
Data_1D(8)=Data_4D(1,2,2,2)
Data_1D(9)=Data_4D(2,1,1,1)
Data_1D(10)=Data_4D(2,1,1,2)
Data_1D(11)=Data_4D(2,1,2,1)
Data_1D(12)=Data_4D(2,1,2,2)
Data_1D(13)=Data_4D(2,2,1,1)
Data_1D(14)=Data_4D(2,2,1,2)
Data_1D(15)=Data_4D(2,2,2,1)
Data_1D(16)=Data_4D(2,2,2,2)
After that I will get this 1D array, shaped as above, I need to create a matrix Matrix_Final( NRows,length(Data_1D)) in which every row is an array Data_1D. In every row NRows, the array Data_1D will have the same amount of elements but with different values.
The amount of rows depends also on some (or all) of the four the dimensions City,Citizen,Car_brand,Car_color (respecting the same indexing rule as for Data_1D) and the array built in each line must be manipulated according also to the value of the matrix row (by using the four indexes, which is the common rule for both NRows and Data_1D).
Example:
Assuming City=2, Citizen=2, Car_brand=2, Car_color=2;
Assuming NRows depends on all the four dimensions.
I will have Matrix_Final( length(DATA_1D), length(DATA_1D)).
I want that all the data of my array DATA_1D are zeros, except one element, which has to be the element that has the same indexes values(City,Citizen,Car_brand,Car_color) as NRows(City,Citizen,Car_brand,Car_color)
So at the row NRows(1), only Data_1D(1) is non-zero. For this example, the result is an eye matrix.
clc
clear all
%Dimensions Definition
City=2;
Citizen=2;
Car_brand=2;
Car_color=2;
%Length of Data
Length_Data_1D=City*Citizen*Car_brand*Car_color;
%preallocation Matrix_Final
Matrix_Final=zeros(City*Citizen*Car_brand*Car_color, Length_Data_1D);
%indexes of the dimensions
indexes_array_carcolor=repmat(repelem([1:Car_color], 1), [1 City*Citizen*Car_brand]);
indexes_array_carbrand=repmat(repelem([1:Car_brand], Car_color), [1 City*Citizen]);
indexes_array_citizen=repmat(repelem([1:Citizen],Car_brand*Car_color),[1 City]);
indexes_array_city=repmat(repelem([1:City],Citizen*Car_brand*Car_color),[1 1]);
%Initializing loop variable
column_Matrix_final=1;
for CITY_selected=1:City
for CITIZEN_selected=1:Citizen
for CAR_BRAND_selected=1:Car_brand
for CAR_COLOR_selected=1:Car_color
%Data_4D Construction
Data_4D=zeros(City,Citizen,Car_brand,Car_color);
for city=1:length(indexes_array_city)
for citizen=1:length(indexes_array_citizen)
for car_brand=1:length(indexes_array_carbrand)
for car_color=1:length(indexes_array_carcolor)
if (indexes_array_city(city)==CITY_selected && indexes_array_citizen(citizen)==CITIZEN_selected ...
&& indexes_array_carbrand(car_brand)==CAR_BRAND_selected && ...
indexes_array_carcolor(car_color)==CAR_COLOR_selected)
Data_4D(sub2ind(size(Data_4D),indexes_array_city(city),indexes_array_citizen(citizen),...
indexes_array_carbrand(car_brand), indexes_array_carcolor(car_color)))=1;
end
end
end
end
end
%Data_4D transformation into array Data_1D
Data_1D=zeros(1,City*Citizen*Car_brand*Car_color);
tic=1;
for city=1:City
for citizen=1:Citizen
for car_brand=1:Car_brand
for car_color=1:Car_color
Data_1D(tic)=Data_4D(city,citizen,car_brand,car_color);
tic=tic+1;
end
end
end
end
%Adding Data_1D to the next for of Matrix_Final
Matrix_Final(column_Matrix_final,:)=Data_1D;
column_Matrix_final=column_Matrix_final+1;
%Display of the four most external loops indexes to show code
%advancement
CAR_COLOR_selected
end
CAR_BRAND_selected
end
CITIZEN_selected
end
CITY_selected
end
spy(Matrix_Final)
If you add e.g.
&& indexes_array_carcolor(car_color)==2
in the if loop, only the elements of Data_1D(City,Citizen,Car_brand,Car_color=2) in NRows(City,Citizen,Car_brand,Car_color=2) will be non-zero.
I would like to know if there are faster ways to set up the problem, but keeping the same ability to manipulate Data_1D an Matrix_Final using the four indexes (City,Citizen,Car_brand,Car_color) and the ability to correlate NRows and the elements of Data_1D using these four indexes.
Thank you for your help!
This is how you have coded it
if (indexes_array_city(city)==CITY_selected && indexes_array_citizen(citizen)==CITIZEN_selected ...
&& indexes_array_carbrand(car_brand)==CAR_BRAND_selected && ...
indexes_array_carcolor(car_color)==CAR_COLOR_selected)
Data_4D(sub2ind(size(Data_4D),indexes_array_city(city),indexes_array_citizen(citizen),...
indexes_array_carbrand(car_brand), indexes_array_carcolor(car_color)))=1;
end
Another way 1
if (indexes_array_city(city)==CITY_selected && indexes_array_citizen(citizen)==CITIZEN_selected ...
&& indexes_array_carbrand(car_brand)==CAR_BRAND_selected && ...
indexes_array_carcolor(car_color)==CAR_COLOR_selected)
Data_4D(indexes_array_city(city),indexes_array_citizen(citizen),...
indexes_array_carbrand(car_brand), indexes_array_carcolor(car_color))=1;
end
Another way 2
Data_4D(indexes_array_city(city),indexes_array_citizen(citizen),...
indexes_array_carbrand(car_brand), indexes_array_carcolor(car_color))=double((indexes_array_city(city)==CITY_selected)& ...
(indexes_array_citizen(citizen)==CITIZEN_selected)& ...
(indexes_array_carbrand(car_brand)==CAR_BRAND_selected)& ...
(indexes_array_carcolor(car_color)==CAR_COLOR_selected));
All three of them will yield same result. Try which is faster and use it.
%% Data_4D transformation into array Data_1D
Data_4D_size=size(Data_4D);
Data_1D_size=prod(Data_4D_size);
temp = permute(Data_4D, [4 3 2 1]);
Data_1D=reshape(temp,Data_1D_size,1);
Use this for 4D to 1D convertion
If still need more speed. Compile it to mex. Compiled code runs faster.

Foreach loop problems in MATLAB

I have the following piece of code:
for query = queryFiles
queryImage = imread(strcat('Queries/', query));
queryImage = im2single(rgb2gray(queryImage));
[qf,qd] = vl_covdet(queryImage, opts{:}) ;
for databaseEntry = databaseFiles
entryImage = imread(databaseEntry.name);
entryImage = im2single(rgb2gray(entryImage));
[df,dd] = vl_covdet(entryImage, opts{:}) ;
[matches, H] = matchFeatures(qf,qf,df,dd) ;
result = [result; query, databaseEntry, length(matches)];
end
end
It is my understanding that it should work as a Java/C++ for(query:queryFiles), however the query appears to be a copy of the queryFiles. How do I iterate through this vector normally?
I managed to sort the problem out. It was mainly to my MATLAB ignorance. I wasn't aware of cell arrays and that's the reason I had this problem. That and the required transposition.
From your code it appears that queryFiles is a numeric vector. Maybe it's a column vector? In that case you should convert it into a row:
for query = queryFiles.'
This is because the for loop in Matlab picks a column at each iteration. If your vector is a single column, it picks the whole vector in just one iteration.
In MATLAB, the for construct expects a row vector as input:
for ii = 1:5
will work (loops 5 times with ii = 1, 2, ...)
x = 1:5;
for ii = x
works the same way
However, when you have something other than a row vector, you would simply get a copy (or a column of data at a time).
To help you better, you need to tell us what the data type of queryFiles is. I am guessing it might be a cell array of strings since you are concatenating with a file path (look at fullfile function for the "right" way to do this). If so, then a "safe" approach is:
for ii = 1:numel(queryFiles)
query = queryFiles{ii}; % or queryFiles(ii)
It is often helpful to know what loop number you are in, and in this case ii provides that count for you. This approach is robust even when you don't know ahead of time what the shape of queryFiles is.
Here is how you can loop over all elements in queryFiles, this works for scalars, row vectors, column vectors and even high dimensional matrices:
for query = queryFiles(:)'
% Do stuff
end
Is queryFiles a cell array? The safest way to do this is to use an index:
for i = 1:numel(queryFiles)
query = queryFiles{i};
...
end

Mark values from loop for each iteration

I want to mark each value that comes out of my loop with a value.
Say I have a variable number of values that come out of each iteration. I want those values to be labeled by which iteration they came out of.
like
1-1,
2-1,
3-1,
1-2,
2-2,
3-2,
4-2,
etc.
where the first number is the value from the loop and the second is counting which iteration it came from.
I feel like there is a way I just cant find it.
ok so here is some code.
for c=1:1:npoints;
for i=1:1:NN;
if ((c-1)*spacepoints)<=PL(i+1) && ((c-1)*spacepoints)>=PL(i);
local(c)=((c)*spacepoints)-PL(i);
end
if ((c-1)*spacepoints)>=PL(NN);
local(c)=((c)*spacepoints)-PL(NN);
element(i)=NN;
end
end
I want to mark each local value with the iteration it came from for the i:NN. PL is a vector and the output is a set of vectors for each iteration.
For this sort of quick problem I like to create a cell array:
for k = 1:12
results{k} = complicated_function(...);
end
If the output is really complicated, then I return a struct with fields relating to the outputs:
for k = 1:12
results{k}.file = get_filename(...);
results{k}.result = ...;
end
Currently as it is right now, in your inner 1:NN loop, your local(c) variable is being updated or overwritten. You never apply the previous value of local, so it is not some iterative optimization algorithm(?)...
Perhaps an easy solution is to change the size/type of local from a vector to a matrix. Let's say that local is of size [npoints 1]. Instead you make it of size [npoints NN]. It is now a 2d-array (a matrix of npoints rows and NN columns). use the second dimension to store each (assumed column) vector from the inner loop:
local = zeros([npoints NN]);
%# ... code in bewteen ...
for c=1:1:npoints;
for i=1:1:NN;
if ((c-1)*spacepoints)<=PL(i+1) && ((c-1)*spacepoints)>=PL(i);
local(c, i)=((c)*spacepoints)-PL(i);
end
if ((c-1)*spacepoints)>=PL(NN);
local(c, i)=((c)*spacepoints)-PL(NN);
element(i)=NN;
end
end
end
The c'th row of your local matrix will then corresponds to the NN values from the inner loop. Please note that I have assumed your vector to be a column vector - if not, just change the order of the sizes.

Matlab: how to implement a dynamic vector

I am refering to an example like this
I have a function to analize the elements of a vector, 'input'. If these elements have a special property I store their values in a vector, 'output'.
The problem is that at the begging I don´t know the number of elements it will need to store in 'output'so I don´t know its size.
I have a loop, inside I go around the vector, 'input' through an index. When I consider special some element of this vector capture the values of 'input' and It be stored in a vector 'ouput' through a sentence like this:
For i=1:N %Where N denotes the number of elements of 'input'
...
output(j) = input(i);
...
end
The problem is that I get an Error if I don´t previously "declare" 'output'. I don´t like to "declare" 'output' before reach the loop as output = input, because it store values from input in which I am not interested and I should think some way to remove all values I stored it that don´t are relevant to me.
Does anyone illuminate me about this issue?
Thank you.
How complicated is the logic in the for loop?
If it's simple, something like this would work:
output = input ( logic==true )
Alternatively, if the logic is complicated and you're dealing with big vectors, I would preallocate a vector that stores whether to save an element or not. Here is some example code:
N = length(input); %Where N denotes the number of elements of 'input'
saveInput = zeros(1,N); % create a vector of 0s
for i=1:N
...
if (input meets criteria)
saveInput(i) = 1;
end
end
output = input( saveInput==1 ); %only save elements worth saving
The trivial solution is:
% if input(i) meets your conditions
output = [output; input(i)]
Though I don't know if this has good performance or not
If N is not too big so that it would cause you memory problems, you can pre-assign output to a vector of the same size as input, and remove all useless elements at the end of the loop.
output = NaN(N,1);
for i=1:N
...
output(i) = input(i);
...
end
output(isnan(output)) = [];
There are two alternatives
If output would be too big if it was assigned the size of N, or if you didn't know the upper limit of the size of output, you can do the following
lengthOutput = 100;
output = NaN(lengthOutput,1);
counter = 1;
for i=1:N
...
output(counter) = input(i);
counter = counter + 1;
if counter > lengthOutput
%# append output if necessary by doubling its size
output = [output;NaN(lengthOutput,1)];
lengthOutput = length(output);
end
end
%# remove unused entries
output(counter:end) = [];
Finally, if N is small, it is perfectly fine to call
output = [];
for i=1:N
...
output = [output;input(i)];
...
end
Note that performance degrades dramatically if N becomes large (say >1000).

Out-of-memory algorithms for addressing large arrays

I am trying to deal with a very large dataset. I have k = ~4200 matrices (varying sizes) which must be compared combinatorially, skipping non-unique and self comparisons. Each of k(k-1)/2 comparisons produces a matrix, which must be indexed against its parents (i.e. can find out where it came from). The convenient way to do this is to (triangularly) fill a k-by-k cell array with the result of each comparison. These are ~100 X ~100 matrices, on average. Using single precision floats, it works out to 400 GB overall.
I need to 1) generate the cell array or pieces of it without trying to place the whole thing in memory and 2) access its elements (and their elements) in like fashion. My attempts have been inefficient due to reliance on MATLAB's eval() as well as save and clear occurring in loops.
for i=1:k
[~,m] = size(data{i});
cur_var = ['H' int2str(i)];
%# if i == 1; save('FileName'); end; %# If using a single MAT file and need to create it.
eval([cur_var ' = cell(1,k-i);']);
for j=i+1:k
[~,n] = size(data{j});
eval([cur_var '{i,j} = zeros(m,n,''single'');']);
eval([cur_var '{i,j} = compare(data{i},data{j});']);
end
save(cur_var,cur_var); %# Add '-append' when using a single MAT file.
clear(cur_var);
end
The other thing I have done is to perform the split when mod((i+j-1)/2,max(factor(k(k-1)/2))) == 0. This divides the result into the largest number of same-size pieces, which seems logical. The indexing is a little more complicated, but not too bad because a linear index could be used.
Does anyone know/see a better way?
Here's a version that combines going fast with using minimal memory.
I use fwrite/fread so that you still can use parfor (and this time, I made sure it works :) )
%# assume data is loaded an k is known
%# find the index pairs for comparisons. This could be done more elegantly, I guess.
%# I'm constructing a lower triangular array, i.e. an array that has ones wherever
%# we want to compare i (row) and j (col). Then I use find to get i and j
[iIdx,jIdx] = find(tril(ones(k,k),-1));
%# create a directory to store the comparisons
mkdir('H_matrix_elements')
savePath = fullfile(pwd,'H_matrix_elements');
%# loop through all comparisons in parallel. This way there may be a bit more overhead from
%# the individual function calls. However, parfor is most efficient if there are
%# a lot of relatively similarly fast iterations.
parfor ct = 1:length(iIdx)
%# make the comparison - do double b/c there shouldn't be a memory issue
currentComparison = compare(data{iIdx(ct)},data{jIdx{ct});
%# create save-name as H_i_j, e.g. H_104_23
saveName = fullfile(savePath,sprintf('H_%i_%i',iIdx(ct),jIdx(ct)));
%# save. Since 'save' is not allowed, use fwrite to write the data to disk
fid = fopen(saveName,'w');
%# for simplicity: save data as vector, add two elements to the beginning
%# to store the size of the array
fwrite(fid,[size(currentComparison)';currentComparison(:)]); % ' #SO formatting
%# close file
fclose(fid)
end
%# to read e.g. comparison H_104_23
fid = fopen(fullfile(savePath,'H_104_23'),'r');
tmp = fread(fid);
fclose(fid);
%# reshape into 2D array.
data = reshape(tmp(3:end),tmp(1),tmp(2));
You can get rid of the eval and clear calls by assigning the filename separately.
for i=1:k
[~,m] = size(data{i});
file_name = ['H' int2str(i)];
cur_var = cell(1, k-i);
for j=i+1:k
[~,n] = size(data{j});
cur_var{i,j} = zeros(m, n, 'single');
cur_var{i,j} = compare(data{i}, data{j});
end
save(file_name, cur_var);
end
If you need the saved variables to take different names, use the -struct option to save.
str.(file_name);
save(file_name, '-struct', str);