How to search through arrays to find a match - matlab

I have a 4x~50,000 array, where the first 3 rows hold the x y z coordinates, then some corresponding values of a given location.
i.e.,
row 1 | 1 2 3 4 5 1...
row 2 | 1 1 1 1 1 1...
row 3 | 0 0 0 0 0 1...
row 4 | 0.7 0.2 1.0 0.3 0.3 0.5
I want to rearrange this data into a 3D array, to plot this data in fMRI style
for example,
3Darray(1,1,0) = 0.7
3Darray(2,1,0) = 0.2
3Darray(3,1,0) = 1.0
.
.
.
I'm having trouble going through the rows simultaneously to match all three x,y,z, values.
Any suggestions are welcome!
Thank you,
Regards,
P

This is a job for sub2ind! Here's a dummy array I made for testing,
%// example array
array=[1 2 3 4 5 1 2 3 4 5
1 1 1 1 1 2 2 2 2 2
1 1 1 1 1 1 1 3 3 3
rand(1,10)]
And the code:
n=max(array(1:3,:),[],2).'; %// get size of final array
m=zeros(n); %// make an array of zeros of that size
ind=sub2ind(n,array(1,:),array(2,:),array(3,:)); %// get linear indices of elements
m(ind)=array(4,:); %// put elements into the array

Related

MATLAB generate all ways that n items can be put into m bins?

I want to find all ways that n items can be split among m bins. For example, for n=3 and m=3 the output would be (the order doesn't matter):
[3 0 0
0 3 0
0 0 3
2 1 0
1 2 0
0 1 2
0 2 1
1 0 2
2 0 1
1 1 1]
The algorithm should be as efficient as possible, preferrably vectorized/using inbuilt functions rather than for loops. Thank you!
This should be pretty efficient.
It works by generating all posible splitings of the real interval [0, n] at m−1 integer-valued, possibly coincident split points. The lengths of the resulting subintervals give the solution.
For example, for n=4 and m=3, some of the possible ways to split the interval [0, 4] at m−1 points are:
Split at 0, 0: this gives subintervals of lenghts 0, 0, 4.
Split at 0, 1: this gives subintervals of lenghts 0, 1, 3.
...
Split at 4, 4: this gives subintervals of lenghts 4, 0, 0.
Code:
n = 4; % number of items
m = 3; % number of bins
x = bsxfun(#minus, nchoosek(0:n+m-2,m-1), 0:m-2); % split points
x = [zeros(size(x,1),1) x n*ones(size(x,1),1)]; % add start and end of interval [0, n]
result = diff(x.').'; % compute subinterval lengths
The result is in lexicographical order.
As an example, for n = 4 items in m = 3 bins the output is
result =
0 0 4
0 1 3
0 2 2
0 3 1
0 4 0
1 0 3
1 1 2
1 2 1
1 3 0
2 0 2
2 1 1
2 2 0
3 0 1
3 1 0
4 0 0
I'd like to suggest a solution based on an external function and accumarray (it should work starting R2015a because of repelem):
n = uint8(4); % number of items
m = uint8(3); % number of bins
whichBin = VChooseKR(1:m,n).'; % see FEX link below. Transpose saves us a `reshape()` later.
result = accumarray([repelem(1:size(whichBin,2),n).' whichBin(:)],1);
Where VChooseKR(V,K) creates a matrix whose rows are all combinations created by choosing K elements of the vector V with repetitions.
Explanation:
The output of VChooseKR(1:m,n) for m=3 and n=4 is:
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 2
1 1 2 3
1 1 3 3
1 2 2 2
1 2 2 3
1 2 3 3
1 3 3 3
2 2 2 2
2 2 2 3
2 2 3 3
2 3 3 3
3 3 3 3
All we need to do now is "histcount" the numbers on each row using positive integer bins to get the desired result. The first output row would be [4 0 0] because all 4 elements go in the 1st bin. The second row would be [3 1 0] because 3 elements go in the 1st bin and 1 in the 2nd, etc.

Transform a matrix to a stacked vector where all zeroes after the last non-zero value per row are removed

I have a matrix with some zero values I want to erase.
a=[ 1 2 3 0 0; 1 0 1 3 2; 0 1 2 5 0]
>>a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
However, I want to erase only the ones after the last non-zero value of each line.
This means that I want to retain 1 2 3 from the first line, 1 0 1 3 2 from the second and 0 1 2 5 from the third.
I want to then store the remaining values in a vector. In the case of the example this would result in the vector
b=[1 2 3 1 0 1 3 2 0 1 2 5]
The only way I figured out involves a for loop that I would like to avoid:
b=[];
for ii=1:size(a,1)
l=max(find(a(ii,:)));
b=[b a(ii,1:l)];
end
Is there a way to vectorize this code?
There are many possible ways to do this, here is my approach:
arotate = a' %//rotate the matrix a by 90 degrees
b=flipud(arotate) %//flips the matrix up and down
c= flipud(cumsum(b,1)) %//cumulative sum the matrix rows -and then flip it back.
arotate(c==0)=[]
arotate =
1 2 3 1 0 1 3 2 0 1 2 5
=========================EDIT=====================
just realized cumsum can have direction parameter so this should do:
arotate = a'
b = cumsum(arotate,1,'reverse')
arotate(b==0)=[]
This direction parameter was not available on my 2010b version, but should be there for you if you are using 2013a or above.
Here's an approach using bsxfun's masking capability -
M = size(a,2); %// Save size parameter
at = a.'; %// Transpose input array, to be used for masked extraction
%// Index IDs of last non-zero for each row when looking from right side
[~,idx] = max(fliplr(a~=0),[],2);
%// Create a mask of elements that are to be picked up in a
%// transposed version of the input array using BSXFUN's broadcasting
out = at(bsxfun(#le,(1:M)',M+1-idx'))
Sample run (to showcase mask usage) -
>> a
a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
>> M = size(a,2);
>> at = a.';
>> [~,idx] = max(fliplr(a~=0),[],2);
>> bsxfun(#le,(1:M)',M+1-idx') %// mask to be used on transposed version
ans =
1 1 1
1 1 1
1 1 1
0 1 1
0 1 0
>> at(bsxfun(#le,(1:M)',M+1-idx')).'
ans =
1 2 3 1 0 1 3 2 0 1 2 5

How to count patterns columnwise in Matlab?

I have a matrix S in Matlab that looks like the following:
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1
I would like to count patterns of values column-wise. I am interested into the frequency of the numbers that follow right after number 3 in any of the columns. For instance, number 3 occurs three times in the first column. The first time we observe it, it is followed by 3, the second time it is followed by 3 again and the third time it is followed by 4. Thus, the frequency for the patters observed in the first column would look like:
3-3: 66.66%
3-4: 33.33%
3-1: 0%
3-2: 0%
To generate the output, you could use the convenient tabulate
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1];
idx = find(S(1:end-1,:)==3);
S2 = S(2:end,:);
tabulate(S2(idx))
Value Count Percent
1 0 0.00%
2 0 0.00%
3 4 66.67%
4 2 33.33%
Here's one approach, finding the 3's then looking at the following digits
[i,j]=find(S==3);
k=i+1<=size(S,1);
T=S(sub2ind(size(S),i(k)+1,j(k))) %// the elements of S that are just below a 3
R=arrayfun(#(x) sum(T==x)./sum(k),1:max(S(:))).' %// get the number of probability of each digit
I'm going to restate your problem statement in a way that I can understand and my solution will reflect this new problem statement.
For a particular column, locate the locations that contain the number 3.
Look at the row immediately below these locations and look at the values at these locations
Take these values and tally up the total number of occurrences found.
Repeat these for all of the columns and update the tally, then determine the percentage of occurrences for the values.
We can do this by the following:
A = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// Define your matrix
[row,col] = find(A(1:end-1,:) == 3);
vals = A(sub2ind(size(A), row+1, col));
h = 100*accumarray(vals, 1) / numel(vals)
h =
0
0
66.6667
33.3333
Let's go through the above code slowly. The first few lines define your example matrix A. Next, we take a look at all of the rows except for the last row of your matrix and see where the number 3 is located with find. We skip the last row because we want to be sure we are within the bounds of your matrix. If there is a number 3 located at the last row, we would have undefined behaviour if we tried to check the values below the last because there's nothing there!
Once we do this, we take a look at those values in the matrix that are 1 row beneath those that have the number 3. We use sub2ind to help us facilitate this. Next, we use these values and tally them up using accumarray then normalize them by the total sum of the tallying into percentages.
The result would be a 4 element array that displays the percentages encountered per number.
To double check, if we look at the matrix, we see that the value of 3 follows other values of 3 for a total of 4 times - first column, row 3, row 4, second column, row 2 and third column, row 6. The value of 4 follows the value of 3 two times: first column, row 6, second column, row 3.
In total, we have 6 numbers we counted, and so dividing by 6 gives us 4/6 or 66.67% for number 3 and 2/6 or 33.33% for number 4.
If I got the problem statement correctly, you could efficiently implement this with MATLAB's logical indexing and an approach that is essentially of two lines -
%// Input 2D matrix
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]
Labels = [1:4]'; %//'# Label array
counts = histc(S([false(1,size(S,2)) ; S(1:end-1,:) == 3]),Labels)
Percentages = 100*counts./sum(counts)
Verify/Present results
The styles for presenting the output results listed next use MATLAB's table for a well human-readable format of data.
Style #1
>> table(Labels,Percentages)
ans =
Labels Percentages
______ ___________
1 0
2 0
3 66.667
4 33.333
Style #2
You can do some fancy string operations to present the results in a more "representative" manner -
>> Labels_3 = strcat('3-',cellstr(num2str(Labels','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-1' 0
'3-2' 0
'3-3' 66.667
'3-4' 33.333
Style #3
If you want to present them in descending sorted manner based on the percentages as listed in the expected output section of the question, you can do so with an additional step using sort -
>> [Percentages,idx] = sort(Percentages,'descend');
>> Labels_3 = strcat('3-',cellstr(num2str(Labels(idx)','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-3' 66.667
'3-4' 33.333
'3-1' 0
'3-2' 0
Bonus Stuff: Finding frequency (counts) for all cases
Now, let's suppose you would like repeat this process for say 1, 2 and 4 as well, i.e. find occurrences after 1, 2 and 4 respectively. In that case, you can iterate the above steps for all cases and for the same you can use arrayfun -
%// Get counts
C = cell2mat(arrayfun(#(n) histc(S([false(1,size(S,2)) ; S(1:end-1,:) == n]),...
1:4),1:4,'Uni',0))
%// Get percentages
Percentages = 100*bsxfun(#rdivide, C, sum(C,1))
Giving us -
Percentages =
90.9091 20.0000 0 100.0000
9.0909 20.0000 0 0
0 60.0000 66.6667 0
0 0 33.3333 0
Thus, in Percentages, the first column are the counts of [1,2,3,4] that occur right after there is a 1 somewhere in the input matrix. As as an example, one can see column -3 of Percentages is what you had in the sample output when looking for elements right after 3 in the input matrix.
If you want to compute frequencies independently for each column:
S = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// data: matrix
N = 3; %// data: number
r = max(S(:));
[R, C] = size(S);
[ii, jj] = find(S(1:end-1,:)==N); %// step 1
count = full(sparse(S(ii+1+(jj-1)*R), jj, 1, r, C)); %// step 2
result = bsxfun(#rdivide, count, sum(S(1:end-1,:)==N)); %// step 3
This works as follows:
find is first applied to determine row and col indices of occurrences of N in S except its last row.
The values in the entries right below the indices of step 1 are accumulated for each column, in variable count. The very convenient sparse function is used for this purpose. Note that this uses linear indexing into S.
To obtain the frequencies for each column, count is divided (with bsxfun) by the number of occurrences of N in each column.
The result in this example is
result =
0 0 0 NaN
0 0 0 NaN
0.6667 0.5000 1.0000 NaN
0.3333 0.5000 0 NaN
Note that the last column correctly contains NaNs because the frequency of the sought patterns is undefined for that column.

Specific selection of rows from a matrix MATLAB

Consider the following the matrix:
1 2 1 1
1 3 1 1
2 5 2 3
2 6 2 4
2 6 2 4
2 9 0 0
3 4 5 6
3 4 1 1
3 2 0 0
3 1 1 1
.
.
.
I want to select the row(s) with the maximum value in column 2 for every unique value in column 1.
For eg.
Answer should be:
1 3 1 1
2 9 0 0
3 4 5 6
3 4 1 1
Any ideas?
Here's one way:
%// Get a unique list of column 1 without changing the order in which they appear
[C1, ~, subs] = unique(M(:,1), 'stable');
%// Get the max value from column 2 corresponding to each unique value of column 1
C2 = accumarray(subs, M(:,2), [], #max);
%// Find the desired row indices
I = ismember(M(:,1:2), [C1, C2], 'rows');
%//Extract the rows
M(I, :)
Some code to get you started:
% unique values in first column
col1 = unique(x(:,1));
% we first store results in a cell array (later converted to matrix)
xx = cell(numel(col1), 1);
for i=1:numel(col1)
% rows with the same value in column 1
rows = x(x(:,1) == col1(i),:);
% maximum value along column 2
mx = max(rows(:,2));
% store all rows with the max value (in case of ties)
xx{i} = rows(rows(:,2)==mx,:);
end
% combine all resulting rows
xx = vertcat(xx{:});
The result for the matrix you've shown:
>> xx
xx =
1 3 1 1
2 9 0 0
3 4 5 6
3 4 1 1

How do I divide the rows of a matrix by different values in MATLAB (array division)

Lets said I have the matrix M = ones(3); and I want to divide each row by a different number, e.g., C = [1;2;3];.
1 1 1 -divide_by-> 1 1 1 1
1 1 1 -divide_by-> 2 = 0.5 0.5 0.5
1 1 1 -divide_by-> 3 0.3 0.3 0.3
How can I do this without using loops?
Use right array division as documented here
result = M./C
whereas C has the following form:
C = [ 1 1 1 ; 2 2 2 ; 3 3 3 ];
EDIT:
result = bsxfun(#rdivide, M, [1 2 3]'); % untested !