Indexing rows of a table by comparing the values between two cells - matlab

I have a table like the above attachment. Column A and Column B contains some elements in terms of cell array. I want to create the third column (Level) as the resultant column; based on the following logic.
The row for which, value of cell A = value of cell B will be labeled1. (In the 3rd row, the value of column A=value of column B= 3, hence labeled 1).
Next, the preceding value will be removed from all
the cells of column A; and the step 1 will be repeated until all the
rows are labeled. (In the second step, 3 will be removed from all
the cells, hence both row 1 and row 2 will be labeled as 2;In the
final step, elements {1,2} will be further removed from the last row
resulting the level as 3 )
I am using cell2mat and setdiff functions to compare the values across the cells, but I am not able to frame the above 2 logical steps to run my code successfully. I have just started learning MATLAB, Any help will be highly appreciated.

Here's the simplest answer I could come up with, using a single while loop and assuming the cells of A and B contain row vectors:
Level = zeros(size(A));
index = cellfun(#isequal, A, B);
while any(index)
Level(index) = max(Level)+1;
A = cellfun(#(c) {setdiff(c, unique([A{index}]))}, A);
index = cellfun(#isequal, A, B);
end
The above code first initializes a matrix of zeroes Level the same size as A to store the level values. Then it finds a logical index index of where there are matching cell contents between A and B using cellfun and isequal. It will continue to loop as long as there are any matches indicated by index. The corresponding indices in Level are set to the current maximum value in Level plus one. All the matching cell contents from A are concatenated and the unique values found by unique([A{index}]). A set difference operation is then used (along with cellfun) to remove the matching values from each cell in A, overwriting A with the remaining values. A new index for matches is then computed and the loop restarts.
Given the following sample data from your question:
A = {[1 2 3]; [2 3]; 3; [1 2 3 4]};
B = {[1 2]; 2; 3; 4};
The code returns the expected level vector:
Level =
2
2
1
3

Not my best work, i think it is possible to get rid of the inner loop.
% your testdata
A = {[1 2 3]
[2 3]
3
[1,2,4]};
B = {[1 2]
2
3
4};
Level = NaN(numel(B),1);
temp = A; % copy of A that we are going to remove elements from
k = 0; % loop couter
while any(isnan(Level)) % do until each element of Level is not NaN
k = k+1; % increment counter by 1
% step 1
idx = find(cellfun(#isequal,temp,B)); % determine which cells are equal
Level(idx) = k; % set level of equal cells
% step 2
for k = 1:numel(idx) % for each cell that is equal
%remove values in B from A for each equal cell
temp = cellfun(#setdiff,temp,repmat(B(idx(k)),numel(B),1),'UniformOutput',0);
end
end

Related

Rearrange rows according to common column element

Give for example the matrices
A = [20 1 2 3;
3 3 3 4];
B = [3 3 3 3;
20 1 2 4];
Each column of matrix A has a common element. Is it possible without for loops to rearrange rows of A so as the common element to be in the top or bottom row (see matrix B)?
I suggest you use the Set functions with multiple inputs file exchange submission by Oleg to find the common element. First convert A to a cell array of column vectors using mat2cell. Then break it up into a comma separated list using the {:} notation to feed each column to intersectm (from the FEX entry linked to above) as separate inputs
A_cell = mat2cell(A,2,ones(1,size(A,2)));
common = intersectm(A_cell{:});
now find which row the common element lies in per column and use linear indexing to flip the columns in which the common element is in the second row
[r, c] = find(A == common);
idx_r = (r+c*2-2)';
idx = idx_r;
idx(2,:) = (idx_r-1).*~mod(idx_r,2) + (idx_r+1).*mod(idx_r,2);
Finally
B = A(idx)

Shifting repeating rows to a new column in a matrix

I am working with a n x 1 matrix, A, that has repeating values inside it:
A = [0;1;2;3;4; 0;1;2;3;4; 0;1;2;3;4; 0;1;2;3;4]
which correspond to an n x 1 matrix of B values:
B = [2;4;6;8;10; 3;5;7;9;11; 4;6;8;10;12; 5;7;9;11;13]
I am attempting to produce a generalised code to place each repetition into a separate column and store it into Aa and Bb, e.g.:
Aa = [0 0 0 0 Bb = [2 3 4 5
1 1 1 1 4 5 6 7
2 2 2 2 6 7 8 9
3 3 3 3 8 9 10 11
4 4 4 4] 10 11 12 13]
Essentially, each repetition from A and B needs to be copied into the next column and then deleted from the first column
So far I have managed to identify how many repetitions there are and copy the entire column over to the next column and then the next for the amount of repetitions there are but my method doesn't shift the matrix rows to columns as such.
clc;clf;close all
A = [0;1;2;3;4;0;1;2;3;4;0;1;2;3;4;0;1;2;3;4];
B = [2;4;6;8;10;3;5;7;9;11;4;6;8;10;12;5;7;9;11;13];
desiredCol = 1; %next column to go to
destinationCol = 0; %column to start on
n = length(A);
for i = 2:1:n-1
if A == 0;
A = [ A(:, 1:destinationCol)...
A(:, desiredCol+1:destinationCol)...
A(:, desiredCol)...
A(:, destinationCol+1:end) ];
end
end
A = [...] retrieved from Move a set of N-rows to another column in MATLAB
Any hints would be much appreciated. If you need further explanation, let me know!
Thanks!
Given our discussion in the comments, all you need is to use reshape which converts a matrix of known dimensions into an output matrix with specified dimensions provided that the number of elements match. You wish to transform a vector which has a set amount of repeating patterns into a matrix where each column has one of these repeating instances. reshape creates a matrix in column-major order where values are sampled column-wise and the matrix is populated this way. This is perfect for your situation.
Assuming that you already know how many "repeats" you're expecting, we call this An, you simply need to reshape your vector so that it has T = n / An rows where n is the length of the vector. Something like this will work.
n = numel(A); T = n / An;
Aa = reshape(A, T, []);
Bb = reshape(B, T, []);
The third parameter has empty braces and this tells MATLAB to infer how many columns there will be given that there are T rows. Technically, this would simply be An columns but it's nice to show you how flexible MATLAB can be.
If you say you already know the repeated subvector, and the number of times it repeats then it is relatively straight forward:
First make your new A matrix with the repmat function.
Then remap your B vector to the same size as you new A matrix
% Given that you already have the repeated subvector Asub, and the number
% of times it repeats; An:
Asub = [0;1;2;3;4];
An = 4;
lengthAsub = length(Asub);
Anew = repmat(Asub, [1,An]);
% If you can assume that the number of elements in B is equal to the number
% of elements in A:
numberColumns = size(Anew, 2);
newB = zeros(size(Anew));
for i = 1:numberColumns
indexStart = (i-1) * lengthAsub + 1;
indexEnd = indexStart + An;
newB(:,i) = B(indexStart:indexEnd);
end
If you don't know what is in your original A vector, but you do know it is repetitive, if you assume that the pattern has no repeats you can use the find function to find when the first element is repeated:
lengthAsub = find(A(2:end) == A(1), 1);
Asub = A(1:lengthAsub);
An = length(A) / lengthAsub
Hopefully this fits in with your data: the only reason it would not is if your subvector within A is a pattern which does not have unique numbers, such as:
A = [0;1;2;3;2;1;0; 0;1;2;3;2;1;0; 0;1;2;3;2;1;0; 0;1;2;3;2;1;0;]
It is worth noting that from the above intuitively you would have lengthAsub = find(A(2:end) == A(1), 1) - 1;, But this is not necessary because you are already effectively taking the one off by only looking in the matrix A(2:end).

find row indices of different values in matrix

Having matrix A (n*2) as the source and B as a vector containing a subset of elements A, I'd like to find the row index of items.
A=[1 2;1 3; 4 5];
B=[1 5];
F=arrayfun(#(x)(find(B(x)==A)),1:numel(B),'UniformOutput',false)
gives the following outputs in a cell according to this help page
[2x1 double] [6]
indicating the indices of all occurrence in column-wise. But I'd like to have the indices of rows. i.e. I'd like to know that element 1 happens in row 1 and row 2 and element 5 happens just in row 3. If the indices were row-wise I could use ceil(F{x}/2) to have the desired output. Now with the variable number of rows, what's your suggested solution? As it may happens that there's no complete inclusion tag 'rows' in ismember function does not work. Besides, I'd like to know all indices of specified elements.
Thanks in advance for any help.
Approach 1
To convert F from its current linear-index form into row indices, use mod:
rows = cellfun(#(x) mod(x-1,size(A,1))+1, F, 'UniformOutput', false);
You can combine this with your code into a single line. Note also that you can directly use B as an input to arrayfun, and you avoid one stage of indexing:
rows = arrayfun(#(x) mod(find(x==A)-1,size(A,1))+1, B(:), 'UniformOutput', false);
How this works:
F as given by your code is a linear index in column-major form. This means the index runs down the first column of B, the begins at the top of the second column and runs down again, etc. So the row number can be obtained with just a modulo (mod) operation.
Approach 2
Using bsxfun and accumarray:
t = any(bsxfun(#eq, B(:), reshape(A, 1, size(A,1), size(A,2))), 3); %// occurrence pattern
[ii, jj] = find(t); %// ii indicates an element of B, and jj is row of A where it occurs
rows = accumarray(ii, jj, [], #(x) {x}); %// group results according to ii
How this works:
Assuming A and B as in your example, t is the 2x3 matrix
t =
1 1 0
0 0 1
The m-th row of t contains 1 at column n if the m-th element of B occurs at the n-th row of B. These values are converted into row and column form with find:
ii =
1
1
2
jj =
1
2
3
This means the first element of B ocurrs at rows 1 and 2 of A; and the second occurs at row 3 of B.
Lastly, the values of jj are grouped (with accumarray) according to their corresponding value of ii to generate the desired result.
One approach with bsxfun & accumarray -
%// Create a match of B's in A's with each column of matches representing the
%// rows in A where there is at least one match for each element in B
matches = squeeze(any(bsxfun(#eq,A,permute(B(:),[3 2 1])),2))
%// Get the indices values and the corresponding IDs of B
[indices,B_id] = find(matches)
%// Or directly for performance:
%// [indices,B_id] = find(any(bsxfun(#eq,A,permute(B(:),[3 2 1])),2))
%// Accumulate the indices values using B_id as subscripts
out = accumarray(B_id(:),indices(:),[],#(x) {x})
Sample run -
>> A
A =
1 2
1 3
4 5
>> B
B =
1 5
>> celldisp(out) %// To display the output, out
out{1} =
1
2
out{2} =
3
With arrayfun,ismember and find
[r,c] = arrayfun(#(x) find(ismember(A,x)) , B, 'uni',0);
Where r gives your desired results, you could also use the c variable to get the column of each number in B
Results for the sample input:
>> celldisp(r)
r{1} =
1
2
r{2} =
3
>> celldisp(c)
c{1} =
1
1
c{2} =
2

How to recover the original locations of deleted elements?

I have a data matrix A, the size of which is 4*20 (4 rows, 20 columns). The matrix A is generated by A = randn (4, 20).
In the first iteration, I delete columns [2,3] of matrix A. Then matrix A becomes matrix A1, the size of which is 4*18.
In the second iteration, I delete columns [4 8 10] of matrix A1. Then matrix A1 becomes matrix A2, the size of which is 4*15.
In the third iteration, I delete columns [1 3 6 9 10] of matrix A2. Then matrix A2 becomes matrix A3, the size of which is 4*10.
The deleted elements are put into a matrix B. My question is how to figure out the x and y coordinates in original matrix A of every deleted element in B. Anyone can give me a help? Thank you so much!
I would personally keep another vector that ranges between 1 to 20 and put your removal of columns within a loop. Let's call this vector column_choice. At each iteration, use randperm to randomly select from column_choice those columns you want to remove, then append these to your matrix B. Once you select these columns, remove these elements from column_choice and continue with your code. Also, those columns from column_choice we will add to another vector... call it, final_columns. This vector will tell you which vectors you ultimately removed in the end, and you can reference these columns in the original matrix.
To make things efficient, create an array where each element contains the total number of columns you want to remove at each iteration. Therefore, do something like:
cols_to_remove = [2 3 5];
The first element means you want to remove 2 columns in the first iteration, the second element means you want to remove 3 columns in the second iteration, and 5 columns in the last iteration. Because you're looping, it's a good idea to pre-allocate your matrix. In total, you're going to have 10 columns removed and populated in B, and since your random matrix has 4 rows, you should do this:
B = zeros(4,sum(cols_to_remove));
We are summing over cols_to_remove as this tells us how many columns we will ultimately be removing all together. One thing I'd like to mention is that you should make a copy of A before we start removing columns. That way, you're able to reference back into the original matrix.
Finally, without further ado, here's the code that I would write to tackle this problem:
column_choice = 1 : 20;
cols_to_remove = [2 3 5];
B = zeros(4,sum(cols_to_remove));
final_columns = zeros(1,sum(cols_to_remove));
A = randn(4,20); %// From your post
Acopy = A; %// Make a copy of the matrix
%// Keep track of which column we need to start populating
%// B at
counter = 1;
%// For each amount of columns we want to remove...
for cols = cols_to_remove
%// Randomly choose the columns we want to remove
to_remove = randperm(numel(column_choice), cols);
%// Remove from the A matrix and store into B
removed_cols = Acopy(:,to_remove);
Acopy(:,to_remove) = [];
B(:,counter : counter + cols - 1) = removed_cols;
%// Also add columns removed to final_columns
final_columns(counter : counter + cols - 1) = column_choice(to_remove);
%// Increment counter for the next spot to place columns
counter = counter + cols;
%// Also remove from column_choice
column_choice(to_remove) = [];
%// Continue your code here to process A and/or B
%//...
%//...
end
%// Remove copy to save memory
clear Acopy;
Therefore, final_columns will give you which columns were removed from the original matrix, and you can refer back to A to locate where these are. B will contain those removed columns from A and are all concatenated together.
Edit
As per your comments, you want to remove certain rows from each intermediate result. As such, you would specify which columns you want to remove in the second dimension of each matrix, then set it equal to []. Make sure you copy over each result into a new matrix before removing the columns. Also, you'll need to keep track of which indices from the original matrix you removed, so make that column_choice and final_columns vector again and repeat the saving logic that we have talked about before.
Therefore:
column_choice = 1:20;
final_columns = zeros(1,10);
A1 = A;
A1(:,[2 3]) = [];
final_columns(1:2) = column_choice([2 3]);
column_choice([2 3]) = [];
A2 = A1;
A2(:,[4 9 11]) = [];
final_columns(3:5) = column_choice([4 9 11]);
column_choice([4 9 11]) = [];
A3 = A2;
A3(:,[1 2 5 8 12]);
final_columns(6:10) = column_choice([1 2 5 8 12]);
column_choice([1 2 5 8 12]) = [];

Filtering a matrix by unique column elements

M is a matrix: M = [X Y Z] where X, Y and Z are column vectors.
What is the easiest way to filter M so that:
1- No element is repeated per column
2- The order of the rows remain (if an element appears twice in a column, then I want to delete the entire row where it appears the second time)
e.g:
M = [1 2 4;
1 3 5;
2 3 9]
would become
Mf = [1 2 4;
2 3 9]
I tried to use [u,~,ind] = unique(M,'rows') to have the elements for which one element in a column is repeated, but this function deals with the entire row (if only one element of the row is repeated, then the row is unique)
Here is a quick and dirty solution, should be fine as long as you M isn't too big. I've tested it on a few matrices and it seems to work as intended.
count=1;
for i=1:length(M(1,:))^2
[~,IA,~]=unique(M(:,count),'first');
if length(IA)~=length(M(:,1))
M=M(IA,:);
count=count-1;
end
count=count+1;
if count>length(M(:,1))
break
end
end
M