Merging elements of different cells - matlab

Suppose, we have a cell array consisting of ids and one attribute, e.g.
A{1,1}=[1 2;2 4]
A{1,2}=[2 3 5;8 5 6]
Now, I'd like to have a final output consisting of unique ids of two cells (first row values) and corresponding columns have attribute value of each cell separately.
i.e.
C =
[1] [ 2]
[2] [1x2 double] % 4 in first cell and 8 in second cell
[3] [ 5]
[5] [ 6]
it seems that it's not possible to use something like C=[unique(A{1,:}(1,:)')]. Any help is greatly appreciated.

Assuming that each cell has two rows and a variable amount of columns where the first row is the ID and the second row is an attribute, I'd consolidate all of the cells into a single 2D matrix and use accumarray. accumarray is very suitable here because you want to group values that belong to the same ID together and apply a function to it. In our case, our function will simply place the values in a cell array and we'll make sure that the values are sorted because the values that are grouped by accumarray per ID come into the function in random order.
Use cell2mat to convert the cells into a 2D matrix, transpose it so that it's compatible for accumarray, and use it. One thing I'll need to note is that should any IDs be missing, accumarray will make this slot empty. What I meant by missing is that in your example, the ID 4 is missing as there is a gap between 3 and 5 and also the ID 6 between 5 and 7 (I added the example in your comment to me). Because the largest ID in your data is 7, accumarray works by assigning outputs from ID 1 up to ID 7 in increments of 1. The last thing we would need to tackle is to eliminate any empty cells from the output of accumarray to complete the grouping.
BTW, I'm going to assume that your cell array consists of a single row of cells like your example.... so:
%// Setup
A{1,1}=[1 2;2 4];
A{1,2}=[2 3 5;8 5 6];
A{1,3}=[7;8];
%// Convert row of cell arrays to a single 2D matrix, then transpose for accumarray
B = cell2mat(A).';
%// Group IDs together and ensure they're sorted
out = accumarray(B(:,1), B(:,2), [], #(x) {sort(x)});
%// Add a column of IDs and concatenate with the previous output
IDs = num2cell((1:numel(out)).');
out = [IDs out];
%// Any cells from the grouping that are empty, eliminate
ind = cellfun(#isempty, out(:,2));
out(ind,:) = [];
We get:
out =
[1] [ 2]
[2] [2x1 double]
[3] [ 5]
[5] [ 6]
[7] [ 8]
>> celldisp(out(2,:))
ans{1} =
2
ans{2} =
4
8
If you'd like this done on a 2D cell array, where each row of this cell array represents a separate instance of the same problem, one suggestion I have is to perhaps loop over each row. Something like this, given your example in the comments:
%// Setup
A{1,1}=[1 2;2 4];
A{1,2}=[2 3 5;8 5 6];
A{1,3}=[7;8];
A{2,1}=[1 2;2 4];
A{2,2}=[1;7];
%// Make a cell array that will contain the output per row
out = cell(size(A,1),1);
for idx = 1 : size(A,1)
%// Convert row of cell arrays to a single 2D matrix, then transpose for accumarray
B = cell2mat(A(idx,:)).';
%// Group IDs together and ensure they're sorted
out{idx} = accumarray(B(:,1), B(:,2), [], #(x) {sort(x)});
%// Add a column of IDs and concatenate with the previous output
IDs = num2cell((1:numel(out{idx})).');
out{idx} = [IDs out{idx}];
%// Any cells from the grouping that are empty, eliminate
ind = cellfun(#isempty, out{idx}(:,2));
out{idx}(ind,:) = [];
end
We get:
>> out{1}
ans =
[1] [ 2]
[2] [2x1 double]
[3] [ 5]
[5] [ 6]
[7] [ 8]
>> out{2}
ans =
[1] [2x1 double]
[2] [ 4]
>> celldisp(out{1}(2,:))
ans{1} =
2
ans{2} =
4
8
>> celldisp(out{2}(1,:))
ans{1} =
1
ans{2} =
2
7

Related

Find corresponding array in a cell

Suppose I have a cell of arrays of the same size, for example
arr = {[1 NaN 2 ], ...
[NaN 4 7 ], ...
[3 4 NaN] };
and I also have a vector, for example
vec = [1 2 2];
How do I find the corresponding cell entry that matches the vector vec. Matching means the entries in the same location are the same, except for NaNs?
For this particular vector vec I would like to have 1 returned, since it matches the first row.
Another vector [5 4 7] would return 2.
Vectors that don't match like [7 7 7] and vectors that match more than one entry like [3 4 7] should throw an error.
Note that the vector [3 7 4] does not match the second entry, because the order is important.
For each cell element, just check if
all(isnan(cellElement) | cellElement == vec)
is true, which means, you found a match. If you convert your cell to a matrix checkMatrix with multiple rows and each row corresponding to one cellElement, you can even do it without implementing a loop by repeating vec vertically and comparing the whole matrix in a single step. You will have to tell all() to check along dimension 2 rather than dimension 1 and have find() detect all the matches, like so:
find( all( ...
isnan(checkMatrix) | checkMatrix == repmat(vec,size(checkMatrix, 1),1) ...
, 2)); % all() along dimension 2
So I thought about it and came up with this:
matching_ind = #(x, arr) find(...
cellfun(#(y) max(abs(not(x-y==0).*not(isnan(x-y)))),...
arr) == 0);
inds = matching_ind(vec, arr);
if length(inds) ~= 1
error('42');
end
See if this bsxfun based approach works for you -
A = vertcat(arr{:});
matching_ind = find(all(bsxfun(#eq,A,vec(:).') | isnan(A),2)) %//'
if numel(matching_ind)~=1
error('Error ID : 42.')
else
out = matching_ind(1);
end

size of inner elements of cells

I'm reading some data with their attribute (say A in which first row is ids and second row is their attribute value) . I'd like to place such data in a cell where the first column are the unique ids and second row their attribute. whenever there's duplicate values for the attribute, I'll put on the vacancy available on front of its row. for example I'd like to construct C
A =
1 2 3 2
2 4 5 9
C{1}=
1 2 0
2 4 9
3 5 0
when I'm going to test the size of inner homes in cell, e.g.
size(C{1},2)
ans = 3
size(C{1},1)
ans = 3
size(C{1}(1,:),2)
ans = 3
All return 3 since it occupies empty homes with 0. So how should I understand where to put my new data (e.g. (1,5))? Should I traverse or find the place of 0 and insert there?
Thanks for any help.
Why not use a cell-Array for these kind of problem? How did you generate your C matrix?
Even though you have used cell-Arrays for C matrix, each element of C is a matrix in your case, so that the dimensions should be constant.
I have used a cell array inside a matrix. (i.e) each elements takes its own size based on the duplicate sizes. for eg, you could see that C{2,2} has two values while C{1,2} and C{3,2} has only one values. you could easily check the size of them without checking for zeros. Note that, even if any values were zero, this code will still work.
The first column of the matrix represents identifiers while the second column represents the values which takes its own size based on the number of duplicates.
Here is my Implementation using accumarray and unique to generate C as a cell-array.
Code:
C = [num2cell(unique(A(1,:).')), accumarray(A(1,:).',A(2,:).',[],#(x) {x.'})]
Your Sample Input:
A = [1 2 3 2;
2 4 5 9];
Output:
>> C
C =
[1] [ 2]
[2] [1x2 double]
[3] [ 5]
>> size(C{2,2},2)
ans =
2
>> size(C{1,2},2)
ans =
1
From the DOC
Note: If the subscripts in subs are not sorted with respect to their linear indices, then accumarray might not always preserve the order of the data in val when it passes them to fun. In the unusual case that fun requires that its input values be in the same order as they appear in val, sort the indices in subs with respect to the linear indices of the output.
Another Example:
Input:
A = [1 2 1 2 3 1;
2 4 5 9 4 8];
Output:
C =
[1] [1x3 double]
[2] [1x2 double]
[3] [ 4]
Hope this helps!!

How using reshape with cell arrays?

Unfortunately I have to work with a dataset of cell arrays, which don't even have the same input..
My dataset (the relevant columns of cell arrays) look as follows:
Data =
1 'd2'
1 'd3'
2 'u2'
2 'd2'
2 'u3'
3 'e2'
... ...
I want to reshape them in a way, that all inputs of the second column of all rows containing the same number in the first column, are stored in new columns. Because the single rows of each number in the first column aren't always the same (but at highest 4) I wrote following code:
% creating 4 new cell arrays for the new columns
cells = cell(length(Data(:,1)),4);
Data = [Data,cells];
% reshaping Data
Data(:,3:6) = reshape(Data(Data(:,1) == 1,2),1,[]);
Data(:,3:6) = reshape(Data(Data(:,1) == 2,2),1,[]);
This would perfectly work with matrices. But unfortunately, it doesn't work on cell arrays!
Please could you help me out, where I have to place the curly brackets, so it would work? I didn't get it so far and maybe I'm just overseeing it now! ;-)
Thank you a lot!
Personally I find a loop to be the most simple and flexible solution in this case:
mydata={1 'd2'
1 'd3'
2 'u2'
2 'd2'
2 'u3'
3 'e2'}
list = unique([mydata{:,1}])
result = {};
for t=1:numel(list)
count=0;
for u =1:size(mydata,1)
if mydata{u,1}==list(1,t)
count = count+1;
result(t,count)=mydata(u,2)
end
end
end
Note that a vectorized approach will likely be more efficient, but unless your data is big it should not matter much.
This could be one approach that uses the masking capability of bsxfun -
%// Input
Data = {
1 'd2'
1 'd3'
2 'u2'
2 'd2'
2 'u3'
3 'e2'}
%// Find the IDs and the unique IDs
ids = cell2mat(Data(:,1))
id_out = num2cell([1:max(ids)]') %//'# To be used as the first col of desired o/p
%// Find the extents of each group/ID members
grp_extents = sum(bsxfun(#eq,[1:max(ids)],ids),1)
%// Or use accumarray which could be faster -
%// grp_extents = accumarray(ids,ones(1,numel(ids))).'
%// Get a cell array with the members (strings) from the second column of Data
%// put into specific columns based on their IDs
string_out = cell(max(grp_extents),numel(grp_extents))
string_out(bsxfun(#le,[1:max(grp_extents)]',grp_extents)) = Data(:,2) %//'# This is
%// where the masking is being used for logical indexing
%// Transpose the string cell array and horizontally concatenate with 1D
%// cell array containing the IDs to form the desired output
Data_out = [id_out string_out']
Output -
Data_out =
[1] 'd2' 'd3' []
[2] 'u2' 'd2' 'u3'
[3] 'e2' [] []

Matlab: 2d-array, rows different lengths

In Matlab, I want to create a two-dimensional array. However, I cannot create a matrix, because the rows are all different lengths.
I am new to Matlab, and I would normally do this in C++ by creating an array of pointers, with each pointer pointing towards its own array.
How should I do this in Matlab? Thanks.
You can use cell arrays, which can contain data of varying types and sizes.
Like this:
data = {[1]; [2,2]; [3,3,3]};
Check out here for more examples.
You could use a cell array:
C = {[1,2,3];
[1,2,3,4,5];
[1,2]};
Or pad with NaN or 0 or Inf etc
N = [1, 2, 3, NaN, NaN;
1, 2, 3, 4, 5;
1, 2, NaN, NaN, NaN]
It really depends on what you will be doing with your data next
Use cell ararys as mentioned by others. Listing out some code and comments to explain it -
%%// Create a cell array to store data
Arr = {[1 3 4 6 8];
[1 8 3];
[4 6 3 2];
[6 3 6 2 6 8]}
%%// Access element (3,4)
element = Arr{3}(4)
Outputs
Arr =
[1x5 double]
[1x3 double]
[1x4 double]
[1x6 double]
element =
2

Searching a cell array of vectors and returning indices

I have a 3000x1 cell array of vectors of different lengths and am looking for a way to search them all for a number and return the cell indices for the first and last occurrence of that number.
So my data looks like this:
[1]
[1 2]
[1 2]
[3]
[6 7 8 9]
etc
And I want to my results to look like this when I search for the number 1:
ans = 1 3
All the indices (e.g. [1 2 3] for 1) would also work, though the above would be better. So far I'm unable to solve either problem.
I've tried
cellfun(#(x) x==1, positions, 'UniformOutput', 0)
This returns a logical array, effectively putting me back at square 1. I've tried using find(cellfun...) but this gives the error undefined function 'find' for input arguments of type 'cell'. Most of the help I can find is for searching for strings within a cell array. Do I need to convert all my vectors to strings for this to work?
C = {[1]
[1 2]
[1 2]
[3]
[6 7 8 9]}; %// example data
N = 1; %// sought number
ind = cellfun(#(v) any(v==N), C); %// gives 1 for cells which contain N
first = find(ind,1);
last = find(ind,1,'last');
result = [ first last ];