Unfortunately I have to work with a dataset of cell arrays, which don't even have the same input..
My dataset (the relevant columns of cell arrays) look as follows:
Data =
1 'd2'
1 'd3'
2 'u2'
2 'd2'
2 'u3'
3 'e2'
... ...
I want to reshape them in a way, that all inputs of the second column of all rows containing the same number in the first column, are stored in new columns. Because the single rows of each number in the first column aren't always the same (but at highest 4) I wrote following code:
% creating 4 new cell arrays for the new columns
cells = cell(length(Data(:,1)),4);
Data = [Data,cells];
% reshaping Data
Data(:,3:6) = reshape(Data(Data(:,1) == 1,2),1,[]);
Data(:,3:6) = reshape(Data(Data(:,1) == 2,2),1,[]);
This would perfectly work with matrices. But unfortunately, it doesn't work on cell arrays!
Please could you help me out, where I have to place the curly brackets, so it would work? I didn't get it so far and maybe I'm just overseeing it now! ;-)
Thank you a lot!
Personally I find a loop to be the most simple and flexible solution in this case:
mydata={1 'd2'
1 'd3'
2 'u2'
2 'd2'
2 'u3'
3 'e2'}
list = unique([mydata{:,1}])
result = {};
for t=1:numel(list)
count=0;
for u =1:size(mydata,1)
if mydata{u,1}==list(1,t)
count = count+1;
result(t,count)=mydata(u,2)
end
end
end
Note that a vectorized approach will likely be more efficient, but unless your data is big it should not matter much.
This could be one approach that uses the masking capability of bsxfun -
%// Input
Data = {
1 'd2'
1 'd3'
2 'u2'
2 'd2'
2 'u3'
3 'e2'}
%// Find the IDs and the unique IDs
ids = cell2mat(Data(:,1))
id_out = num2cell([1:max(ids)]') %//'# To be used as the first col of desired o/p
%// Find the extents of each group/ID members
grp_extents = sum(bsxfun(#eq,[1:max(ids)],ids),1)
%// Or use accumarray which could be faster -
%// grp_extents = accumarray(ids,ones(1,numel(ids))).'
%// Get a cell array with the members (strings) from the second column of Data
%// put into specific columns based on their IDs
string_out = cell(max(grp_extents),numel(grp_extents))
string_out(bsxfun(#le,[1:max(grp_extents)]',grp_extents)) = Data(:,2) %//'# This is
%// where the masking is being used for logical indexing
%// Transpose the string cell array and horizontally concatenate with 1D
%// cell array containing the IDs to form the desired output
Data_out = [id_out string_out']
Output -
Data_out =
[1] 'd2' 'd3' []
[2] 'u2' 'd2' 'u3'
[3] 'e2' [] []
Related
In Matlab, is there a way to concatenate a non-scalar structure without losing the empty fields? This is interfering with my ability to index within the structure.
I would prefer not to populate all of my "y" fields with NaN for memory management reasons, but I can do this if it is the only work around.
"code" is always fully populated and has no empty cells. "y" could be fully populated but usually is not.
I am providing a quick example: simplified structure (it is really tens of thousands of entries with 50+ fields)
% create example structure
x = struct('y',{1 [] 3 4},'code', {{'a'}, {'b'}, {'c'}, {'b'}});
% concatenate
out = [x.y];
% find indices with code 'b'
ind = find(strcmpi([x.code], 'b'));
% desired output
outSub = out(ind)
I would expect out to yield:
out = [1 NaN 3 4]
Instead I get:
out = [1 3 4]
When trying to use code to create an index to find the values in out that match the desired code value, this obviously does not work.
Error: Index exceeds the number of array elements (3).
The desired output would yield:
out = [2 4];
outSub = [NaN 4]
I am fully open to indexing in a different way as well.
Using the comment above, here is the final solution:
% create example structure
x = struct('y',{1 [] 3 4},'code', {{'a'}, {'b'}, {'c'}, {'b'}});
% concatenate
out = {x.y};
% find indices with code 'b'
ind = find(strcmpi([x.code], 'b'));
% desired output - cell array
outSubCell = out(ind);
% substitute [] for NaN
outSubCell(cellfun('isempty',outSubCell)) = {NaN};
% convert output to double array
outSub = cell2mat(outSubCell)
I Have a 8x18 structure with each cel containing a column vector of occurrences of a single event. I want to obtain data from some of these fields concatenated in a single array, without having to loop through it. I can't seem to find a way to vertically concatenate the fields I am interested in in a single array.
As an example I create the following structure with between 1 and 5 occurrences per cell:
s(62).vector(8,18).heading.occurrences=[1;2;3];
for i=1:62
for j=1:8
for k=1:18
y=ceil(rand(1)*5);
s(i).vector(j,k).heading.occurrences=rand(y,1);
end
end
end
Now if want to obtain all occurrences in several cells while keeping i constant at for instant i=1 the following works:
ss=s(1).vector([1 26 45]);
h=[ss.heading];
cell2mat({h.occurrences}')
Now I would want to do the same for s, for instance s([1 2 3]).vector([1 26 45]), how would that work? I have tried xx=s([1 2 3]), yy=xx.vector([1 26 45]) but this however yields the error:
Expected one output from a curly brace or dot indexing expression, but there were 3 results.
Is this also possible with a vector operation?
Here's a vectorized solution that accommodates using index vectors for s and the field vector:
sIndex = [1 2 3]; % Indices for s
vIndex = [1 26 45]; % Indices for 'vector' field
v = reshape(cat(3, s(sIndex).vector), 144, []);
h = [v(vIndex, :).heading];
out = vertcat(h.occurrences);
It uses cat to concatenate all the vector fields into an 8-by-18-by-numel(sIndex) matrix, reshapes that into a 144-by-numel(sIndex) matrix, then indexes the rows specified by vIndex and collects their heading and occurrences fields, using vertcat instead of cell2mat.
It's difficult to vectorize the entire operation, but this should work.
% get vector field and store in cell array
s_new = { s(1:3).vector };
% now extract heading field, this is a cell-of-cells
s_new_heading = cellfun(#(x) { x.heading }', s_new, 'UniformOutput', false);
occurences = {};
for iCell = 1:length(s_new_heading)
% use current cell
cellHere = s_new_heading{iCell};
% retain indices of interest, these could be different for each cell
cellHere = cellHere([ 1 26 45 ]);
% extract occurrences
h = cellfun(#(x) x.occurrences, cellHere, 'UniformOutput', false);
h_mat = cell2mat(h);
% save them in cell array
occurences = cat(1, occurences, h_mat);
end
Suppose, we have a cell array consisting of ids and one attribute, e.g.
A{1,1}=[1 2;2 4]
A{1,2}=[2 3 5;8 5 6]
Now, I'd like to have a final output consisting of unique ids of two cells (first row values) and corresponding columns have attribute value of each cell separately.
i.e.
C =
[1] [ 2]
[2] [1x2 double] % 4 in first cell and 8 in second cell
[3] [ 5]
[5] [ 6]
it seems that it's not possible to use something like C=[unique(A{1,:}(1,:)')]. Any help is greatly appreciated.
Assuming that each cell has two rows and a variable amount of columns where the first row is the ID and the second row is an attribute, I'd consolidate all of the cells into a single 2D matrix and use accumarray. accumarray is very suitable here because you want to group values that belong to the same ID together and apply a function to it. In our case, our function will simply place the values in a cell array and we'll make sure that the values are sorted because the values that are grouped by accumarray per ID come into the function in random order.
Use cell2mat to convert the cells into a 2D matrix, transpose it so that it's compatible for accumarray, and use it. One thing I'll need to note is that should any IDs be missing, accumarray will make this slot empty. What I meant by missing is that in your example, the ID 4 is missing as there is a gap between 3 and 5 and also the ID 6 between 5 and 7 (I added the example in your comment to me). Because the largest ID in your data is 7, accumarray works by assigning outputs from ID 1 up to ID 7 in increments of 1. The last thing we would need to tackle is to eliminate any empty cells from the output of accumarray to complete the grouping.
BTW, I'm going to assume that your cell array consists of a single row of cells like your example.... so:
%// Setup
A{1,1}=[1 2;2 4];
A{1,2}=[2 3 5;8 5 6];
A{1,3}=[7;8];
%// Convert row of cell arrays to a single 2D matrix, then transpose for accumarray
B = cell2mat(A).';
%// Group IDs together and ensure they're sorted
out = accumarray(B(:,1), B(:,2), [], #(x) {sort(x)});
%// Add a column of IDs and concatenate with the previous output
IDs = num2cell((1:numel(out)).');
out = [IDs out];
%// Any cells from the grouping that are empty, eliminate
ind = cellfun(#isempty, out(:,2));
out(ind,:) = [];
We get:
out =
[1] [ 2]
[2] [2x1 double]
[3] [ 5]
[5] [ 6]
[7] [ 8]
>> celldisp(out(2,:))
ans{1} =
2
ans{2} =
4
8
If you'd like this done on a 2D cell array, where each row of this cell array represents a separate instance of the same problem, one suggestion I have is to perhaps loop over each row. Something like this, given your example in the comments:
%// Setup
A{1,1}=[1 2;2 4];
A{1,2}=[2 3 5;8 5 6];
A{1,3}=[7;8];
A{2,1}=[1 2;2 4];
A{2,2}=[1;7];
%// Make a cell array that will contain the output per row
out = cell(size(A,1),1);
for idx = 1 : size(A,1)
%// Convert row of cell arrays to a single 2D matrix, then transpose for accumarray
B = cell2mat(A(idx,:)).';
%// Group IDs together and ensure they're sorted
out{idx} = accumarray(B(:,1), B(:,2), [], #(x) {sort(x)});
%// Add a column of IDs and concatenate with the previous output
IDs = num2cell((1:numel(out{idx})).');
out{idx} = [IDs out{idx}];
%// Any cells from the grouping that are empty, eliminate
ind = cellfun(#isempty, out{idx}(:,2));
out{idx}(ind,:) = [];
end
We get:
>> out{1}
ans =
[1] [ 2]
[2] [2x1 double]
[3] [ 5]
[5] [ 6]
[7] [ 8]
>> out{2}
ans =
[1] [2x1 double]
[2] [ 4]
>> celldisp(out{1}(2,:))
ans{1} =
2
ans{2} =
4
8
>> celldisp(out{2}(1,:))
ans{1} =
1
ans{2} =
2
7
I wonder how to do this in MATLAB.
I have a={1;2;3} and would like to create a cell array
{{1,1};{1,2};{1,3};{2,1};{2,2};{2,3};{3,1};{3,2};{3,3}}.
How can I do this without a for loop?
You can use allcomb from MATLAB File-exchange to help you with this -
mat2cell(allcomb(a,a),ones(1,numel(a)^2),2)
Just for fun, using kron and repmat:
a = {1;2;3}
b = mat2cell([kron(cell2mat(a),ones(numel(a),1)) repmat(cell2mat(a),numel(a),1)])
Here square brackets [] are used to perform a concatenation of both column vectors, where each is defined either by kron or repmat.
This can be easily generalized, but I doubt this is the most efficient/fast solution.
Using repmat and mat2cell
A = {1;2;3};
T1 = repmat(A',[length(A) 1]);
T2 = repmat(A,[1 length(A)]);
C = mat2cell(cell2mat([T1(:),T2(:)]),ones(length(T1(:)),1),2);
You can use meshgrid to help create unique permutations of pairs of values in a by unrolling both matrix outputs of meshgrid such that they fit into a N x 2 matrix. Once you do this, you can determine the final result using mat2cell to create your 2D cell array. In other words:
a = {1;2;3};
[x,y] = meshgrid([a{:}], [a{:}]);
b = mat2cell([x(:) y(:)], ones(numel(a)*numel(a),1), 2);
b will contain your 2D cell array. To see what's going on at each step, this is what the output of the second line looks like. x and y are actually 2D matrices, but I'm going to unroll them and display what they both are in a matrix where I've concatenated both together:
>> disp([x(:) y(:)])
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Concatenating both vectors together into a 2D matrix is important for the next line of code. This is a vital step in order to achieve what you want. After the second line of code, the goal will be to make each element of this concatenated matrix into an individual cell in a cell array, which is what mat2cell is doing in the end. By running this last line of code, then displaying the contents of b, this is what we get:
>> format compact;
>> celldisp(b)
b{1} =
1 1
b{2} =
1 2
b{3} =
1 3
b{4} =
2 1
b{5} =
2 2
b{6} =
2 3
b{7} =
3 1
b{8} =
3 2
b{9} =
3 3
b will be a 9 element cell array and within each cell is another cell array that is 1 x 2 which stores one row of the concatenated matrix as individual cells.
I've got 2 string cell arrays, one is the unique version of the other. I would like to count the number of occurrence of each values in the unique cell array given the other cell array. I got a large cell array so I thought I'd try my best to find answers to a more faster approach as oppose to looping...
An example:
x = {'the'
'the'
'aaa'
'b'
'the'
'c'
'c'
'd'
'aaa'}
y=unique(x)
I am looking for an output in any form that contains something like the following:
'aaa' = 2
'b' = 1
'c' = 2
'd' = 1
'the' = 3
Any ideas?
One way is to count the indices unique finds:
[y, ~, idx] = unique(x);
counts = histc(idx, 1:length(y));
which gives
counts =
2
1
2
1
3
in the same order as y.
histc is my default fallback for counting things, but the function I always forget about is probably better in this case:
counts = accumarray(idx, 1);
should give the same result and is probably more efficient.