Matlab: Get Fields of Structures Inside Cells, 2D Strucutre Array - matlab

I am hoping to have a multidimensional array of structures, but I can't seem to get at the field of the contained elements. or in code:
mySample = struct('a', zeros(numA),'b', zeros(numB));
Data = cells(height,width);
disp(Data(1,1).a);
The bottom line fails with an error such as
"Improper index matrix reference."
How is a 2D array of structures done in Matlab?

There are a couple of ways to create an array of structures ("structure array" or "struct array"). Note that in a struct array, each element must have the same fields. For example, if s(1) has fields "a" and "b", then s(2)..s(n) must have fields "a" and "b".
% num rows
n = 10;
% num cols
m = 50;
% method 1, which will repeat a structure
s = struct('field1', 10, 'field2', 20);
sArray = repmat(s, n, m);
% method 2, which initializes each field to empty []
sArray(n,m) = struct('field1', [], 'field2', []);
You can expand on that to go beyond the second dimension eaisly:
sArray(n,m,p) = struct('field1', [], 'field2', []);
You could also preallocate the array and use a for-loop to set the value of each field. Additionally:
help deal
help structfun
You could also create a cell array of structures, which provides more flexibility: each structure in the cell array may have different fields.
c = cell(1,2);
c{1} = struct('a', 1, 'b', 2);
c{2} = struct('z', 0, 'q', 5);

2D array of structures can be done in 2 ways:
Cell array of structs - Heterogenic container. This means that each struct can be different.
x = {struct('a',1,'b',2), struct('c',3) ; struct() ; struct('aa',[5 6])};
disp(x{1,2});
Arrays of structs - Homegenic container. This means that all strucs must be the same - type safety.
x = struct('a',{1 2 3 ; 1 2 3},'b', {4 5 6; 7 8 9 });
disp(x(1,2));

Related

Vectorized extracting a list from MATLAB Cell Array

I have a two-index MATLAB Cell Array (AllData{1:12,1:400}) where each element is a structure. I would like to extract a list of values from this structure.
For example, I would like to do something like this to obtain a list of 12 values from this structure
MaxList = AllData{1:12,1}.MaxVal;
This comes up with an error
Expected one output from a curly brace or dot indexing expression, but there were 12 results
I can do this as a loop, but would prefer to vectorize:
clear MaxList
for i=1:12
MaxList(i) = AllData{i,1}.MaxVal;
end
How do I vectorize this?
If all structs are scalar and have the same fields, it's better to avoid the cell array and directly use a struct array. For example,
clear AllData
AllData(1,1).MaxVal = 10;
AllData(1,2).MaxVal = 11;
AllData(2,1).MaxVal = 12;
AllData(2,2).MaxVal = 13;
[AllData(:).OtherField] = deal('abc');
defines a 2×2 struct array. Then, what you want can be done simply as
result = [AllData(:,1).MaxVal];
If you really need a cell array of scalar structs, such as
clear AllData
AllData{1,1} = struct('MaxVal', 10, 'OtherField', 'abc');
AllData{1,2} = struct('MaxVal', 11, 'OtherField', 'abc');
AllData{2,1} = struct('MaxVal', 12, 'OtherField', 'abc');
AllData{2,2} = struct('MaxVal', 13, 'OtherField', 'abc');
you can use these two steps:
tmp = [AllData{:,1}];
result = [tmp.MaxVal];
Using the answer above as a starting point, it is also possible to extract a 2d array of vectors from the Cell Array Structure. In each element of the 2d AllData cell array is a 2048 element vector called DataSet. The following commands will extract all of these vectors to a 2d array:
tmp = [AllData{:,1}];
len = length(tmp(1).DataSet); % Gets the length of one vector of DataSet
tmp2 = [tmp.DataSet]; % Extracts all vectors to a large 1-d array
AllDataSets = reshape(tmp2,len,[])'; % Reshapes into a 2d array of vectors

cellfun with two arrays of indices

I have one big cell with N by 1 dimension. Each row is either a string or a double. A string is a variable name and the sequential doubles are its values until the next string (another variable name). For example:
data = {
var_name1;
val1;
val2;
val3;
val4;
val5;
var_name2;
val1;
val2;
var_name3;
val1;
val2;
val3;
val4;
val5;
val6;
val7}
and so on. I want to separate the data cell into three cells; {var_name and it's 5 values}, {var_name and it's 2 values}, {var_name and it's 7 values}. I try not to loop as much as possible and have found that vectorization along with cellfun works really well. Is it possible? The data cell has close to million rows.
I believe the following should do what you're after. The main pieces are to use cumsum to work out which name each row corresponds to, and then accumarray to build up lists per name.
% Make some data
data = {'a'; 1; 2; 3;
'b'; 4; 5;
'c'; 6; 7; 8; 9;
'd';
'e'; 10; 11; 12};
% Which elements are the names?
isName = cellfun(#ischar, data);
% Use CUMSUM to work out for each row, which name it corresponds to
whichName = cumsum(isName);
% Pick out only the values from 'data', and filter 'whichName'
% for just the values
justVals = data(~isName);
whichName = whichName(~isName);
% Use ACCUMARRAY to build up lists per name. Note that the function
% used by ACCUMARRAY must return something scalar from a column of
% values, so we return a scalar cell containing a row-vector
% of those values
listPerName = accumarray(whichName, cell2mat(justVals), [], #(x) {x.'});
% All that remains is to prepend the name to each cell. This ends
% up with each row of output being a cell like {'a', [1 2 3]}.
% It's simple to make the output be {'a', 1, 2, 3} by adding
% a call to NUM2CELL on 'v' in the anonymous function.
nameAndVals = cellfun(#(n, v) [{n}, v], data(isName), listPerName, ...
'UniformOutput', false);
cellfun is for applying a function to each element of a cell.
When you pass multiple arguments to cellfun like that, it takes the ith argument of data, indx_first, and indx_last, and uses each of them in the anonymous function. Substituting those variables in, your function evaluates to x(y : z), for each element x in data. In other words, you're doing data{i}(y : z), i.e., indexing the actual elements of the cell array, rather than indexing the cell array itself. I don't think that's what you want. Really you want data{y : z}, for each (y, z) pair given by corresponding elements in indx_first and indx_last, right?
If that's indeed the case, I don't see a vectorized way to solve your problem, because each of the "variables" has different size. But you do know how many variables you have, which is the size of indx_first. So I'd pre-allocate and then loop, like so:
>> vars = cell(length(indx_first), 2);
>> for i = 1:length(vars)
vars{i, 1} = data{indx_first(i) - 1}; % store variable name in first column
vars{i, 2} = [data{indx_first(i) : indx_last(i)}]; % store data in last column
end
At the end of this, you'll have a cell array with 2 columns. The first column in each row is the name of the variable. The second is the actual data. I.e.
{'var_name1', [val1 val2 val3 val4 val5];
'var_name2', [val1 val2];
.
.
.

Logical index of structure with various dimensioned fields

Lets say I have a structure like this:
S.index = 1:10;
S.testMatrix = zeros(3,3,10);
for x = 1:10
S.testMatrix(:,:,x) = magic(3) + x;
end
S.other = reshape(0:39, 4, 10);
It contains a 1x10 vector, a 3x3x10 multi-paged array and a 4x10 matrix. Now say I want to select only the entries corresponding to the indices between 2 and 8. mask = S.index > 2 & S.index < 8;
I tried structfun(#(x) x(mask), S, 'UniformOutput', 0); first which correctly worked for only the vector, which makes perfect sense. So then I figured all I needed to do was expand my mask. So I did this.
test = structfun(#(x) x(repmat(mask, size(x, ndims(x) - 1), 1)), S, 'UniformOutput',0);
The expanded mask was correct for the matrix but not the multi-paged array. And the 2D matrix was flattened to a vector.
If I was going to index these elements individually I would do something like this:
S2.index = S.index(mask);
S2.other = S.other(:,mask);
S2.testMatrix = S.testMatrix(:,:,mask);
My use case is for hundreds of structures each with 20+ fields. How do I script the indexing? The exact problem occurs is limited to a structure with 1xN vectors, 3xN and 4xN matrices and 3x3xN arrays. The mask is constructed based on one of the vectors representing time. The field names are constant for each structure so I could brute force the thing and type in the commands and run it as a function, but I'm looking for an intelligent way to index it.
Update: Here is something that looks promising.
fn = fieldnames(S);
for x = 1:length(fn)
extraDim = repmat({':'}, 1, ndims(S.(fn{x})) - 1);
S2.(fn{x}) = S.(fn{x})(extraDim{:}, mask);
end
You can exploit the fact that the string ':' can be used as an index instead of :, and build a comma-separated list of that string repeated the appropriate number of times for each field:
s = {':',':'}; % auxilary cell array to generate the comma-separated list
S2 = structfun(#(f) f(s{1:ndims(f)-1}, mask), S, 'UniformOutput', false);

Sort a structure of arrays in Matlab

I have a structure of arrays StockInfo in Matlab. The fields of the structure StockInfo are as follows:
StockInfo =
Name: {10x1 cell}
Values: [10x6 double]
Return: [10x1 double]
I need to sort StockInfo based on the field Return, so that each array in the struct is sorted accordingly. Any idea how to do it?
As I mentioned in the comment above, you question is unclear. I think you are confusing structures and structure arrays. This post might be of help.
That said, here is an example to show what I think you meant to do.
First I create a structure array with some random data:
% cell array of 10 names
names = arrayfun(#(k) randsample(['A':'Z' 'a':'z' '0':'9'],k), ...
randi([5 10],[10 1]), 'UniformOutput',false);
% 10x6 matrix of values
values = rand(10,6);
% 10x1 vector of values
returns = randn(10,1);
% 10x1 structure array
StockInfo = struct('Name',names, 'Values',num2cell(values,2), ...
'Return',num2cell(returns));
The created variable is a an array of structures:
>> StockInfo
StockInfo =
10x1 struct array with fields:
Name
Values
Return
where each element is a structure with the following fields:
>> StockInfo(1)
ans =
Name: 'Pr3N4LTEi'
Values: [0.7342 0.1806 0.7458 0.8044 0.6838 0.1069]
Return: -0.3818
Next can sort this struct array by the "return" field (each struct has a corresponding scalar value):
[~,ord] = sort([StockInfo.Return]);
StockInfo = StockInfo(ord);
The result is that the array is now sorted by the "return" values in ascending order:
>> [StockInfo.Return]
ans =
Columns 1 through 8
-0.3818 0.4289 -0.2991 -0.8999 0.6347 0.0675 -0.1871 0.2917
Columns 9 through 10
0.9877 0.3929
You can sort structure arrays based on fields with the FileExchange function nestedSortStruct (link).
B = nestedSortStruct(A, 'Return');
A solution with built-in functions only could be:
[~, ix] = sort(StockInfo.Return);
StockInfo = struct(...
'Name', {StockInfo.Name{ix}}, ...
'Values', StockInfo.Values(ix), ...
'Return', StockInfo.Return(ix));
Replace ~ with any unused identifier if your Matlab is older and does not support unused output arguments.

Sorting in Matlab

I would like to sort elements in a comma-separated list. The elements in the list are structs and I would like the list to be sorted according to one of the fields in the struct.
For example, given the following code:
L = {struct('obs', [1 2 3 4], 'n', 4), struct('obs', [6 7 5 3], 'n', 2)};
I would want to have a way to sort L by the field 'n'. Matlab's sort function only works on matrices or arrays and on lists of strings (not even lists of numbers).
Any ideas on how that may be achieved?
Thanks,
Micha
I suggest you do this in three steps: Extract 'n' into an array, sort the array and consequently reorder the elements of the cell array.
%# get the n's
nList = cellfun(#(x)x.n,L);
%# sort the n's and capture the reordering in sortIdx
[sortedN,sortIdx] = sort(nList);
%# use the sortIdx to sort L
sortedL = L(sortIdx)
This is a bit of an aside, but if all of the structures in your cell array L have the same fields (obs and n in this case), then it would make more sense to store L as a 1-by-N structure array instead of a 1-by-N cell array of 1-by-1 structures.
To convert the 1-by-N cell array of structures to a 1-by-N structure array, you can do the following:
L = [L{:}];
Or, you can create the structure array directly using one call to STRUCT instead of creating the cell array of structures as you did in your example:
L = struct('obs',{[1 2 3 4],[6 7 5 3]},'n',{4,2});
Now the solution from Jonas becomes even simpler:
[junk,sortIndex] = sort([L.n]); %# Collect field n into an array and sort it
sortedL = L(sortIndex); %# Apply the sort to L
For what it's worth, here is the solution in Python:
L = [{'n': 4, 'obs': [1, 2, 3, 4]}, {'n': 2, 'obs': [6, 7, 5, 3]}]
L.sort(lambda a,b: a['n'].__cmp__(b['n']))
# L is now sorted as you wanted