Accessing data in structures without loops - matlab

I have a set of strings vals, for example:
vals = {'AD', 'BC'}
I also have a struct info, inside of which are structs nested in fields corresponding to the elements in the array vals (that would be 'AD' and 'BC' in this example), each in turn storing a number in a field named lastcontract.
I can use a for loop to extract lastcontract for each of the vals like this:
for index = 1:length(vals)
info.(vals{index}).lastcontract
end
I'd like to find a way of doing this without a loop if at all possible, but I'm not having luck. I tried:
info.(vals{1:2}).lastcontract
without success. I think arrayfun may be the appropriate way, but I can't figure out the right syntax.

It is actually possible here to manage without an explicit loop (nor arrayfun/cellfun):
C = struct2cell(info); %// Convert to cell array
idx = ismember(fieldnames(info), vals); %// Find fields
C = [C{idx}]; %// Flatten to structure array
result = [C.lastcontract]; %// Extract values
P.S
cellfun would be more appropriate here than arrayfun, because you iterate vals (a cell array). For the sake of practice, here's a solution with cellfun:
result = cellfun(#(x)info.(x).lastcontract, vals);

Related

Matlab - Using two index arrays to operate on a sub-matrix

I'm trying to figure out how to access Matlab sub-arrays (part of an array) with a generic set of subscript vectors.
In general, the problem is defined as:
Given two n-dim endpoints of an array index (both size nd), one having the initial set of indices (startInd) and the other having the last set of indices (endInd), how to access the sub-matrix which is included between the pair of index-sets?
For example, I want to replace this:
Mat=rand(10,10,10,10);
Mat(2:7, 1:6, 1:6, 2:8) = 1.0;
With an operation that can accept any set of two n-dim vectors specifying the indices for the last operation, which is "abstractly" expressed as:
Mat=rand(10,10,10,10);
startInd=[2 1 1 2];
endInd =[7 6 6 8];
IndexVar=???
Mat(IndexVar) = 1.0;
Thus I want to access the sub-matrix Mat(2:7, 1:6, 1:6, 2:8) using a variable or some other generic form that allows a generic n-dim. Preferably not a loop (as it is slow).
I have tried using something of this nature:
% Generate each index list separately:
nDims=length(startInd);
ind=cell(nDims,1);
for j=1:nDims
ind{j}=startInd(j):1:endInd(j);
end
% Access the matrix:
S.type = '()';
S.subs = ind;
Mat=subsasgn(Mat,S,1.0)
This seems to get the job done, but is very slow and memory-expansive, but might give someone an idea...
If you don't mind looping over dimensions (which should be much faster than looping over array entries):
indexVar = arrayfun(#(a,b) colon(a,b), startInd, endInd, 'UniformOutput', false);
Mat(indexVar{:}) = 1;
This uses arrayfun (essentially a loop) to create a cell array with the indexing vectors, which is then expanded into a comma-separated list.
Now that I see your code: this uses the same approach, only that the loop is replaced by arrayfun and the comma-separated list allows a more natural indexing syntax instead of subsasgn.

Do I always need to use a cell array to assign multiple values to a struct array?

I've got a nested struct array like
A(1).B(1).var1 = 1;
A(1).B(2).var1 = 2;
Now I want to change the values of var1 to using the elements of the vector x = [3; 4] for each of the respective values.
The result should be
A(1).B(1).var1 = 3;
A(1).B(2).var1 = 4;
I have tried
% Error : Scalar structure required for this assignment.
A(1).B.var1 = x;
% Error : Insufficient number of outputs from right hand side of equal sign to satisfy assignment.
[A(1).B.var1] = x(:);
Curiously, if x is a cell array, the second syntax works
x = {3, 4};
[A(1).B.var1] = x{:};
Luckily, it's not too complicated to convert my numeric vector to a cell array using mat2cell, but is that the only way to do this assignment without a for loop?
What's the correct syntax for multiple assignment to a nested struct array? Can I use numeric vectors or do I have to use cell arrays?
The statement
[A(1).B.var1] = x{:};
is shorthand for
[A(1).B.var1] = deal(x{:});
(see the documentation for deal).
Thus you can also write
[A(1).B.var1] = deal(3,4);
I'm not aware of any other way to assign different values to a field in a struct array in a single command.
If your values are in a numeric array, you can easily convert it to a cell array using num2cell (which is simpler than the mat2cell you found).
data = [3,4];
tmp = num2cell(data);
[A(1).B.var1] = tmp{:};
In general, struct arrays are rather awkward to use for cases like this. If you can, I would recommend that you store your data in normal numeric arrays, which make it easier to manipulate many elements at the same time. If you insist on using a struct array (which is convenient for certain situations), simply use a for loop:
data = [3,4];
for ii = 1:length(A(1).B)
A(1).B(ii).var1 = data(ii);
end
The other alternative is to use table.

Matlab find the smallest number within variables

Say I have 10 variables which have different numbers within them(num1 = 0.4123,num2 = 0.6223,num3 etc.). How can i find the variable which has the smallest number? Do i need to use recursive loop or is there any easy way to do it?
Having a bunch of numbers delineated by separate variable names is awful. I would recommend doing what #LuisMendo suggested and place them all into an array.
However, if you don't want to punch in all of those numbers into a single array, you can cheat and save all variables that start with num, then load it back into MATLAB inside a struct. Convert the struct into an array by converting it into a cell array, and then a numeric array.
Once you do that, use the min call that he was talking about. In other words:
save('temp.mat', 'num*'); %// Save all variables with num from workspace to file
s = load('temp.mat'); %// Reload back in as a structure
vals = cell2mat(struct2cell(s)); %// Convert from structure to numeric array
[~,idx] = min(vals); %// Find value that was the minimum
f = fieldnames(s); %// Get all of the variable names
disp(f{idx}); %// Display the variable that has the minimum
idx would be the number that resulted in the minimum. If you want to display the actual name of the variable that resulted in the minimum, you can use fieldnames to retrieve a list of all of the variables from the structure, then index into that with the minimum value to get the result.
Or, if you can bear the typing, just do this:
vals = [num1 num2 num3 num4 num5 num6 num7 num8 num9 num10];
[~,idx] = min(vals);
disp(['num' num2str(idx)]);
The first way would be preferable if you have a lot of variables that you want to find the min of. If this is the case, you should consider placing all of the values in an array and reformulate your code. Having a lot of variables declared in your code makes it unmanageable.
It would be much better to have all those variables as entries of a numeric vector:
num = [0.4123 0.6223];
Then you would use
[~, result] = min(num);
to get the index of the minimum element.

Substrings from a Cell Array in Matlab

I'm new to MATLAB and I'm struggling to comprehend the subtleties between array-wise and element wise operations. I'm working with a large dataset and I've found the simplest methods aren't always the fastest. I have a very large Cell Array of strings, like in this simplified example:
% A vertical array of same-length strings
CellArrayOfStrings = {'aaa123'; 'bbb123'; 'ccc123'; 'ddd123'};
I'm trying to extract an array of substrings, for example:
'a1'
'b1'
'c1'
'd1'
I'm happy enough with an element-wise reference like this:
% Simple element-wise substring operation
MySubString = CellArrayOfStrings{2}(3:4); % Expected result is 'b1'
But I can't work out the notation to reference them all in one go, like this:
% Desired result is 'a1','b1','c1','d1'
MyArrayOfSubStrings = CellArrayOfStrings{:}(3:4); % Incorrect notation!
I know that Matlab is capable of performing very fast array-wise operations, such as strcat, so I was hoping for a technique that works at a similar speed:
% An array-wise operation which works quickly
tic
speedTest = strcat(CellArrayOfStrings,'hello');
toc % About 2 seconds on my machine with >500K array elements
All the for loops and functions which use behind-the-scenes iteration I have tried run too slowly with my dataset. Is there some array-wise notation that would do this? Would somebody be able to correct my understanding of element-wise and array-wise operations?! Many thanks!
I can't work out the notation to reference them all in one go, like this:
MyArrayOfSubStrings = CellArrayOfStrings{:}(3:4); % Incorrect notation!
This is because curly braces ({}) return a comma-separated list, which is equivalent to writing the contents of these cells in the following way:
c{1}, c{2}, and so on....
When the subscript index refers to only one element, MATLAB's syntax allows to use parentheses (()) after the curly braces and further extract a sub-array (a substring in your case). However, this syntax is prohibited when the comma separated lists contains multiple items.
So what are the alternatives?
Use a for loop:
MyArrayOfSubStrings = char(zeros(numel(CellArrayOfStrings), 2));
for k = 1:size(MyArrayOfSubStrings, 1)
MyArrayOfSubStrings(k, :) = CellArrayOfStrings{k}(3:4);
end
Use cellfun (a slight variant of Dang Khoa's answer):
MyArrayOfSubStrings = cellfun(#(x){x(3:4)}, CellArrayOfStrings);
MyArrayOfSubStrings = vertcat(MyArrayOfSubStrings{:});
If your original cell array contains strings of a fixed length, you can follow Dan's suggestion and convert the cell array into an array of strings (a matrix of characters), reshape it and extract the desired columns:
MyArrayOfSubStrings =vertcat(CellArrayOfStrings{:});
MyArrayOfSubStrings = MyArrayOfSubStrings(:, 3:4);
Employ more complicated methods, such as regular expressions:
MyArrayOfSubStrings = regexprep(CellArrayOfStrings, '^..(..).*', '$1');
MyArrayOfSubStrings = vertcat(MyArrayOfSubStrings{:});
There are plenty solutions to pick from, just pick the one that fits you most :) I think that with MATLAB's JIT acceleration, a simple loop would be sufficient in most cases.
Also note that in all my suggestions the obtained cell array of substrings cell is converted into an array of strings (a matrix). This is just for the sake of the example; obviously you can keep the substrings stored in a cell array, should you decide so.
cellfun operates on every element of a cell array, so you could do something like this:
>> CellArrayOfStrings = {'aaa123'; 'bbb123'; 'ccc123'; 'ddd123'};
>> MyArrayofSubstrings = cellfun(#(str) str(3:4), CellArrayOfStrings, 'UniformOutput', false)
MyArrayofSubstrings =
'a1'
'b1'
'c1'
'd1'
If you wanted a matrix of strings instead of a cell array whose elements are the strings, use char on MyArrayOfSubstrings. Note that this is only allowed when each string is the same length.
You can do this:
C = {'aaa123'; 'bbb123'; 'ccc123'; 'ddd123'}
t = reshape([C{:}], 6, [])'
t(:, 3:4)
But only if your strings are all of equal length I'm afraid.
You can use char to convert them to a character array, do the indexing and convert it back to cell array
A = char(CellArrayOfStrings);
B = cellstr(A(:,3:4));
Note that if strings are of different lengths, char pads them with spaces at the end to create the array. Therefore if you index for a column that is beyond the length of one of the short strings you may receive some space characters.

Is there a splat operator (or equivalent) in Matlab?

If I have an array (of unknown length until runtime), is there a way to call a function with each element of the array as a separate parameter?
Like so:
foo = #(varargin) sum(cell2mat(varargin));
bar = [3,4,5];
foo(*bar) == foo(3,4,5)
Context: I have a list of indices to an n-d array, Q. What I want is something like Q(a,b,:), but I only have [a,b]. Since I don't know n, I can't just hard-code the indexing.
There is no operator in MATLAB that will do that. However, if your indices (i.e. bar in your example) were stored in a cell array, then you could do this:
bar = {3,4,5}; %# Cell array instead of standard array
foo(bar{:}); %# Pass the contents of each cell as a separate argument
The {:} creates a comma-separated list from a cell array. That's probably the closest thing you can get to the "operator" form you have in your example, aside from overriding one of the existing operators (illustrated here and here) so that it generates a comma-separated list from a standard array, or creating your own class to store your indices and defining how the existing operators operate for it (neither option for the faint of heart!).
For your specific example of indexing an arbitrary N-D array, you could also compute a linear index from your subscripted indices using the sub2ind function (as detailed here and here), but you might end up doing more work than you would for my comma-separated list solution above. Another alternative is to compute the linear index yourself, which would sidestep converting to a cell array and use only matrix/vector operations. Here's an example:
% Precompute these somewhere:
scale = cumprod(size(Q)).'; %'
scale = [1; scale(1:end-1)];
shift = [0 ones(1, ndims(Q)-1)];
% Then compute a linear index like this:
indices = [3 4 5];
linearIndex = (indices-shift)*scale;
Q(linearIndex) % Equivalent to Q(3,4,5)