Acessing multiple structure fields in matlab without looping through it - matlab

I Have a 8x18 structure with each cel containing a column vector of occurrences of a single event. I want to obtain data from some of these fields concatenated in a single array, without having to loop through it. I can't seem to find a way to vertically concatenate the fields I am interested in in a single array.
As an example I create the following structure with between 1 and 5 occurrences per cell:
s(62).vector(8,18).heading.occurrences=[1;2;3];
for i=1:62
for j=1:8
for k=1:18
y=ceil(rand(1)*5);
s(i).vector(j,k).heading.occurrences=rand(y,1);
end
end
end
Now if want to obtain all occurrences in several cells while keeping i constant at for instant i=1 the following works:
ss=s(1).vector([1 26 45]);
h=[ss.heading];
cell2mat({h.occurrences}')
Now I would want to do the same for s, for instance s([1 2 3]).vector([1 26 45]), how would that work? I have tried xx=s([1 2 3]), yy=xx.vector([1 26 45]) but this however yields the error:
Expected one output from a curly brace or dot indexing expression, but there were 3 results.
Is this also possible with a vector operation?

Here's a vectorized solution that accommodates using index vectors for s and the field vector:
sIndex = [1 2 3]; % Indices for s
vIndex = [1 26 45]; % Indices for 'vector' field
v = reshape(cat(3, s(sIndex).vector), 144, []);
h = [v(vIndex, :).heading];
out = vertcat(h.occurrences);
It uses cat to concatenate all the vector fields into an 8-by-18-by-numel(sIndex) matrix, reshapes that into a 144-by-numel(sIndex) matrix, then indexes the rows specified by vIndex and collects their heading and occurrences fields, using vertcat instead of cell2mat.

It's difficult to vectorize the entire operation, but this should work.
% get vector field and store in cell array
s_new = { s(1:3).vector };
% now extract heading field, this is a cell-of-cells
s_new_heading = cellfun(#(x) { x.heading }', s_new, 'UniformOutput', false);
occurences = {};
for iCell = 1:length(s_new_heading)
% use current cell
cellHere = s_new_heading{iCell};
% retain indices of interest, these could be different for each cell
cellHere = cellHere([ 1 26 45 ]);
% extract occurrences
h = cellfun(#(x) x.occurrences, cellHere, 'UniformOutput', false);
h_mat = cell2mat(h);
% save them in cell array
occurences = cat(1, occurences, h_mat);
end

Related

Matlab - Concatenating non-scalar nested structures with empty fields without losing proper indexing

In Matlab, is there a way to concatenate a non-scalar structure without losing the empty fields? This is interfering with my ability to index within the structure.
I would prefer not to populate all of my "y" fields with NaN for memory management reasons, but I can do this if it is the only work around.
"code" is always fully populated and has no empty cells. "y" could be fully populated but usually is not.
I am providing a quick example: simplified structure (it is really tens of thousands of entries with 50+ fields)
% create example structure
x = struct('y',{1 [] 3 4},'code', {{'a'}, {'b'}, {'c'}, {'b'}});
% concatenate
out = [x.y];
% find indices with code 'b'
ind = find(strcmpi([x.code], 'b'));
% desired output
outSub = out(ind)
I would expect out to yield:
out = [1 NaN 3 4]
Instead I get:
out = [1 3 4]
When trying to use code to create an index to find the values in out that match the desired code value, this obviously does not work.
Error: Index exceeds the number of array elements (3).
The desired output would yield:
out = [2 4];
outSub = [NaN 4]
I am fully open to indexing in a different way as well.
Using the comment above, here is the final solution:
% create example structure
x = struct('y',{1 [] 3 4},'code', {{'a'}, {'b'}, {'c'}, {'b'}});
% concatenate
out = {x.y};
% find indices with code 'b'
ind = find(strcmpi([x.code], 'b'));
% desired output - cell array
outSubCell = out(ind);
% substitute [] for NaN
outSubCell(cellfun('isempty',outSubCell)) = {NaN};
% convert output to double array
outSub = cell2mat(outSubCell)

operations with structure in matlab

I have a structure 1x300 called struct with 3 fields but I'm using only the third field called way. This field is, for each 300 lines, a vertor of index.
Here an exemple with 3 lines to explain my problem : I woud like to search if the last index of the first line is present in an other vector (line) of the field way.
way
[491751 491750 491749 492772 493795 494819 495843 496867]
[491753 491754 491755 491756]
[492776 493800 494823 495847 496867]
I tried with intersect function :
Inter=intersect(struct(1).way(end), struct.way);
but Matlab returns me an error :
Error using intersect (line 80)
Too many input arguments.
Error in file2 (line 9)
Inter=intersect(struct(1).way(end), struct.way);
I don't understand why I have this error. Any explanations and/or other(s) solution(s)?
Let the data be defined as
st(1).way = [491751 491750 491749 492772 493795 494819 495843 496867];
st(2).way = [491753 491754 491755 491756];
st(3).way = [492776 493800 494823 495847 496867]; % define the data
sought = st(1).way(end);
If you want to know which vectors contain the desired value: pack all vectors into a cell array and pass that to cellfun with an anonymous function as follows:
ind = cellfun(#(x) ismember(sought, x), {st.way});
This gives:
ind =
1×3 logical array
1 0 1
If you want to know for each vector the indices of the matching: modify the anonymous function to output a cell with the indices:
ind = cellfun(#(x) {find(x==sought)}, {st.way});
or equivalently
ind = cellfun(#(x) find(x==sought), {st.way}, 'UniformOutput', false);
The result is:
ind =
1×3 cell array
[8] [1×0 double] [5]
Or, to exclude the reference vector:
n = 1; % index of vector whose final element is sought
ind = cellfun(#(x) {find(x==st(n).way(end))}, {st([1:n-1 n+1:end]).way});
You propbably want to use ismember.
Consider what you are passing to the intersect/ismember functions too, struct.way isn't a valid argument, you may need to loop to iterate over each line of your struct (in this case it would be easier to have a cell array, or matrix with equal length rows).
output = zeros(300);
for ii = 1:300
for jj = 1:300
if ii ~= jj && ismember(struct(ii).way(end), struct(jj).way)
output(ii,jj) = 1;
end
end
end
Now you have a matrix output where the elements which are 1 identify a match between the last element in way in the struct row ii and the vector struct(jj).way, where ii are the matrix row numbers and jj the column numbers.

How to make calculations on certain cells (within a table) that meet specific criteria?

I have the following code:
L_sum = zeros(height(ABC),1);
for i = 1:height(ABC)
L_sum(i) = sum(ABC{i, ABC.L(i,4:281)});
end
Here my table:
Problem: My sum function sums the entire row values (col. 4-281) per date whereas I only want those cells to be added whose headers are in the cell array of ABC.L, for any given date.
X = ABC.L{1, 1}; gives (excerpt):
Red arrow: what sum function is referencing (L of same date).
Green arrow: what I am trying to reference now (L of previous date).
Thanks for your help
In general, in matlab you dont need to use for loops to do simple operations like selective sums.
Example:
Data=...
[1 2 3;
4 5 6;
7 8 9;
7 7 7];
NofRows=size(Data,1);
RowsToSum=3:NofRows;
ColToSum=[1,3];
% sum second dimension 2d array
Result=sum(Data(RowsToSum,ColToSum), 2)
% table mode
DataTable=array2table(Data);
Result2=sum(DataTable{RowsToSum,ColToSum}, 2)
To do that you need to first extract the columns you want to sum, and then sum them:
% some arbitrary data:
ABC = table;
ABC.L{1,1} = {'aa','cc'};
ABC.L{2,1} = {'aa','b'};
ABC.L{3,1} = {'aa','d'};
ABC.L{4,1} = {'b','d'};
ABC{1:4,2:5} = magic(4);
ABC.Properties.VariableNames(2:5) = {'aa','b','cc','d'}
% summing the correct columns:
L_sum = zeros(height(ABC),1);
col_names = ABC.Properties.VariableNames; % just to make things shorter
for k = 1:height(ABC)
% the following 'cellfun' compares each column to the values in ABC.L{k},
% and returns a cell array of the result for each of them, then
% 'cell2mat' converts it to logical array, and 'any' combines the
% results for all elements in ABC.L{k} to one logical vector:
col_to_sum = any(cell2mat(...
cellfun(#(x) strcmp(col_names,x),ABC.L{k},...
'UniformOutput', false).'),1);
% then a logical indexing is used to define the columns for summation:
L_sum(k) = sum(ABC{k,col_to_sum});
end

matlab parse file into cell array

I have a file in the following format in matlab:
user_id_a: (item_1,rating),(item_2,rating),...(item_n,rating)
user_id_b: (item_25,rating),(item_50,rating),...(item_x,rating)
....
....
so each line has values separated by a colon where the value to the left of the colon is a number representing user_id and the values to the right are tuples of item_ids (also numbers) and rating (numbers not floats).
I would like to read this data into a matlab cell array or better yet ultimately convert it into a sparse matrix wherein the user_id represents the row index, and the item_id represents the column index and store the corresponding rating in that array index. (This would work as I know a-priori the number of users and items in my universe so ids cannot be greater than that ).
Any help would be appreciated.
I have thus far tried the textscan function as follows:
c = textscan(f,'%d %s','delimiter',':') %this creates two cells one with all the user_ids
%and another with all the remaining string values.
Now if I try to do something like str2mat(c{2}), it works but it stores the '(' and ')' characters also in the matrix. I would like to store a sparse matrix in the fashion that I described above.
I am fairly new to matlab and would appreciate any help regarding this matter.
f = fopen('data.txt','rt'); %// data file. Open as text ('t')
str = textscan(f,'%s'); %// gives a cell which contains a cell array of strings
str = str{1}; %// cell array of strings
r = str(1:2:end);
r = cellfun(#(s) str2num(s(1:end-1)), r); %// rows; numeric vector
pairs = str(2:2:end);
pairs = regexprep(pairs,'[(,)]',' ');
pairs = cellfun(#(s) str2num(s(1:end-1)), pairs, 'uni', 0);
%// pairs; cell array of numeric vectors
cols = cellfun(#(x) x(1:2:end), pairs, 'uni', 0);
%// columns; cell array of numeric vectors
vals = cellfun(#(x) x(2:2:end), pairs, 'uni', 0);
%// values; cell array of numeric vectors
rows = arrayfun(#(n) repmat(r(n),1,numel(cols{n})), 1:numel(r), 'uni', 0);
%// rows repeated to match cols; cell array of numeric vectors
matrix = sparse([rows{:}], [cols{:}], [vals{:}]);
%// concat rows, cols and vals into vectors and use as inputs to sparse
For the example file
1: (1,3),(2,4),(3,5)
10: (1,1),(2,2)
this gives the following sparse matrix:
matrix =
(1,1) 3
(10,1) 1
(1,2) 4
(10,2) 2
(1,3) 5
I think newer versions of Matlab have a stringsplit function that makes this approach overkill, but the following works, if not quickly. It splits the file into userid's and "other stuff" as you show, initializes a large empty matrix, and then iterates through the other stuff, breaking it apart and placing in the correct place in the matrix.
(I Didn't see the previous answer when I opened this for some reason - it is more sophisticated than this one, though this may be a little easier to follow at the expense of slowness). I throw in the \s* into the regex in case the spacing is inconsistent, but otherwise don't perform much in the way of data-sanity-checking. Output is the full array, that you can then turn into a sparse array if desired.
% matlab_test.txt:
% 101: (1,42),(2,65),(5,0)
% 102: (25,78),(50,12),(6,143),(2,123)
% 103: (23,6),(56,3)
clear all;
fclose('all');
% your path will vary, of course
file = '<path>/matlab_test.txt';
f = fopen(file);
c = textscan(f,'%d %s','delimiter',':');
celldisp(c)
uids = c{1}
tuples = c{2}
% These are stated as known
num_users = 3;
num_items = 40;
desired_array = zeros(num_users, num_items);
expression = '\((\d+)\s*,\s*(\d+)\)'
% Assuming length(tuples) == num_users for simplicity
for k = 1:num_users
uid = uids(k)
tokens = regexp(tuples{k}, expression, 'tokens');
for l = 1:length(tokens)
item_id = str2num(tokens{l}{1})
rating = str2num(tokens{l}{2})
desired_array(uid, item_id) = rating;
end
end

Vectorizing the Notion of Colon (:) - values between two vectors in MATLAB

I have two vectors, idx1 and idx2, and I want to obtain the values between them. If idx1 and idx2 were numbers and not vectors, I could do that the following way:
idx1=1;
idx2=5;
values=idx1:idx2
% Result
% values =
%
% 1 2 3 4 5
But in my case, idx1 and idx2 are vectors of variable length. For example, for length=2:
idx1=[5,9];
idx2=[9 11];
Can I use the colon operator to directly obtain the values in between? This is, something similar to the following:
values = [5 6 7 8 9 9 10 11]
I know I can do idx1(1):idx2(1) and idx1(2):idx2(2), this is, extract the values for each column separately, so if there is no other solution, I can do this with a for-loop, but maybe Matlab can do this more easily.
Your sample output is not legal. A matrix cannot have rows of different length. What you can do is create a cell array using arrayfun:
values = arrayfun(#colon, idx1, idx2, 'Uniform', false)
To convert the resulting cell array into a vector, you can use cell2mat:
values = cell2mat(values);
Alternatively, if all vectors in the resulting cell array have the same length, you can construct an output matrix as follows:
values = vertcat(values{:});
Try taking the union of the sets. Given the values of idx1 and idx2 you supplied, run
values = union(idx1(1):idx1(2), idx2(1):idx2(2));
Which will yield a vector with the values [5 6 7 8 9 10 11], as desired.
I couldn't get #Eitan's solution to work, apparently you need to specify parameters to colon. The small modification that follows got it working on my R2010b version:
step = 1;
idx1 = [5, 9];
idx2 = [9, 11];
values = arrayfun(#(x,y)colon(x, step, y), idx1, idx2, 'UniformOutput', false);
values=vertcat(cell2mat(values));
Note that step = 1 is actually the default value in colon, and Uniform can be used in place of UniformOutput, but I've included these for the sake of completeness.
There is a great blog post by Loren called Vectorizing the Notion of Colon (:). It includes an answer that is about 5 times faster (for large arrays) than using arrayfun or a for-loop and is similar to run-length-decoding:
The idea is to expand the colon sequences out. I know the lengths of
each sequence so I know the starting points in the output array. Fill
the values after the start values with 1s. Then I figure out how much
to jump from the end of one sequence to the beginning of the next one.
If there are repeated start values, the jumps might be negative. Once
this array is filled, the output is simply the cumulative sum or
cumsum of the sequence.
function x = coloncatrld(start, stop)
% COLONCAT Concatenate colon expressions
% X = COLONCAT(START,STOP) returns a vector containing the values
% [START(1):STOP(1) START(2):STOP(2) START(END):STOP(END)].
% Based on Peter Acklam's code for run length decoding.
len = stop - start + 1;
% keep only sequences whose length is positive
pos = len > 0;
start = start(pos);
stop = stop(pos);
len = len(pos);
if isempty(len)
x = [];
return;
end
% expand out the colon expressions
endlocs = cumsum(len);
incr = ones(1, endlocs(end));
jumps = start(2:end) - stop(1:end-1);
incr(endlocs(1:end-1)+1) = jumps;
incr(1) = start(1);
x = cumsum(incr);

Categories