Extract column data of table in a struct array - matlab

Summary / TLDR
How do I extract all rows of one table-column if the table is in a struct array and I want to combine alle struct-array-elements to one big matrix?
First Approch
I have a table (Values) with multiple columns and rows stored in a struct. Multiple of these structs are stored in an array (Data):
Data(1).Values = table([1; 2; 3; 4; 8], 'VariableNames', {'Rk'});
Data(2).Values = table([3; 6; 10; 8], 'VariableNames', {'Rk'});
Data(3).Values = table([2; 10; 11; 7], 'VariableNames', {'Rk'});
There are many more variables in the struct and also in the table, so it is just a simplified example. As you can see the height of the table can vary. I want to plot the columns Rk in a boxplot. So i need to create a matrix like this:
matrix = [1 1; 2 1; 3 1; 4 1; 8 1; 3 2; 6 2; 10 2; 8 2; 2 3; 10 3; 11 3; 7 3];
I use the following code to create the matrix:
matrix = zeros(0, 2);
for i=1:length(Data)
l = height(Data(1, i).Values(:, 'Rk'));
e = length(matrix) + 1;
% Reshape the data into an array
matrix((end+1):(end+l), 1) = table2array(Data(1, i).Values(:, 'Rk'));
% Creating the index of each data-row
matrix(e:end, 2) = i;
end
% Plot the boxplot
boxplot(matrix(:, 1), matrix(:, 2))
I really don't like this for-loop-version, especially because it becomes slow for big Data-Arrays and also because I don't know the size of matrix, so I can't reserve the space. Theoretically I could run through the whole data-array, counting the elements, initalizing the matrix-variable and then fill it.
Is there a more elegant version without a for-loop?
Second approach
I already tried another solution by changing the structure of the struct. Semantically, this really makes no sense, but this way I found a more elegant solution, creating the matrix-Variable without the problems of the first solution:
% Creating Data
Data(1).Values.Rk = [1; 2; 3; 4; 8];
Data(2).Values.Rk = [3; 6; 10; 8];
Data(3).Values.Rk = [2; 10; 11; 7];
% Reshape the data into an array
a = {cell2mat({Data.Values}).Rk};
b = vertcat(a{:});
% Creating the index of each data (b)-row
c = cumsum(cellfun('length', a(1, :)));
d = meshgrid(1:c(end), 1:length(c));
e = d>c';
f = sum(e);
% Plot the boxplot
boxplot(b, f);
Questions
I would apreciate a solution combining both approaches (having a table, no for-loop, no need of matrix-size-calculation) but:
I don't know how to extract the data of the table in a struct in an array.
I am asking myself if there is a more elegant solution creating the boxplot-indexes.
Whole code
%% Boxplot of Table
clear
% Creating Data
Data(1).Values = table([1; 2; 3; 4; 8], 'VariableNames', {'Rk'});
Data(2).Values = table([3; 6; 10; 8], 'VariableNames', {'Rk'});
Data(3).Values = table([2; 10; 11; 7], 'VariableNames', {'Rk'});
matrix = zeros(0, 2);
for i=1:length(Data)
l = height(Data(1, i).Values(:, 'Rk'));
e = length(matrix) + 1;
% Reshape the data into an array
matrix((end+1):(end+l), 1) = table2array(Data(1, i).Values(:, 'Rk'));
% Creating the index of each data
matrix(e:end, 2) = i;
end
boxplot(matrix(:, 1), matrix(:, 2));
%% Boxplot of Arrays
clear
% Creating Data
Data(1).Values.Rk = [1; 2; 3; 4; 8];
Data(2).Values.Rk = [3; 6; 10; 8];
Data(3).Values.Rk = [2; 10; 11; 7];
% Reshape the data into an array
a = {cell2mat({Data.Values}).Rk};
b = vertcat(a{:});
% Creating the index of each data (b)-row
c = cumsum(cellfun('length', a(1, :)));
d = meshgrid(1:c(end), 1:length(c));
e = d>c';
f = sum(e);
% Plot the boxplot
boxplot(b, f);

I have a partial answer if you are allowed to change the dimensions on your input table. If you transpose your input values, such that you have a row vector
Data(1).Values.Rk = [1; 2; 3; 4; 8]';
...
You can use the following commands to concatenate all the elements:
tmp = [Data.Values];
allvals = [tmp.Rk]'; %output is a column vector of your aggregated values
If you eliminate the Rk field (if it is not informative), you don't need the first step and can do the operation in one step.
If you aggregate the values into a column vector in this manner you now know the dimensions of the second column and can initialize that column before executing a for loop to populate the second column with a monotonically increasing index for each element of data.
I cannot think of a way to get the total number of elements and corresponding value for your box plot in the second column (without a for loop) unless you have the flexibility to add another field to your data structure (e.g. Data(1).nVals = 5, etc.), in which case you can get the total number of elements via sum([Data.nVals])

Related

Combine 2 matrices

There is:
a = [1;2;3;4;5;6;7;8;9;10]; %(10x1 double)
b = [1;3;4;5;6;9]; %(6x1 double)
I hope to combine a and b. So my expected result is:
I think may be use conditional or first import zeros(10 2)? Could you help me?
Method 1: Conditional Checking
Checks if the values in the arrays match before filling the combined array. If they match both columns are filled. If they do not match the "NaN" undefined term is placed in the array. The variable Index that controls the scanning for array b is only incremented upon finding a match between both arrays.
%Method 1: Conditional%
a = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10];
b = [1; 3; 4; 5; 6; 9];
%Creating a result array of the same length as array "a"%
Combined_Array = zeros(length(a),2);
%Copying the first column into the result array "Combined_Array"%
for Row = 1: +1: length(a)
Combined_Array(Row,1) = a(Row,1);
end
%Padding the array "b" with zeroes to match the length of array "a"%
b = [b; zeros(length(a) - length(b),1)];
Index = 1;
for Row = 1: +1: length(a)
%If the values in arrays "a" and "b" do not match%
if a(Row,1) ~= b(Index,1)
Combined_Array(Row,2) = "NaN";
Index = Index - 1;
end
%If the values in arrays "a" and "b" match%
if a(Row,1) == b(Index,1)
Combined_Array(Row,2) = b(Index,1);
end
Index = Index + 1;
end
%Printing the result%
Combined_Array
Method 2: Concatenation and Filling Arrays
Fill in the arrays where the undefined term "NaN" is expected and concatenate the rest of the content accordingly. Using horzcat will concatenate columns together side by side. (vertcat concantates rows on top of one another)
%Method 2: Hard-Coded and Concatenation%
a = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10];
b = [1; 3; 4; 5; 6; 9];
b = [b(1); "NaN"; b(2:5);"NaN"; "NaN"; b(6); "NaN"];
Combined_Array = horzcat(a,b);

How can I turn Structure to n-dimensional matrix

I have 2 matrices.
First one is the names.
Names={'a','b','c'};
Second one is Numbers.
a=[1 3]; b=[4]; c=[2 4 5];
Then i have the structure names which is combination of names and numbers, and they are equal some random matrices with equal rows and columns.
For this case i have 6 combination (2*1*3) and it looks like =
a1.b4.c2=[7 8 9; 10 11 14];
a1.b4.c4=[2 4 5; 3 4 7];
a1.b4.c5=[3 2 11; 4 7 8];
a3.b4.c2=[1 1 1; 3 5 12];
a3.b4.c4=[2 7 9 ; 10 11 12];
a3.b4.c5=[4 2 7 ; 5 6 8];
I want to return this into n-dimensional matrix. In this case it is 5-dimensional which has to look like this;
(:,:,1,4,2)=[7 8 9; 10 11 14]; %%%% for a=1 b=4 c=2
(:,:,1,4,4)=[2 4 5; 3 4 7]; %%%% for a=1 b=4 c=4
(:,:,1,4,5)=[3 2 11; 4 7 8]; %%%% for a=1 b=4 c=5
(:,:,3,4,2)=[1 1 1; 3 5 12]; %%%% for a=3 b=4 c=2
(:,:,3,4,4)=[2 7 9 ; 10 11 12]; %%%% for a=3 b=4 c=4
(:,:,3,4,5)=[4 2 7 ; 5 6 8]; %%%% for a=3 b=4 c=5
I want to write a generalized code that helps me do this job for different numbers of names and numbers yet I couldnt do it. Hope you can help me! Thanks.
To get every combination of the possible numbers for each fields, use ndgrid.
[Numbers{1:3}] = ndgrid(a,b,c);
Having the numbered structures sitting in the workspace as you describe makes it messy to access them programmatically and you should avoid it if possible; however they can still be accessed using eval.
evalPattern = strjoin(strcat(Names,'%d'), '.'); % 'a%d.b%d.c%d'
firstNumbers = cellfun(#(n) n(1), Numbers); % [1 4 2]
firstElement = eval(sprintf(evalPattern,firstNumbers)); % returns the value of a1.b4.c2
result = nan([size(firstElement) size(Numbers{1}]);
for ii = 1:numel(Numbers{1})
iiNumbers = cellfun(#(n) n(ii), Numbers);
result(:,:,ii) = eval(sprintf(evalPattern,iiNumbers));
end
Ok, it took longer than I expected, but the following code should work for arbitrary number of names and numbers, given the following requirements:
At the moment, names are considered to be single characters - that could be tweaked by regexp or something like this.
All your ax.bx.cx.... can be stored in some superordinated structure by your application (beforehand).
Your struct always follows the presented order of ax.bx.cx..., and the matrix dimensions are equal.
So, the script is quite long and - I'm afraid - needs some explanation. Please just ask. The basic idea is to loop through the struct(s) as long as the particular "children" are still structs. That assures arbitrary "depth" of structs, i.e. number of names and numbers.
I expanded your data, so you see, that it also works for (a) additional names, (b) additional numbers, and (c) different matrix sizes. Of course, it also works on your original data.
Also, one doesn't need Names or Numbers in the beginning as these information are automatically extracted from the (has to be there superordinated) structure.
(Attention: Written in Octave. I tried to verify, that all functionality is available in Matlab, too. Please report any issues, if that's not the case. I will then refactor the code.)
% Structs given.
a1.b4.c2.d3 = ones(4, 4);
a1.b4.c4.d3 = ones(4, 4) * 2;
a1.b4.c5.d3 = ones(4, 4) * 3;
a1.b6.c2.d3 = ones(4, 4) * 4;
a1.b6.c4.d3 = ones(4, 4) * 5;
a1.b6.c5.d3 = ones(4, 4) * 6;
a2.b4.c2.d3 = ones(4, 4) * 7;
a2.b4.c4.d3 = ones(4, 4) * 8;
a2.b4.c5.d3 = ones(4, 4) * 9;
a2.b6.c2.d3 = ones(4, 4) * 10;
a2.b6.c4.d3 = ones(4, 4) * 11;
a2.b6.c5.d3 = ones(4, 4) * 12;
% REQUIREMENT: Store your structs in some superordinated struct.
super.a1 = a1;
super.a2 = a2;
% Initialize combined struct for names and numbers.
NamesNumbers = struct();
% Initialize Names cell array.
Names = {};
% Extract names and numbers from superordinated struct.
totalNames = 0;
totalNumbers = 1;
current = super;
while (isstruct(current))
fields = fieldnames(current);
totalNames = totalNames + 1;
totalNumbers = totalNumbers * numel(fields);
for iField = 1:numel(fields)
field = fields{iField};
name = field(1);
Names{totalNames} = name;
number = field(2:end);
if (isfield(NamesNumbers, name) == false)
NamesNumbers.(name) = str2num(number);
else
NamesNumbers.(name) = [NamesNumbers.(name) str2num(number)];
end
end
current = current.(fields{1});
if (isstruct(current) == false)
[nRows, nCols] = size(current);
end
end
% Extract all values from superordinated struct.
level = struct2cell(super);
while (isstruct([level{:}]))
level = struct2cell([level{:}]);
end
values = vertcat(level{:});
% Determine indices.
maxIdx = cellfun(#(x) max(x), struct2cell(NamesNumbers));
idx = zeros([totalNumbers, totalNames]);
factorProd = 1;
for iName = 1:totalNames
numbers = NamesNumbers.(Names{iName});
n = numel(numbers);
factorProd = factorProd * n;
inner = totalNumbers / factorProd;
resh = totalNumbers * n / factorProd;
outer = factorProd / n;
column = repmat(reshape(repmat(numbers, inner, 1), resh, 1), outer, 1);
START = (iName - 1) * totalNumbers + 1;
STOP = iName * totalNumbers;
idx(START:STOP) = column;
end
% Initialize output.
output = zeros([nRows nCols maxIdx']);
% Fill output with values.
for iIdx = 1:size(idx, 1)
temp = num2cell(idx(iIdx, :));
START = (iIdx - 1) * nRows + 1;
STOP = iIdx * nRows;
output(:, :, temp{:}) = values(START:STOP, :);
end

Is there any way to obtain different shuffled in randperm?

I have a matrix [1 2 3 4] and I want to shuffle it with randperm in few times but I want to obtain different matrices. For example
for i=1:4
m(i,:)=randperm(4);
end
will give me 4 rows with 4 columns but I want every row to be different from every other one; e.g. like this:
m(1,:)=[1 3 4 2]
m(2,:)=[2 3 1 4]
m(3,:)=[2 1 4 3]
m(4,:)=[4 3 2 3]
You can just check the existing rows to see if the current permutation already exists
m = zeros(4, 4);
counter = 1;
while counter < 4
new = randperm(4);
if ~ismember(new, m, 'rows')
m(counter, :) = new;
counter = counter + 1;
end
end
Another (memory intensive) approach would be to generate all permutations and then randomly select N of them
allperms = perms(1:4);
N = 4;
m = allperms(randsample(size(allperms,1), N), :);
You can easily use the MATLAB function ismember to check if the random permutation that you just created is already contained in your matrix.
So you can just try something like that:
for i=1:4
temp = randperm(4);
while ismember(m,temp,'rows')
temp = randperm(4);
end
m(i,:) = temp;
end

How to do a pairwise interchange of rows in Matlab?

I have a sequence in an matrix (computed from sortrows function in Matlab). Say the matrix looks something like this after computing:
A = [5; 3; 4; 1; 2];
[b, c] = size(A)
In lieu of performing permutations on the sequence in A, I would like to peform a pairwise interchange of the cells, so the performance runs better, even though the results won't be exact (it will be very close to answer though). I want the rows to look somewhat like this in the end =>
A1 = [5; 4; 3; 2; 1];
A2 = [4; 5; 3; 1; 2];
A3 = [4; 3; 5; 2; 1];
A4 = [3; 4; 5; 1; 2];
Now, the catch is that the matrix will contain as little or as much elements (it will vary). Matrix 'A' is just an example. How do I perform [b-1] pairwise interchanges on A (or any other matrix)?
A = [5; 3; 4; 1; 2];
swapIndexLeft = [1,2,3,4,5];
swapIndexRight = [2,3,4,5,1];
%// make sure the dimension of indices agree
assert(numel(swapIndexLeft) == numel(swapIndexRight))
%// ... and values do not exceed dimensions of the vector A
assert(max(swapIndexLeft)<=numel(A) )
assert(max(swapIndexRight)<=numel(A) )
%// swap iteratively
for ii = 1:numel(swapIndexLeft)
temp = A( swapIndexLeft(ii) );
A( swapIndexLeft(ii) ) = A( swapIndexRight(ii) );
A( swapIndexRight(ii) ) = temp;
%// now you have an array where the element swapIndexLeft(ii)
%// has been swapped with swapIndexRight(ii)
%// do your calculations here
end

Using accumarray and #min to extract min from groups but also output corresponding values from another variable/column

I have 3 columns of data:
time = [1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16];
category = [1;1;1;1;2;2;2;2;3; 3; 3; 3; 4; 4; 4; 4];
data = [1;1;0;1;2;2;1;2;3; 3; 2; 3; 4; 4; 4; 3];
I am using the following to extract the minimum data values for each category:
groupmin = accumarray(category,data,[],#min)
Which outputs:
groupmin = [0;1;2;3]
However, I would really like to have an output that also tells me which time point the minimums are from, e.g.
timeofgroupmin = [3;7;11;16]
groupmin = [0;1; 2; 3]
Alternatively, I would like to have the minimums output in a vector of their own, with NaNs for any row which was not the minimum of its group, e.g.
groupminallrows = [NaN;NaN;0;NaN;NaN;NaN;1;NaN;NaN;NaN;2;NaN;NaN;NaN;NaN;3];
Either approach would solve my problem. As a Matlab novice I'm struggling to know which terms to search for.
This works if all data of the same category are in a single run and the categories are sorted, as in your example. Several minimizers are allowed within each category.
r = accumarray(category,data,[],#(v) {(min(v)==v)});
r = vertcat(r{:});
groupminallrows = NaN(size(data));
groupminallrows(r) = data(r);
Try this solution:
% first we group the data into cell according to the group they belong to
grouped = accumarray(category, data, [], #(x){x});
% find the minimum and corresponding index of each group
[mn,idx] = cellfun(#min, grouped);
% fix index by offsetting the position to point the whole data vector
offset = cumsum([0;cellfun(#numel, grouped)]);
idx = idx + offset(1:end-1);
% result
[mn(:) idx(:)]
assert(isequal(mn, data(idx)))
% build the vector with NaNs
mnAll = nan(size(data));
mnAll(idx) = mn;
The resulting vectors:
>> mn'
ans =
0 1 2 3
>> idx'
ans =
3 7 11 16
>> mnAll'
ans =
NaN NaN 0 NaN NaN NaN 1 NaN NaN NaN 2 NaN NaN NaN NaN 3
EDIT:
Here is an alternate solution:
% find the position of min value in each category
idx = accumarray(category, data, [], #minarg);
% fix position in terms of the whole vector
offset = cumsum([0;accumarray(category,1)]);
idx = idx + offset(1:end-1);
% corresponding min values
mn = data(idx);
I'm using the following custom function to extract the second output argument from min:
minarg.m
function idx = minarg(X)
[~,idx] = min(X);
end
The results are the same as above.
Use accumarray with a custom function:
time = [1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16];
category = [1;1;1;1;2;2;2;2;3; 3; 3; 3; 4; 4; 4; 4];
data = [1;1;0;1;2;2;1;2;3; 3; 2; 3; 4; 4; 4; 3];
groupmin = accumarray( A(:,1), A(:,2), [], #min)
Is what you have, but to get the indices of the minima and their time you'd need the second output of the min function, which I don't know if it is possible to get when used with accumarray. But there is the following workaround:
groupidx = accumarray( category, data, [], #(x) find(x == min(x) )).'
occ = cumsum(hist(category,unique(category)))
idx = -occ(1)+occ+groupidx;
timeofgroupmin = time(idx).'
groupmin = data(idx).'
groupmin =
0 1 2 3
timeofgroupmin =
3 7 11 16
The desired NaN-vector you could get like:
groupminallrows = NaN(1,numel(data));
groupminallrows(idx) = data(idx)
Regarding your comment:
I assume the reason for that, is that you have multiple minima in each group, then find returns an array. To resolve that you can substitute find(x == min(x)) with find(x == min(x),1). But then you would just get the first occurance of every minimum in each group.
If that is not desired I'd say accumarray is generally the wrong way to go.