Left Join tables in MATLAB - matlab

I tried to find something relevant but to no avail, except this post which I can't say it was helpful.
I have two tables A and B:
A has dimensions 5x5 and non unique values in the LastName
LastName = {'Smith';'Johnson';'Williams';'Smith';'Williams'};
YearNow= [2010;2010;2010;2010;2010];
Height = [71;69;64;67;64];
Weight = [176;163;131;133;119];
BloodPressure = [124; 109; 125; 117; 122];
A = table(LastName,YearNow,Height,Weight,BloodPressure);
and B has dimensions 3x2 and unique values in LastName
LastName = {'Smith';'Johnson';'Williams'};
YearBorn= [1950;1975;1965];
B = table(LastName,YearBorn);
I want to create a new column on Table A that will contain their age after I subtract for each A.YearNow the B.YearBorn, so the last column will have the form
A.Age = [60,35,45,60,45];
When I try to use [detect,pos] = ismember(A,B(:,1)); I get an error:
A and B must contain the same variables.
Any help would be appreciated.

Instead of using ismember, which can be quite error-prone as you have to put things in the right order, you could also use Matlab's outerjoin instead:
A = outerjoin(A,B,'Type','Left','MergeKeys',true);
A.Age = A.YearNow - A.YearBorn;
Note that outerjoin modifies the ordering. See the official Matlab documentation for all the input arguments.
An additional advantage of outerjoin over ismember is that in case not all LastNames in table A exist in table B, you will have to pre-allocate output with ismember, and use the first output argument as well.

You need to extract the LastName columns and pass them to ismember. Then you can use the index vector it returns to compute the Age column as follows:
[~, index] = ismember(A.LastName, B.LastName);
A.Age = A.YearNow-B.YearBorn(index);

Related

How to concatenate / assign string of different length to existing Matlab table?

Consider table in Matlab.
a = table();
a.c = 'a';
How can I add one row containing a string of different length to that table? i.e I want to get:
c
______
'a'
'aa'
For example this simple attempt gives an error:
b = table();
b.c = 'aa';
result = [a; b]
Error:
Could not concatenate the table variable 'c' using VERTCAT.
Caused by:
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
Due to how MATLAB's table objects treats the contained data, it tries to be smart with the data types. Occasionally when things try to be smart behind the scenes they get tripped up in ways that aren't necessarily readily apparent to the user.
What's happening here is that since your c column is created with a character array, MATLAB attempts to keep this column homogeneous and concatenate 'a' with 'aa'. This will error out due to MATLAB's handling of character arrays as matrices of characters, which comes with a size enforcement: all rows must have the same number of columns.
You have a couple options: use a string array (introduced in R2016b), or use a cell array. While string arrays are essentially cell arrays under the hood, they come with the advantage of dedicated string methods, allowing you to natively perform various string operations without needing to explicitly index into a cell array.
To change your code, simply use double quotes ("") instead of single quotes (''):
a = table();
a.c = "a";
b = table();
b.c = "aa";
T = [a;b]
Which returns:
T =
2×1 table
c
____
"a"
"aa"
Alternatively, you can explicitly force the type of c as a cell array:
a = table();
a.c = {'a'};
b = table();
b.c = 'aa';
T = [a; b]
Which returns the same.
If you have an entire column of data, you can create a column from a cell array
tbl = table();
tbl.mycol = {'some text';
'something else';
'third item'};
If you want to append a single items (like in a loop) you could do
tbl = table();
mycell = {'some text';
'something else';
'third item'};
tbl.mycol = {};
for ii = 1:numel(mycell)
tbl.mycol(ii) = mycell(ii);
end
Similarly, you can append to the end as you would an array
tbl.mycol(end+1) = {'fourth item'};
You can merge two tables by concatenating them using vertcat
myothercell = {'append this';
'...and this'};
tbl1 = table();
tbl1.mycol = mycell;
tbl2 = table();
tbl2.mycol = myothercell;
tbl3 = vertcat(tbl1, tbl2);

Make a MatLab table out of 2 Cell Array

What is the most efficient way to generate a table from 2 cell arrays. Array A contains data and Array B contains the corresponding names. I would like to combine them into a convenient table structure.
Cell_Array_A =
{[10x10],[10x10],[];
[10x11],[],[10x12];
[9x10],[13x10],[]}
Cell_Array_B =
{['A','B',[];
'B',[],'A';
'B','A',[]}
It should generate a table with the headers 'A' and 'B'. However, in my real data set, I have a lot of variable names, which I don't know. So I need a way to read all variables from the first row and use them to create the table for the rest of the matrix.
Example of the desired output:
'A' 'B'
[10x10] [10x10]
[10x12] [10x11]
[13x10] [9x10]
I tried so far to process the arrays row wise to get rid of the empty arrays. However this is not very efficient. I run the following code for each row in Array A and B. Below an example for the first row in Array A which I later on use as the tablet for my table.
order_row = {};
order_row = Cell_Array_A(1,:);
ordered_filenames = order_row(~cellfun('isempty',order_row));
Edited answer The idea is 1) to collect the non-empty column headers, 2) for each unique column header collect the corresponding values:
% Unique variable names
nul_names = cellfun(#isempty,Cell_Array_B);
var_names = unique(Cell_Array_B(~nul_names));
% Function to find a column header index
ixhead = #(h,c) cellfun(#(x) isequaln(x,h), c(:));
% Collect the values
var_values = cellfun( ...
#(h) Cell_Array_A(ixhead(h, Cell_Array_B)), var_names, ...
'UniformOutput', false ...
);
% Create the table
tbl = table(var_values{:}, 'VariableNames', var_names);

Matlab; Structure field name with valuable ( = number)

I am trying to assign valuable, which is number and given by for loop, to the name of structure field. For example, I would like to do as following,
A.bx, where A is name of structure(= char), b is part of field name ( = char) and x is valuable given by for loop. A and b is fixed or predefined.
Any comment is appreciated !
genvarname(str,list) generates a valid variable name in str [a string] in which at each iteration value in str is different from the exclusion list
And fieldname(S) returns a list of all the names of the field already in the structure S (use it to create a exclusion list)
Here is a code for what you want:
A = struct ();
for i = 1:5
A.(genvarname ('b', fieldnames (A))) = i;
end
Read about 1. genvarname(str,list) 2. fieldnames(S)
You can name you struct fields using simple sprintf
A = struct()
for ii = 1:10
fn = sprintf('b%d', ii );
A.(fn) = ii; % use the struct
end
I tend to agree with sebastian that suggested using arrays or cells over this type of field naming. In addition to cells and arrays you might find containers.Map to be very versatile and useful.

Using a string to refer to a structure array - matlab

I am trying to take the averages of a pretty large set of data, so i have created a function to do exactly that.
The data is stored in some struct1.struct2.data(:,column)
there are 4 struct1 and each of these have between 20 and 30 sub-struct2
the data that I want to average is always stored in column 7 and I want to output the average of each struct2.data(:,column) into a 2xN array/double (column 1 of this output is a reference to each sub-struct2 column 2 is the average)
The omly problem is, I can't find a way (lots and lots of reading) to point at each structure properly. I am using a string to refer to the structures, but I get error Attempt to reference field of non-structure array. So clearly it doesn't like this. Here is what I used. (excuse the inelegence)
function [avrg] = Takemean(prefix,numslits)
% place holder arrays
avs = [];
slits = [];
% iterate over the sub-struct (struct2)
for currslit=1:numslits
dataname = sprintf('%s_slit_%02d',prefix,currslit);
% slap the average and slit ID on the end
avs(end+1) = mean(prefix.dataname.data(:,7));
slits(end+1) = currslit;
end
% transpose the arrays
avs = avs';
slits = slits';
avrg = cat(2,slits,avs); % slap them together
It falls over at this line avs(end+1) = mean(prefix.dataname.data,7); because as you can see, prefix and dataname are strings. So, after hunting around I tried making these strings variables with genvarname() still no luck!
I have spent hours on what should have been 5min of coding. :'(
Edit: Oh prefix is a string e.g. 'Hs' and the structure of the structures (lol) is e.g. Hs.Hs_slit_XX.data() where XX is e.g. 01,02,...27
Edit: If I just run mean(Hs.Hs_slit_01.data(:,7)) it works fine... but then I cant iterate over all of the _slit_XX
If you simply want to iterate over the fields with the name pattern <something>_slit_<something>, you need neither the prefix string nor numslits for this. Pass the actual structure to your function, extract the desired fields and then itereate them:
function avrg = Takemean(s)
%// Extract only the "_slit_" fields
names = fieldnames(s);
names = names(~cellfun('isempty', strfind(names, '_slit_')));
%// Iterate over fields and calculate means
avrg = zeros(numel(names), 2);
for k = 1:numel(names)
avrg(k, :) = [k, mean(s.(names{k}).data(:, 7))];
end
This method uses dynamic field referencing to access fields in structs using strings.
First of all, think twice before you use string construction to access variables.
If you really really need it, here is how it can be used:
a.b=123;
s1 = 'a';
s2 = 'b';
eval([s1 '.' s2])
In your case probably something like:
Hs.Hs_slit_01.data= rand(3,7);
avs = [];
dataname = 'Hs_slit_01';
prefix = 'Hs';
eval(['avs(end+1) = mean(' prefix '.' dataname '.data(:,7))'])

Simultaneously assign values to multiple structure fields

I have a matlab structure that follows the following pattern:
S.field1.data1
...
.field1.dataN
...
.fieldM.data1
...
.fieldM.dataN
I would like to assign values to one data field (say, data3) from all fields simultaneously. That would be semantically similar to:
S.*.data3 = value
Where the wildcard "*" represents all fields (field1,...,fieldM) in the structure. Is this something that can be done without a loop in matlab?
Since field1 .. fieldM are structure arrays with identical fields, why not make a struct array for "field"? Then you can easily set all "data" members to a specific value using deal.
field(1).data1 = 1;
field(1).data2 = 2;
field(2).data1 = 3;
field(2).data2 = 4;
[field.data1] = deal(5);
disp([field.data1]);
A loop-based solution can be flexible and easily readable:
names = strtrim(cellstr( num2str((1:5)','field%d') )); %'# field1,field2,...
values = num2cell(1:5); %# any values you want
S = struct();
for i=1:numel(names)
S.(names{i}).data3 = values{i};
end
In simple cases, you could do that by converting your struct into a cell array using struct2cell(). As you have a nested structure, I don't think that will work here.
On the other side, is there any reason why your data is structured like this. Your description gives the impression that a simple MxN array or cell array would be more suitable.