How to add a string as data to a dataset? - matlab

I use the following code to create a simple dataset and add the first two rows:
data = dataset([1; 2],[3; 4],'VarNames', {'A', 'B'})
After that, I would like to set the value 4 to 'test':
data(1,2) = 'test'
Since this throws the following exception:
Error using dataset/subsasgnParens (line 198)
Right hand side must be a dataset array.
Error in dataset/subsasgn (line 79)
a = subsasgnParens(a,s,b,creating);
I also tried:
data(1,2) = dataset('test');
But this is also not working. Therefore my question: How can I add a String to a dataset such as I have created, using the method I'm using (I have to specify the row and column)?

You can't do
data(1,2) = dataset('test');
because 'test' is char type and the rest of your data are doubles and because the string 'test' is four elements which you're trying to put into one element of an array.
You need to use cell arrays. If you want to use the dataset function capabilities, see the cell2dataset and dataset2cell functions. For example:
data = dataset([1; 2],[3; 4],'VarNames',{'A', 'B'})
data2 = dataset2cell(data);
data2{3,1} = 'test';
data3 = cell2dataset(data2,'ReadVarNames',true');

Related

MATLAB: Pass part of structure field name to function

I need to pass a part of a structure's name into a function.
Examples of a available structs:
systems.system1.stats.equityCurve.relative.exFee
systems.system1.stats.equityCurve.relative.inFee
systems.system2.stats.equityCurve.relative.exFee
systems.system2.stats.equityCurve.relative.inFee
systems.system1.returns.aggregated.exFee
systems.system1.returns.aggregated.inFee
systems.system2.returns.aggregated.exFee
systems.system2.returns.aggregated.inFee
... This goes on...
Within a function, I loop through the structure as follows:
function mat = test(fNames)
feeString = {'exFee', 'inFee'};
sysNames = {'system1', 'system2'};
for n = 1 : 2
mat{n} = systems.(sysNames{n}).stats.equityCurve.relative.(feeString{n});
end
end
What I like to handle in a flexible way within the loop is the middle part, i.e. the part after systems.(sysNames{n}) and before .(feeString{n}) (compare examples).
I am now looking for a way to pass the middle part as an input argument fNames into the function. The loop should than contain something like
mat{n} = systems.(sysNames{n}).(fName).(feeString{n});
How about using a helper function such as
function rec_stru = recSA(stru, field_names)
if numel(field_names) == 1
rec_stru = stru.(field_names{1});
else
rec_stru = recSA(stru.(field_names{1}), field_names(2:end));
end
This function takes the intermediate field names as a cell array.
This would turn this statement:
mat{n} = systems.(sysNames{n}).stats.equityCurve.relative.(feeString{n});
into
mat{n} = recSA(systems.(sysNames{n}), {'stats', 'equityCurve', 'relative', feeString{n}});
The first part of the cell array could then be passed as an argument to the function.
This is one of those cases where matlab is a bit unhelpful in the documentation. There is a way to use the fieldnames function in matlab to get the list of all the fields and iterate over that using dynamic fields.
systems.system1.stats.equityCurve.relative.exFee='T'
systems.system1.stats.equityCurve.relative.inFee='E'
systems.system2.stats.equityCurve.relative.exFee='S'
systems.system2.stats.equityCurve.relative.inFee='T'
systems.system1.returns.aggregated.exFee='D'
systems.system1.returns.aggregated.inFee='A'
systems.system2.returns.aggregated.exFee='T'
systems.system2.returns.aggregated.inFee='A'
dynamicvariable=fieldnames(systems.system1)
This will return a cell matrix of the field names which you can use to iterate over.
systems.system1.(dynamicvariable{1})
ans =
equityCurve: [1x1 struct]
Ideally you would have your data structure fixed in such a way that you know how many levels of depth are in your data structure.

How to extract columns of data from .txt files MATLAB

I have some data in a .txt file. that are separated by commas.
for example:
1.4,2,3,4,5
2,3,4.2,5,6
24,5,2,33.4,62
what if you want the average of columns, like first column (1.4,2 and 24)? or second column(2,3 and 5)?
I think putting the column in an array and using the built in mean function would work, but so far, I am only able to extract rows, not columns
instead of making another thread, I thought i'd edit this one. I am working on getting the average of each column of the well known iris data set.
I cut a small portion of the data:
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
delimiterln= ',';
data = importdata('iris.txt', delimiterln);
meanCol1 = mean(data(:,1))
meanCol2 = mean(data(:,2))
meanCol3 = mean(data(:,3))
meanCol4 = mean(data(:,4))
Undefined function 'sum' for input arguments of type 'cell'.
Error in mean (line 115)
y = sum(x, dim, flag)/size(x,dim);
Error in irisData(line 6)
meanCol1 = mean(data(:,1))
it looks like there is an error with handling data type...any thoughts on this? I tried getting rid of the last column, which are strings. and it seems to work without error. So i am thinking that it's because of the strings.
Use comma separated file reading function:
M = csvread(filename);
Now you have the matrix M:
col1Mean=mean(M(:,1));

Create structure fieldnames from array of numbers

I have a dataset that I would like to categorise and store in a structure based on the value in one column of the dataset. For example, the data can be categorised into element 'label_100', 'label_200' or 'label_300' as I attempt below:
%The labels I would like are based on the dataset
example_data = [repmat(100,1,100),repmat(200,1,100),repmat(300,1,100)];
data_names = unique(example_data);
%create a cell array of strings for the structure fieldnames
for i = 1:length(data_names)
cell_data_names{i}=sprintf('label_%d', data_names(i));
end
%create a cell array of data (just 0's for now)
others = num2cell(zeros(size(cell_data_names)));
%try and create the structure
data = struct(cell_data_names{:},others{:})
This fails and I get the following error message:
"Error using struct
Field names must be strings."
(Also, is there a more direct method to achieve what I am trying to do above?)
According to the documentation of struct,
S = struct('field1',VALUES1,'field2',VALUES2,...) creates a
structure array with the specified fields and values.
So you need to have each value right after its field name. The way you are calling struct now is
S = struct('field1','field2',VALUES1,VALUES2,...)
instead of the correct
S = struct('field1',VALUES1,'field2',VALUES2,...).
You can solve that by concatenating cell_data_names and others vertically and then using {:} to produce a comma-separated list. This will give the cells' contents in column-major order, so each field name fill be immediately followed by the corresponding value:
cell_data_names_others = [cell_data_names; others]
data = struct(cell_data_names_others{:})

Concatenating (sub)fields of structs in a cell array

I have a Matlab object, that is a cell array containting structs that have almost identical structures and I want to programmatically get a (sub)field of the structs of all cell array elements.
For example, we take test
test = {struct('a',struct('sub',1)), struct('a',struct('sub',2),'b',1)};
This will create a cell array with the following structure:
cell-element 1: a --> sub --> 1
cell-element 2: a --> sub --> 2
\-> b --> 1
It can be seen that the elements of test don't have the exact same structure, but similar. How can I get all values of the a.sub fields of the elements of the cell. I can obtain them in this specific problem with
acc=zeros(1,numel(test));
for ii=1:numel(test)
acc(ii) = test{ii}.a.sub;
end
but I can't quite get this method to work in a more general context (ie. having different fields).
You may want to use the function getfield:
%//Data to play with
test = {struct('a',struct('sub',1)), struct('a',struct('sub',2),'b',1)};
%//I'm interested in these nested fields
nested_fields = {'a', 'sub'};
%//Scan the cell array to retrieve the data array
acca = cellfun(#(x) getfield(x, nested_fields{:}), test);
In case your data cannot guarantee that all the elements are the same type and size, then you need to output a cell array instead:
%//Scan the cell array to retrieve the data cell array
accc = cellfun(#(x) getfield(x, nested_fields{:}), test, 'UniformOutput', false);
Later Edit
If one wants to use different sets of nested fields for each cell element then:
%//nested_fields list should have the same size as test
nested_fields = {{'a','sub'}, {'b'}};
accm = cellfun(#(x,y) getfield(x,y{:}), test, nested_fields, 'UniformOutput', false);
Edit: No need for recursion, as shown by #CST-link:s answer; the native getfield function can neatly unfold a cell array of fields as its second argument, e.g. getfield(foo{i}, fields{:}) instead of the call to the recursive function in my old answer below. I'll leave the recursive solution below, however, as it could have some value in the context of the question.
You can build you own recursive version of getField, taking a cell array of fields.
function value = getFieldRec(S,fields)
if numel(fields) == 1
value = getfield(S, fields{1});
else
S = getfield(S,fields{1})
fields{1} = [];
fields = fields(~cellfun('isempty',fields));
value = getFieldRec(S,fields);
end
end
Example usage:
foo = {struct('a',struct('sub',1)), ...
struct('a',struct('sub',2),'b',3), ...
struct('c',struct('bar',7),'u',5)};
accessFields = {'a.sub', 'b', 'c.bar'};
values = zeros(1,numel(foo));
for i = 1:numel(foo)
fields = strsplit(accessFields{i},'.');
values(i) = getFieldRec(foo{i},fields);
end
With the following result
values =
1 3 7
I have found a way to do this using eval:
function out = catCellStructSubField(cellStruct, fieldName)
out = zeros(1,numel(cellStruct));
for ii = 1:numel(cellStruct)
out(ii) = eval(['cellStruct{ii}.' fieldName]);
end
Where it can be used on my test example like this:
catCellStructSubField(test, 'a.sub')
Dynamic field names (cellStruct{ii}.(fieldName)) does not work because I'm accessing sub-fields.
I know eval is often a bad idea. I'm curious for different solutions.

How can I dynamically access a field of a field of a structure in MATLAB?

I'm interested in the general problem of accessing a field which may be buried an arbitrary number of levels deep in a containing structure. A concrete example using two levels is below.
Say I have a structure toplevel, which I define from the MATLAB command line with the following:
midlevel.bottomlevel = 'foo';
toplevel.midlevel = midlevel;
I can access the midlevel structure by passing the field name as a string, e.g.:
fieldnameToAccess = 'midlevel';
value = toplevel.(fieldnameToAccess);
but I can't access the bottomlevel structure the same way -- the following is not valid syntax:
fieldnameToAccess = 'midlevel.bottomlevel';
value = toplevel.(fieldnameToAccess); %# throws ??? Reference to non-existent field 'midlevel.bottomlevel'
I could write a function that looks through fieldnameToAccess for periods and then recursively iterates through to get the desired field, but I am wondering if there's some clever way to use MATLAB built-ins to just get the field value directly.
You would have to split the dynamic field accessing into two steps for your example, such as:
>> field1 = 'midlevel';
>> field2 = 'bottomlevel';
>> value = toplevel.(field1).(field2)
value =
foo
However, there is a way you can generalize this solution for a string with an arbitrary number of subfields delimited by periods. You can use the function TEXTSCAN to extract the field names from the string and the function GETFIELD to perform the recursive field accessing in one step:
>> fieldnameToAccess = 'midlevel.bottomlevel';
>> fields = textscan(fieldnameToAccess,'%s','Delimiter','.');
>> value = getfield(toplevel,fields{1}{:})
value =
foo