I am writing a code to do some very simple descriptive statistics, but I found myself being very repetitive with my syntax.
I know there's a way to shorten this code and make it more elegant and time efficient with something like a for-loop, but I am not quite keen enough in coding (yet) to know how to do this...
I have three variables, or groups (All data, condition 1, and condition 2). I also have 8 matlab functions that I need to perform on each of the three groups (e.g mean, median). I am saving all of the data in a table where each column corresponds to one of the functions (e.g. mean) and each row is that function performed on the correspnding group (e.g. (1,1) is mean of 'all data', (2,1) is mean of 'cond 1', and (3,1) is mean of 'cond 2'). It is important to preserve this structure as I am outputting to a csv file that I can open in excel. The columns, again, are labeled according the function, and the rows are ordered by 1) all data 2) cond 1, and 3) cond 2.
The data I am working with is in the second column of these matrices, by the way.
So here is the tedious way I am accomplishing this:
x = cell(3,8);
x{1,1} = mean(alldata(:,2));
x{2,1} = mean(cond1data(:,2));
x{3,1} = mean(cond2data(:,2));
x{1,2} = median(alldata(:,2));
x{2,2} = median(cond1data(:,2));
x{3,2} = median(cond2data(:,2));
x{1,3} = std(alldata(:,2));
x{2,3} = std(cond1data(:,2));
x{3,3} = std(cond2data(:,2));
x{1,4} = var(alldata(:,2)); % variance
x{2,4} = var(cond1data(:,2));
x{3,4} = var(cond2data(:,2));
x{1,5} = range(alldata(:,2));
x{2,5} = range(cond1data(:,2));
x{3,5} = range(cond2data(:,2));
x{1,6} = iqr(alldata(:,2)); % inter quartile range
x{2,6} = iqr(cond1data(:,2));
x{3,6} = iqr(cond2data(:,2));
x{1,7} = skewness(alldata(:,2));
x{2,7} = skewness(cond1data(:,2));
x{3,7} = skewness(cond2data(:,2));
x{1,8} = kurtosis(alldata(:,2));
x{2,8} = kurtosis(cond1data(:,2));
x{3,8} = kurtosis(cond2data(:,2));
% write output to .csv file using cell to table conversion
T = cell2table(x, 'VariableNames',{'mean', 'median', 'stddev', 'variance', 'range', 'IQR', 'skewness', 'kurtosis'});
writetable(T,'descriptivestats.csv')
I know there is a way to loop through this stuff and get the same output in a much shorter code. I tried to write a for-loop but I am just confusing myself and not sure how to do this. I'll include it anyway so maybe you can get an idea of what I'm trying to do.
x = cell(3,8);
data = [alldata, cond2data, cond2data];
dfunction = ['mean', 'median', 'std', 'var', 'range', 'iqr', 'skewness', 'kurtosis'];
for i = 1:8,
for y = 1:3
x{y,i} = dfucntion(i)(data(1)(:,2));
x{y+1,i} = dfunction(i)(data(2)(:,2));
x{y+2,i} = dfunction(i)(data(3)(:,2));
end
end
T = cell2table(x, 'VariableNames',{'mean', 'median', 'stddev', 'variance', 'range', 'IQR', 'skewness', 'kurtosis'});
writetable(T,'descriptivestats.csv')
Any ideas on how to make this work??
You want to use a cell array of function handles. The easiest way to do that is to use the # operator, as in
dfunctions = {#mean, #median, #std, #var, #range, #iqr, #skewness, #kurtosis};
Also, you want to combine your three data variables into one variable, to make it easier to iterate over them. There are two choices I can see. If your data variables are all M-by-2 in dimension, you could concatenate them into a M-by-2-by-3 three-dimensional array. You could do that with
data = cat(3, alldata, cond1data, cond2data);
The indexing expression into data that retrieves the values you want would be data(:, 2, y). That said, I think this approach would have to copy a lot of data around and probably isn't the best for performance. The other way to combine data together is in 1-by-3 cell array, like this:
data = {alldata, cond1data, cond2data};
The indexing expression into data that retrieves the values you want in this case would be data{y}(:, 2).
Since you are looping from y == 1 to y == 3, you only need one line in your inner loop body, not three.
for y = 1:3
x{y, i} = dfunctions{i}(data{y}(:,2));
end
Finally, to get the cell array of strings containing function names to pass to cell2table, you can use cellfun to apply func2str to each element of dfunctions:
funcnames = cellfun(#func2str, dfunctions, 'UniformOutput', false);
The final version looks like this:
dfunctions = {#mean, #median, #std, #var, #range, #iqr, #skewness, #kurtosis};
data = {alldata, cond1data, cond2data};
x = cell(length(data), length(dfunctions));
for i = 1:length(dfunctions)
for y = 1:length(data)
x{y, i} = dfunctions{i}(data{y}(:,2));
end
end
funcnames = cellfun(#func2str, dfunctions, 'UniformOutput', false);
T = cell2table(x, 'VariableNames', funcnames);
writetable(T,'descriptivestats.csv');
You can create a cell array of functions using str2func :
function_string = {'mean', 'median', 'std', 'var', 'range', 'iqr', 'skewness', 'kurtosis'};
dfunction = {};
for ii = 1:length(function_string)
fun{ii} = str2func(function_string{ii})
end
Then you can use it on your data as you'd like to :
for ii = 1:8,
for y = 1:3
x{y,i} = dfucntion{ii}(data(1)(:,2));
x{y+1,i} = dfunction{ii}(data(2)(:,2));
x{y+2,i} = dfunction{ii}(data(3)(:,2));
end
end
I want to store efficiently some numbers as strings (with different lengths) into a table. This is my code:
% Table with numbers
n = 5;
m = 5;
T_numb = array2table((rand(n,m)));
% I create a table with empty cells (to store strings)
T_string = array2table(cell(n,m));
for i = 1:height(T_numb)
for ii = 1:width(T_numb)
T_string{i,ii} = cellstr(num2str(T_numb{i,ii}, '%.2f'));
end
end
What could I do to improve it? Thank you.
I don't have access to the function cell2table right now, but using the undocumented function sprintfc might work well here (check here for details).
For instance:
%// 2D array
a = magic(5)
b = sprintfc('%0.2f',a)
generates a cell array like this:
b =
'17.00' '24.00' '1.00' '8.00' '15.00'
'23.00' '5.00' '7.00' '14.00' '16.00'
'4.00' '6.00' '13.00' '20.00' '22.00'
'10.00' '12.00' '19.00' '21.00' '3.00'
'11.00' '18.00' '25.00' '2.00' '9.00'
which you can convert to a table using cell2table.
So in 1 line:
YourTable = cell2table(sprintfc('%0.2f',a))
This seems to be quite fast -
T_string = cell2table(reshape(strtrim(cellstr(num2str(A(:),'%.2f'))),size(A)))
Or with regexprep to replace strtrim -
cell2table(reshape(regexprep(cellstr(num2str(A(:),'%.2f')),'\s*',''),size(A)))
Here, A is the 2D input numeric array.
As part of an image processing pipeline using 'regionprops' in Matlab I generate the struct:
vWFfeatures =
1631x1 struct array with fields:
Area
Centroid
MajorAxisLength
MinorAxisLength
Eccentricity
EquivDiameter
Where 'Centroid' is a Vector containing [x, y] for example [12.4, 26.2]. I would like to convert this struct to a table and save as a CSV file. The objective is to separate the 'Centroid' vector into two columns in the table labelled Centroid_X and Centroid_Y for example. I am not sure how to achieve this.
So far I have investigated using the 'struct2table' function. This ouputs the 'Centroid' as one column. In addition when I try to assign the output to a variable I get an error:
table = struct2table(vWFfeatures)
Error using struct2table
Too many output arguments.
I cannot understand this, any help please?
Since the original struct2table isn't available to you, you might want to implement specifically the behavior you're trying to achieve yourself.
In this case, this means extracting the values you want to save, (split the array,) then save the data:
data_Centroid = vertcat(vWFfeatures.Centroid); %// contains the centroid data
Centroid_X = data_Centroid(:,1); %// The first column is X
Centroid_Y = data_Centroid(:,2); %// the second column is Y
csvwrite('centroid.csv',data_Centroid); %// writes values into csv
If you want column headers in your csv, it gets complicated because csvwrite can only handle numeric arrays:
celldata = num2cell(num2str(data_Centroid)); %// create cell array
celldata(:,3) = celldata(:,4); %// copy col 4 (y data) into col 3 (spaces)
for i=1:length(celldata)
celldata{i,2} = ','; %// col 2 has commas
celldata{i,4} = '\n'; %// col 4 has newlines
end
celldata = celldata'; %'// transpose to make the entries come columnwise
strdata = ['Centroid_X,Centroid_Y\n',celldata{:}]; %// contains all as string
fid = fopen('centroid.csv','w'); % writing the string into the csv
fprintf(fid,strdata);
fclose(fid);
This is how I solved it in the end: extracted each field from struct used horzcat to join into a new array then defined headers and used csvwrite_with_headers, to ouput to csv.
wpbFeatures = regionprops(vWFlabelled, 'Area','Centroid', 'MajorAxisLength', 'MinorAxisLength', 'Eccentricity', 'EquivDiameter');
wpbArea = vertcat(wpbFeatures.Area);
wpbCentroid = vertcat(wpbFeatures.Centroid);
wpbCentroidX = wpbCentroid(:,1);
wpbCentroidY = wpbCentroid(:,2);
wpbFeret = max(imFeretDiameter(vWFlabelled, linspace(0,180,201)), [], 2);
wpbMajorAxisLength = vertcat(wpbFeatures.MajorAxisLength);
wpbMinorAxisLength = vertcat(wpbFeatures.MinorAxisLength);
wpbEccentricity = vertcat(wpbFeatures.Eccentricity);
wpbEquivDiameter = vertcat(wpbFeatures.EquivDiameter);
wpbFeatures = horzcat(wpbArea, wpbCentroidX, wpbCentroidY, wpbFeret, wpbMajorAxisLength, wpbMinorAxisLength, wpbEccentricity, wpbEquivDiameter);
headers = {'Area','CentroidX','CentroidY', 'Feret', 'MajorAxisLength', 'MinorAxisLength', 'Eccentricity', 'EquivDiameter'};
csvwrite_with_headers(strcat(PlateName, '_ResultsFeatures.csv'),wpbFeatures,headers);
I am wondering how can I write a full structure in an output file *.txt with all fieldnames and the respectives values. For example
allvariables=
names='raw.txt'
date= = '**/**/2013'
User = Mr/Mrs *
timeProcessing = 20
numberIterations = 1000;
Should be written in output.txt with all the information displayed before.
This is just an example but my structure has a length of 50 fieldnames, so I would appreciate any suggestion!
Here's something you can use:
%// Extract field data
fields = fieldnames(allvariables);
values = struct2cell(allvariables);
%// Optional: add enclosing apostrophes around string values
idx = cellfun(#ischar, values);
values(idx) = cellfun(#(x){['''', x, '''']}, values(idx));
%// Convert numerical values to strings
idx = cellfun(#isnumeric, values);
values(idx) = cellfun(#num2str, values(idx), 'UniformOutput', false);
%// Convert cell arrays of strings to comma-delimited strings
idx = cellfun(#iscellstr, values);
stringify_cellstr = #(x){['{' sprintf('''%s'', ', x{1:end - 1}) ...
sprintf('''%s''', x{end}) '}']};
values(idx) = cellfun(stringify_cellstr, values(idx));
%// Convert cell array of numbers to strings
idx = cellfun(#iscell, values);
isnumber = #(x)isnumeric(x) && isscalar(x);
idx_num = cellfun(#(x)all(arrayfun(#(k)isnumber(x{k}),1:numel(x))), values(idx));
idx(idx) = idx_num;
stringify_cellnum = #(x){['{' sprintf('%d, ' x{1:end - 1}) num2str(x{end}) '}']};
values(idx) = cellfun(stringify_cellnum, values(idx));
%// Combine field names and values in the same array
C = {fields{:}; values{:}};
%// Write fields to text file
fid = fopen('output.txt', 'wt');
fprintf(fid, repmat('%s = %s\n', 1, size(C, 2)), C{:});
fclose(fid);
This solution is essentially a variant of this one. Note that it assumes that each field contains a scalar value or a string, but it can be extended to handle other types, of course.
EDIT: added handling for fields storing cell arrays of strings and cell arrays of numbers.
I have a matrix of structs. I'm trying to extract from that matrix a matrix the same size
with only one of the fields as values.
I've been trying to use struct2cell and similar functions without success.
How can this be done?
If I understand you correctly, you have an array of struct like e.g this
s(1:2,1:3) = struct('a',1,'b',2);
Now you want a different struct that only has the field b
[newS(1:2,1:3).b] = deal(s.b);
edit
If all you need is the output (and if the field values are scalar), you can do the following:
out = zeros(size(s));
out(:) = cat(1,s.b)
I'll borrow Jonas' example. You can use the [] to gather a particular field.
% Create structure array
s(1:2,1:3) = struct('a',1,'b',2);
% Change values
for idx = 1:prod(size(s))
s(idx).a = idx;
s(idx).b = idx^2;
end
% Gather a specific field and reshape it to the size of the original matrix
A = reshape([s.a],size(s));
B = reshape([s.b],size(s));
I have a similar problem, but the contents of the field in my structure array are varying length strings that I use to tag my data, so when I extract the contents of the field, I want a cell of varying length strings.
This code using getfield and arrayfun does the job, but I think it is more complicated than it needs to be.
sa = struct('name', {'ben' 'frank', 'betty', 'cybil', 'jack'}, 'value', {1 1 2 3 5})
names = arrayfun(#(x) getfield(x, 'name'), sa, 'UniformOutput', false)
Can anyone suggest cleaner alternative? extractfield in the mapping toolbox seems to do the job, but it is not part of the base MATLAB system.
Update: I have answered my own embedded question.
names = {sa.name}