Loading multiple datasets without overwriting variable - matlab

I have a folder containing a number of files, named:
filename_1.mat
filename_2.mat
.
.
.
filename_n.mat
Each file contains a dataset named Var, with identical columns. I want to load all these datasets into the workspace and use vertcat() to vertically concatenate them, but when I load them in a for loop I only get the last dataset since the variable Var is overwritten. These datasets were created in a for loop:
% generate filenames
tss = arrayfun(#(x){sprintf('filename_%d',x)},1:(length(1:3)))';
namerr = cell((length(1:3)),1);
namerr(:,1) = {'E:\FILES\'};
file_names = strcat(namerr,tss,'.mat');
% create datasets and save them to E:\FILES
for ii = 1:3
a = rand(1,5)';
b = rand(1,5)';
Var = dataset({[a,b],'a_name','b_name'});
save(file_names{(ii)},'Var','-v6')
end
% Now read these datasets into workspace and concatenate vertically??
% Is there a way for me to name the datasets `Var_1...Var_n`
% so they are not overwritten?

Sure. You can load the data into a variable, and then access the contents of a file as fields in the variable. Starting with your example, this would look something like this:
loadedData_1 = load(file_names{1})
loadedData_2 = load(file_names{2})
loadedData_3 = load(file_names{3})
mergedData = [...
loadedData_1.Var; ...
loadedData_2.Var; ...
loadedData_3.Var ];
You can clean this up by using a loop:
for ix = 3:-1:1 %Load all data, backwards to force preallocation
loadedData(ix) = load(file_names{ix});
end
mergedData = cat(1,loadedData.Var);
Or if you really want to go over the top I think you can do it in one long line using arrayfun, but that's probably going over the top.

It would be far easier to just do it directly in the loop:
...
Varcat = [];
for ii = 1:3
a = rand(1,5)';
b = rand(1,5)';
Var = dataset({[a,b],'a_name','b_name'});
Varcat = [Varcat; Var];
save(file_names{(ii)},'Var','-v6')
end
Var = Varcat;
In case you actually want to do it much later or in another part of the program, hopefully it's clear how to adapt the same approach for a similar loop with load().

Related

Save the data in a form of three columns in text files

This function reads the data from multiple mat files and save them in multiple txt files. But the data (each value) are saved one value in one column and so on. I want to save the data in a form of three columns (coordinates) in the text files, so each row has three values separated by space. Reshape the data before i save them in a text file doesn't work. I know that dlmwrite should be modified in away to make newline after three values but how?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar.Bodies.Positions(1,:,:);
%pos=reshape(pos,2,3,2000);
filename = sprintf('data%d.txt', q);
dlmwrite(filename , pos(q,:,:), 'delimiter','\t','newline','pc')
end
My data structure:
These data should be extracted from each mat file and stored in the corresponding text files like this:
332.68 42.76 42.663 3.0737
332.69 42.746 42.655 3.0739
332.69 42.75 42.665 3.074
A TheMathWorks-trainer once told me that there is almost never a good reason nor a need to use eval. Here's a snippet of code that should solve your writing problem using writematrix since dlmwrite is considered to be deprecated.
It further puts the file-handling/loading on a more resilient base. One can access structs dynamically with the .(FILENAME) notation. This is quite convenient if you know your fields. With who one can list variables in the workspace but also in .mat-files!
Have a look:
% path to folder
pFldr = pwd;
% get a list of all mat-files (returns an array of structs)
Lst = dir( fullfile(pFldr,'*.mat') );
% loop over files
for Fl = Lst.'
% create path to file
pFl = fullfile( Fl.folder, Fl.name );
% variable to load
[~, var2load, ~] = fileparts(Fl.name);
% get names of variables inside the file
varInfo = who('-file',pFl);
% check if it contains the desired variables
if ~all( ismember(var2load,varInfo) )
% display some kind of warning/info
disp(strcat("the file ",Fl.name," does not contain all required varibales and is therefore skipped."))
% skip / continue with loop
continue
end
% load | NO NEED TO USE eval()
Dat = load(pFl, var2load);
% DO WHATEVER YOU WANT TO DO
pos = squeeze( Dat.(var2load)(1,:,1:2000) );
% create file name for text file
pFl2save = fullfile( Fl.folder, strrep(Fl.name,'.mat','.txt') );
writematrix(pos,pFl2save,'Delimiter','\t')
end
To get your 3D-matrix data into a 2D matrix that you can write nicely to a file, use the function squeeze. It gets rid of empty dimensions (in your case, the first dimension) and squeezes the data into a lower-dimensional matrix
Why don't you use writematrix() function?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar(1,:,1:2000);
filename = sprintf('data%d.txt', q);
writematrix(pos(q,:,:),filename,'Delimiter','space');
end
More insight you can find here:
https://www.mathworks.com/help/matlab/ref/writematrix.html

How do I find out in which strucutre my matlab variable is stored in?

I have raw data in a .mat format. The Data is comprised of a bunch of Structures titled 'TimeGroup_XX' where xx is just a random number. Each one of these structures contains a bunch of signals and their corresponding time steps.
It looks something like this
TimeGroup_45 =
time45: [34069x1 double]
Current_Shunt_5: [34069x1 double]
Voltage_Load_5: [34069x1 double]
This is simply unusable, as I simply do not know where the variable I am looking for is hiding in 100's of the structures contained in the raw data. I just know that I am looking for 'Current_Shut_3' for example!
There has to be a way that would allow me to do the following
for all variables in Work space
I_S3 = Find(Current_Shut_3)
end for
Basically I do not want to manually click through every structure to find my variable and just want it to be saved in a normal time series instead of it being hidden in a random structure! Any suggestion on how to do it? There has to be a way!
I have tried using the 'whos' command, but did not get far, as it only returns a list of the stored strucutres in the workspace. I cannot convert that text to a variable and tell it to search for all the fields.
Thanks guys/girls!
This is a great example of why you shouldn't iterate variable names when there are plenty of adequate storage methods that don't require code gymnastics to get data back out of. If you can change this, do that and don't even bother reading the rest of this answer.
Since everything is apparently contained in one *.mat file, specify an output to load so it's output into a unified structure and use fieldnames to iterate.
Using the following data set, for example:
a1.Current_Shunt_1 = 1;
a2.Current_Shunt_2 = 2;
a5.Current_Shunt_5 = 5;
b1.Current_Shunt_5 = 10;
save('test.mat')
We can do:
% Load data
alldata = load('test.mat');
% Get all structure names
datastructs = fieldnames(alldata);
% Filter out all but aXX structures and iterate through
adata = regexpi(datastructs, 'a\d+', 'Match');
adata = [adata{:}];
queryfield = 'Current_Shunt_5';
querydata = [];
for ii = 1:numel(adata)
tmp = fieldnames(alldata.(adata{ii}));
% See if our query field is present
% If yes, output & break out of loop
test = intersect(queryfield, tmp);
if ~isempty(test)
querydata = alldata.(adata{ii}).(queryfield);
break
end
end
which gives us:
>> querydata
querydata =
5

Matlab loop through functions using an array in a for loop

I am writing a code to do some very simple descriptive statistics, but I found myself being very repetitive with my syntax.
I know there's a way to shorten this code and make it more elegant and time efficient with something like a for-loop, but I am not quite keen enough in coding (yet) to know how to do this...
I have three variables, or groups (All data, condition 1, and condition 2). I also have 8 matlab functions that I need to perform on each of the three groups (e.g mean, median). I am saving all of the data in a table where each column corresponds to one of the functions (e.g. mean) and each row is that function performed on the correspnding group (e.g. (1,1) is mean of 'all data', (2,1) is mean of 'cond 1', and (3,1) is mean of 'cond 2'). It is important to preserve this structure as I am outputting to a csv file that I can open in excel. The columns, again, are labeled according the function, and the rows are ordered by 1) all data 2) cond 1, and 3) cond 2.
The data I am working with is in the second column of these matrices, by the way.
So here is the tedious way I am accomplishing this:
x = cell(3,8);
x{1,1} = mean(alldata(:,2));
x{2,1} = mean(cond1data(:,2));
x{3,1} = mean(cond2data(:,2));
x{1,2} = median(alldata(:,2));
x{2,2} = median(cond1data(:,2));
x{3,2} = median(cond2data(:,2));
x{1,3} = std(alldata(:,2));
x{2,3} = std(cond1data(:,2));
x{3,3} = std(cond2data(:,2));
x{1,4} = var(alldata(:,2)); % variance
x{2,4} = var(cond1data(:,2));
x{3,4} = var(cond2data(:,2));
x{1,5} = range(alldata(:,2));
x{2,5} = range(cond1data(:,2));
x{3,5} = range(cond2data(:,2));
x{1,6} = iqr(alldata(:,2)); % inter quartile range
x{2,6} = iqr(cond1data(:,2));
x{3,6} = iqr(cond2data(:,2));
x{1,7} = skewness(alldata(:,2));
x{2,7} = skewness(cond1data(:,2));
x{3,7} = skewness(cond2data(:,2));
x{1,8} = kurtosis(alldata(:,2));
x{2,8} = kurtosis(cond1data(:,2));
x{3,8} = kurtosis(cond2data(:,2));
% write output to .csv file using cell to table conversion
T = cell2table(x, 'VariableNames',{'mean', 'median', 'stddev', 'variance', 'range', 'IQR', 'skewness', 'kurtosis'});
writetable(T,'descriptivestats.csv')
I know there is a way to loop through this stuff and get the same output in a much shorter code. I tried to write a for-loop but I am just confusing myself and not sure how to do this. I'll include it anyway so maybe you can get an idea of what I'm trying to do.
x = cell(3,8);
data = [alldata, cond2data, cond2data];
dfunction = ['mean', 'median', 'std', 'var', 'range', 'iqr', 'skewness', 'kurtosis'];
for i = 1:8,
for y = 1:3
x{y,i} = dfucntion(i)(data(1)(:,2));
x{y+1,i} = dfunction(i)(data(2)(:,2));
x{y+2,i} = dfunction(i)(data(3)(:,2));
end
end
T = cell2table(x, 'VariableNames',{'mean', 'median', 'stddev', 'variance', 'range', 'IQR', 'skewness', 'kurtosis'});
writetable(T,'descriptivestats.csv')
Any ideas on how to make this work??
You want to use a cell array of function handles. The easiest way to do that is to use the # operator, as in
dfunctions = {#mean, #median, #std, #var, #range, #iqr, #skewness, #kurtosis};
Also, you want to combine your three data variables into one variable, to make it easier to iterate over them. There are two choices I can see. If your data variables are all M-by-2 in dimension, you could concatenate them into a M-by-2-by-3 three-dimensional array. You could do that with
data = cat(3, alldata, cond1data, cond2data);
The indexing expression into data that retrieves the values you want would be data(:, 2, y). That said, I think this approach would have to copy a lot of data around and probably isn't the best for performance. The other way to combine data together is in 1-by-3 cell array, like this:
data = {alldata, cond1data, cond2data};
The indexing expression into data that retrieves the values you want in this case would be data{y}(:, 2).
Since you are looping from y == 1 to y == 3, you only need one line in your inner loop body, not three.
for y = 1:3
x{y, i} = dfunctions{i}(data{y}(:,2));
end
Finally, to get the cell array of strings containing function names to pass to cell2table, you can use cellfun to apply func2str to each element of dfunctions:
funcnames = cellfun(#func2str, dfunctions, 'UniformOutput', false);
The final version looks like this:
dfunctions = {#mean, #median, #std, #var, #range, #iqr, #skewness, #kurtosis};
data = {alldata, cond1data, cond2data};
x = cell(length(data), length(dfunctions));
for i = 1:length(dfunctions)
for y = 1:length(data)
x{y, i} = dfunctions{i}(data{y}(:,2));
end
end
funcnames = cellfun(#func2str, dfunctions, 'UniformOutput', false);
T = cell2table(x, 'VariableNames', funcnames);
writetable(T,'descriptivestats.csv');
You can create a cell array of functions using str2func :
function_string = {'mean', 'median', 'std', 'var', 'range', 'iqr', 'skewness', 'kurtosis'};
dfunction = {};
for ii = 1:length(function_string)
fun{ii} = str2func(function_string{ii})
end
Then you can use it on your data as you'd like to :
for ii = 1:8,
for y = 1:3
x{y,i} = dfucntion{ii}(data(1)(:,2));
x{y+1,i} = dfunction{ii}(data(2)(:,2));
x{y+2,i} = dfunction{ii}(data(3)(:,2));
end
end

Looping a process, outputting numerically labelled variables each time

I have about 50 different arrays and I want to perform the following operation on all of them:
data1(isnan(data1)) = 0;
coldata1 = nonzeros(data1);
avgdata1 = mean(coldata1);
and so on for data2, data3 etc... the goal being to turn data1 into a vector without NaNs and then take a mean, saving the vector and the mean into coldata1 and avgdata1.
I'm looking for a way to automate this for all 50, rather than copy it 50 times and change the numbers... any ideas? I've been playing with eval but no luck so far. Also tried:
for y = 1:50
data(y)(isnan(data(y))) = 0;
coldata(y) = nonzeros(data(y));
avgdata(y) = mean(coldata(y));
end
You can do it with eval but really should not. Rather use a cell array as suggested here: Create variables with names from strings
i.e.
for y = 1:50
data{y}(isnan(data{y})) = 0;
coldata{y} = nonzeros(data{y});
avgdata{y} = mean(coldata{y});
end
Also read How can I create variables A1, A2,...,A10 in a loop? for alternative options.

How can I create/process variables in a loop in MATLAB?

I need to calculate the mean, standard deviation, and other values for a number of variables and I was wondering how to use a loop to my advantage. I have 5 electrodes of data. So to calculate the mean of each I do this:
mean_ch1 = mean(ch1);
mean_ch2 = mean(ch2);
mean_ch3 = mean(ch3);
mean_ch4 = mean(ch4);
mean_ch5 = mean(ch5);
What I want is to be able to condense that code into a line or so. The code I tried does not work:
for i = 1:5
mean_ch(i) = mean(ch(i));
end
I know this code is wrong but it conveys the idea of what I'm trying to accomplish. I want to end up with 5 separate variables that are named by the loop or a cell array with all 5 variables within it allowing for easy recall. I know there must be a way to write this code I'm just not sure how to accomplish it.
You have a few options for how you can do this:
You can put all your channel data into one large matrix first, then compute the mean of the rows or columns using the function MEAN. For example, if each chX variable is an N-by-1 array, you can do the following:
chArray = [ch1 ch2 ch3 ch4 ch5]; %# Make an N-by-5 matrix
meanArray = mean(chArray); %# Take the mean of each column
You can put all your channel data into a cell array first, then compute the mean of each cell using the function CELLFUN:
meanArray = cellfun(#mean,{ch1,ch2,ch3,ch4,ch5});
This would work even if each chX array is a different length from one another.
You can use EVAL to generate the separate variables for each channel mean:
for iChannel = 1:5
varName = ['ch' int2str(iChannel)]; %# Create the name string
eval(['mean_' varName ' = mean(' varName ');']);
end
If it's always exactly 5 channels, you can do
ch = {ch1, ch2, ch3, ch4, ch5}
for j = 1:5
mean_ch(j) = mean(ch{j});
end
A more complicated way would be
for j = 1:nchannels
mean_ch(j) = eval(['mean(ch' num2str(j) ')']);
end
Apart from gnovice's answer. You could use structures and dynamic field names to accomplish your task. First I assume that your channel data variables are all in the format ch* and are the only variables in your MATLAB workspace. The you could do something like the following
%# Move the channel data into a structure with fields ch1, ch2, ....
%# This could be done by saving and reloading the workspace
save('channelData.mat','ch*');
chanData = load('channelData.mat');
%# Next you can then loop through the structure calculating the mean for each channel
flds = fieldnames(chanData); %# get the fieldnames stored in the structure
for i=1:length(flds)
mean_ch(i) = mean(chanData.(flds{i});
end