Name each variable differently in a loop - matlab

I have created a .dat file of file names. I want to read into MATLAB each file in that list and give the data a different name. Currently, each iteration just overwrites the last one.
I found that a lot of people give this answer:
for i=1:10
A{i} = 1:i;
end
However, it isn't working for my problem. Here's what I am doing
flist = fopen('fnames.dat'); % Open the list of file names
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:6 % Number of filenames in the .dat file
% For each file
fname = fgetl(flist); % Reads next line of list, which is the name of the next data file
disp(fname); % Stores name as string in fname
nt = nt+1; % Time index
% Save data
data{i} = read_mixed_csv(fname, '\t'); % Reads in the CSV file% Open file
data{i} = data(2:end,:); % Replace header row
end
end
The code runs with no errors, but only one data variable is saved.
My fnames.dat contains this:
IA_2007_MDA8_O3.csv
IN_2007_MDA8_O3.csv
MI_2007_MDA8_O3.csv
MN_2007_MDA8_O3.csv
OH_2007_MDA8_O3.csv
WI_2007_MDA8_O3.csv
If possible, I would really like to name data something more intuitive. Like IA for the first file, IN for the second and so on. Is there any way to do this?

The last line of the loop is the problem:
data{i} = data(2:end,:);
I don't know what exactly happens I did not run your code, but data(2:end,:) refers to the second to last dataset, not the second to last line.
Try:
thisdata = read_mixed_csv(fname, '\t');
data{i} = thisdata(2:end,:);

If you want to keep track of what data came from which file, save out a second cell array with the names:
thisdata = read_mixed_csv(fname, '\t');
data{i} = thisdata(2:end,:);
names{i} = fname(1:2); % presuming you only need first two letters.
If you need a specific part of the filename that's not always the same length look into strtok or fileparts. Then you can use things like strcmp to check the cell array names for where the data labelled IA or whichever is stored.

As mentioned by #Daniel the simple way to store data of various sizes in a cell array.
data{1} = thisdata(2:end,:)
However, if the names are really important, you could consider using a struct instead. For example:
dataStruct(1).numbers= thisdata(2:end,:);
dataStruct(1).name= theRelevantName
Of course you could also just add them to the cell array:
dataCell{1,1} = thisdata(2:end,:);
dataCell{1,2} = theRelevantName

Related

Save the data in a form of three columns in text files

This function reads the data from multiple mat files and save them in multiple txt files. But the data (each value) are saved one value in one column and so on. I want to save the data in a form of three columns (coordinates) in the text files, so each row has three values separated by space. Reshape the data before i save them in a text file doesn't work. I know that dlmwrite should be modified in away to make newline after three values but how?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar.Bodies.Positions(1,:,:);
%pos=reshape(pos,2,3,2000);
filename = sprintf('data%d.txt', q);
dlmwrite(filename , pos(q,:,:), 'delimiter','\t','newline','pc')
end
My data structure:
These data should be extracted from each mat file and stored in the corresponding text files like this:
332.68 42.76 42.663 3.0737
332.69 42.746 42.655 3.0739
332.69 42.75 42.665 3.074
A TheMathWorks-trainer once told me that there is almost never a good reason nor a need to use eval. Here's a snippet of code that should solve your writing problem using writematrix since dlmwrite is considered to be deprecated.
It further puts the file-handling/loading on a more resilient base. One can access structs dynamically with the .(FILENAME) notation. This is quite convenient if you know your fields. With who one can list variables in the workspace but also in .mat-files!
Have a look:
% path to folder
pFldr = pwd;
% get a list of all mat-files (returns an array of structs)
Lst = dir( fullfile(pFldr,'*.mat') );
% loop over files
for Fl = Lst.'
% create path to file
pFl = fullfile( Fl.folder, Fl.name );
% variable to load
[~, var2load, ~] = fileparts(Fl.name);
% get names of variables inside the file
varInfo = who('-file',pFl);
% check if it contains the desired variables
if ~all( ismember(var2load,varInfo) )
% display some kind of warning/info
disp(strcat("the file ",Fl.name," does not contain all required varibales and is therefore skipped."))
% skip / continue with loop
continue
end
% load | NO NEED TO USE eval()
Dat = load(pFl, var2load);
% DO WHATEVER YOU WANT TO DO
pos = squeeze( Dat.(var2load)(1,:,1:2000) );
% create file name for text file
pFl2save = fullfile( Fl.folder, strrep(Fl.name,'.mat','.txt') );
writematrix(pos,pFl2save,'Delimiter','\t')
end
To get your 3D-matrix data into a 2D matrix that you can write nicely to a file, use the function squeeze. It gets rid of empty dimensions (in your case, the first dimension) and squeezes the data into a lower-dimensional matrix
Why don't you use writematrix() function?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar(1,:,1:2000);
filename = sprintf('data%d.txt', q);
writematrix(pos(q,:,:),filename,'Delimiter','space');
end
More insight you can find here:
https://www.mathworks.com/help/matlab/ref/writematrix.html

MATLAB loop through excel files

My code is posted below. It does exactly what I need it to do.
It reads in a file and plots the data that I need. If I want to read in another file and have it go through the same code, without having to write the whole thing a second time with different variables, is that possible? I would like to store the matrices from each loop.
As you can see the file I get is called: Oxygen_1keV_300K.xlsx
I have another file called: Oxygen_1keV_600K.xlsx
and so on.
How can I loop through these files without having to re-code the whole thing? I then want to plot them all on the same graph. It would be nice to store the final matrix Y and Ymean for each file so they are not overwritten.
clear
clc
files = ['Oxygen_1keV_300K','Oxygen_1keV_300K','Oxygen_1keV_600K','Oxygen_1keV_900K'];
celldata = cellstr(file)
k = cell(1,24);
for k=1:24
data{k} = xlsread('C:\Users\Ben\Desktop\Oxygen_1keV_300K.xlsx',['PKA', num2str(k)]);
end
for i=1:24
xfinal{i}=data{1,i}(end,1);
xi{i}=0:0.001:xfinal{i};
xi{i}=transpose(xi{i});
x{i}=data{1,i}(:,1);
y{i}=data{1,i}(:,4);
yi{i} = interp1(x{i},y{i},xi{i});
end
Y = zeros(10001, numel(data));
for ii = 1 : numel(data)
Y(:, ii) = yi{ii}(1 : 10001);
end
Ymean = mean(Y, 2);
figure (1)
x=0:0.001:10;
semilogy(x,Ymean)
Cell arrays make it very easy to store a list of strings that you can access as part of a for loop. In this case, I would suggest putting your file paths in a cell array as a substitute for the string used in your xlsread call
For example,
%The first file is the same as in your example.
%I just made up file names for the next two.
%Use the full file path if the file is not in your current directory
filepath_list = {'C:\Users\Ben\Desktop\Oxygen_1keV_300K.xlsx', 'file2.xlsx', 'file3.xlsx'};
%To store separate results for each file, make Ymean a cell array or matrix too
YMean = zeros(length(filepath_list), 1);
%Now use a for loop to loop over the files
for ii=1:length(filepath_list)
%Here's where your existing code would go
%I only include the sections which change due to the loop
for k=1:24
%The change is that on this line you use the cell array variable to load the next file path
data{k} = xlsread(filepath_list{ii},['PKA', num2str(k)]);
end
% ... do the rest of your processing
%You'll need to index into Ymean to store your result in the corresponding location
YMean(ii) = mean(Y, 2);
end
Cell arrays are a basic matlab variable type. For an introduction, I recommend the documentation for creating and accessing data in cell arrays.
If all your files are in the same directory, you can also use functions like dir or ls to populate the cell array programatically.

Find which line of a .dat file MATLAB is working on

I have a MATLAB script that reads a line from a text file. Each line of the text file contains the filename of a CSV. I need to keep track of what line MATLAB is working on so that I can save the data for that line in a cell array. How can I do that?
To illustrate, the first few lines of my .dat file looks like this:
2006-01-003-0010.mat
2006-01-027-0001.mat
2006-01-033-1002.mat
2006-01-051-0001.mat
2006-01-055-0011.mat
2006-01-069-0004.mat
2006-01-073-0023.mat
2006-01-073-1003.mat
2006-01-073-1005.mat
2006-01-073-1009.mat
2006-01-073-1010.mat
2006-01-073-2006.mat
2006-01-073-5002.mat
2006-01-073-5003.mat
I need to save the variable site_data from each of these .mat files into a different cell of O3_data. Therefore, I need to have a counter so that O3_data{1} is the data from the first line of the text file, O3_data{2} is the data from the second line, etc.
This code works, but it's done without using the counter so I only get the data for one of the files I'm reading in:
year = 2006:2014;
for y = 1:9
flist = fopen(['MDA8_' num2str(year(y)) '_mat.dat']); % Open the list of file names - CSV files of states with data under consideration
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
fname = fgetl(flist);
disp(fname); % Stores name as string in fname
fid = fopen(fname);
while ~feof(fid)
currentLine = fgetl(fid);
load (fname, 'site_data'); % Load current file. It is all the data for one site for one year
O3_data = site_data;
% Do other stuff
end
fclose(fid);
end
fclose(flist);
end
If I add the time index part, MATLAB is telling me that Subscript indices must either be real positive integers or logicals. nt is an integer so I don't know what I'm doing wrong. I need the time index so that I can have O3_data{i} in which each i is one of the files I'm reading in.
year = 2006:2014;
for y = 1:9
flist = fopen(['MDA8_O3_' num2str(year(y)) '_mat.dat']); % Open the list of file names - CSV files of states with data under consideration
nt = 0;
while ~feof(flist) % While end of file has not been reached
fname = fgetl(flist);
fid = fopen(fname);
while ~feof(fid)
currentLine = fgetl(fid);
nt = nt+1; % Time index
load (fname, 'site_data'); % Load current file. It is all the data for one site for one year
O3_data{nt} = site_data;
% Do other stuff
end
fclose(fid);
end
fclose(flist);
end
Try the following:
years = 2006:2014;
for y=1:numel(years)
% read list of filenames for this year (as a cell array of strings)
fid = fopen(sprintf('MDA8_O3_%d_mat.dat',years(y)), 'rt');
fnames = textscan(fid, '%s');
fnames = fnames{1};
fclose(fid);
% load data from each MAT-file
O3_data = cell(numel(fnames),1);
for i=1:numel(fnames)
S = load(fnames{i}, 'site_data');
O3_data{i} = S.site_data;
end
% do something with O3_data cell array ...
end
Try the following - note that since there is an outer for loop, the nt variable will have to be initialized outside of that loop so that we don't overwrite data from previous years (or previous j's). We can avoid the inner while loop since the just read file is a *.mat file and we are using the load command to load its single variable into the workspace.
year = 2006:2014;
nt = 0;
data_03 = {}; % EDIT added this line to initialize to empty cell array
% note also the renaming from 03_data to data_03
for y = 1:9
% Open the list of file names - CSV files of states with data under
% consideration
flist = fopen(['MDA8_O3_' num2str(year(y)) '_mat.dat']);
% make sure that the file identifier is valid
if flist>0
% While end of file has not been reached
while ~feof(flist)
% get the name of the *.mat file
fname = fgetl(flist);
% load the data into a temp structure
data = load(fname,'site_data');
% save the data to the cell array
nt = nt + 1;
data_03{nt} = data.site_data;
end
fclose(flist); % EDIT moved this in to the if statement
end
end
Note that the above assumes that each *.dat file contains a list of *.mat files as illustrated in your above example.
Note the EDITs in the above code from the previous posting.

Read through and save files with different filenames

I have a list of CSV files. The filenames of these files have been stored in the form [year '_MDA8_mat.dat']. I want to read in each of these files into MATLAB and save the output. How can I write the code so that each year is considered in turn and the output .mat file will be saved for each year?
Here's what I have for reading in one of the files:
flist = fopen('2006_MDA8_mat.dat'); % Open the list of file names - CSV files of states with data under consideration
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:27299 % Number of files
fname = fgetl(flist); % Reads next line of list, which is the name of the next data file
disp(fname); % Stores name as string in fname
nt = nt+1; % Time index
load (fname, 'site_data'); % Load current file. It is all the data for one site for one year
O3_data{i} = site_data;
% Do some more stuff
end
save ('2006_MDA8_1990_2014.mat', '-v7.3')
I tried to write a for loop like this:
year = 2006:2014
for y = 1:9
flist = fopen([year(y) '_MDA8_mat.dat']);
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:1500 % Number of files
% Same as above
end
end
save ([year '_MDA8_1990_2014.mat'], '-v7.3')
end
However, when I run this, it doesn't do the same thing as it did for the one file script. I'm not quite sure where the error occurs, but MATLAB tells me there's an error with feof, which doesn't seem to make sense.
When combining numbers and strings, you need to do a num2str on the number:
[num2str(year) '_MDA8_1990_2014.mat']

How to read numbered sequence of .dat files into MATLAB

I am trying to load a numbered sequence of ".dat" named in the form a01.dat, a02.dat... a51.dat into MATLAB. I used the eval() function with the code below.
%% To load each ".dat" file for the 51 attributes to an array.
a = dir('*.dat');
for i = 1:length(a)
eval(['load ' a(i).name ' -ascii']);
end
attributes = length(a);
I ran into problems with that as I could not easily manipulate the data loaded with the eval function. And I found out the community is strongly against using eval. I used the csvread() with the code below.
% Scan folder for number of ".dat" files
datfiles = dir('*.dat');
% Count Number of ".dat" files
numfiles = length(datfiles);
% Read files in to MATLAB
for i = 1:1:numfiles
A{i} = csvread(datfiles(i).name);
end
The csvread() works for me but it reads the files but messes up the order when it reads the files. It reads a01.dat first and then a10.dat and a11.dat and so on instead of a01.dat, a02.dat... The contents of each files are signed numbers. Some are comma-delimited and single column and this is an even split. So a01.dat's contents are comma-delimited and a02.dat's content are in a single column.
Please how do I handle this?
Your problem seems to be sorting of the files. Drawing on a question on mathworks, this should help you:
datfiles = dir('*.mat');
name = {datfiles.name};
[~, index] = sort(name);
name = name(index);
And then you can loop with just name:
% Read files in to MATLAB
for i = 1:1:numfiles
A{i} = csvread(name{i});
end