How to read numbered sequence of .dat files into MATLAB - matlab

I am trying to load a numbered sequence of ".dat" named in the form a01.dat, a02.dat... a51.dat into MATLAB. I used the eval() function with the code below.
%% To load each ".dat" file for the 51 attributes to an array.
a = dir('*.dat');
for i = 1:length(a)
eval(['load ' a(i).name ' -ascii']);
end
attributes = length(a);
I ran into problems with that as I could not easily manipulate the data loaded with the eval function. And I found out the community is strongly against using eval. I used the csvread() with the code below.
% Scan folder for number of ".dat" files
datfiles = dir('*.dat');
% Count Number of ".dat" files
numfiles = length(datfiles);
% Read files in to MATLAB
for i = 1:1:numfiles
A{i} = csvread(datfiles(i).name);
end
The csvread() works for me but it reads the files but messes up the order when it reads the files. It reads a01.dat first and then a10.dat and a11.dat and so on instead of a01.dat, a02.dat... The contents of each files are signed numbers. Some are comma-delimited and single column and this is an even split. So a01.dat's contents are comma-delimited and a02.dat's content are in a single column.
Please how do I handle this?

Your problem seems to be sorting of the files. Drawing on a question on mathworks, this should help you:
datfiles = dir('*.mat');
name = {datfiles.name};
[~, index] = sort(name);
name = name(index);
And then you can loop with just name:
% Read files in to MATLAB
for i = 1:1:numfiles
A{i} = csvread(name{i});
end

Related

Save the data in a form of three columns in text files

This function reads the data from multiple mat files and save them in multiple txt files. But the data (each value) are saved one value in one column and so on. I want to save the data in a form of three columns (coordinates) in the text files, so each row has three values separated by space. Reshape the data before i save them in a text file doesn't work. I know that dlmwrite should be modified in away to make newline after three values but how?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar.Bodies.Positions(1,:,:);
%pos=reshape(pos,2,3,2000);
filename = sprintf('data%d.txt', q);
dlmwrite(filename , pos(q,:,:), 'delimiter','\t','newline','pc')
end
My data structure:
These data should be extracted from each mat file and stored in the corresponding text files like this:
332.68 42.76 42.663 3.0737
332.69 42.746 42.655 3.0739
332.69 42.75 42.665 3.074
A TheMathWorks-trainer once told me that there is almost never a good reason nor a need to use eval. Here's a snippet of code that should solve your writing problem using writematrix since dlmwrite is considered to be deprecated.
It further puts the file-handling/loading on a more resilient base. One can access structs dynamically with the .(FILENAME) notation. This is quite convenient if you know your fields. With who one can list variables in the workspace but also in .mat-files!
Have a look:
% path to folder
pFldr = pwd;
% get a list of all mat-files (returns an array of structs)
Lst = dir( fullfile(pFldr,'*.mat') );
% loop over files
for Fl = Lst.'
% create path to file
pFl = fullfile( Fl.folder, Fl.name );
% variable to load
[~, var2load, ~] = fileparts(Fl.name);
% get names of variables inside the file
varInfo = who('-file',pFl);
% check if it contains the desired variables
if ~all( ismember(var2load,varInfo) )
% display some kind of warning/info
disp(strcat("the file ",Fl.name," does not contain all required varibales and is therefore skipped."))
% skip / continue with loop
continue
end
% load | NO NEED TO USE eval()
Dat = load(pFl, var2load);
% DO WHATEVER YOU WANT TO DO
pos = squeeze( Dat.(var2load)(1,:,1:2000) );
% create file name for text file
pFl2save = fullfile( Fl.folder, strrep(Fl.name,'.mat','.txt') );
writematrix(pos,pFl2save,'Delimiter','\t')
end
To get your 3D-matrix data into a 2D matrix that you can write nicely to a file, use the function squeeze. It gets rid of empty dimensions (in your case, the first dimension) and squeezes the data into a lower-dimensional matrix
Why don't you use writematrix() function?
mat = dir('*.mat');
for q = 1:length(mat)
load(mat(q).name);
[~, testName, ~] = fileparts(mat(q).name);
testVar = eval(testName);
pos(q,:,:) = testVar(1,:,1:2000);
filename = sprintf('data%d.txt', q);
writematrix(pos(q,:,:),filename,'Delimiter','space');
end
More insight you can find here:
https://www.mathworks.com/help/matlab/ref/writematrix.html

Recursively read images from subdirectories

I am stuck on something that is supposed to be so simple.
I have a folder, say main_folder with four sub folders, say sub1, sub2, sub3 and sub4 each containing over 100 images. Now am trying to read and store them in an array. I have looked all over the internet and some MATLAB docs:
here, here and even the official doc.
My code is like this:
folder = 'main_folder/**'; %path containing all the training images
dirImage = dir('main_folder/**/*.jpg');%rdir(fullfile(folder,'*.jpg')); %reading the contents of directory
numData = size(dirImage,1); %no. of samples
arrayImage = zeros(numData, 133183); % zeros matrix for storing the extracted features from images
for i=1:numData
ifile = dirImage(i).name;
% ifolder = dirImage(i).folder;
I=imread([folder, '/', ifile]); %%%% read the image %%%%%
I=imresize(I,[128 128]);
...
If I try the code in the above snippet, the images are not read.
But if I replace the first two lines with something like:
folder = 'main_folder/'; %path containing all the training images
dirImage = dir('main_folder/sub1/*.jpg'); %rdir(fullfile(folder,'*.jpg'));
then all images in sub1 are read. How can I fix this? Any help will be highly appreciated. I want to read all the images in the four sub folders at once.
I am using MATLAB R2015a.
I believe you will need to use genpath to get all sub-folders, and then loop through each of them, like:
dirs = genpath('main_folder/'); % all folders recursively
dirs = regexp(dirs, pathsep, 'split'); % split into cellstr
for i = 1:numel(dirs)
dirImage = dir([dirs{i} '/*.jpg']); % jpg in one sub-folder
for j = 1:numel(dirImage)
img = imread([dirs{i} '/' dirImage(j).name]);
% process img using your code
end
end

MATLAB loop through excel files

My code is posted below. It does exactly what I need it to do.
It reads in a file and plots the data that I need. If I want to read in another file and have it go through the same code, without having to write the whole thing a second time with different variables, is that possible? I would like to store the matrices from each loop.
As you can see the file I get is called: Oxygen_1keV_300K.xlsx
I have another file called: Oxygen_1keV_600K.xlsx
and so on.
How can I loop through these files without having to re-code the whole thing? I then want to plot them all on the same graph. It would be nice to store the final matrix Y and Ymean for each file so they are not overwritten.
clear
clc
files = ['Oxygen_1keV_300K','Oxygen_1keV_300K','Oxygen_1keV_600K','Oxygen_1keV_900K'];
celldata = cellstr(file)
k = cell(1,24);
for k=1:24
data{k} = xlsread('C:\Users\Ben\Desktop\Oxygen_1keV_300K.xlsx',['PKA', num2str(k)]);
end
for i=1:24
xfinal{i}=data{1,i}(end,1);
xi{i}=0:0.001:xfinal{i};
xi{i}=transpose(xi{i});
x{i}=data{1,i}(:,1);
y{i}=data{1,i}(:,4);
yi{i} = interp1(x{i},y{i},xi{i});
end
Y = zeros(10001, numel(data));
for ii = 1 : numel(data)
Y(:, ii) = yi{ii}(1 : 10001);
end
Ymean = mean(Y, 2);
figure (1)
x=0:0.001:10;
semilogy(x,Ymean)
Cell arrays make it very easy to store a list of strings that you can access as part of a for loop. In this case, I would suggest putting your file paths in a cell array as a substitute for the string used in your xlsread call
For example,
%The first file is the same as in your example.
%I just made up file names for the next two.
%Use the full file path if the file is not in your current directory
filepath_list = {'C:\Users\Ben\Desktop\Oxygen_1keV_300K.xlsx', 'file2.xlsx', 'file3.xlsx'};
%To store separate results for each file, make Ymean a cell array or matrix too
YMean = zeros(length(filepath_list), 1);
%Now use a for loop to loop over the files
for ii=1:length(filepath_list)
%Here's where your existing code would go
%I only include the sections which change due to the loop
for k=1:24
%The change is that on this line you use the cell array variable to load the next file path
data{k} = xlsread(filepath_list{ii},['PKA', num2str(k)]);
end
% ... do the rest of your processing
%You'll need to index into Ymean to store your result in the corresponding location
YMean(ii) = mean(Y, 2);
end
Cell arrays are a basic matlab variable type. For an introduction, I recommend the documentation for creating and accessing data in cell arrays.
If all your files are in the same directory, you can also use functions like dir or ls to populate the cell array programatically.

How to import and save in Matlab Multiple Text Files creating a Matrix for each files

I have a very large data set which is divided in folders, I have 100 folders with approximately 200 text files each. I have been trying the for loop first of all importing one and then in another command importing the rest. But I am not interested in a dataarray but rather conserving each file with its name as I have to then match the dates among all the files and each file does not have the same amount of columns.
Each text file has is like the one I have attached, where the data I need is from the row 23 until column 13.
The data names are saves as 010010.txt, 010030.txt, 010050.txt ......until 014957.txt , they are not sequential
Apart from this I have created a script for importing one file but I would like to know how to repeat the same script for the rest.
filename = 'C:*\010010.txt';
startRow = 22;
formatSpec = '%4f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, 'Delimiter', '', 'WhiteSpace', '', 'HeaderLines' ,startRow-1, 'ReturnOnError', false);
fclose(fileID);
Untitled (010010) = [dataArray{1:end-1}];
I would like to repeat the same import process but for the rest files. I would appreciate any suggestion
The text files have the following format:
I only need from row 23 and column 13 and each txt file has different number of rows as some have data from 1992 - 2014 and other have only 2000 - 2014. The first column is the year and column 2 to 13 are months.
I guess you know the basepath under which all your folders are. You can then use something like this:
% First find all folders
folders = cell(0); % empty cell to save folder names
nFolders = 0;
allFolders = ls(basePath); % find all files and folders
for k=1:size(allFolders,1)
curFolder = fullfile(basePath,strtrim(allFolders(k,:)));
if isdir(curFolder) % find out if it is a folder
if ~(allFolders(k,1) == '.') % ignore '.' and '..'
folders{nFolders+1,1} = curFolder; % Save folder path
nFolders = nFolders + 1;
end
end
end
% Then find all files inside these folders
files = cell(0); % empty cell array for file names
nFiles = 0;
for k=1:nFolders % go through all folders
allFiles = ls(folders{k,1});
for l=1:size(allFiles,1) % go through all found files/subfolders
curFile = fullfile(folders{k},strtrim(allFiles(l,:)));
if ~isdir(curFile) % only select files
files{nFiles+1,1} = curFile; % and save it to the cell
nFiles = nFiles + 1;
end
end
end
Now you can iterate through the files cell and read all files according to your script. I see you are interested in the file name. You can extract the file name by
[path,filename,extension] = fileparts(files{k,1});
To import text files, you can use dlmread, which I think is more intuitive than textscan (but has more limitations, of course). For that you don't have to open the file using fopen, you can directly supply the file name.
value = dlmread(fileName,' ',[23,13,23,13]);
The delimiter is now a white space and only the value at row=23 / col=13 is read. Note that the range starts at row/col=0, not 1 like normally in Matlab - so maybe you'll have to change it to [22,12,22,12].

Name each variable differently in a loop

I have created a .dat file of file names. I want to read into MATLAB each file in that list and give the data a different name. Currently, each iteration just overwrites the last one.
I found that a lot of people give this answer:
for i=1:10
A{i} = 1:i;
end
However, it isn't working for my problem. Here's what I am doing
flist = fopen('fnames.dat'); % Open the list of file names
nt = 0; % Counter will go up one for each file loaded
while ~feof(flist) % While end of file has not been reached
for i = 1:6 % Number of filenames in the .dat file
% For each file
fname = fgetl(flist); % Reads next line of list, which is the name of the next data file
disp(fname); % Stores name as string in fname
nt = nt+1; % Time index
% Save data
data{i} = read_mixed_csv(fname, '\t'); % Reads in the CSV file% Open file
data{i} = data(2:end,:); % Replace header row
end
end
The code runs with no errors, but only one data variable is saved.
My fnames.dat contains this:
IA_2007_MDA8_O3.csv
IN_2007_MDA8_O3.csv
MI_2007_MDA8_O3.csv
MN_2007_MDA8_O3.csv
OH_2007_MDA8_O3.csv
WI_2007_MDA8_O3.csv
If possible, I would really like to name data something more intuitive. Like IA for the first file, IN for the second and so on. Is there any way to do this?
The last line of the loop is the problem:
data{i} = data(2:end,:);
I don't know what exactly happens I did not run your code, but data(2:end,:) refers to the second to last dataset, not the second to last line.
Try:
thisdata = read_mixed_csv(fname, '\t');
data{i} = thisdata(2:end,:);
If you want to keep track of what data came from which file, save out a second cell array with the names:
thisdata = read_mixed_csv(fname, '\t');
data{i} = thisdata(2:end,:);
names{i} = fname(1:2); % presuming you only need first two letters.
If you need a specific part of the filename that's not always the same length look into strtok or fileparts. Then you can use things like strcmp to check the cell array names for where the data labelled IA or whichever is stored.
As mentioned by #Daniel the simple way to store data of various sizes in a cell array.
data{1} = thisdata(2:end,:)
However, if the names are really important, you could consider using a struct instead. For example:
dataStruct(1).numbers= thisdata(2:end,:);
dataStruct(1).name= theRelevantName
Of course you could also just add them to the cell array:
dataCell{1,1} = thisdata(2:end,:);
dataCell{1,2} = theRelevantName