Load multiple .mat files for processing - matlab

In MatLab I have (after extensive code running) multiple .mat files outputted to .mat files. The actual matlab name of each .mat file is called results but I've used the save command to write them to different files. A small subset of the files looks like this:
results_test1_1.mat
results_test1_2.mat
results_test1_3.mat
results_test1_4.mat
results_test2_1.mat
results_test2_2.mat
results_test2_3.mat
results_test2_4.mat
Now I want to compare the results for each test, which means I have to load in all four .mat files and combine them in a graph. Reading in one file and making the eventual graph is no problem. But since all files have the same matlab name results, iteratively loading them is not an option (at least, not one that I know of yet) since in the end only file 4 remains since it rewrites the previous ones.
Is there a way to load all these files and store them in different variables in a structure (regarding only one test set)? Because doing all this manually is a hell of a lot of work.
I've tried to use this method: Load Multiple .mat Files to Matlab workspace but I get an Invalid field name error on loaded.(char(file)) = load(file);

You can load into a variable (preferably a cell array)
results = cell( 2, 4 ); % allocate
for testi=1:2
for resi = 1:4
filename = sprintf('results_test%d_%d.mat', testi, resi );
results{testi,resi} = load( filename );
end
end
Now you have all the results stored in results cell array and you may access the stored variables, e.g.,
results{1,3}.someVar % access variable someVar (assuming such variable was saves to the corresponding mat file

Related

Clearing a specific global variable in octave

I am trying to load the variables from a multiple .mat files using 'who' function and saving it in a variable 'A'. I am using a for loop for that. When I finish loading the first file and start loading the second file then 'A' shows variables in first .mat file as well. The problem is the function 'who' saves the variables as it is for multiple loops and I want to clear the 'who' after each loop. How can I do this. There is any way to clear a specific global variable.
for i=1:10; (10 mat files)
clear A;
clear who;
A=who; (all the variables in each mat file saved in A)
max(A(1,1); (finding max of variable A(1,1))
end
from the above code, if each .mat file has 5 variables then in the second loop the 'who' has 10 variables. the who is not cleared.
It's not entirely clear what you're trying to do because who (with no input arguments) returns a list of all variables in the current workspace not the variables within a file. For it to return a list of variables within a file you'd need to do something like:
vars = who('-file', filenames{i});
That being said, it looks like you actually want to load the variable A from all mat files that you've saved and find the maximum value of A across these files.
The better way to approach this is to specify an output to load which will load the data into a struct where each variable is stored as a separate field in the struct. You can also specify an additional input to load to specify that you'd only like to load variable A (in case there are other variables). You can then load each matfile into a separate struct and do your comparison
for k = 1:numel(filenames)
% Load variable A from this file into a struct
data(k) = load(filenames{k}, 'A');
end
% Now find the maximum value of A
maxA = max([data.A]);

How to save the matlab workplace variable including original file name?

I would like to know how I can save the matlab output files (i.e. matlab workplace variables), by including original file name.
e.g. I open a file (filename.mat) with load filename.mat. Then I run a code to do calculation and I get some workplace variables (e.g. flow, pressure). I want to save those variables as filename_flow.mat and filename_pressure.mat.
I will use the same code on different filename, so I would like to know how I can save my variables as mentioned above (i.e. including the original file's name)?
FileToBeLoaded = 'filename.mat';
[pathstr,filename,ext] = fileparts(FileToBeLoaded)
load([filename ext]);
%// calculate stuff
FlowVariable = %// some calculation
save([filename '_flow'],FlowVariable)
The same of course works for other names as well. You pull apart the original file name to its actual name and extension, and use the original name, add something (_flow in this case) and save that. The default of MATLAB is already to save to a .mat file, so that's taken care of automatically.

Convert dataset of .mat format to .csv octave/matlab

there are datasets in .mat format in the this site: http://www.cs.nyu.edu/~roweis/data.html
I want to change the format to .csv.
Can someone tell me how to change the format to create the .csv file.
Thanks!
Suppose that the .mat files from the site are available already. In the command window in Matlab, you may write, for example:
load('C:\Users\YourUserName\Downloads\mnist_all.mat');
to load the .mat file; the result should be a set of matrices test0, test1, ..., train0, train1 ... created in your workspace, which you want saved as CSV files. Because they're different size, you need to save one CSV per variable, e.g. (also in the command window):
csvwrite('C:\Users\YourUserName\Downloads\mnist_test0.csv', test0);
Repeat the command for each variable, and do not forget to change also the name of the output file to avoid overwriting.
Did you tried the csvwrite function in Matlab?
Just load your .mat files with the load function and then write them with csvwrite!
I do not have a Matlab license so I installed GNU Octave 4.2.1 (2017) on Windows 10 (thank you to John W. Eaton and others). I was not fully successful using the csvwrite so I used the following workaround. (BTW, I am totally incompetent in the Octave world. csvwrite worked for simple data structures).
In the Command Window I used the following two commands
load myfile.mat
save("-text","myfile.txt","variablename")
When the "myfile.mat" is loaded, the variable names for the data vectors loaded are displayed in the workspace window. This is the name(s) to use in the save command. Some .mat files will load several data structures.
The "-text" option is the default, so you may not need to include this option in the command.
The output file lists the .mat file contents in text format as single column (of potentially sequential variables). It should be easy to use you text editor to massage this data into the original matrix structure for use in whatever app you are comfortable with.
Had a similar issue. Needed to convert a series of .mat files that had two columns of numerical data into standard data files (ascii text). Note that I don't really ever use csv, but everything here could be adapted by using csvwrite instead of the standard save.
Using Octave 4.2.1 ....
load myfile.mat
LI = [L, I] ## L and I are column vectors representing my data
save myfile.txt LI
Note that L and I appear to be default variable names chosen by Octave for the two columns vectors in my original data file. Ideally a script that iterated over all files with the .mat extension in my directory would be ideal, but this got the job done. It saves the data as two space separated columns of data.
*** Update
The following script works on Octave 4.2.1 for a series of data files with the .mat extension that are in the same directory. It will iterate over them and write the data out to text files with the same name but with the extension .dat . Note that this is not efficient, so if you have a lot of files or if they are large it can take a while to run. I would suggest that you run it from the command line using octave mat2dat.m so you can actually watch it go.
I make no guarantees that this will work for you, but it did for me. I also am NOT proficient in Octave or Matlab, so I'm sure a better solution exists.
# mat2dat.m
dirlist = glob("*.mat")
for i=1:length(dirlist)
filename = dirlist{i,1}
load(filename, "L", "I")
LI = [L,I]
tmpname = filename(1:length(filename)-3)
txtname = strcat(tmpname, 'dat')
save(txtname, "LI")
end

How can I load large files (~150MB) in MATLAB?

I have a large MATLAB file (150MB) in matrix form (i.e. 4070x4070). I need to work on this file in MATLAB but I can't seem to load this file. I am getting an "out of memory" error. Is there any other way I can load this size of file? I am using a 32bit processor and have 2GB of RAM. Please help me, I am getting exhausted from dealing with this problem.
Starting from release R2011b (ver.7.13) there is a new object matlab.io.MatFile with MATFILE as a constructor. It allows to load and save parts of variables in MAT-files. See the documentation for more details. Here is a simple example to read part of a matrix:
matObj = matfile(filename);
a = matObj.a(100:500, 200:600);
If your original file is not a MAT file, but some text file, you can read it partially and use matfile to save those parts to the same variable in a MAT file for later access. Just remember to set Writable property to true in the constructor.
Assuming your text file is tab-delimited and contains only numbers, here is a sample script to read the data by blocks and save them to MAT file:
blocksize = 100;
startrow = 0;
filename = 'test.mat';
matObj = matfile(filename,'Writable',true);
while true
try
a = dlmread(filename,'\t',startrow,0); %# depends on your file format
startrow = startrow + blocksize;
matObj.a(startrow+(1:blocksize),:) = a;
catch
break
end
end
I don't have the latest release now to test, but hope it should work.
If it is an image file, and you want to work with it, try the matlab block processing. By using it, you will load small parts of the file. Your function fun will be applied to each block individually.
B = blockproc(src_filename,[M N],fun)
In case it is an xml file, try the XML DOM Node mode together with SAX - (Thanks to #Nzbuu for pointing that out), but that seems to be an undocumented functionality.
Also, if it is a textual file of any kind (Unlikely, due to the amount of data), try external tool to split.
You can also user MATLAB's Memory-Mapping of Data Files to read in a block of the file, process it, and proceed to the next block without having to load the entire file into memory at once.
For instance, see this example, which "maps a file of 100 double-precision floating-point numbers to memory."

vectorization - importing excel files into matlab

I've written the following function for importing excel files into matlab. The function works fine, where by inserting the path name of the files, the scripts imports them into the workspace. The function is shown below:
function Data = xls_function(pathName);
%Script imports the relevant .xls files into matlab - ensure that the .xls
%files are stored in a folder specified by 'pathName'.
%--------------------------------------------------------------------------
TopFolder = pathName;
dirListing = dir(TopFolder);%Lists the folders in the directory specified
%by pathName.
dirListing = dirListing(3:end);%Remove the first two structures as they
%are only pointers.
for i = 1:length(dirListing);
SubFolder{i} = dirListing(i,1).name;%obtain the name of each folder in
%the specified path.
SubFolderPath{i} = fullfile(pathName, dirListing(i,1).name);%obtain
%the path name for each of the folders.
ExcelFile{i} = dir(fullfile(SubFolderPath{i},'*.xls'));%find the
%number of .xls files in each of the SubFolders.
for j = 1:length(ExcelFile{1,i});
ExcelFileName{1,i}{j,1} = ExcelFile{1,i}(j,1).name;%find the name
%of each .xls file in each of the SubFolders.
for k = 1:length(ExcelFileName);
for m = 1:length(ExcelFileName{1,k});
[status{1,k}{m,1},sheets{1,k}{m,1},format{1,k}{m,1}]...
= xlsfinfo((fullfile(pathName,SubFolder{1,k},...
ExcelFileName{1,k}{m,1})));%gather information on the
%.xls files i.e. worksheet names.
Name_worksheet{1,k}{m,1} = sheets{1,k}{m,1}{1,end};%obtain
%the name of each of the .xls worksheets within
%each spreadsheet.
end
end
end
end
for n = 1:length(ExcelFileName);
for o = 1:length(ExcelFileName{1,n});
%require two loops as the number of excel spreadsheets varies
%from the number of worksheets in each spreadsheet.
TXT{1,n}{o,1} = xlsread(fullfile(pathName,SubFolder{1,n},...
ExcelFileName{1,n}{o,1}),Name_worksheet{1,n}{o,1});%import the
%relevant data from excel by using the spreadsheet and
%worksheet names previously obtained.
Data.(SubFolder{n}){o,1} = TXT{1,n}{o,1};
end
end
The only problem with the script is that it takes too long to run if the number of .xls files is large. I've read that vectorization would improve the running time, therefore I am asking for any advice on how I could alter this code to run faster, through vectorization.
I realise that reading a code like this isn't easy (especially as my form of coding is by no means as efficient as I would like) but any advice provided would be much appreciated.
I don't think vectorization applies to your problem - but one after the other.
As an example for your data you could use cellfun to turn a loop vectorized:
tmp = ExcelFileName{1,n}
result_cell = cellfun(#(x) xlsread(fullfile(pathName,x)),tmp, 'UniformOutput', false))
But the key problem is the poor implementation of xlsread and the other excel related functions in matlab. What they do is with every(!) function call they create a new excel process (which is hidden) in which they perform your command and then end it.
I remember a tool at matlab central that reused the same excel instance and thus was very quick - but unfortunately I can no longer find it. But maybe you can find an example there on which you can base your own reader which reuses it.
On a related note - Excel has the stupid limitation that it doesn't allow you two files with the same name to be opened at the same time - and then fails with some error. So if you run your reading vectorized/parallel you are in for a whole new fun of strange errors :D
For myself I found the only propper way to deal with these documents through java with Apache POI libraries. These have the nice advantage you don't need Excel installed - but unfortunatly require some programming.