How can I prevent MATLAB from automatically modifying .dat file variable names upon import using the dataset function? - matlab

So, I currently have a MATLAB script that does stuff with data and then, using a template .dat file, creates about 20 more .dat files with only a single column being changed (I've been using the dataset and export functions to read and write the files, respectively). The program that will use the .dat files, ExperimentBuilder, requires that the headers have names that start with dollar signs (for example: $image). However, when I use the dataset function in MATLAB to import the template file, I get this warning:
Warning: Variable names were modified to make them valid MATLAB identifiers.
It then replaces all the dollar signs in the variables to x_ (for example, x_image), which would be fine if it would let me change it back to the $ format. But whenever I try to using set , it just gives me this warning again and reverts it back to x_, which is unreadable by ExperimentBuilder.
I know I could just do a quick copy and paste on each file with the original headings, but I would like to know if there's a way to fix this problem in the actual code.
Thanks!

Thing is the MATLAB database uses the header names to provide access to the columns by name, this is why the header names must be valid identifiers (isvarname() states that it must starts with a letter, and contains only valid alphanumeric characters [a-zA-Z0-9_]).
The easiest solution would be manually write the header line yourself (including names starting with $), while separately exporting the data without the headers:
export(ds, ..., 'WriteObsNames',false)
(Note that dataset.export overwrites files by default, so you'll have to export first, then prepend the header line at the beginning of the file. Or if you're comfortable modifying MATLAB own functions, then go edit dataset.export and change the fopen mode from overwrite 'wt' to append 'at' mode).

Related

Preventing MATLAB's readtable function from ignoring first row of delimited text data file

I have a very similar issue as the following question that was previously asked:
readtable on text file ignores first row which contains the column names
However, in my case, the file is consistently formatted correctly. All values are separated by a single space, including the first row, which consists of the column headers. I've tried switching the spaces to tabs, but this did not fix anything.
I am simply using the following code:
% Get list of file names from current directory and make file name variable
filelist = ls();
filename=filelist(3,1:97);
% create table object using file name
DE_genelst_raw_CntrlMvF = readtable(filename);
And where I should have a table with 6 rows and 5 columns with headers, I get a 6x5 table with the column headers missing. I used the readtable function with a more complex delimited dataset and it correctly included the headers. So I know it should be able to work. just not sure what is wrong. If need be I can provide a copy of the file. Thank you for the help.

How do i open a number of xlsx files when I only know part of the filename in Matlab?

I am using Matlab 2013b. I have a list of excel files in a directory and I want to open chosen ones within a loop to read out the data.
I only know the start of each of the file names however, not the ending. I have a vector of numbers that provide the identifying portion of the file's name (filenumber), and I want to loop through the excel files one by one, opening them and extracting the data, then closing them.
There are 500 files, each of the format: img_****ff*******.xlsx, where the first set of asterisks is my filenumber, while the second set of asterisks is unknown.
So far, I have tried listing what is in the directory using:
list=dir('E:\processed\Img*');
filenames={list.name}
This provides with with the full file names.
I tried then within a loop to create the part of the filename I know exists:
x = sprintf('Img_%d_FF_',img(1,1));
I then thought I could use 'Find' to look for my my partial filename/string in the 'filenames' structure above. I don't think I have the code correct for this datatype though:
index = find(strcmp({list.name}, x)==1)
You're pretty close, but the issue is that strcmp compares the whole string and since you only have the beginning part, it is not going to match. I would use strncmp to compare only the first n characters of the string. We can determine what n is based upon the length of your string x.
matches = strncmp({list.name}, x, numel(x));
thisfile = list(matches).name;

load unix executable file to ascii

I am simply trying to load ascii files with two columns of data (spectral data).
They were saved originally as .asc.
I need to open and edit them using text editor before I can load them into Matlab to erase the headers, but some of them somehow got converted to unix executable foramt with the .asc extension. And others are plain text docs also with the same extension. I have no idea why they got saved with the same extension and with my same manipulation as different kind formats.
When I use the load command in Matlab, the plain text docs load normally as expected but the ones saved as unix executable kinds give me this error:
Error using load Unable to read file filename.asc: No such file or
directory.
How can I either resave them (still with the same extension) or otherwise load them to be read by Matlab as standard two column data matrixes?
Thanks!
If these are truly plain text files, try renaming the file from xxx.asc to xxx.txt. Then, see if you are able to edit them as desired.

ReadTable in Matlab

When I use the readtable function I get the following error:
IVcellData = readtable('RiskModelData','Sheet',2,'Range','A1:A49')
Error using readtable (line 129) Invalid parameter name: Sheet.
Would appreciate if anyone could help me.
Have you renamed Sheet 2 to something else, e.g. Datafile? If so, you need to use this name (inside single quotes) not the sheet number instead of 2 in that call.
Also, you need to make a call to
opts = detectImportOptions(yourfilename)
before the call to readtable. I suspect this one is this cause as it is not recognising Sheet as a variable.
Took me a while to discover that lot, mostly empirical as the documentation is not clear on that point.
Keith
Looks like you need to define extension:
T = readtable(filename) creates a table by reading column oriented data from a file.
readtable determines the file format from the file name's extension:
.txt, .dat, or .csv for delimited text files
.xls, .xlsb, .xlsm, .xlsx, .xltm, .xltx, or .ods for spreadsheet files
try ReadModelData.xls or .xlsx

Convert dataset of .mat format to .csv octave/matlab

there are datasets in .mat format in the this site: http://www.cs.nyu.edu/~roweis/data.html
I want to change the format to .csv.
Can someone tell me how to change the format to create the .csv file.
Thanks!
Suppose that the .mat files from the site are available already. In the command window in Matlab, you may write, for example:
load('C:\Users\YourUserName\Downloads\mnist_all.mat');
to load the .mat file; the result should be a set of matrices test0, test1, ..., train0, train1 ... created in your workspace, which you want saved as CSV files. Because they're different size, you need to save one CSV per variable, e.g. (also in the command window):
csvwrite('C:\Users\YourUserName\Downloads\mnist_test0.csv', test0);
Repeat the command for each variable, and do not forget to change also the name of the output file to avoid overwriting.
Did you tried the csvwrite function in Matlab?
Just load your .mat files with the load function and then write them with csvwrite!
I do not have a Matlab license so I installed GNU Octave 4.2.1 (2017) on Windows 10 (thank you to John W. Eaton and others). I was not fully successful using the csvwrite so I used the following workaround. (BTW, I am totally incompetent in the Octave world. csvwrite worked for simple data structures).
In the Command Window I used the following two commands
load myfile.mat
save("-text","myfile.txt","variablename")
When the "myfile.mat" is loaded, the variable names for the data vectors loaded are displayed in the workspace window. This is the name(s) to use in the save command. Some .mat files will load several data structures.
The "-text" option is the default, so you may not need to include this option in the command.
The output file lists the .mat file contents in text format as single column (of potentially sequential variables). It should be easy to use you text editor to massage this data into the original matrix structure for use in whatever app you are comfortable with.
Had a similar issue. Needed to convert a series of .mat files that had two columns of numerical data into standard data files (ascii text). Note that I don't really ever use csv, but everything here could be adapted by using csvwrite instead of the standard save.
Using Octave 4.2.1 ....
load myfile.mat
LI = [L, I] ## L and I are column vectors representing my data
save myfile.txt LI
Note that L and I appear to be default variable names chosen by Octave for the two columns vectors in my original data file. Ideally a script that iterated over all files with the .mat extension in my directory would be ideal, but this got the job done. It saves the data as two space separated columns of data.
*** Update
The following script works on Octave 4.2.1 for a series of data files with the .mat extension that are in the same directory. It will iterate over them and write the data out to text files with the same name but with the extension .dat . Note that this is not efficient, so if you have a lot of files or if they are large it can take a while to run. I would suggest that you run it from the command line using octave mat2dat.m so you can actually watch it go.
I make no guarantees that this will work for you, but it did for me. I also am NOT proficient in Octave or Matlab, so I'm sure a better solution exists.
# mat2dat.m
dirlist = glob("*.mat")
for i=1:length(dirlist)
filename = dirlist{i,1}
load(filename, "L", "I")
LI = [L,I]
tmpname = filename(1:length(filename)-3)
txtname = strcat(tmpname, 'dat')
save(txtname, "LI")
end