Load matrix from .dat file Matlab (except remove first 8 rows) - matlab

To simplify my problem, I am trying to load a file named data.dat.
I however do not know how to import a only a specific number of rows. For example:
Blue Red Green
1 2 4
1 3 4
0.1 2.2 3
.
.
.
How do I go about only importing rows from 0.1 and below. I do not want the first 2 rows nor the headers.
I know this is a fairly simple problem but I keep running into the following error:
Error using textscan. Invalid file identifier. Use fopen to generate a valid file identifier.
fid = fopen('data.dat', 'r');
mat = textscan(fid, '%f', 'HeaderLines', 1);
fclose(fid);
I thought that this works by removing the first row but I am clearly mistaken.

You can use dlmread command like single line of the code below (I'm assuming that your file located at the current working directory, otherwise use the correct path):
mat = dlmread('data.dat','',4,0);
Note: You don't need to open/close the file.

Related

Efficiency loading a file into Matlab

I am trying to load a file in Matlab. But I am a bit confused about the best way to do it.
The file has 3 columns and looks like the screenshot below:
This file I can load very quickly by doing load('c').
However, I had to add 2 NaNs on the bottom row.
The original file actually looks like the file below:
Now if I do load('c') on the file below I get the error:
Error using load
Unable to read file 'c'. Input must be a MAT-file or an ASCII file containing numeric
data with same number of columns in each row.
Of course I can use ImportData to import this file, but it is just soooo slow to import it.
Any suggestions?
You should be able to use c = readtable('c'). This should automatically change the empty entries to "NaN" by default, but if not, there is a way to set that in the options.
If I have a file that is tricky to import (prior to readtable()...that made things a lot easier in the last few years), I will often use the Import Data tool (if its a really big file you can make a mock-up of the complicated file so it loads faster) then change all the import settings as I would want it, then where the green check says "Import Selection" use the black drop down arrow to select "Generate Function." This will give you the coded way of setting everything up to get the file in just the way you want it.
load() is better suited for reading in previously saved '.mat' files that were created in Matlab.
Here's a low-level approach, which might be faster than other methods:
filename = 'c'; % name of the file
N = 3; % number of columns
fid = fopen(filename, 'r'); % open file for reading
x = fscanf(fid, '%f'); % read all values as a column vector
fclose(fid); % close file
x = [x; NaN(N-mod(numel(x)-1,N)-1, 1)]; % include NaN's to make length a multiple of N
x = reshape(x, N, []).'; % reshape to N columns in row-major order

Import text file in MATLAB

I have a tab delimited text file with suffix .RAW.
How can I load the data from the file into a matrix in MATLAB?
I have found readtable, but it doesn't support files ending with suffix .RAW.
Do I really have to use fread, fscanf, etc. to simply load a text file into a matrix?
You can use the dlmread() function. It will read data from an ASCII text file into a matrix and let you define the delimiter yourself. The delimiter for tabs is '\t'.
>> M = dlmread('Data.raw', '\t')
M =
1 2 3
4 5 6
7 8 9
Just for your information there is also the tdfread() function but I do not recommend using it except in very specific cases. dlmread() is a much better option.
.RAW is a generic file extention. You should know the format of your RAW file (especially if your file contains a combination of numbers, data structures etc). If it is a simple text file with a single 2D table, you can easily read it with fscanf, fread, fgetl, fgets, etc
Here is a simple example for a 2D table (matrix):
Let's assume that each row of your table is separated by a carriage return from its following rows. We can read each row by fgetl() and then extract numbers using str2num().
fid=fopen('YourTextFile.RAW');
Data=[];
i = 0;
while 1
i = i + 1;
tline = fgetl(fid);
if ~ischar(tline), break, end
Data(i,:) = str2num(tline);
end
fclose(fid);
disp(Data)
For more complex data structure, the code should be changed.
For a 2D table (a special case) the above simple code can be easily exchanged by dlmread() function.

Reading multiple .dat files into MatLab

I am having great difficulties reading a bunch of .dat files into MatLab. I have tried to Google the problem, but after one hour I still can't get my code to work. In total I have 141 .dat files. Each file consist of three lines of header information (which I do not want to include), and then a bunch of rows, each with three columns of numbers. I want to merge the rows from all the .dat files into one large matrix featuring all the rows, and three columns (since each row in all .dat files contains three numbers). This is the code I have attempted to use:
d = dir('C:\Users\Kristian\Documents\MATLAB\polygoner1\');
out = [];
N_files = numel(d);
for i = 3:N_files
fid = fopen(d(i).name,'r');
data = textscan(fid,'%f%f%f','HeaderLines',3);
out = [out; data];
end
However, when I try to run the code I get the error message
??? Error using ==> textscan
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in ==> readpoly at 6
data = textscan(fid,'%f%f%f','HeaderLines',3);
If anyone knows how I can get this to work, then I would be extremely grateful!
The problem is when you use fopen, you are not giving the full path of the file
path = 'C:\Users\Kristian\Documents\MATLAB\polygoner1\'
d = dir(path);
....
%as #craigim advised it, otherwise you can use strcat
my_file = fullfile(path, {d.name})
for i = 3:N_files
fid = fopen(my_file{i},'r');
....
fclose(fid);
end

Confused with .tsv files in MATLAB (converting to a Matrix?)

I have a .tsv file that I wish to open in MATLAB, however I am having several problems with this.
I have tried the following
fid = fopen('data.tsv');
C = textscan(fid, ['%s' repmat('%f',1,8)], 'HeaderLines', 1);
fclose(fid);
and got some weird values that had nothing to do with my file. I also tried:
data = dlmread('data.tsv', '\t');
and got this
Error using dlmread (line 139)
Mismatch between file and format string.
Trouble reading number from file (row 1u, field 1u) ==> Participant Assessment
Experiment Block Trial
Answer Reaction Timestamp Free Response\n
Is there some way I can get it to ignore the header, or am I doing it totally wrong?
With dlmread you can specify where to start reading in the file. This is one of the few times that MATLAB indexing begins at 0 - [0,0] is the first row, first column. Therefore, to ignore the first row (containing your header):
data = dlmread('data.tsv','\t', 1, 0);
This will only work if all the values (other than the header lines you skip) are numeric.
Your example with textscan also looks fine to me (provided that the format supplied is correct and there is indeed only one header line). C will be a cell array; to obtain the data from each column use C{n} where n is the column number.
Rather than skipping the header line, it's sometimes useful to just read it in to a separate value:
fid = fopen('data.tsv');
C_header = textscan(fid, '%s',9);
C = textscan(fid, ['%s' repmat('%f',1,8)]);
fclose(fid);

How to load data for Classification in Matlab

I have a text file containing thousands of attributes (each column indicates an attribute) and a column that shows the labels of each row.All data is numeric except the last column which is the labels. This column is string. I want to use matlab classification functions such as gscatter() to classify the data. The problem is that when I use load filename in matlab to load my data I get this error (in which "no" is one of the lables)
Unknown text on line number 1 of ASCII file C:\Program Files\MATLAB\R2011b\train\train.txt
"no".
In fact I do not know how to load my data in matlab to be able to use matlab functions to classify the data.
Load is only for .mat files and text files with only numeric data, which is why you get an error.
There are a number of functions which do read text files though.
Depending on the format of your data files, you could use one of the following:
textread is pretty general but requires you to supply the format and to open and close the file.
csvread reads only numeric, comma-separated value, but you don't have to provide a format.
importdata is very general and convenient
fscanf is similar to textread
Given the number of attributes you're dealing with, I'd definitely go with importdata myself.
Here is an example
train.txt
1,2,3,4,5,6,no
2,3,4,5,6,7,yes
myLoadScript.m
numAttribs = 6; %# number of attributes (excluding the label)
frmt = [repmat('%f ',1,numAttribs) '%s'];
fid = fopen('train.txt', 'rt');
C = textscan(fid, frmt, 'Delimiter',',', 'CollectOutput',1);
fclose(fid);
The result:
>> C{1}
ans =
1 2 3 4 5 6
2 3 4 5 6 7
>> C{2}
ans =
'no'
'yes'
Should be easy to adapt to work on your specific file format...