Detect number of columns in a columnar text file - matlab

I am trying to interpret data from an eye tracking device. The files exported from the eye tracker are in ASCII format.
Recording files that contain data from a single eye only look like this (only four rows shown):
6372825 645.3 275.4 1362.0 ...
6372826 644.6 274.0 1364.0 ...
6372827 644.2 273.2 1365.0 ...
6372828 642.5 272.7 1367.0 ...
Note that the dots at the end of each row above are a part of the output file, i.e. I haven't added them for the purposes of this question. I normally detect these dots and later throw them away.
The format of the above columns is [timestamp, X, Y, pupilSize, {...}]
A recording from both eyes looks like this (only four rows shown):
505076 416.8 755.4 1148.0 23.6 751.1 1239.0 .....
505077 417.0 758.4 1143.0 23.7 753.1 1244.0 .....
505078 416.7 761.4 1146.0 24.6 752.1 1249.0 .....
505079 416.1 764.8 1150.0 27.3 750.2 1250.0 .....
In this case, the data format is [timestamp, X(left), Y(left), pupilSize(left), X(right), Y(right), pupilSize(right), {.....}]
In both cases, I'd like to extract the numbers from the text and assign them to an array. Here's how I do this for recordings from a single eye:
eyeData = textscan(fid,'%d %f %f %f %s');
I can do the same for binocular recordings, using the following code:
eyeData = textscan(fid,'%d %f %f %f %f %f %f %s');
The trouble is, I'd like to be able to automatically detect whether the data I'm dealing with are monocular or binocular. In other words, I need a way of determining whether the ASCII file has five columns or eight. Note that the last column in both cases just consists of a series of dots. Whilst I typically just throw this away, it may be useful in determining the number of eyes in the recording (since monocular recordings end each row with ... and binocular with .....)
Any ideas as to how I might work out how many columns are in each ASCII file are welcome!

You can read the first data line, check the number of columns and then revert the file position indicator For example:
pos = ftell(fid);
cols = numel(regexp(fgetl(fid), '\s*([^\s]*)\s*'));
fseek(fid, pos, 'bof');
This can be followed by:
if (cols == 5)
eyeData = textscan(fid, '%d %f %f %f %s');
else
eyeData = textscan(fid, '%d %f %f %f %f %f %f %s');
end
By the way, note that you can tell textscan to discard the dots by using %*s instead of the last %s in the pattern string.

You can count the columns in a file with a shell command, which you can call from MATLAB using
s = system(shell_command);
To produce a 'shell_command' that fits your needs check out the following link
unix - count of columns in file

Related

I want to read this specific csv file, using Matlab. I used textscan but I failed

csv file:
Date,Open,High,Low,Close,Volume,Adj Close
20170217,64.470001,64.690002,64.300003,64.620003,21234600,64.620003
20170216,64.739998,65.239998,64.440002,64.519997,20524700,64.519997
I used this:
fileID = fopen('table.csv');
C = textscan(fileID,'%s %f %f %f %f %d %f','Delimiter',',');
fclose(fileID);
celldisp(C)
but It does not read anything.
You can use the csvread function to read a csv file.
m=csvread('table.csv',1,0)
The values are stored in a matrix.
Since your file has an header line, you have to specify, in the call, to start reading from the second row of the file.
You can do it by adding two parameters in the call:
the first defines the row from which to start reading (notice that the index is zero base)
the second defines the column from which to start (in the case of the example, from the first)
If, nevertheless, you want to use textscan, you have to modify your code as follows:
fileID = fopen('table.csv');
% C = textscan(fileID,'%s %f %f %f %f %d %f','Delimiter',',');
C1 = textscan(fileID,'%s',2);
C2 = textscan(fileID,'%d%f%f%f%f%d%f','delimiter',',')
fclose(fileID);
You have to call textscan twice:
the first time ro read the first row (the header)
the second time to read the data
Notice in the first call the third parameter in the call: it specifies that the format (%s) has to be used twice.
This because in your header row the last word is separated by a space.
Once you've read the header row, you call textscan for the again to read the numeric values.
CSV is reading by xlsread('File');
if it is reading nan so do
[num text all]=xlsread('file');
and do for loops on text output

Read line delimited by comma and tab

I would like to read files containing numbers in each line. Here is the example of the format-
0,0,0 1 0 0 0
0.02,0.1,0.98 8.77 0.985292 0.112348 0.112348
0.04,0.2,1.96 8.77 0.985292 0.112348 0.224696
As above shown, the first three numbers are separated by commas, after that all the rest numbers are separated by tab in the line. As a result, it is not possible to use dlmread or textscan. Is there any way to solve it? Thanks!
Yes you should add two parameters in your function:
Delimiter %choose the delimiter
and
MultipleDelimsAsOne %Treat Repeated Delimiters as One
Option 1:
Small "trick" you can select more than one delimiter if you give a structure as input: {',',' '}.
Result = textscan(fileID,'%f %f %f %f %f %f %f','Delimiter',{',',' '},'MultipleDelimsAsOne',1);
Option 2: (that should work)
This time I don't use MultipleDelimsAsOne but I precise that the delimiter can be a comma or a tab (with \t).
Result = textscan(fileID,'%f %f %f %f %f %f %f','Delimiter',{',','\t'});

Dimensions of matrices being concatenated are not consistent

i read a csv file with textscan and when i want write in a file i receive this error : Error using horzcat. Dimensions of matrices being concatenated are not consistent.
if i change the first format in textscan (i mean %S) to %f the error vanishes.
the error occurs when matlab want to make [datatest{1} probability]
probability is 1000*1 double
datatest{1} is 1000*1 cell
datatest=textscan(FileID,'%s %*f %f %f %*s %*s %*s %*s %*s %*s %*s %*s %*s %f %f %f %f %f %f %f %f %f %f',1000,'headerlines',1,'delimiter',',');
csvwrite('output.csv',[datatest{1} probability]);
Your variable datatest{1} contains 1000 cells which each contains a string (may be or may be not the same length).
In your statement [datatest{1} probability] you are trying to concatenate cells (containing strings) with double numeric type, this does not work. The concatenation operator needs to operate on data of similar type.
Now even if you were to create a cell array which would contain all your desired columns myCellArray={datatest{1} probability}, this would not help you because the output of that cannot be passed on the function csvwrite.
csvwrite, or the better sister dlmwrite, do not accept cell arrays. You would have to convert the cell values into numeric values. Unfortunately, you want to write strings and numeric values, so your only way is to use low level functions like fprintf
In your case, to write the file you were expecting, you can use the following code.
col1 = datatest{1} ; %// extract the column of interest for easier indexing later on
fidw = fopen('output.csv','w') ; %// get a handle on a file to write (necessary with "fprintf")
for iline = 1:numel(probability) %// loop on each line
fprintf( fidw , '%s, %f\n' , col1{iline} , probability(iline) ) ; %// write the line
end
fclose(fidw) ; %// close the file - IMPORTANT - (necessary with "fprintf")

Read part of a text-based file

I had a text-based file with .ptx suffix. It contains the point cloud information please see the following example
100
50
0.352 -5.207 -0.823 0.238 61 61 61
0.345 -5.202 -0.824 0.234 60 60 60
...
Question:
How can I load the file from the third row (ignore the first two rows) and save is as a matrix.
I would recommend using textscan.
Something like:
in = textscan('sample.ptx','%f %f %f %f %f %f %f','HeaderLines',2)
You can specify a number of header lines to skip using 'HeaderLines'. The %f refers to the format of the input data. Hope that helps.
Here is a full example of how to apply textscan and transform the result in to a matrix:
fid = fopen('textscantest.txt','r');
assert(fid~=1); % verify file is opened
C = textscan(fid,'%f %f %f %f %f %f %f','HeaderLines',2);
fclose(fid);
M = [C{:}]
Note that since you want it all in the same matrix, you need the same data type and all %f is required for each column.

fscanf file read in matlab for mixed numeric and non-numeric data (textscan not available)

I am trying to read a data file but I have an older version of Matlab that does not include textscan. I am trying to use fscanf but I am unable to figure out how to read the second element which is time format. The txt data looks like this:
20120502,16:30:00,1397.5,1397.5,1397.0,1397.5,1283
20120502,16:32:00,1397.25,1397.5,1397.0,1397.0,582
I have tried this, with different attempts at figuring out the 2nd column which is the time vector, but I am not having any luck.
fid = fopen('C:\matlab\data\GLOBEX.txt','r');
[c] = fscanf(fid, '%f %s %f %f %f %f %f');
Thanks
Try the following:
[c] = fscanf(fid, '%f,%d:%d:%d,%f,%f,%f,%f,%f');
c = reshape(c, 9, length(c)/9)';
Now you have hours, minutes, and seconds in columns 2, 3, and 4.