read a txt file to matrix and cellarray Matlab - matlab

I have a txt file with those entries and I would like to know how to get the numerical values from the second column until the last column in a matrix and the first column in a cell array.
I've tried with import data and fscanf and I dont understand what's going on.
CP6 7,2 -2,7 6,6
P5 -5,8 -5,9 5,8
P6 5,8 -5,9 5,8
AF7 -5,0 7,2 3,6
AF8 5,0 7,2 3,6
FT7 -7,6 2,8 3,6

This should give you what you want based on the text sample you supplied.
fileID = fopen('x.txt'); %open file x.txt
m=textscan(fileID,'%s %d ,%d %d ,%d %d ,%d');
fclose(fileID); %close file
col1 = m{1,1}; %get first column into cell array col1
colRest = cell2mat(m(1,2:6)); %convert rest of columns into matrix colRest
Lookup textscan for more info on reading specially formatted data

This function should do the trick. It reads your file and scans it according to your pattern. Then, put the first column in a cell array and the others in a matrix.
function [ C1,A ] = scan_your_txt_file( filename )
fid = fopen(filename,'rt');
C = textscan(fid, '%s %d,%d %d,%d %d,%d');
fclose(fid);
C1 = C{1};
A = cell2mat(C(2:size(C,2)));
end

Have you tried xlsread? It makes a numeric array and two non-numeric arrays.
[N,T,R]=xlsread('yourfilename.txt')
but your data is not comma delimited. It also looks like you are using a comma to represent a decimal point. Does this array have 7 columns or 4? Because I'm in the US, I'm going to assume you have paired coordinates and the comma is one kind of delimiter while the space is a second one.
So here is something klugy, but it works. It is a gross ugly hack, but it works.
%housekeeping
clc
%get name of raw file
d=dir('*22202740*.txt')
%translate from comma-as-decimal to period-as-decimal
fid = fopen(d(1).name,'r') %source
fid2= fopen('myout.txt','w+') %sink
while 1
tline = fgetl(fid); %read
if ~ischar(tline), break, end %end loop
fprintf(fid2,'%s\r\n',strrep(tline,',','.')) %write updated line to output
end
fclose(fid)
fclose(fid2)
%open, gulp, parse/store, close
fid3 = fopen('myout.txt','r');
C=textscan(fid3,'%s %f %f %f ');
fclose(fid3);
%measure waist size and height
[n,m]=size(C);
n=length(C{1});
%put in slightly more friendly form
temp=zeros(n,m);
for i=2:m
t0=C{i};
temp(:,i)=t0;
end
%write to excel
xlswrite('myout_22202740.xlsx',temp(:,2:end),['b1:' char(96+m) num2str(n)]);
xlswrite('myout_22202740.xlsx',C{1},['a1:a' num2str(n)])
%read from excel
[N,T,R]=xlsread('myout_22202740.xlsx')
If you want those commas to be decimal points, then that is a different question.

Related

Matlab - string containing a number and equal sign

I have a data file that contains parameter names and values with an equal sign in between them. It's like this:
A = 1234
B = 1353.335
C =
D = 1
There is always one space before and after the equal sign. The problem is some variables don't have values assigned to them like "C" above and I need to weed them out.
I want to read the data file (text) into a cell and just remove the lines with those invalid statements or just create a new data file without them.
Whichever is easier, but I will eventually read the file into a cell with textscan command.
The values (numbers) will be treated as double precision.
Please, help.
Thank you,
Eric
Try this:
fid = fopen('file.txt'); %// open file
x = textscan(fid, '%s', 'delimiter', '\n'); %// or '\r'. Read each line into a cell
fclose(fid); %// close file
x = x{1}; %// each cell of x contains a line of the file
ind = ~cellfun(#isempty, regexp(x, '=\s[\d\.]+$')); %// desired lines: space, numbers, end
x = x(ind); %// keep only those lines
If you just want to get the variables, and reject lines that do not have any character, this might work (the data.txt is just a txt generated by the example of data you have given):
fid = fopen('data.txt');
tline = fgets(fid);
while ischar(tline)
tmp = cell2mat(regexp(tline,'\=(.*)','match'));
b=str2double(tmp(2:end));
if ~isnan(b)
disp(b)
end
tline = fgets(fid);
end
fclose(fid);
I am reading the txt file line by line, and using general expressions to get rid of useless chars, and then converting to double the value read.

Matlab reading txt formatted file

If there is a .txt file in the format
Name, Home, 1, 2, 3, 3, 3, 3
It means the first two columns are string, and the rest are integers
How do I read first two column as vectors of strings, and another matrix as numeric values.
One way of doing this so you know exactly what's happening line by line is in the following piece of code:
fid = fopen('textfile.txt');
clear data
tline = fgetl(fid);
n = 1;
while ischar(tline)
data(n,:) = strsplit(tline(1:end),', ');
n=n+1;
tline = fgetl(fid);
end
fclose(fid);
dataStrings = data(:,1:2);
dataValues = str2double(data(:,3:end));
where data contains everything in string type, dataStrings contains only first 2 columns as strings, and dataValues contains the rest of the columns as type double.
This way you get simple matrices, meaning you don't have to worry yourself with structures or cell arrays.
Use textscan:
fileID = fopen('sometextfile.txt');
C = textscan(fileID,'%s %s %f %f %f %f %f %f','Delimiter',','); % assuming you want double data types, change as required
fclose(fileID);
celldisp(C) % C is a cell array

Can Matlab readtable work on a text file delimited with variable numbers of spaces?

I have several text files that are formatted something like this, each file with a different number of rows (but around 1000 rows in each).
Id X Y Curve
1 0.0000000000 -0.0000286102 Domain_BCs
2 0.0010000000 -202.5294952393 Domain_BCs
3 0.2028919513 -1098.9577636719 Domain_BCs
4 1.0000000000 -2286.1757812500 Domain_BCs
I want to bring this data into Matlab, break it into separate vectors according to the string in the Curve column, and plot Y as a function of X.
The data is space-delimited with a variable number of spaces, and there are also a variable number of spaces at the start of each row (before the Id column). I know that readtable would work if there were no spaces at the beginning of the rows and only one space between columns. Is there any way to make readtable work with data in this format?
I also considered using textscan, but my understanding is that I would need to know the number of rows in order to use textscan, which makes things trickier because the number of rows is different for each file I want to process.
Textscan is exactly meant for this purpose. You can just use textscan without knowing the number of lines :) Any amount of whitespace is interpreted as a single delimiter standard. So just use:
FID = fopen('test2.txt');
formatSpec = '%d %f %f %s';
C = textscan(FID,formatSpec);
fclose(FID)
In test2.txt I just pasted your example a few times (without headers).
Each column of your file is then read into a cell in C.
Soruce: http://www.mathworks.nl/help/matlab/ref/textscan.html
fgets - Read lines without concerning number of lines
strsplit - split a string with delimiters
fid = fopen('yourfile.txt');
tline = fgets(fid);
while ischar(tline)
trow = strsplit(tline, ' ', 'CollapseDelimiters',true);
tline = fgets(fid);
end
fclose(fid);
If you want to speed up a little bit,
fid = fopen('yourfile.txt');
counter = 0;
tline = fgets(fid);
trow = strsplit(tline, ' ', 'CollapseDelimiters',true);
while ischar(tline)
counter = counter + 1;
tline = fgets(fid);
end
T = zeros(counter, length(trow));
frewind(fid);
while ischar(tline)
trow = strsplit(tline, ' ', 'CollapseDelimiters',true);
tline = fgets(fid);
end
fclose(fid);

Reading CSV with mixed type data

I need to read the following csv file in MATLAB:
2009-04-29 01:01:42.000;16271.1;16271.1
2009-04-29 02:01:42.000;2.5;16273.6
2009-04-29 03:01:42.000;2.599609;16276.2
2009-04-29 04:01:42.000;2.5;16278.7
...
I'd like to have three columns:
timestamp;value1;value2
I tried the approaches described here:
Reading date and time from CSV file in MATLAB
modified as:
filename = 'prova.csv';
fid = fopen(filename, 'rt');
a = textscan(fid, '%s %f %f', ...
'Delimiter',';', 'CollectOutput',1);
fclose(fid);
But it returs a 1x2 cell, whose first element is a{1}='ÿþ2', the other are empty.
I had also tried to adapt to my case the answers to these questions:
importing data with time in MATLAB
Read data files with specific format in matlab and convert date to matal serial time
but I didn't succeed.
How can I import that csv file?
EDIT After the answer of #macduff i try to copy-paste in a new file the data reported above and use:
a = textscan(fid, '%s %f %f','Delimiter',';');
and it works.
Unfortunately that didn't solve the problem because I have to process csv files generated automatically, which seems to be the cause of the strange MATLAB behavior.
What about trying:
a = textscan(fid, '%s %f %f','Delimiter',';');
For me I get:
a =
{4x1 cell} [4x1 double] [4x1 double]
So each element of a corresponds to a column in your csv file. Is this what you need?
Thanks!
Seems you're going about it the right way. The example you provide poses no problems here, I get the output you desire. What's in the 1x2 cell?
If I were you I'd try again with a smaller subset of the file, say 10 lines, and see if the output changes. If yes, then try 100 lines, etc., until you find where the 4x1 cell + 4x2 array breaks down into the 1x2 cell. It might be that there's an empty line or a single empty field or whatever, which forces textscan to collect data in an additional level of cells.
Note that 'CollectOutput',1 will collect the last two columns into a single array, so you'll end up with 1 cell array of 4x1 containing strings, and 1 array of 4x2 containing doubles. Is that indeed what you want? Otherwise, see #macduff's post.
I've had to parse large files like this, and I found I didn't like textscan for this job. I just use a basic while loop to parse the file, and I use datevec to extract the timestamp components into a 6-element time vector.
%% Optional: initialize for speed if you have large files
n = 1000 %% <# of rows in file - if known>
timestamp = zeros(n,6);
value1 = zeros(n,1);
value2 = zeros(n,1);
fid = fopen(fname, 'rt');
if fid < 0
error('Error opening file %s\n', fname); % exit point
end
cntr = 0
while true
tline = fgetl(fid); %% get one line
if ~ischar(tline), break; end; % break out of loop at end of file
cntr = cntr + 1;
splitLine = strsplit(tline, ';'); %% split the line on ; delimiters
timestamp(cntr,:) = datevec(splitLine{1}, 'yyyy-mm-dd HH:MM:SS.FFF'); %% using datevec to parse time gives you a standard timestamp vector
value1(cntr) = splitLine{2};
value2(cntr) = splitLine{3};
end
%% Concatenate at the end if you like
result = [timestamp value1 value2];

How to convert Matlab variables to .dat (text) file with headers

EDITED QUESTION:
I have 2500 rows x 100 columns data in variable named avg_data_models. I also have 2500 rows x 100 columns variable 'X' and similar size matrix variable 'Y', both containing the co-ordinates. I want to save the values of this variable in a text (.dat) file which must have 302 header lines in the following manner:
avg_data_models
300
X_1
X_2
.
.
.
X_100
Y_1
Y_2
.
.
.
Y_100
avg_data_models_1
avg_data_models_2
avg_data_models_3
.
.
.
.
.
avg_data_models_100
In the above header style, the first line is the name of the file, the 2nd line tells the number of columns (each column has 2500 rows), and the rest of the 300 lines represent the model of each variable respectively - Like 100 models of X, 100 models of Y and 100 models of avg_data_models.
Consider this code:
%# here you have your data X/Y/..
%#X = rand(2500,100);
[r c] = size(X);
prefixX = 'X';
prefixY = 'Y';
prefixData = 'avg_data_models';
%# build a cell array that contains all the header lines
num = strtrim( cellstr(num2str((1:c)','_%d')) ); %#' SO fix
headers = [ prefixData ;
num2str(3*c) ;
strcat(prefixX,num) ;
strcat(prefixY,num) ;
strcat(prefixData,num) ];
%# write to file
fid = fopen('outputFile.dat', 'wt');
fprintf(fid, '%s\n',headers{:});
fclose(fid);
EDIT
It seems I misunderstood the question.. Here's the code to write the actual data (not the header titles!):
%# here you have your data X/Y/..
avg_data_models = rand(2500,100);
X = rand(2500,100);
Y = rand(2500,100);
%# create file, and write the title and number of columns
fid = fopen('outputFile.dat', 'wt');
fprintf(fid, '%s\n%d\n', 'avg_data_models', 3*size(X,2));
fclose(fid);
%# append rest of data
dlmwrite('outputFile.dat', [X Y avg_data_models], '-append', 'delimiter',',')
Note: I used a comma , as delimiter, you can change it to be a space or a tab \t if you like..
You can use fprintf to write the header, like so:
%# define the number of data
nModels = 100;
dataName = 'avg_data_models';
%# open the file
fid = fopen('output.dat','w');
%# start writing. First line: title
fprintf(fid,'%s\n',dataName); %# don't forget \n for newline. Use \n\r if yow want to open this in notepad
%# write number of models
fprintf(fid,'%i\n',nModels)
%# loop to write the rest of the header
for iModel = 1:nModels
fprintf(fid,'%s_%i\n',dataName,iModel);
end
%# use your favorite method to write the rest of the data.
%# for example, you could use fprintf again, using /t to add tabs
%# create format-string
%# check the help to fprintf to learn about formatting details
formatString = repmat('%f\t',1,100);
formatString = [formatString(1:end-1),'n']; %# replace last tab with newline
%# transpose the array, because fprintf reshapes the array to a vector and
%# 'fills' the format-strings sequentially until it runs out of data
fprintf(fid,formatString,avg_data'); %'# SO formatting
%# close the file
fclose(fid);