How to read in lines of unequal length CSV doubles

How to read in lines of unequal length CSV doubles - matlab

I have a file where each line is a list of CSV doubles, i.e:
80,81,179,180,181,182
114,115,27,31,34
16,17,18,25
63,64,35,58,73,75,76,94,95
67,68
I need to read in each line, temporarily store it as a 1 x n double array for some calculations, then move onto the next line.
The idea I had was:
fid = fopen('fileName.txt');
tline = fgets(fid);
while ischar(tline)
% Update with solution I came up with
values = cellfun(#str2double,regexp(tline,',', 'split'));
tline = fgets(fid);
end

You can search for the commas contained in each line and the either use the indexes of their location in the string or their amount to loop till the end of the line.

Related

I'm using numel to read and process multiple files one by one. But it barfs when there is tline statement

I'm using numel to read and process multiple files one by one. But it barfs when there is tline statement. Here is my code. Can someone help?
......
filename = ['20170101.BER' '20170102.BER' '20170103.BER' '20170104.BER'];
for i = 1:numel(filename)
fid = fopen(filename,'rt');
% Read file and find lines with the string 'DATA'
count=0;
got1=[];
got2=[];
while 1
tline = fgetl(fid);
if ~ischar(tline), break, end
count=count+1;
count5=0;
count3=[];
for(t=1:length(tline)-length(str)+1)
count4=0;
for(count2=1:length(str))
......
fclose(fid);
end

The call to fgetl fails because fid is not a valid file ID. Your main problem is how you are storing and accessing filename. What you have right now is a 1-by-N character array. It's a better idea to store your file names in a cell array of character arrays, like so:
filename = {'20170101.BER' '20170102.BER' '20170103.BER' '20170104.BER'};
This is now a 1-by-4 cell array. When you loop over the 4 cells, you have to extract the character array to open the file like so:
fid = fopen(filename{i}, 'rt');
As long as these files are in the current directory, they should load just fine.

Matlab - string containing a number and equal sign

I have a data file that contains parameter names and values with an equal sign in between them. It's like this:
A = 1234
B = 1353.335
C =
D = 1
There is always one space before and after the equal sign. The problem is some variables don't have values assigned to them like "C" above and I need to weed them out.
I want to read the data file (text) into a cell and just remove the lines with those invalid statements or just create a new data file without them.
Whichever is easier, but I will eventually read the file into a cell with textscan command.
The values (numbers) will be treated as double precision.
Please, help.
Thank you,
Eric

Try this:
fid = fopen('file.txt'); %// open file
x = textscan(fid, '%s', 'delimiter', '\n'); %// or '\r'. Read each line into a cell
fclose(fid); %// close file
x = x{1}; %// each cell of x contains a line of the file
ind = ~cellfun(#isempty, regexp(x, '=\s[\d\.]+$')); %// desired lines: space, numbers, end
x = x(ind); %// keep only those lines

If you just want to get the variables, and reject lines that do not have any character, this might work (the data.txt is just a txt generated by the example of data you have given):
fid = fopen('data.txt');
tline = fgets(fid);
while ischar(tline)
tmp = cell2mat(regexp(tline,'\=(.*)','match'));
b=str2double(tmp(2:end));
if ~isnan(b)
disp(b)
end
tline = fgets(fid);
end
fclose(fid);
I am reading the txt file line by line, and using general expressions to get rid of useless chars, and then converting to double the value read.

Can Matlab readtable work on a text file delimited with variable numbers of spaces?

I have several text files that are formatted something like this, each file with a different number of rows (but around 1000 rows in each).
Id X Y Curve
1 0.0000000000 -0.0000286102 Domain_BCs
2 0.0010000000 -202.5294952393 Domain_BCs
3 0.2028919513 -1098.9577636719 Domain_BCs
4 1.0000000000 -2286.1757812500 Domain_BCs
I want to bring this data into Matlab, break it into separate vectors according to the string in the Curve column, and plot Y as a function of X.
The data is space-delimited with a variable number of spaces, and there are also a variable number of spaces at the start of each row (before the Id column). I know that readtable would work if there were no spaces at the beginning of the rows and only one space between columns. Is there any way to make readtable work with data in this format?
I also considered using textscan, but my understanding is that I would need to know the number of rows in order to use textscan, which makes things trickier because the number of rows is different for each file I want to process.

Textscan is exactly meant for this purpose. You can just use textscan without knowing the number of lines :) Any amount of whitespace is interpreted as a single delimiter standard. So just use:
FID = fopen('test2.txt');
formatSpec = '%d %f %f %s';
C = textscan(FID,formatSpec);
fclose(FID)
In test2.txt I just pasted your example a few times (without headers).
Each column of your file is then read into a cell in C.
Soruce: http://www.mathworks.nl/help/matlab/ref/textscan.html

fgets - Read lines without concerning number of lines
strsplit - split a string with delimiters
fid = fopen('yourfile.txt');
tline = fgets(fid);
while ischar(tline)
trow = strsplit(tline, ' ', 'CollapseDelimiters',true);
tline = fgets(fid);
end
fclose(fid);
If you want to speed up a little bit,
fid = fopen('yourfile.txt');
counter = 0;
tline = fgets(fid);
trow = strsplit(tline, ' ', 'CollapseDelimiters',true);
while ischar(tline)
counter = counter + 1;
tline = fgets(fid);
end
T = zeros(counter, length(trow));
frewind(fid);
while ischar(tline)
trow = strsplit(tline, ' ', 'CollapseDelimiters',true);
tline = fgets(fid);
end
fclose(fid);

MATLAB - How to save vectors with different length

I created a file that contains vectors and these could have empty space between their elements.
-77.4 1 0.17 260 88 1004.0 1006.5
-77.3 1 0.17 1009.2 1011.8
I save the file 'myfile.txt' row by row with fprintf() function.
Well, when I load the file with the command load('myfile.txt') I receive this error message "Number of columns on line ... must be the same as previous lines"
How can I fix it? Perhaps save the row vectors by another way? How to do?
Thank you

You would be better off by using the save command as #maxywb stated in his comment, but if you find yourself in a situation where you have a text file that does not have consistent column numbers, you can parse the file line by line and save the results into a cell array
fid = fopen('myFile.txt','r');
values = {};
count = 1;
tline = fgets(fid);
while ischar(tline)
values{count} = textscan(tline,'%f','delimiter',', ');
count = count+1;
tline = fgets(fid);
end
fclose(fid)

Text Scanning to read in unknown number of variables and unknown number of runs

I am trying to read in a csv file which will have the format
Var1 Val1A Val1B ... Val1Q
Var2 Val2A Val2B ... Val2Q
...
And I will not know ahead of time how many variables (rows) or how many runs (columns) will be in the file.
I have been trying to get text scan to work but no matter what I try I cannot get either all the variable names isolated or a rows by columns cell array. This is what I've been trying.
fID = fopen(strcat(pwd,'/',inputFile),'rt');
if fID == -1
disp('Could not find file')
return
end
vars = textscan(fID, '%s,%*s','delimiter','\n');
fclose(fID);
Does anyone have a suggestion?

If the file has the same number of columns in each row (you just don't know how many to begin with), try the following.
First, figure out how many columns by parsing just the first row and find the number of columns, then parse the full file:
% Open the file, get the first line
fid = fopen('myfile.txt');
line = fgetl(fid);
fclose(fid);
tmp = textscan(line, '%s');
% The length of tmp will tell you how many lines
n = length(tmp);
% Now scan the file
fid = fopen('myfile.txt');
tmp = textscan(fid, repmat('%s ', [1, n]));
fclose(fid);

For any given file, are all the lines equal length? If they are, you could start by reading in the first line and use that to count the number of fields and then use textscan to read in the file.
fID = fopen(strcat(pwd,'/',inputFile),'rt');
firstLine = fgetl(fID);
numFields = length(strfind(firstLine,' ')) + 1;
fclose(fID);
formatString = repmat('%s',1,numFields);
fID = fopen(strcat(pwd,'/',inputFile),'rt');
vars = textscan(fID, formatString,' ');
fclose(fID);
Now you will have a cell array where first entry are the var names and all the other entries are the observations.
In this case I assumed the delimiter was space even though you said it was a csv file. If it is really commas, you can change the code accordingly.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to read in lines of unequal length CSV doubles - matlab

You can search for the commas contained in each line and the either use the indexes of their location in the string or their amount to loop till the end of the line.

Related

I'm using numel to read and process multiple files one by one. But it barfs when there is tline statement

Matlab - string containing a number and equal sign

Can Matlab readtable work on a text file delimited with variable numbers of spaces?

MATLAB - How to save vectors with different length

Text Scanning to read in unknown number of variables and unknown number of runs

Categories

Resources