Matlab: Put each line of a text file in a separate array - matlab

I have a file like the following
10158 18227 2055 24478 25532
12936 14953 17522 17616 20898 24993 24996
26375 27950 32700 33099 33496 3663
...
I would like to put each line in an array in order to access elements of each line separately.
I used cell arrays but it seems to create a 1 by 1 array for each cell element:
fid=fopen(filename)
nlines = fskipl(fid, Inf)
frewind(fid);
cells = cell(nlines, 1);
for ii = 1:nlines
cells{ii} = fscanf(fid, '%s', 1);
end
fclose(fid);
when I access cells{ii} I get all values in the same element and I can't access the list values

A shorter solution would be reading the file with textscan:
fid = fopen(filename, 'r');
C = cellfun(#str2num, textscan(fid, '%s', 'delimiter', ''), 'Uniform', false);
fclose(fid);
The resulting cell array C is what you're looking for.

I think that fscanf(fid, '%s', 1); is telling matlab to read the line a single string. You will still have to convert it to an array of numbers:
for ii = 1:nlines
cells{ii} = str2num(fscanf(fid, '%s', 1));
end

Related

Using textscan to extract values from a text file

I have the following text file:
Leaf Tips:2867.5,1101.66666666667 2555,764.166666666667 2382.5,1221.66666666667 2115,759.166666666667 1845,1131.66666666667 1270,991.666666666667
Leaf Bases:1682.66666666667,800.333333333333 1886,850.333333333333 2226,920.333333333333 2362.66666666667,923.666666666667 2619.33333333333,967
Ear Tips:1029.33333333333,513.666666666667 1236,753.666666666667
Ear Bases:1419.33333333333,790.333333333333 1272.66666666667,677
These are coordinates to regions of interest for each category in an image. I need to extract these regions. I know I have to use textscan to accomplish this but I am unsure of the formatspec options needed to achieve this, since whichever setting I use seem to give me some jumbled form of cell output.
What formatSpec should I use so that I get the coordinates of each region outputed in a cell?
I've tried the following:
file = '0.txt';
fileID = fopen(file);
formatSpec = '%s %f %f %f %f %f %f %f %f';
C = textscan(fileID, formatSpec, 150, 'Delimiter', ':');
Here's an example of what you can do:
fid = fopen('0.txt'); % open file
T = textscan(fid, '%s','Delimiter',':'); % read all lines, separate row names from numbers
fclose(fid); % close file
T = reshape(T{1},2,[]).'; % rearrange outputs so it makes more sense
T = [T(:,1), cellfun(#(x)textscan(x,'%f','Delimiter',','), T(:,2))]; % parse numbers
Which will result in a cell array as follows:
T =
4×2 cell array
{'Leaf Tips' } {12×1 double}
{'Leaf Bases'} {10×1 double}
{'Ear Tips' } { 4×1 double}
{'Ear Bases' } { 4×1 double}
This is how I would do it:
fid = fopen('file.txt');
x = textscan(fid,'%s', 'Delimiter', char(10)); % read each line
fclose(fid);
x = x{1};
x = regexp(x, '\d*\.?\d*', 'match'); % extract numbers of each line
C = cellfun(#(t) reshape(str2double(t), 2, []).', x, 'UniformOutput', false); % rearrange
Result:
>> celldisp(C)
C{1} =
1.0e+03 *
2.867500000000000 1.101666666666670
2.555000000000000 0.764166666666667
2.382500000000000 1.221666666666670
2.115000000000000 0.759166666666667
1.845000000000000 1.131666666666670
1.270000000000000 0.991666666666667
C{2} =
1.0e+03 *
1.682666666666670 0.800333333333333
1.886000000000000 0.850333333333333
2.226000000000000 0.920333333333333
2.362666666666670 0.923666666666667
2.619333333333330 0.967000000000000
C{3} =
1.0e+03 *
1.029333333333330 0.513666666666667
1.236000000000000 0.753666666666667
C{4} =
1.0e+03 *
1.419333333333330 0.790333333333333
1.272666666666670 0.677000000000000

importing text file data by blocks?

I am trying to import every rows that starts with '//', I have tried to extract it with the script below. can anybody check my script please?
formatSpec = '//NFE=%f //ElapsedTime=%f //SBX=%f //DE=%f //PCX=%f //SPX=%f //UNDX=%f //UM=%f //Improvements=%f //Restarts=%f //PopulationSize=%f //ArchiveSize=%f //MutationIndex=%f %*f';
N=1
k = 0;
while ~feof(fileID)
k = k+1;
C = textscan(fileID,formatSpec,N,'CommentStyle','#','Delimiter','\n');
end
It is not clear to me how you want the output to look, but here is one possibilitiy:
fid = fopen(filename, 'rt');
dataset = textscan(fid, '%s', 'delimiter', '\n', 'headerlines', 0);
fclose(fid);
result = regexp(dataset{1}, '//([A-Za-z].*)=([0-9\.].*)', 'tokens');
result = result(cellfun(#(x) ~isempty(x), result));
result contains both the type, e.g. NFE or SBX, and the number (albeit in character format).

Load large matrix from text file in matlab

I have a text file like :
[ 1, 2, 3;
2, 4, 5;
2, 2, 2;
8, 3, 3 ]
What is the quickest way to load this as a matrix in Octave/Matlab? I want to see this as a matrix with 4 rows and 3 cols.
Drag the text file with your mouse over your workspace in MATLAB (the area where all your current variables are shown) and drop it there. This opens the "import" window:
Give the file a name (mine is currently "NewTextDocument2") and select IMPORT on the top right. MATLAB will take care of semicolons and brackets. If you want to have a function that does this, select "generate function" instead of IMPORT.
I'm not sure if it is the simplest.
fid = fopen('filename.txt','r');
C = textscan(fid, '%f %f %f', ...
'Delimiter',' ','MultipleDelimsAsOne', 1);
fclose(fid);
DataMatrix = cat(2,C{:});
Quick and really dirty approach using the generally non-recommended function eval:
fid = fopen('data.txt');
s = fscanf(fid, '%s');
fclose(fid);
eval(['dataMatrix = ' s ';']);
in Octave you could do
fid = fopen ("yourfile", "r");
x = str2num (char(fread(fid))');
fclose (fid)
(I don't know if this works in Matlab)
If your file contains only numbers you can use the Matlab load() function. This function is often used to load .mat files. However it is capable of dealing with what matlab calls ASCII format files.
Say your file is purely textual, contains only numbers and is structured as follows:
filename.txt
1 2 3
2 4 5
2 2 2
8 3 3
The load function will create a variable called filename containing your array:
> load('filename.txt');
> filename =
[ 1, 2, 3;
2, 4, 5;
2, 2, 2;
8, 3, 3 ]
This works with your current format of textfile.
Use this function importfile.m:
function filename = importfile(filename, startRow, endRow)
delimiter = ',';
if nargin<=2
startRow = 1;
endRow = inf;
end
formatSpec = '%s%s%s%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, endRow(1)-startRow(1)+1, 'Delimiter', delimiter, 'HeaderLines', startRow(1)-1, 'ReturnOnError', false);
for block=2:length(startRow)
frewind(fileID);
dataArrayBlock = textscan(fileID, formatSpec, endRow(block)-startRow(block)+1, 'Delimiter', delimiter, 'HeaderLines', startRow(block)-1, 'ReturnOnError', false);
for col=1:length(dataArray)
dataArray{col} = [dataArray{col};dataArrayBlock{col}];
end
end
fclose(fileID);
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3]
rawData = dataArray{col};
for row=1:size(rawData, 1);
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(thousandsRegExp, ',', 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
filename = cell2mat(raw);
How to use it:
>> importfile('file.txt',1,4)
ans =
1 2 3
2 4 5
2 2 2
8 3 3

Replace N lines in a txt file without using cell

Is there a faster way to replace the third line of file by another one without using cell ?
I've used this code but it slows my programs, especially that my txt file is composed by more than 1000 lines
% Read txt into cell A
fid7 = fopen([handles.filenameproba],'r');
i = 1;
tline = fgetl(fid7);
A{i} = tline;
while ischar(tline)
i = i+1;
tline = fgetl(fid7);
A{i} = tline;
end
fclose(fid7);
% Change cell A
newval =...
A{3} = sprintf('StartExperiment:%s',num2str(newval);
% Write cell A into txt
fid7 = fopen([handles.filenameproba], 'w');
for i = 1:numel(A)
if A{i+1} == -1
fprintf(fid7,'%s', A{i});
break
else
fprintf(fid7,'%s\n', A{i});
end
end
fclose(fid7);
Thanks !
If performance is your primary concern, try this importdata approach to see if it's any faster -
f=importdata(handles.filenameproba,'')
f(3)={sprintf('StartExperiment:%s',num2str(newval))}
%%// Save the modified text file
fid1 = fopen(handles.filenameproba,'w');
for k = 1:numel(f)
fprintf(fid1,'%s\n',f{k});
end
fclose(fid1);
Cells are not your problem. Your problem is reading and writing one line at a time. Additionally, you are re-sizing your cell array at every iteration.
For the following problem, I created a test file with 10000 lines in it.
fileId = fopen('test.txt', 'w');
for i = 1:10000
fprintf(fileId, 'This is the %dth line.\n', i);
end
fclose(fileId);
I'm calling your method ranellMethod.
>> timeit(#ranellMethod)
ans =
0.5160
A better way to do it is to limit the number of read/write operations you have to do. Assuming your file is small enough, you can read the entire contents into memory at once. Perform your operations, and write everything at once.
function entireFileMethod()
fileName = 'test.txt';
fileId = fopen(fileName, 'r');
try
text = fread(fileId, '*char')'; %'
catch Me
fclose(fileId);
Me.rethrow();
end
fclose(fileId);
newLineInds = find(text == char(10));
newLine = sprintf('This is the new line. %1.*f', randi(10), rand);
newText = [text(1:newLineInds(2)), newLine, text(newLineInds(3):end)];
fileId = fopen(fileName, 'w');
try
fprintf(fileId, '%s', newText);
catch Me
fclose(fileId);
Me.rethrow();
end
fclose(fileId);
end
This function has one read operation, and one write operation:
>> timeit(#entireFileMethod)
ans =
0.0043
See Fastest Matlab file reading? for more detailed information about file IO in MATLAB

Matlab, Avoid empty lines

I've have a
function [Q,A] = load_test(filename) which is loading in a text file. I would like the function to skip empty lines, but i'm not sure how to do it.
I have tried to use
~isempty(x), ~ischar(x)
but I keep getting an error message. my code so far is:
fid = fopen(filename);
data = textscan(fid, '%s','delimiter','\n');
fclose(fid);
Q = cellfun(#(x) x(1:end-2), data{1}, 'uni',0);
A = cellfun(#(x) x(end) == 'T' || x(end) == 'F' && ~isempty(x),data{1});
what do I need to do ?
Code
%%// Your code
fid = fopen(filename);
data = textscan(fid, '%s','delimiter','\n')
fclose(fid);
%%// Additional code
%%// 1. Remove empty lines
c1 = ~cellfun(#isempty,data{:})
t1 = data{:,:}(c1,:)
%%// 2. Select only the lines that have F or T as end characters
lastInLine = regexp(t1,'.$','match','lineanchors') %%// Get the end characters
%%// Get a binary array of rows that have F or T at the end
c2 = strcmp(vertcat(lastInLine{:}),'F') | strcmp(vertcat(lastInLine{:}),'T')
%%// Finally select those rows/lines
data = {t1(c2,:)}
Please note that I am not sure if you still need Q and A.