importing text file data by blocks?

importing text file data by blocks? - matlab

I am trying to import every rows that starts with '//', I have tried to extract it with the script below. can anybody check my script please?
formatSpec = '//NFE=%f //ElapsedTime=%f //SBX=%f //DE=%f //PCX=%f //SPX=%f //UNDX=%f //UM=%f //Improvements=%f //Restarts=%f //PopulationSize=%f //ArchiveSize=%f //MutationIndex=%f %*f';
N=1
k = 0;
while ~feof(fileID)
k = k+1;
C = textscan(fileID,formatSpec,N,'CommentStyle','#','Delimiter','\n');
end

It is not clear to me how you want the output to look, but here is one possibilitiy:
fid = fopen(filename, 'rt');
dataset = textscan(fid, '%s', 'delimiter', '\n', 'headerlines', 0);
fclose(fid);
result = regexp(dataset{1}, '//([A-Za-z].*)=([0-9\.].*)', 'tokens');
result = result(cellfun(#(x) ~isempty(x), result));
result contains both the type, e.g. NFE or SBX, and the number (albeit in character format).

Related

Importing data block with Matlab

I have a set of data in the following format, and I would like to import each block in order to analyze them with Matlab.
Emax=0.5/real
----------------------------------------------------------------------
4.9750557 14535
4.9825821 14522
4.990109 14511
4.9976354 14491
5.0051618 14481
5.0126886 14468
5.020215 14437
5.0277414 14418
5.0352678 14400
5.0427947 14372
5.0503211 14355
5.0578475 14339
5.0653744 14321
Emax=1/real
----------------------------------------------------------------------
24.965595 597544
24.973122 597543
24.980648 597543
24.988174 597542
24.995703 597542
25.003229 597542
I have modified this piece of code from MathWorks, but I think, I have problems dealing with the spaces between each column.
Each block of data consist of 3874 rows and is divided by a text (Emax=XX/real) and a line of ----, unfortunately is the only way the software export the data.

Here is one way to import the data:
% read file as a cell-array of lines
fid = fopen('file.dat', 'rt');
C = textscan(fid, '%s', 'Delimiter','');
C = C{1};
fclose(fid);
% remove separator lines
C(strncmp('---',C,3)) = [];
% location of section headers
headInd = [find(strncmp('Emax=', C, length('Emax='))) ; numel(C)+1];
% extract each section
num = numel(headInd)-1;
blocks = struct('header',cell(num,1), 'data',cell(num,1));
for i=1:num
% section header
blocks(i).header = C{headInd(i)};
% data
X = regexp(C(headInd(i)+1:headInd(i+1)-1), '\s+', 'split');
blocks(i).data = str2double(vertcat(X{:}));
end
The result is a structure array containing the data from each block:
>> blocks
blocks =
2x1 struct array with fields:
header
data
>> blocks(2)
ans =
header: 'Emax=1/real'
data: [6x2 double]
>> blocks(2).data(:,1)
ans =
24.9656
24.9731
24.9806
24.9882
24.9957
25.0032

This should work. I don't think textscan() will work with a file like this because of the breaks between blocks.
Essentially what this code does is loop through lines between blocks until it finds a line that matches the data format. The code is naive and assumes that the file will have exactly the number of blocks lines per block that you specify. If there were a fixed number of lines between blocks it would be a lot easier and you could remove the first inner loop and replace with just ~=fgets(fid) once for each line.
function block_data = readfile(in_file_name)
fid = fopen(in_file_name, 'r');
delimiter = ' ';
line_format = '%f %f';
n_cols = 2; % Number of numbers per line
block_length = 3874; % Number of lines per block
n_blocks = 2; % Total number of blocks in file
tline = fgets(fid);
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
block_n = 0;
block_data = zeros(n_blocks,block_length,n_cols);
while ischar(tline) && block_n < n_blocks
block_n = block_n+1;
tline = fgets(fid);
if ischar(tline)
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
end
while ischar(tline) && isempty(line_data)
tline = fgets(fid);
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
end
line_n = 1;
while line_n <= block_length
block_data(block_n,line_n,:) = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
tline = fgets(fid);
line_n = line_n+1;
end
end
fclose(fid)

Replace N lines in a txt file without using cell

Is there a faster way to replace the third line of file by another one without using cell ?
I've used this code but it slows my programs, especially that my txt file is composed by more than 1000 lines
% Read txt into cell A
fid7 = fopen([handles.filenameproba],'r');
i = 1;
tline = fgetl(fid7);
A{i} = tline;
while ischar(tline)
i = i+1;
tline = fgetl(fid7);
A{i} = tline;
end
fclose(fid7);
% Change cell A
newval =...
A{3} = sprintf('StartExperiment:%s',num2str(newval);
% Write cell A into txt
fid7 = fopen([handles.filenameproba], 'w');
for i = 1:numel(A)
if A{i+1} == -1
fprintf(fid7,'%s', A{i});
break
else
fprintf(fid7,'%s\n', A{i});
end
end
fclose(fid7);
Thanks !

If performance is your primary concern, try this importdata approach to see if it's any faster -
f=importdata(handles.filenameproba,'')
f(3)={sprintf('StartExperiment:%s',num2str(newval))}
%%// Save the modified text file
fid1 = fopen(handles.filenameproba,'w');
for k = 1:numel(f)
fprintf(fid1,'%s\n',f{k});
end
fclose(fid1);

Cells are not your problem. Your problem is reading and writing one line at a time. Additionally, you are re-sizing your cell array at every iteration.
For the following problem, I created a test file with 10000 lines in it.
fileId = fopen('test.txt', 'w');
for i = 1:10000
fprintf(fileId, 'This is the %dth line.\n', i);
end
fclose(fileId);
I'm calling your method ranellMethod.
>> timeit(#ranellMethod)
ans =
0.5160
A better way to do it is to limit the number of read/write operations you have to do. Assuming your file is small enough, you can read the entire contents into memory at once. Perform your operations, and write everything at once.
function entireFileMethod()
fileName = 'test.txt';
fileId = fopen(fileName, 'r');
try
text = fread(fileId, '*char')'; %'
catch Me
fclose(fileId);
Me.rethrow();
end
fclose(fileId);
newLineInds = find(text == char(10));
newLine = sprintf('This is the new line. %1.*f', randi(10), rand);
newText = [text(1:newLineInds(2)), newLine, text(newLineInds(3):end)];
fileId = fopen(fileName, 'w');
try
fprintf(fileId, '%s', newText);
catch Me
fclose(fileId);
Me.rethrow();
end
fclose(fileId);
end
This function has one read operation, and one write operation:
>> timeit(#entireFileMethod)
ans =
0.0043
See Fastest Matlab file reading? for more detailed information about file IO in MATLAB

Matlab, Avoid empty lines

I've have a
function [Q,A] = load_test(filename) which is loading in a text file. I would like the function to skip empty lines, but i'm not sure how to do it.
I have tried to use
~isempty(x), ~ischar(x)
but I keep getting an error message. my code so far is:
fid = fopen(filename);
data = textscan(fid, '%s','delimiter','\n');
fclose(fid);
Q = cellfun(#(x) x(1:end-2), data{1}, 'uni',0);
A = cellfun(#(x) x(end) == 'T' || x(end) == 'F' && ~isempty(x),data{1});
what do I need to do ?

Code
%%// Your code
fid = fopen(filename);
data = textscan(fid, '%s','delimiter','\n')
fclose(fid);
%%// Additional code
%%// 1. Remove empty lines
c1 = ~cellfun(#isempty,data{:})
t1 = data{:,:}(c1,:)
%%// 2. Select only the lines that have F or T as end characters
lastInLine = regexp(t1,'.$','match','lineanchors') %%// Get the end characters
%%// Get a binary array of rows that have F or T at the end
c2 = strcmp(vertcat(lastInLine{:}),'F') | strcmp(vertcat(lastInLine{:}),'T')
%%// Finally select those rows/lines
data = {t1(c2,:)}
Please note that I am not sure if you still need Q and A.

Get all the values from the "for loop" run

I have some *.dat files in a folder, I would like to extract a particular column (8th column) from all of the files and put into a excel file. I have run a for loop, but it only gives me the results of final run (i.e. if there are 10 number of files, it only returns me 8th column of the 10th files).
data = cell(numel(files),1);
for i = 1:numel(files)
fid = fopen(fullfile(pathToFolder,files(i).name), 'rt');
H = textscan(fid, '%s', 4, 'Delimiter','\n');
C = textscan(fid, repmat('%f ',1,48), 'Delimiter',' ', ...
'MultipleDelimsAsOne',true, 'CollectOutput',true);
fclose(fid);
H = H = H{1}; C = C{1};
data{i} = C;
B = C(:,8);
end
Looking for your help on this.
It would be greatly appreciated.

You are overwriting B each iteration. B(:,i) will put each column 8 of C in a column of B.

Matlab: Put each line of a text file in a separate array

I have a file like the following
10158 18227 2055 24478 25532
12936 14953 17522 17616 20898 24993 24996
26375 27950 32700 33099 33496 3663
...
I would like to put each line in an array in order to access elements of each line separately.
I used cell arrays but it seems to create a 1 by 1 array for each cell element:
fid=fopen(filename)
nlines = fskipl(fid, Inf)
frewind(fid);
cells = cell(nlines, 1);
for ii = 1:nlines
cells{ii} = fscanf(fid, '%s', 1);
end
fclose(fid);
when I access cells{ii} I get all values in the same element and I can't access the list values

A shorter solution would be reading the file with textscan:
fid = fopen(filename, 'r');
C = cellfun(#str2num, textscan(fid, '%s', 'delimiter', ''), 'Uniform', false);
fclose(fid);
The resulting cell array C is what you're looking for.

I think that fscanf(fid, '%s', 1); is telling matlab to read the line a single string. You will still have to convert it to an array of numbers:
for ii = 1:nlines
cells{ii} = str2num(fscanf(fid, '%s', 1));
end

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

importing text file data by blocks? - matlab

Related

Importing data block with Matlab

Replace N lines in a txt file without using cell

Matlab, Avoid empty lines

Get all the values from the "for loop" run

Matlab: Put each line of a text file in a separate array

Categories

Resources