Using textscan to extract values from a text file - matlab

I have the following text file:
Leaf Tips:2867.5,1101.66666666667 2555,764.166666666667 2382.5,1221.66666666667 2115,759.166666666667 1845,1131.66666666667 1270,991.666666666667
Leaf Bases:1682.66666666667,800.333333333333 1886,850.333333333333 2226,920.333333333333 2362.66666666667,923.666666666667 2619.33333333333,967
Ear Tips:1029.33333333333,513.666666666667 1236,753.666666666667
Ear Bases:1419.33333333333,790.333333333333 1272.66666666667,677
These are coordinates to regions of interest for each category in an image. I need to extract these regions. I know I have to use textscan to accomplish this but I am unsure of the formatspec options needed to achieve this, since whichever setting I use seem to give me some jumbled form of cell output.
What formatSpec should I use so that I get the coordinates of each region outputed in a cell?
I've tried the following:
file = '0.txt';
fileID = fopen(file);
formatSpec = '%s %f %f %f %f %f %f %f %f';
C = textscan(fileID, formatSpec, 150, 'Delimiter', ':');

Here's an example of what you can do:
fid = fopen('0.txt'); % open file
T = textscan(fid, '%s','Delimiter',':'); % read all lines, separate row names from numbers
fclose(fid); % close file
T = reshape(T{1},2,[]).'; % rearrange outputs so it makes more sense
T = [T(:,1), cellfun(#(x)textscan(x,'%f','Delimiter',','), T(:,2))]; % parse numbers
Which will result in a cell array as follows:
T =
4×2 cell array
{'Leaf Tips' } {12×1 double}
{'Leaf Bases'} {10×1 double}
{'Ear Tips' } { 4×1 double}
{'Ear Bases' } { 4×1 double}

This is how I would do it:
fid = fopen('file.txt');
x = textscan(fid,'%s', 'Delimiter', char(10)); % read each line
fclose(fid);
x = x{1};
x = regexp(x, '\d*\.?\d*', 'match'); % extract numbers of each line
C = cellfun(#(t) reshape(str2double(t), 2, []).', x, 'UniformOutput', false); % rearrange
Result:
>> celldisp(C)
C{1} =
1.0e+03 *
2.867500000000000 1.101666666666670
2.555000000000000 0.764166666666667
2.382500000000000 1.221666666666670
2.115000000000000 0.759166666666667
1.845000000000000 1.131666666666670
1.270000000000000 0.991666666666667
C{2} =
1.0e+03 *
1.682666666666670 0.800333333333333
1.886000000000000 0.850333333333333
2.226000000000000 0.920333333333333
2.362666666666670 0.923666666666667
2.619333333333330 0.967000000000000
C{3} =
1.0e+03 *
1.029333333333330 0.513666666666667
1.236000000000000 0.753666666666667
C{4} =
1.0e+03 *
1.419333333333330 0.790333333333333
1.272666666666670 0.677000000000000

Related

importing text file data by blocks?

I am trying to import every rows that starts with '//', I have tried to extract it with the script below. can anybody check my script please?
formatSpec = '//NFE=%f //ElapsedTime=%f //SBX=%f //DE=%f //PCX=%f //SPX=%f //UNDX=%f //UM=%f //Improvements=%f //Restarts=%f //PopulationSize=%f //ArchiveSize=%f //MutationIndex=%f %*f';
N=1
k = 0;
while ~feof(fileID)
k = k+1;
C = textscan(fileID,formatSpec,N,'CommentStyle','#','Delimiter','\n');
end
It is not clear to me how you want the output to look, but here is one possibilitiy:
fid = fopen(filename, 'rt');
dataset = textscan(fid, '%s', 'delimiter', '\n', 'headerlines', 0);
fclose(fid);
result = regexp(dataset{1}, '//([A-Za-z].*)=([0-9\.].*)', 'tokens');
result = result(cellfun(#(x) ~isempty(x), result));
result contains both the type, e.g. NFE or SBX, and the number (albeit in character format).

Load large matrix from text file in matlab

I have a text file like :
[ 1, 2, 3;
2, 4, 5;
2, 2, 2;
8, 3, 3 ]
What is the quickest way to load this as a matrix in Octave/Matlab? I want to see this as a matrix with 4 rows and 3 cols.
Drag the text file with your mouse over your workspace in MATLAB (the area where all your current variables are shown) and drop it there. This opens the "import" window:
Give the file a name (mine is currently "NewTextDocument2") and select IMPORT on the top right. MATLAB will take care of semicolons and brackets. If you want to have a function that does this, select "generate function" instead of IMPORT.
I'm not sure if it is the simplest.
fid = fopen('filename.txt','r');
C = textscan(fid, '%f %f %f', ...
'Delimiter',' ','MultipleDelimsAsOne', 1);
fclose(fid);
DataMatrix = cat(2,C{:});
Quick and really dirty approach using the generally non-recommended function eval:
fid = fopen('data.txt');
s = fscanf(fid, '%s');
fclose(fid);
eval(['dataMatrix = ' s ';']);
in Octave you could do
fid = fopen ("yourfile", "r");
x = str2num (char(fread(fid))');
fclose (fid)
(I don't know if this works in Matlab)
If your file contains only numbers you can use the Matlab load() function. This function is often used to load .mat files. However it is capable of dealing with what matlab calls ASCII format files.
Say your file is purely textual, contains only numbers and is structured as follows:
filename.txt
1 2 3
2 4 5
2 2 2
8 3 3
The load function will create a variable called filename containing your array:
> load('filename.txt');
> filename =
[ 1, 2, 3;
2, 4, 5;
2, 2, 2;
8, 3, 3 ]
This works with your current format of textfile.
Use this function importfile.m:
function filename = importfile(filename, startRow, endRow)
delimiter = ',';
if nargin<=2
startRow = 1;
endRow = inf;
end
formatSpec = '%s%s%s%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, endRow(1)-startRow(1)+1, 'Delimiter', delimiter, 'HeaderLines', startRow(1)-1, 'ReturnOnError', false);
for block=2:length(startRow)
frewind(fileID);
dataArrayBlock = textscan(fileID, formatSpec, endRow(block)-startRow(block)+1, 'Delimiter', delimiter, 'HeaderLines', startRow(block)-1, 'ReturnOnError', false);
for col=1:length(dataArray)
dataArray{col} = [dataArray{col};dataArrayBlock{col}];
end
end
fclose(fileID);
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3]
rawData = dataArray{col};
for row=1:size(rawData, 1);
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(thousandsRegExp, ',', 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
filename = cell2mat(raw);
How to use it:
>> importfile('file.txt',1,4)
ans =
1 2 3
2 4 5
2 2 2
8 3 3

Get all the values from the "for loop" run

I have some *.dat files in a folder, I would like to extract a particular column (8th column) from all of the files and put into a excel file. I have run a for loop, but it only gives me the results of final run (i.e. if there are 10 number of files, it only returns me 8th column of the 10th files).
data = cell(numel(files),1);
for i = 1:numel(files)
fid = fopen(fullfile(pathToFolder,files(i).name), 'rt');
H = textscan(fid, '%s', 4, 'Delimiter','\n');
C = textscan(fid, repmat('%f ',1,48), 'Delimiter',' ', ...
'MultipleDelimsAsOne',true, 'CollectOutput',true);
fclose(fid);
H = H = H{1}; C = C{1};
data{i} = C;
B = C(:,8);
end
Looking for your help on this.
It would be greatly appreciated.
You are overwriting B each iteration. B(:,i) will put each column 8 of C in a column of B.

Matlab: Put each line of a text file in a separate array

I have a file like the following
10158 18227 2055 24478 25532
12936 14953 17522 17616 20898 24993 24996
26375 27950 32700 33099 33496 3663
...
I would like to put each line in an array in order to access elements of each line separately.
I used cell arrays but it seems to create a 1 by 1 array for each cell element:
fid=fopen(filename)
nlines = fskipl(fid, Inf)
frewind(fid);
cells = cell(nlines, 1);
for ii = 1:nlines
cells{ii} = fscanf(fid, '%s', 1);
end
fclose(fid);
when I access cells{ii} I get all values in the same element and I can't access the list values
A shorter solution would be reading the file with textscan:
fid = fopen(filename, 'r');
C = cellfun(#str2num, textscan(fid, '%s', 'delimiter', ''), 'Uniform', false);
fclose(fid);
The resulting cell array C is what you're looking for.
I think that fscanf(fid, '%s', 1); is telling matlab to read the line a single string. You will still have to convert it to an array of numbers:
for ii = 1:nlines
cells{ii} = str2num(fscanf(fid, '%s', 1));
end

manipulating columns in the pdb files in matlab

I have written the following code, and I am required to do the operations with 6,7,8th column only as shown but I want the result .pdb file containing the other columns from the input files also. that is I just want to replace the 6,7,8th column of the input file with the final result without affecting the other columns. I am not able to figure out this.
%
phi1 =188.7*pi/180;
PHI = 82.3*pi/180;
phi2 = 150.4*pi/180;
%%%%% to be change ary euler angles
%%%%%%% rotation_matrix defining
R(1,1) = cos(phi1)*cos(phi2)-sin(phi1)*sin(phi2)*cos(PHI);
R(1,2) = sin(phi1)*cos(phi2)+cos(phi1)*sin(phi2)*cos(PHI);
R(1,3) = sin(phi2)*sin(PHI);
R(2,1) = -cos(phi1)*sin(phi2)-sin(phi1)*cos(phi2)*cos(PHI);
R(2,2) = -sin(phi1)*sin(phi2)+cos(phi1)*cos(phi2)*cos(PHI);
R(2,3) = cos(phi2)*sin(PHI);
R(3,1) = sin(phi1)*sin(PHI);
R(3,2) = -cos(phi1)*sin(PHI);
R(3,3) = cos(PHI);
%%output
rotationMatrix=R;
fid2 = fopen('final_result.pdb','wt');
for k=1:100
fid = fopen('ipp31.pdb');
A = textscan(fid, '%s %s %f %s %s %f %f %f %s ') ;
%read the file
a = A{6};
b = A{7};
c = A{8};
p=[a b c];
p_t=p.';
M=rotationMatrix*p_t;
M_T=M.';
fid = fopen('ipp31.pdb', 'wt'); % Open for writing
fprintf(fid,' %f\t %f\t %f\n',M_T);
fclose(fid);
%
textfilename = [num2str(k) '.pdb'];
fid1 = fopen(textfilename,'wt');
fprintf(fid1,' %f\t %f\t %f\n',M_T);
fclose(fid1);
fprintf(fid2,' %f\t %f\t %f\n',M_T);
end
fclose(fid2);