how to read broken numbers on two lines in a text file in matlab? - matlab

how to read broken numbers on two lines in matlab?
I am generating some results in text files that are being broken into two lines. Example:
text x = 1.
2345 text
What would a code look like to read the value x = 1.2345?
Suppose the value of x = 1.2345 is in file named name.txt.
When it doesn't break the values I'm looking for:
text x = 1.2345 text
I use the following (working) code:
buffer = fileread('name.txt') ;
search = 'x = ' ;
local = strfind(buffer, search);
xvalue = sscanf(buffer(local(1,1)+numel(search):end), '%f', 1);

You can remove line breaks (and other "white space", if needed) before parsing the string:
>> str = sprintf('text x = 1.\n2345 text')
str =
'text x = 1.
2345 text'
>> str = regexprep(str, '\n', '')
str =
'text x = 1.2345 text'

Related

MATLAB - replace the string in a text file with index and delete those line without indexed

I have a text file with data structure like this
30,311.263671875,158.188034058,20.6887207031,17.4877929688,0.000297248129755,aeroplane
30,350.668334961,177.547393799,19.1939697266,18.3677368164,0.00026999923648,aeroplane
30,367.98135376,192.697219849,16.7747192383,23.0987548828,0.000186387239864,aeroplane
30,173.569274902,151.629364014,38.0069885254,37.5704650879,0.000172595537151,aeroplane
30,553.904602051,309.903320312,660.893981934,393.194030762,5.19620243722e-05,aeroplane
30,294.739196777,156.249740601,16.3522338867,19.8487548828,1.7795707663e-05,aeroplane
30,34.1946258545,63.4127349854,475.104492188,318.754821777,6.71026540999e-06,aeroplane
30,748.506652832,0.350944519043,59.9415283203,28.3256549835,3.52978979379e-06,aeroplane
30,498.747009277,14.3766479492,717.006652832,324.668731689,1.61551643174e-06,aeroplane
30,81.6389465332,498.784301758,430.23046875,210.294677734,4.16855394647e-07,aeroplane
30,251.932098389,216.641052246,19.8385009766,20.7131652832,3.52147743106,bicycle
30,237.536972046,226.656692505,24.0902862549,15.7586669922,1.8601918593,bicycle
30,529.673400879,322.511322021,25.1921386719,21.6920166016,0.751171214506,bicycle
30,255.900146484,196.583847046,17.1589355469,27.4430847168,0.268321367912,bicycle
30,177.663650513,114.458488464,18.7516174316,16.6759414673,0.233057001606,bicycle
30,436.679382324,273.383331299,17.4342041016,19.6081542969,0.128449092153,bicycle
I want to index those file with a label file.and the result will be something like this.
60,509.277435303,284.482452393,26.1684875488,31.7470092773,0.00807665128377,15
60,187.909835815,170.448471069,40.0388793945,58.8763122559,0.00763951029512,15
60,254.447280884,175.946624756,18.7212677002,21.9440612793,0.00442053096776,15
However there might be some class that is not in label class and I need to filter those line out so I can use load() to load in.(you can't have char inside that text file and execute load().
here is my implement:
function test(vName,meta)
f_dt = fopen([vName '.txt'],'r');
f_indexed = fopen([vName '_indexed.txt'], 'w');
lbls = loadlbl()
count = 1;
while(true),
if(f_dt == -1),
break;
end
dt = fgets(f_dt);
if(dt == -1),
break
else
dt_cls = strsplit(dt,','){7};
dt_cls = regexprep(dt_cls, '\s+', '');
cls_idx = find(strcmp(lbls,dt_cls));
if(~isempty(cls_idx))
dt = strrep(dt,dt_cls,int2str(cls_idx));
fprintf(f_indexed,dt);
end
end
end
fclose(f_indexed);
if(f_dt ~= -1),
fclose(f_dt);
end
end
However it work very very slow because the text file contains 100 thousand of lines. Is it anyway that I could do this task smarter and faster?
You may use textscan, and get the indices/ line numbers of the labels you want. After knowing the line numbers, you can extract what you want.
fid = fopen('data.txt') ;
S = textscan(fid,'%s','delimiter','\n') ;
S = S{1} ;
fclose(fid) ;
%% get bicycle lines
idx = strfind(S, 'bicycle');
idx = find(not(cellfun('isempty', idx)));
S_bicycle = S(idx)
%% write this to text file
fid2 = fopen('text.txt','wt') ;
fprintf(fid2,'%s\n',S_bicycle{:});
fclose(fid2) ;
From S_bicycle, you can extract your numbers.

Excel Range object throws error 0x800A03EC because of a string longer than 255 characters

Using an ActiveX server from MATLAB, I am trying to highlight many cells in an Excel sheet at once. These are not in specific columns or rows so I use Range('A1,B2,...') to access them. However the string accepted by the Range object has to be less than 255 characters or an error:
Error: Object returned error code: 0x800A03EC
is thrown. The following code reproduces this error with an empty Excel file.
hActX = actxserver('Excel.Application');
hWB = hActX.Workbooks.Open('C:\Book1.xlsx');
hSheet = hWB.Worksheets.Item('Sheet1');
col = repmat('A', 100, 1);
row = num2str((1:100)'); %'
cellInd = strcat(col, strtrim(cellstr(row)));
str1 = strjoin(cellInd(1:66), ','); %// 254 characters
str2 = strjoin(cellInd(1:67), ','); %// 258 characters
hSheet.Range(str1).Interior.Color = 255; %// Works
hSheet.Range(str2).Interior.Color = 255; %// Error 0x800A03EC
hWB.Save;
hWB.Close(false);
hActX.Quit;
How can I get around this? I found no other relevant method of calling Range, or of otherwise getting the cells I want to modify.
If you start with a String, you can test its length to determine if Range() can handle it. Here is an example of building a diagonal range:
Sub DiagonalRange()
Dim BigString As String, BigRange As Range
Dim i As Long, HowMany As Long, Ln As String
HowMany = 100
For i = 1 To HowMany
BigString = BigString & "," & Cells(i, i).Address(0, 0)
Next i
BigString = Mid(BigString, 2)
Ln = Len(BigString)
MsgBox Ln
If Ln < 250 Then
Set BigRange = Range(BigString)
Else
Set BigRange = Nothing
arr = Split(BigString, ",")
For Each a In arr
If BigRange Is Nothing Then
Set BigRange = Range(a)
Else
Set BigRange = Union(BigRange, Range(a))
End If
Next a
End If
BigRange.Select
End Sub
For i = 10, the code will the the direct method, but if the code were i=100, the array method would be used.
The solution, as Rory pointed out, is to use the Union method. To minimize the number of calls from MATLAB to the ActiveX server, this is what I did:
str = strjoin(cellInd, ',');
isep = find(str == ',');
isplit = diff(mod(isep, 250)) < 0;
isplit = [isep(isplit) (length(str) + 1)];
hRange = hSheet.Range(str(1:(isplit(1) - 1)));
for ii = 2:numel(isplit)
hRange = hActX.Union(hRange, ...
hSheet.Range(str((isplit(ii-1) + 1):(isplit(ii) - 1))));
end
I used 250 in the mod to account for the cell names being up to 6 characters long, which is sufficient for me.

Matlab; how to extract information from a header's file (text file)

I have many text files that have 35 lines of header followed by a large matrix with data of an image (that info can be ignored and do not need to read it at the moment). I want to be able to read the header lines and extract information contained on those lines. For instance the first few lines of the header are..
File Version Number: 1.0
Date: 06/05/2015
Time: 10:33:44 AM
===========================================================
Beam Voltage (-kV) = 13.000
Filament (W) = 4.052
Cond. (-kV) = 8.885
CenterX1 (V) = 10.7
CenterY1 (V) = -45.9
Objective (%) = 71.40
OctupoleX = -0.4653
OctupoleY = -0.1914
Angle (deg) = 0.00
.
I would like to be able to open this text file and read the vulue of the day and time the file was created, filament power, the condenser voltage, the angle, etc.. and save these in variables or send them to a text box on a GUI program.
I have tried several things but since the values I want to extract some times are after a '=' or after a ':' or simply after a '' then I do not know how to approach this. Perhaps reading each line and look for a match of a word?
Any help would be much appreciated.
Thanks,
Alex
This is not particularly difficult, and one of the ways to do it would be to parse line-by-line as you suggested. Something like this:
MAX_LINES_TO_READ = 35;
fid = fopen('input.txt');
lineCount = 0;
dateString = '';
beamVoltage = 0;
while ~eof(fid)
line = fgetl(fid);
lineCount = lineCount + 1;
%//check conditions for skipping loop body
if isempty(line)
continue
elseif lineCount > MAX_LINES_TO_READ
break
end
%//find headers you are interested in
if strfind(line, 'Date')
%//find the first location of the header separator
idx = find(line, ':', 1);
%//extract substring starting from 1 char after separator
%//note: the trim is to get rid of leading/trailing whitespace
dateString = strtrim(line(idx + 1 : end));
elseif strfind(line, 'Beam Voltage')
idx = find(line, '=', 1);
beamVoltage = str2double(line(idx + 1 : end));
end
end
fclose(fid);

Concatenating text from different text files into one

Suppose there are five text files. Contents of files are textfile1 = i saw an alligator, textfile2 = alligator was sitting near a tree, textfile3 = alligator was sleeping, textfile4 = parrot was flying, textfile5 = parrot was flying.
I used the code below to find paths of textfiles containing word alligator:
sdirectory = 'C:\Users\anurag\Desktop\Animals\Annotations\';
textfiles = dir([sdirectory '*.eng']);
num_of_files = length(textfiles);
C = cell(num_of_files, 1);
for w = 1:length(textfiles)
file = [sdirectory textfiles(w).name];
%// load string from a file
STR = importdata(file);
BL = cellfun(#lower, STR, 'uni',0);
%// extract string between tags
%// assuming you want to remove the angle brackets
B = regexprep(BL, '<.*?>','');
B(strcmp(B, '')) = [];
%// split each string by delimiters and add to C
tmp = regexp(B, '/| ', 'split');
C{w} = [tmp{:}];
end
where = [];
for j = 1:length(C)
file1 = [sdirectory textfiles(j).name];
if find(ismember(C{j},'alligator'))
where = [where num2str(j) '.eng, '];
disp(file1)
end
end
Finally the variable file1 will show the path of textfile containing word alligator one by one. Is there any way to concatenate the strings of textfiles containing the required word into a cell array.

Prefix match in MATLAB

Hey guys, I have a very simple problem in MATLAB:
I have some strings which are like this:
Pic001
Pic002
Pic003
004
Not every string starts with the prefix "Pic". So how can I cut off the part "pic" that only the numbers at the end shall remain to have an equal format for all my strings?
Greets, poeschlorn
If 'Pic' only ever occurs as a prefix in your strings and nowhere else within the strings then you could use STRREP to remove it like this:
>> x = {'Pic001'; 'Pic002'; 'Pic003'; '004'}
x =
'Pic001'
'Pic002'
'Pic003'
'004'
>> x = strrep(x, 'Pic', '')
x =
'001'
'002'
'003'
'004'
If 'Pic' can occur elsewhere in your strings and you only want to remove it when it occurs as a prefix then use STRNCMP to compare the first three characters of your strings:
>> x = {'Pic001'; 'Pic002'; 'Pic003'; '004'}
x =
'Pic001'
'Pic002'
'Pic003'
'004'
>> for ii = find(strncmp(x, 'Pic', 3))'
x{ii}(1:3) = [];
end
>> x
x =
'001'
'002'
'003'
'004'
strings = {'Pic001'; 'Pic002'; 'Pic003'; '004'};
numbers = regexp(strings, '(PIC)?(\d*)','match');
for cc = 1:length(numbers);
fprintf('%s\n', char(numbers{cc}));
end;