MATLAB - replace the string in a text file with index and delete those line without indexed - matlab

I have a text file with data structure like this
30,311.263671875,158.188034058,20.6887207031,17.4877929688,0.000297248129755,aeroplane
30,350.668334961,177.547393799,19.1939697266,18.3677368164,0.00026999923648,aeroplane
30,367.98135376,192.697219849,16.7747192383,23.0987548828,0.000186387239864,aeroplane
30,173.569274902,151.629364014,38.0069885254,37.5704650879,0.000172595537151,aeroplane
30,553.904602051,309.903320312,660.893981934,393.194030762,5.19620243722e-05,aeroplane
30,294.739196777,156.249740601,16.3522338867,19.8487548828,1.7795707663e-05,aeroplane
30,34.1946258545,63.4127349854,475.104492188,318.754821777,6.71026540999e-06,aeroplane
30,748.506652832,0.350944519043,59.9415283203,28.3256549835,3.52978979379e-06,aeroplane
30,498.747009277,14.3766479492,717.006652832,324.668731689,1.61551643174e-06,aeroplane
30,81.6389465332,498.784301758,430.23046875,210.294677734,4.16855394647e-07,aeroplane
30,251.932098389,216.641052246,19.8385009766,20.7131652832,3.52147743106,bicycle
30,237.536972046,226.656692505,24.0902862549,15.7586669922,1.8601918593,bicycle
30,529.673400879,322.511322021,25.1921386719,21.6920166016,0.751171214506,bicycle
30,255.900146484,196.583847046,17.1589355469,27.4430847168,0.268321367912,bicycle
30,177.663650513,114.458488464,18.7516174316,16.6759414673,0.233057001606,bicycle
30,436.679382324,273.383331299,17.4342041016,19.6081542969,0.128449092153,bicycle
I want to index those file with a label file.and the result will be something like this.
60,509.277435303,284.482452393,26.1684875488,31.7470092773,0.00807665128377,15
60,187.909835815,170.448471069,40.0388793945,58.8763122559,0.00763951029512,15
60,254.447280884,175.946624756,18.7212677002,21.9440612793,0.00442053096776,15
However there might be some class that is not in label class and I need to filter those line out so I can use load() to load in.(you can't have char inside that text file and execute load().
here is my implement:
function test(vName,meta)
f_dt = fopen([vName '.txt'],'r');
f_indexed = fopen([vName '_indexed.txt'], 'w');
lbls = loadlbl()
count = 1;
while(true),
if(f_dt == -1),
break;
end
dt = fgets(f_dt);
if(dt == -1),
break
else
dt_cls = strsplit(dt,','){7};
dt_cls = regexprep(dt_cls, '\s+', '');
cls_idx = find(strcmp(lbls,dt_cls));
if(~isempty(cls_idx))
dt = strrep(dt,dt_cls,int2str(cls_idx));
fprintf(f_indexed,dt);
end
end
end
fclose(f_indexed);
if(f_dt ~= -1),
fclose(f_dt);
end
end
However it work very very slow because the text file contains 100 thousand of lines. Is it anyway that I could do this task smarter and faster?

You may use textscan, and get the indices/ line numbers of the labels you want. After knowing the line numbers, you can extract what you want.
fid = fopen('data.txt') ;
S = textscan(fid,'%s','delimiter','\n') ;
S = S{1} ;
fclose(fid) ;
%% get bicycle lines
idx = strfind(S, 'bicycle');
idx = find(not(cellfun('isempty', idx)));
S_bicycle = S(idx)
%% write this to text file
fid2 = fopen('text.txt','wt') ;
fprintf(fid2,'%s\n',S_bicycle{:});
fclose(fid2) ;
From S_bicycle, you can extract your numbers.

Related

Matlab; how to extract information from a header's file (text file)

I have many text files that have 35 lines of header followed by a large matrix with data of an image (that info can be ignored and do not need to read it at the moment). I want to be able to read the header lines and extract information contained on those lines. For instance the first few lines of the header are..
File Version Number: 1.0
Date: 06/05/2015
Time: 10:33:44 AM
===========================================================
Beam Voltage (-kV) = 13.000
Filament (W) = 4.052
Cond. (-kV) = 8.885
CenterX1 (V) = 10.7
CenterY1 (V) = -45.9
Objective (%) = 71.40
OctupoleX = -0.4653
OctupoleY = -0.1914
Angle (deg) = 0.00
.
I would like to be able to open this text file and read the vulue of the day and time the file was created, filament power, the condenser voltage, the angle, etc.. and save these in variables or send them to a text box on a GUI program.
I have tried several things but since the values I want to extract some times are after a '=' or after a ':' or simply after a '' then I do not know how to approach this. Perhaps reading each line and look for a match of a word?
Any help would be much appreciated.
Thanks,
Alex
This is not particularly difficult, and one of the ways to do it would be to parse line-by-line as you suggested. Something like this:
MAX_LINES_TO_READ = 35;
fid = fopen('input.txt');
lineCount = 0;
dateString = '';
beamVoltage = 0;
while ~eof(fid)
line = fgetl(fid);
lineCount = lineCount + 1;
%//check conditions for skipping loop body
if isempty(line)
continue
elseif lineCount > MAX_LINES_TO_READ
break
end
%//find headers you are interested in
if strfind(line, 'Date')
%//find the first location of the header separator
idx = find(line, ':', 1);
%//extract substring starting from 1 char after separator
%//note: the trim is to get rid of leading/trailing whitespace
dateString = strtrim(line(idx + 1 : end));
elseif strfind(line, 'Beam Voltage')
idx = find(line, '=', 1);
beamVoltage = str2double(line(idx + 1 : end));
end
end
fclose(fid);

Why isnt this matlab code working?

I'm trying to write a script to remove comment lines from another script file. Here's what I tried:
fid = fopen('correct_answer.m');
aline = {};
counter = 1;
while ~feof(fid)
aline{counter} = fgetl(fid);
if aline{counter}(1) == '%'
aline{counter} = '';
end
counter = counter + 1;
end
This is the error I get:
Attempted to access aline.%cell(1); index out of bounds because numel(aline.%cell)=0.
Error in hw2 (line 8)
if aline{counter}(1) == '%'
If I run it without the while loop it works fine. What's up with this??
Also if you just happen to know a more simple/efficient approach to removing comment lines that would work too ;)
The problem is that you sometimes read empty lines, and you cannot access the first element of empty lines. Additionally you may have comments at the end of a valid line of code, this code should do the trick
fid = fopen('correct_answer.m');
aline = {};
counter = 1;
while ~feof(fid)
aline{counter} = fgetl(fid);
if ~isempty(aline{counter})
k = strfind(aline{counter}, '%');
aline{counter}(k:end) = '';
end
counter = counter + 1;
end

How to import in Matlab a cellarray from a .txt file with columns as "text"?

I have a .txt file which I want to import with code in Matlab as a cell. This .txt file has lines with different lengths and a combination of numbers and characters in them. Here is a sample of the file:
*** After labeling ***
Elements: E11, E21, E31, E51, E61, E81,
Connections: E11E61, E21E81, E31E61, E31E81, E51E81,
*** After labeling ***
Elements: E11, E21, E31, E51, E61, E81,
Connections: E11E61, E21E81, E31E51, E31E81, E61E81,
*** After labeling ***
Elements: E11, E21, E31, E51, E61, E62, E81,
Connections: E11E61, E21E81, E31E51, E31E62, E61E81, E62E81,
The results should look like this:
When I import the file using the Import Tool some columns are recognized as text and some as numbers. I have to manually change the type of the column from number to text for each of them:
.
More, I have to select cellarray instead of Column Vectors each time. The maximum length of the lines is known, 15.
I tried results = importfile('results') but it does not work. Does anybody have any suggestions for me?
Here is some simple parsing routine that returns directly elements and connections as cell vectors.
NB: Parsing assumes text file is well formatted and is always elements, followed by connections.
function [elements, connections] = ReadElementsAndConnections(filename)
%[
% For debug
if (nargin < 1), filename = 'ElementsAndConnections.txt'; end
% Read full file content
text = fileread(filename);
% Split on newline
lines = strsplit(strtrim(text), '\n');
% Parse
elements = cell(0,1);
connections = cell(0,1);
isElementLine = true;
lcount = length(lines);
startElements = length('Elements:') + 1;
startConnections = length('Connections:') + 1;
for li = 1:lcount,
% Skip empty lines or lines starting with '*'
line = strtrim(lines{li});
if (isempty(line) || (line(1) == '*')), continue; end
% NOT VERY SAFE: Assuming we always have 'elements', followed by 'connections'
if (isElementLine)
elementsLine = lines{li}(startElements:end);
elementsValues = strtrim(strsplit(elementsLine, ','));
elements{end+1} = elementsValues(1:(end-1)); % Last value is empty.
isElementLine = false;
else
connectionsLine = lines{li}(startConnections:end);
connectionsValues = strtrim(strsplit(connectionsLine, ','));
connections{end+1} = connectionsValues(1:(end-1)); % Last value is empty.
isElementLine = true;
end
end
%]
end
You can then access elements and connections like this:
>> elements{1}
ans =
'E11' 'E21' 'E31' 'E51' 'E61' 'E81'
>> connections{3}
ans =
'E11E61' 'E21E81' 'E31E51' 'E31E62' 'E61E81' 'E62E81'

Extraction of data from DWT subband

I am attempting to extract data from a DWT subband. I am able to embed data correctly (I have followed it in the debugger),cal PSNR etc. PSNR rate seem very high 76.2?? however,I am having lot of trouble extracting data back!It is sometimes extracting the number 128?? Can anyone help or have any idea why this is? I would be very thankful.I have been working on this all day & having no luck!I am very curious to know??
Data Embedding:
coverImage = imread('lena.bmp');
message = importdata('minutiaTest.txt');
%message = 'Bifurcations:';
[LL,LH,HL,HH] = dwt2(coverImage,'haar');
if size(message) > size(coverImage,1) * size(coverImage,2)
error ('message too big to embed');
end
bit_count = 0;
steg_coeffs = [4, 4.75, 5.5, 6.25, 7];
for jj=1:size(message,2)+1
if jj > size(message,2)
charbits = [0,0,0,0,0,0,0,0];
else
charbits = dec2bin(message(jj),8)';
charbits = charbits(:)'-'0';
end
for ii=1:8
bit_count = bit_count + 1;
if charbits(ii) == 1
if HH(bit_count) <= 0
HH(bit_count) = steg_coeffs(randi(numel(steg_coeffs)));
end
else
if HH(bit_count) >= 0
HH(bit_count) = -1 * steg_coeffs(randi(numel(steg_coeffs)));
end
end
end
end
stego_image = idwt2(LL,LH,HL,HH,'haar');
imwrite(uint8(stego_image),'newStego.bmp');
Data Extraction:
new_Stego = imread('newStego.bmp');
[LL,LH,HL,HH] = dwt2(new_Stego,'haar');
message = '';
msgbits = '';
for ii = 1:size(HH,1)*size(HH,2)
if HH(ii) > 0
msgbits = strcat (msgbits, '1');
elseif HH(ii) < 0
msgbits = strcat (msgbits, '0');
else
return;
end
if mod(ii,8) == 0
msgChar = bin2dec(msgbits);
if msgChar == 0
break;
end
msgChar = char (msgChar);
message = [message msgChar];
msgbits = '';
end
end
The problem arises from reading your data with importdata.
This command will load the data to an array. Since you have 39 lines and 2 columns (skipping any empty lines), its size will be 39 2. However, the program assumes that your message will be a string. For example, 'i am a string' has a size 1 13. This expectation of the program compared to the data you actually give it creates all sorts of problems.
What you want is to read your data as a single string, where the number 230 is not one element, but 3 individual characters. Tabs and newlines will also be read in as well.
To read your file:
message = fileread('minutiaTest.txt');
After you extract your message, to save it to a file:
fid = fopen('myFilename.txt','w');
fprintf(fid,message);
fclose(fid);

MATLAB: Get the last modification time of a file

I am looking for MATLAB code that does some routine (updates a file.m), if file.csv is edited more recently than file.m.
Something that should look like:
% Write time extraction
tempC = GetFileTime('file.csv', [], 'Write');
tempdateC = tempC.date
tempM = GetFileTime('file.m', [], 'Write');
tempdateM = tempM.date
% Write time comparison
if numel(dir('file.m')) == 0 || tempdateC >= tempdateM
matDef = regexprep(fileread('file.csv'), '(\r\n|\r|\n)', ';\n');
f = fopen('file.m', 'w');
fwrite(f, ['Variable = [' matDef(1:end) '];']);
fclose(f);
end
The lines for timestamp extraction seem to be incorrect MATLAB code. The rest works (Evaluate variables in external file strings).
You can extract the modification time of a file using MATLAB's dir command. Something like:
function modTime = GetFileTime(fileName)
listing = dir(fileName);
% check we got a single entry corresponding to the file
assert(numel(listing) == 1, 'No such file: %s', fileName);
modTime = listing.datenum;
end
Note that the output is in MATLAB's datenum serial date format.