I am parsing a large text file full of data and then saving it to disk as a *.mat file so that I can easily load in only parts of it (see here for more information on reading in the files, and here for the data). To do so, I read in one line at a time, parse the line, and then append it to the file. The problem is that the file itself is >3 orders of magnitude larger than the data contained therein!
Here is a stripped down version of my code:
database = which('01_hit12.par');
[directory,filename,~] = fileparts(database);
matObj = matfile(fullfile(directory,[filename '.mat']),'Writable',true);
fidr = fopen(database);
hitranTemp = fgetl(fidr);
k = 1;
while ischar(hitranTemp)
if abs(hitranTemp(1)) == 32;
hitranTemp(1) = '0';
end
hitran = textscan(hitranTemp,'%2u%1u%12f%10f%10f%5f%5f%10f%4f%8f%15c%15c%15c%15c%6u%2u%2u%2u%2u%2u%2u%1c%7f%7f','delimiter','','whitespace','');
matObj.moleculeNumber(1,k) = uint8(hitran{1});
matObj.isotopeologueNumber(1,k) = uint8(hitran{2});
matObj.vacuumWavenumber(1,k) = hitran{3};
matObj.lineIntensity(1,k) = hitran{4};
matObj.airWidth(1,k) = single(hitran{6});
matObj.selfWidth(1,k) = single(hitran{7});
matObj.lowStateE(1,k) = single(hitran{8});
matObj.tempDependWidth(1,k) = single(hitran{9});
matObj.pressureShift(1,k) = single(hitran{10});
if rem(k,1e4) == 0;
display(sprintf('line %u (%2.2f)',k,100*k/K));
end
hitranTemp = fgetl(fidr);
k = k + 1;
end
fclose(fidr);
I stopped the code after 13,813 of the 224,515 lines had been parsed because it had been taking a very long time and the file size was getting huge, but the last printout indicated that I had only just cleared 10k lines. I cleared the memory, and then ran:
S = whos('-file','01_hit12.mat');
fileBytes = sum([S.bytes]);
T = dir(which('01_hit12.mat'));
diskBytes = T.bytes;
disp([fileBytes diskBytes diskBytes/fileBytes])
and get the output:
524894 896189009 1707.37141022759
What is taking up the extra 895,664,115 bytes? I know the help page says there should be a little extra overhead, but I feel that nearly a Gb of descriptive header is a bit excessive!
New information:
I tried pre-allocating the file, thinking that perhaps MATLAB was doing the same thing it does when a matrix is embiggened in a loop and reallocating a chunk of disk space for the entire matrix on each write, and that isn't it. Filling the file with zeros of the appropriate data types results in a file that my short check script returns:
8531570 71467 0.00837677004349727
This makes more sense to me. Matlab is saving the file sparsely, so the disk file size is much smaller than the size of the full matrix in memory. Once it starts replacing values with real data, however, I get the same behavior as before and the file size starts skyrocketing beyond all reasonable bounds.
New new information:
Tried this on a subset of the data, 100 lines long. To stream to disk, the data has to be in v7.3 format, so I ran the subset through my script, loaded it into memory, and then resaved as v7.0 format. Here are the results:
v7.3: 3800 8752 2.30
v7.0: 3800 2561 0.67
No wonder the v7.3 format isn't the default. Does anyone know a way around this? Is this a bug or a feature?
This seems like a bug to me. A workaround is to write in chunks to pre-allocated arrays.
Start off by pre-allocating:
fid = fopen('01_hit12.par', 'r');
data = fread(fid, inf, 'uint8');
nlines = nnz(data == 10) + 1;
fclose(fid);
matObj.moleculeNumber = zeros(1,nlines,'uint8');
matObj.isotopeologueNumber = zeros(1,nlines,'uint8');
matObj.vacuumWavenumber = zeros(1,nlines,'double');
matObj.lineIntensity = zeros(1,nlines,'double');
matObj.airWidth = zeros(1,nlines,'single');
matObj.selfWidth = zeros(1,nlines,'single');
matObj.lowStateE = zeros(1,nlines,'single');
matObj.tempDependWidth = zeros(1,nlines,'single');
matObj.pressureShift = zeros(1,nlines,'single');
Then to write in chunks of 10000, I modified your code as follows:
... % your code plus pre-alloc first
bs = 10000;
while ischar(hitranTemp)
if abs(hitranTemp(1)) == 32;
hitranTemp(1) = '0';
end
for ii = 1:bs,
hitran{ii} = textscan(hitranTemp,'%2u%1u%12f%10f%10f%5f%5f%10f%4f%8f%15c%15c%15c%15c%6u%2u%2u%2u%2u%2u%2 u%1c%7f%7f','delimiter','','whitespace','');
hitranTemp = fgetl(fidr);
if hitranTemp==-1, bs=ii; break; end
end
% this part really ugly, sorry! trying to keep it compact...
matObj.moleculeNumber(1,k:k+bs-1) = uint8(builtin('_paren',cellfun(#(c)c{1},hitran),1:bs));
matObj.isotopeologueNumber(1,k:k+bs-1) = uint8(builtin('_paren',cellfun(#(c)c{2},hitran),1:bs));
matObj.vacuumWavenumber(1,k:k+bs-1) = builtin('_paren',cellfun(#(c)c{3},hitran),1:bs);
matObj.lineIntensity(1,k:k+bs-1) = builtin('_paren',cellfun(#(c)c{4},hitran),1:bs);
matObj.airWidth(1,k:k+bs-1) = single(builtin('_paren',cellfun(#(c)c{5},hitran),1:bs));
matObj.selfWidth(1,k:k+bs-1) = single(builtin('_paren',cellfun(#(c)c{6},hitran),1:bs));
matObj.lowStateE(1,k:k+bs-1) = single(builtin('_paren',cellfun(#(c)c{7},hitran),1:bs));
matObj.tempDependWidth(1,k:k+bs-1) = single(builtin('_paren',cellfun(#(c)c{8},hitran),1:bs));
matObj.pressureShift(1,k:k+bs-1) = single(builtin('_paren',cellfun(#(c)c{9},hitran),1:bs));
k = k + bs;
fprintf('.');
end
fclose(fidr);
The final size on disk is 21,393,408 bytes. The usage breaks down as,
>> S = whos('-file','01_hit12.mat');
>> fileBytes = sum([S.bytes]);
>> T = dir(which('01_hit12.mat'));
>> diskBytes = T.bytes; ratio = diskBytes/fileBytes;
>> fprintf('%10d whos\n%10d disk\n%10.6f\n',fileBytes,diskBytes,ratio)
8531608 whos
21389582 disk
2.507099
Still fairly inefficient, but not out of control.
Related
I'm trying to take in a complex csv file in MATLAB and edit/rearrange it. I used tableread to pull out the numerical data that I then manipulated. Afterwards I aimed to take the first 52 rows of text metadata, add three more rows then combine with the numerical data and output as a csv file.
However, I can't seem to find a way to work with the text metadata rows that doesn't ultimately add new lines between them when output.
e.g. this is how my csv outputs:
#HEADER
,,,,,,
Instrument, Data Processed Analysis
,,,,,
Version,1.3.1,1.25
,,,,
Created,Monday, August 16, 2021 3:13:38 PM
,,,
Filename,D5-116.NGXDP
,,,,,
When it should look like:
#HEADER
NGX, Data Processed Analysis
Version,1.3.1,1.25
Created,Monday, August 16, 2021 3:13:38 PM
Filename,D5-116.NGXDP
Here is my code:
`
close
clear
fileName = 'D5-116.NGXDP';
fileOut = fileName(1:end-6);
fileOut = append(fileOut, '.csv');
eEnergy = 1.602176634e-19; %electron energy in coulombs
resisTor = 1e11; %Resistor value for EM
%% Reading and Edit Mass Data - this is too hardcoded right now
T = readtable(fileName, 'FileType', 'text');
T = T{:,:}; %convert from table to matrix
dataOrg(:,1:3) = T(1087:1236, 1:3); %cycle:time:Ar
dataOrg(:,4) = T(935:1084, 3); %39
dataOrg(:,5) = T(783:932, 3); %38
dataOrg(:,6) = T(631:780, 3); %37
dataOrg(:,7) = T(479:628, 3); %36
dataOrg(:,7) = dataOrg(:,7) * eEnergy * resisTor; %36Ar - convert CPS to V
dataOrg(:,3:7) = dataOrg(:,3:7) * 1000; %Convert all to mV
%% Reading the metadata from the file
T2 = fopen(fileName);
metaData = fread(T2, '*char')'; %read content
fclose(T2);
results = strsplit(metaData, '\n'); %split the char data by entry spaces
metaNeat = results(:,1:52)'; %Takes the metadata prior to #collectors
metaNeat(end+1, 1) = {'Blocks, 1'}; %added data
metaNeat(end+1, 1) = {'Cycles, 150'}; %added data
metaNeat(end+1, 1) = {'Cycle, Time, 40, 39, 38, 37, 36'};
for n = 1:length(metaNeat) %scan each row, if there is a comma split them
if contains(metaNeat(n,1), ',') == 1
tempMeta = split(metaNeat(n,1), ',')';
metaNeat(n,1:size(tempMeta, 2)) = tempMeta;
clear tempMeta
end
end
outPut = metaNeat; %metadata section
outPut(end+1:(end+150), 1:7) = num2cell(dataOrg); %mass value section
writecell(outPut,fileOut,'Delimiter',',', 'FileType', 'text')
I'm sure there is a way to do this in three lines of course. My guess is that the strsplit function I'm using to separate the char cell into multiple rows is preserving the \n value. The added lines (e.g. Blocks, 1) do not show gaps in the csv file. Any tips or advice would be appreciated.
Thanks for your time,
I read the bytes in this way:
filename = 'random_path';
f=fopen(filename,'rb');%f=fopen(filename,'rb');
if f<3, error('Impossivel abrir %s',filename); end
samples= fread(f,202*4096,'int16')';
This file was wrote in LowEndian. Now I want to pass it to BigEndian. I try this, without success.
read= fopen('big_endian','wb');
fwrite(read,int16(swapbytes(samples)),'int16');
fclose(read)
You should try:
%read the data
fid = fopen('random_path','r','ieee-le'); %ieee-le = Low endian
data = fread(fid,inf,'int16');
%write the data
fid = fopen('random_path_le','w','ieee-be'); %ieee-be = Big endian
fwrite(fid,data,'int16');
I have many text files that have 35 lines of header followed by a large matrix with data of an image (that info can be ignored and do not need to read it at the moment). I want to be able to read the header lines and extract information contained on those lines. For instance the first few lines of the header are..
File Version Number: 1.0
Date: 06/05/2015
Time: 10:33:44 AM
===========================================================
Beam Voltage (-kV) = 13.000
Filament (W) = 4.052
Cond. (-kV) = 8.885
CenterX1 (V) = 10.7
CenterY1 (V) = -45.9
Objective (%) = 71.40
OctupoleX = -0.4653
OctupoleY = -0.1914
Angle (deg) = 0.00
.
I would like to be able to open this text file and read the vulue of the day and time the file was created, filament power, the condenser voltage, the angle, etc.. and save these in variables or send them to a text box on a GUI program.
I have tried several things but since the values I want to extract some times are after a '=' or after a ':' or simply after a '' then I do not know how to approach this. Perhaps reading each line and look for a match of a word?
Any help would be much appreciated.
Thanks,
Alex
This is not particularly difficult, and one of the ways to do it would be to parse line-by-line as you suggested. Something like this:
MAX_LINES_TO_READ = 35;
fid = fopen('input.txt');
lineCount = 0;
dateString = '';
beamVoltage = 0;
while ~eof(fid)
line = fgetl(fid);
lineCount = lineCount + 1;
%//check conditions for skipping loop body
if isempty(line)
continue
elseif lineCount > MAX_LINES_TO_READ
break
end
%//find headers you are interested in
if strfind(line, 'Date')
%//find the first location of the header separator
idx = find(line, ':', 1);
%//extract substring starting from 1 char after separator
%//note: the trim is to get rid of leading/trailing whitespace
dateString = strtrim(line(idx + 1 : end));
elseif strfind(line, 'Beam Voltage')
idx = find(line, '=', 1);
beamVoltage = str2double(line(idx + 1 : end));
end
end
fclose(fid);
I am attempting to extract data from a DWT subband. I am able to embed data correctly (I have followed it in the debugger),cal PSNR etc. PSNR rate seem very high 76.2?? however,I am having lot of trouble extracting data back!It is sometimes extracting the number 128?? Can anyone help or have any idea why this is? I would be very thankful.I have been working on this all day & having no luck!I am very curious to know??
Data Embedding:
coverImage = imread('lena.bmp');
message = importdata('minutiaTest.txt');
%message = 'Bifurcations:';
[LL,LH,HL,HH] = dwt2(coverImage,'haar');
if size(message) > size(coverImage,1) * size(coverImage,2)
error ('message too big to embed');
end
bit_count = 0;
steg_coeffs = [4, 4.75, 5.5, 6.25, 7];
for jj=1:size(message,2)+1
if jj > size(message,2)
charbits = [0,0,0,0,0,0,0,0];
else
charbits = dec2bin(message(jj),8)';
charbits = charbits(:)'-'0';
end
for ii=1:8
bit_count = bit_count + 1;
if charbits(ii) == 1
if HH(bit_count) <= 0
HH(bit_count) = steg_coeffs(randi(numel(steg_coeffs)));
end
else
if HH(bit_count) >= 0
HH(bit_count) = -1 * steg_coeffs(randi(numel(steg_coeffs)));
end
end
end
end
stego_image = idwt2(LL,LH,HL,HH,'haar');
imwrite(uint8(stego_image),'newStego.bmp');
Data Extraction:
new_Stego = imread('newStego.bmp');
[LL,LH,HL,HH] = dwt2(new_Stego,'haar');
message = '';
msgbits = '';
for ii = 1:size(HH,1)*size(HH,2)
if HH(ii) > 0
msgbits = strcat (msgbits, '1');
elseif HH(ii) < 0
msgbits = strcat (msgbits, '0');
else
return;
end
if mod(ii,8) == 0
msgChar = bin2dec(msgbits);
if msgChar == 0
break;
end
msgChar = char (msgChar);
message = [message msgChar];
msgbits = '';
end
end
The problem arises from reading your data with importdata.
This command will load the data to an array. Since you have 39 lines and 2 columns (skipping any empty lines), its size will be 39 2. However, the program assumes that your message will be a string. For example, 'i am a string' has a size 1 13. This expectation of the program compared to the data you actually give it creates all sorts of problems.
What you want is to read your data as a single string, where the number 230 is not one element, but 3 individual characters. Tabs and newlines will also be read in as well.
To read your file:
message = fileread('minutiaTest.txt');
After you extract your message, to save it to a file:
fid = fopen('myFilename.txt','w');
fprintf(fid,message);
fclose(fid);
Here is my code. The intent is I have a Wireshark capture saved to a particularly formatted text file. The MATLAB code is supposed to go through the Packets, dissect them for different protocols, and then make tables based on those protocols. I currently have this programmed for ETHERNET/IP/UDP/MODBUS. In this case, it creates a column in MBTable each time it encounters a new register value, and each time it comes across a change to that register value, it updates the value in that line of the table. The first column of MBTable is time, the registers start with the second column.
MBTable is preallocated to over 100,000 Rows (nol is very large), 10 columns before this code is executed. The actual data from a file I'm pulling into the table gets to about 10,000 rows and 4 columns and the code execution is so slow I have to stop it. The tic/toc value is calculated every 1000 rows and continues to increase exponentially with every iteration. It is a large loop, but I can't see where anything is growing in such a way that it would cause it to run slower with each iteration.
All variables get initialized up top (left out to lessen amount of code.
The variables eth, eth.ip, eth.ip.udp, and eth.ip.udp.modbus are all of type struct as is eth.header and eth.ip.header. WSID is a file ID from a .txt file opened earlier.
MBTable = zeros(nol,10);
tval = tic;
while not(feof(WSID))
packline = packline + 1;
fl = fl + 1;
%Get the next line from the file
MBLine = fgetl(WSID);
%Make sure line is not blank or short
if length(MBLine) >= 3
%Split the line into 1. Line no, 2. Data, 3. ASCII
%MBAll = strsplit(MBLine,' ');
%First line of new packet, if headers included
if strcmp(MBLine(1:3),'No.')
newpack = true;
newtime = false;
newdata = false;
stoppack = false;
packline = 1;
end
%If packet has headers, 2nd line contains timestamp
if newpack
Ordered = false;
if packline == 2;
newtime = true;
%MBstrs = strsplit(MBAll{2},' ');
packno = int32(str2double(MBLine(1:8)));
t = str2double(MBLine(9:20));
et = t - lastt;
if lastt > 0 && et > 0
L = L + 1;
MBTable(L,1) = t;
end
%newpack = false;
end
if packline > 3
dataline = int16(str2double(MBLine(1:4)));
packdata = strcat(packdata,MBLine(7:53));
end
end
else
%if t >= st
if packline > 3
stoppack = true;
newpack = false;
end
if stoppack
invalid = false;
%eth = struct;
eth.pack = packdata(~isspace(packdata));
eth.length = length(eth.pack);
%Dissect the packet data
eth.stbyte = 1;
eth.ebyte = eth.length;
eth.header.stbyte = 1;
eth.header.ebyte = 28;
%Ethernet Packet Data
eth.header.pack = eth.pack(eth.stbyte:eth.stbyte+27);
eth.header.dest = eth.header.pack(eth.header.stbyte:eth.header.stbyte + 11);
eth.header.src = eth.header.pack(eth.header.stbyte + 12:eth.header.stbyte + 23);
eth.typecode = eth.header.pack(eth.header.stbyte + 24:eth.header.ebyte);
if strcmp(eth.typecode,'0800')
eth.type = 'IP';
%eth.ip = struct;
%IP Packet Data
eth.ip.stbyte = eth.header.ebyte + 1;
eth.ip.ver = eth.pack(eth.ip.stbyte);
%IP Header length
eth.ip.header.length = 4*int8(str2double(eth.pack(eth.ip.stbyte+1)));
eth.ip.header.ebyte = eth.ip.stbyte + eth.ip.header.length - 1;
%Differentiated Services Field
eth.ip.DSF = eth.pack(eth.ip.stbyte + 2:eth.ip.stbyte + 3);
%Total IP Packet Length
eth.ip.length = hex2dec(eth.pack(eth.ip.stbyte+4:eth.ip.stbyte+7));
eth.ip.ebyte = eth.ip.stbyte + max(eth.ip.length,46) - 1;
eth.ip.pack = eth.pack(eth.ip.stbyte:eth.ip.ebyte);
eth.ip.ID = eth.pack(eth.ip.stbyte+8:eth.ip.stbyte+11);
eth.ip.flags = eth.pack(eth.ip.stbyte+12:eth.ip.stbyte+13);
eth.ip.fragoff = eth.pack(eth.ip.stbyte+14:eth.ip.stbyte+15);
%Time to Live
eth.ip.ttl = hex2dec(eth.pack(eth.ip.stbyte+16:eth.ip.stbyte+17));
eth.ip.typecode = eth.pack(eth.ip.stbyte+18:eth.ip.stbyte+19);
eth.ip.checksum = eth.pack(eth.ip.stbyte+20:eth.ip.stbyte+23);
%eth.ip.src = eth.pack(eth.ip.stbyte+24:eth.ip.stbyte+31);
eth.ip.src = ...
[num2str(hex2dec(eth.pack(eth.ip.stbyte+24:eth.ip.stbyte+25))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+26:eth.ip.stbyte+27))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+28:eth.ip.stbyte+29))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+30:eth.ip.stbyte+31)))];
eth.ip.dest = ...
[num2str(hex2dec(eth.pack(eth.ip.stbyte+32:eth.ip.stbyte+33))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+34:eth.ip.stbyte+35))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+36:eth.ip.stbyte+37))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+38:eth.ip.stbyte+39)))];
if strcmp(eth.ip.typecode,'11')
eth.ip.type = 'UDP';
eth.ip.udp.stbyte = eth.ip.stbyte + 40;
eth.ip.udp.src = hex2dec(eth.pack(eth.ip.udp.stbyte:eth.ip.udp.stbyte + 3));
eth.ip.udp.dest = hex2dec(eth.pack(eth.ip.udp.stbyte+4:eth.ip.udp.stbyte+7));
eth.ip.udp.length = hex2dec(eth.pack(eth.ip.udp.stbyte+8:eth.ip.udp.stbyte+11));
eth.ip.udp.checksum = eth.pack(eth.ip.udp.stbyte+12:eth.ip.udp.stbyte+15);
eth.ip.udp.protoID = eth.pack(eth.ip.udp.stbyte+20:eth.ip.udp.stbyte+23);
if strcmp(eth.ip.udp.protoID,'0000')
eth.ip.udp.proto = 'MODBUS';
%eth.ip.udp.modbus = struct;
eth.ip.udp.modbus.stbyte = eth.ip.udp.stbyte+16;
eth.ip.udp.modbus.transID = eth.pack(eth.ip.udp.modbus.stbyte:eth.ip.udp.modbus.stbyte+3);
eth.ip.udp.modbus.protoID = eth.ip.udp.protoID;
eth.ip.udp.modbus.length = int16(str2double(eth.pack(eth.ip.udp.modbus.stbyte + 8:eth.ip.udp.modbus.stbyte + 11)));
eth.ip.udp.modbus.UID = eth.pack(eth.ip.udp.modbus.stbyte + 12:eth.ip.udp.modbus.stbyte + 13);
eth.ip.udp.modbus.func = hex2dec(eth.pack(eth.ip.udp.modbus.stbyte + 14:eth.ip.udp.modbus.stbyte+15));
eth.ip.udp.modbus.register = eth.pack(eth.ip.udp.modbus.stbyte + 16: eth.ip.udp.modbus.stbyte+19);
%Number of words to a register, or the number of registers
eth.ip.udp.modbus.words = hex2dec(eth.pack(eth.ip.udp.modbus.stbyte+20:eth.ip.udp.modbus.stbyte+23));
eth.ip.udp.modbus.bytes = hex2dec(eth.pack(eth.ip.udp.modbus.stbyte+24:eth.ip.udp.modbus.stbyte+25));
eth.ip.udp.modbus.data = eth.pack(eth.ip.udp.modbus.stbyte + 26:eth.ip.udp.modbus.stbyte + 26 + 2*eth.ip.udp.modbus.bytes - 1);
%If func 16 or 23, loop through data/registers and add to table
if eth.ip.udp.modbus.func == 16 || eth.ip.udp.modbus.func == 23
stp = eth.ip.udp.modbus.bytes*2/eth.ip.udp.modbus.words;
for n = 1:stp:eth.ip.udp.modbus.bytes*2;
%Check for existence of register as a key?
if ~isKey(MBMap,eth.ip.udp.modbus.register)
MBCol = MBCol + 1;
MBMap(eth.ip.udp.modbus.register) = MBCol;
end
MBTable(L,MBCol) = hex2dec(eth.ip.udp.modbus.data(n:n+stp-1));
eth.ip.udp.modbus.register = dec2hex(hex2dec(eth.ip.udp.modbus.register)+1);
end
lastt = t;
end
%If func 4, make sure it is the response, then put
%data into table for register column
elseif false
%need code to handle serial to UDP conversion box
else
invalid = true;
end
else
invalid = true;
end
else
invalid = true;
end
if ~invalid
end
end
%end
end
%Display Progress
if int64(fl/1000)*1000 == fl
for x = 1:length(mess);
fprintf('\b');
end
%fprintf('Lines parsed: %i',fl);
mess = sprintf('Lines parsed: %i / %i',fl,nol);
fprintf('%s',mess);
%Check execution time - getting slower:
%%{
ext = toc(tval);
mess = sprintf('\nExecution Time: %f\n',ext);
fprintf('%s',mess);
%%}
end
end
ext = toc - exst;
Update: I updated my code above to remove the overloaded operators (disp and lt were replaced with mess and lastt)
Was asked to use the profiler, so I limited to 2000 lines in the table (added && L >=2000 to the while loop) to limit the execution time, and here are the top results from the profiler:
SGAS_Wireshark_Parser_v0p7_fulleth 1 57.110 s 9.714 s
Strcat 9187 29.271 s 13.598 s
Blanks 9187 15.673 s 15.673 s
Uigetfile 1 12.226 s 0.009 s
uitools\private\uigetputfile_helper 1 12.212 s 0.031 s
FileChooser.FileChooser>FileChooser.show 1 12.085 s 0.006s
...er>FileChooser.showPeerAndBlockMATLAB 1 12.056 s 0.001s
...nChooser>FileOpenChooser.doShowDialog 1 12.049 s 12.049 s
hex2dec 44924 2.944 s 2.702 s
num2str 16336 1.139 s 0.550 s
str2double 17356 1.025 s 1.025 s
int2str 16336 0.589 s 0.589 s
fgetl 17356 0.488 s 0.488 s
dec2hex 6126 0.304 s 0.304 s
fliplr 44924 0.242 s 0.242 s
It appears to be strcat calls that are doing it. I only explicitly call strcat on one line. Are some of the other string manipulations I'm doing calling strcat indirectly?
Each loop should be calling strcat the same number of times though, so I still don't understand why it takes longer and longer the more it runs...
also, hex2dec is called a lot, but is not really affecting the time.
But anyway, are there any other methods I can use the combine the strings?
Here is the issue:
The string (an char array in MATLAB) packdata was being resized and reallocated over and over again. That's what was slowing down this code. I did the following steps:
I eliminated the redundant variable packdata and now only use eth.pack.
I preallocated eth.pack and a couple "helper variables" of known lengths by running blanks ONCE for each before the loop ever starts
eth.pack = blanks(604);
thisline = blanks(47);
smline = blanks(32);
(Note: 604 is the maximum possible size of packdata based on headers + MODBUS protocol)
Then I created a pointer variable to point to the location of the last char written to packdata.
pptr = 1;
...
dataline = int16(str2double(MBLine(1:4)));
thisline = MBLine(7:53); %Always 47 characters
smline = [thisline(~isspace(thisline)),blanks(32-sum(~isspace(thisline)))]; %Always 32 Characters
eth.pack(pptr:pptr+31) = smline;
pptr = pptr + 32;
The above was inside the 'if packline > 3' block in place of the 'packdata =' statement, then at the end of the 'if stoppack' block was the reset statement:
pptr = 1; %Reset Pointer
FYI, not surprisingly this brought out other flaws in my code which I've mostly fixed but still need to finish. Not a big issue now as this loop executes lightning fast with these changes. Thanks to Yvon for helping point me in the right direction.
I kept thinking my huge table, MBTable was the issue... but it had nothing to do with it.