clc
URL = 'http://time.is/';
key1 = 'title="Click for calendar">';
key2 = '</h2>';
data = urlread(URL);
start_ind = strfind(data,key1);
data1 = data(start_ind:end);
off_stop_ind = strfind(data1,key2);
current_date =data(start_ind+numel(key1):start_ind + off_stop_ind(1)-2)
date_split = strsplit(current_date,',')
current_date1 = datestr(strcat(date_split(2),date_split(3)))
I got this error how to fix it
Index exceeds the number of array elements. Index must not exceed 0.
Error in date (line 10)
current_date =data(start_ind+numel(key1):start_ind + off_stop_ind(1)-2)
Related
I tried to ask a question regarding nathandrake's #nathandrake post: How do I calculate a word-word co-occurrence matrix with sklearn?
import pandas as pd
def co_occurance_matrix(input_text,top_words,window_size):
co_occur = pd.DataFrame(index=top_words, columns=top_words)
for row,nrow in zip(top_words,range(len(top_words))):
for colm,ncolm in zip(top_words,range(len(top_words))):
count = 0
if row == colm:
co_occur.iloc[nrow,ncolm] = count
else:
for single_essay in input_text:
essay_split = single_essay.split(" ")
max_len = len(essay_split)
top_word_index = [index for index, split in enumerate(essay_split) if row in split]
for index in top_word_index:
if index == 0:
count = count + essay_split[:window_size + 1].count(colm)
elif index == (max_len -1):
count = count + essay_split[-(window_size + 1):].count(colm)
else:
count = count + essay_split[index + 1 : (index + window_size + 1)].count(colm)
if index < window_size:
count = count + essay_split[: index].count(colm)
else:
count = count + essay_split[(index - window_size): index].count(colm)
co_occur.iloc[nrow,ncolm] = count
return co_occur
My question is: what if my words are not one word but bigrams. For example:
corpus = ['ABC DEF IJK PQR','PQR KLM OPQ','LMN PQR XYZ DEF ABC']
words = ['ABC PQR','PQR DEF']
window_size =100
result = co_occurance_matrix(corpus,words,window_size)
result
I changed the word list into a bigram list, then the co_occurance_matrix function is not working. All are showing 0.
I'm using some codes that called specific function.
in line 17 of this function I get the error
Index exceeds matrix dimensions.
Error in generateExpReport (line 17)
checkPValuesField = subjFiles(1).name;
the function up to line 20 is like this:
function [] = generateExpReport(copyDir,resultDir,params)
% Syntax
methodNames = fieldnames(params.methods);
numMethods = length(methodNames);
for i = 1 : numMethods
cd(resultDir)
numTargets = length(params.methods.(methodNames{i,1}).idTargets);
idDrivers = params.methods.(methodNames{i,1}).idDrivers;
nameFiles = [methodNames{i,1} '*.mat'];
subjFiles = dir(nameFiles);
numSubj = length(subjFiles);
significanceOnDrivers = zeros(numSubj,numTargets);
matrixTransferEntropy = zeros(numSubj,(numTargets)+1);
% check if the pValues matrix is present
checkPValuesField = load(subjFiles(1).name);
fields = fieldnames(checkPValuesField);
nameFields = checkPValuesField.(fields{1,1});
I can't find the problem
please help me:(( what's wrong with
checkPValuesField = load(subjFiles(1).name);
Using an ActiveX server from MATLAB, I am trying to highlight many cells in an Excel sheet at once. These are not in specific columns or rows so I use Range('A1,B2,...') to access them. However the string accepted by the Range object has to be less than 255 characters or an error:
Error: Object returned error code: 0x800A03EC
is thrown. The following code reproduces this error with an empty Excel file.
hActX = actxserver('Excel.Application');
hWB = hActX.Workbooks.Open('C:\Book1.xlsx');
hSheet = hWB.Worksheets.Item('Sheet1');
col = repmat('A', 100, 1);
row = num2str((1:100)'); %'
cellInd = strcat(col, strtrim(cellstr(row)));
str1 = strjoin(cellInd(1:66), ','); %// 254 characters
str2 = strjoin(cellInd(1:67), ','); %// 258 characters
hSheet.Range(str1).Interior.Color = 255; %// Works
hSheet.Range(str2).Interior.Color = 255; %// Error 0x800A03EC
hWB.Save;
hWB.Close(false);
hActX.Quit;
How can I get around this? I found no other relevant method of calling Range, or of otherwise getting the cells I want to modify.
If you start with a String, you can test its length to determine if Range() can handle it. Here is an example of building a diagonal range:
Sub DiagonalRange()
Dim BigString As String, BigRange As Range
Dim i As Long, HowMany As Long, Ln As String
HowMany = 100
For i = 1 To HowMany
BigString = BigString & "," & Cells(i, i).Address(0, 0)
Next i
BigString = Mid(BigString, 2)
Ln = Len(BigString)
MsgBox Ln
If Ln < 250 Then
Set BigRange = Range(BigString)
Else
Set BigRange = Nothing
arr = Split(BigString, ",")
For Each a In arr
If BigRange Is Nothing Then
Set BigRange = Range(a)
Else
Set BigRange = Union(BigRange, Range(a))
End If
Next a
End If
BigRange.Select
End Sub
For i = 10, the code will the the direct method, but if the code were i=100, the array method would be used.
The solution, as Rory pointed out, is to use the Union method. To minimize the number of calls from MATLAB to the ActiveX server, this is what I did:
str = strjoin(cellInd, ',');
isep = find(str == ',');
isplit = diff(mod(isep, 250)) < 0;
isplit = [isep(isplit) (length(str) + 1)];
hRange = hSheet.Range(str(1:(isplit(1) - 1)));
for ii = 2:numel(isplit)
hRange = hActX.Union(hRange, ...
hSheet.Range(str((isplit(ii-1) + 1):(isplit(ii) - 1))));
end
I used 250 in the mod to account for the cell names being up to 6 characters long, which is sufficient for me.
I am attempting to extract data from a DWT subband. I am able to embed data correctly (I have followed it in the debugger),cal PSNR etc. PSNR rate seem very high 76.2?? however,I am having lot of trouble extracting data back!It is sometimes extracting the number 128?? Can anyone help or have any idea why this is? I would be very thankful.I have been working on this all day & having no luck!I am very curious to know??
Data Embedding:
coverImage = imread('lena.bmp');
message = importdata('minutiaTest.txt');
%message = 'Bifurcations:';
[LL,LH,HL,HH] = dwt2(coverImage,'haar');
if size(message) > size(coverImage,1) * size(coverImage,2)
error ('message too big to embed');
end
bit_count = 0;
steg_coeffs = [4, 4.75, 5.5, 6.25, 7];
for jj=1:size(message,2)+1
if jj > size(message,2)
charbits = [0,0,0,0,0,0,0,0];
else
charbits = dec2bin(message(jj),8)';
charbits = charbits(:)'-'0';
end
for ii=1:8
bit_count = bit_count + 1;
if charbits(ii) == 1
if HH(bit_count) <= 0
HH(bit_count) = steg_coeffs(randi(numel(steg_coeffs)));
end
else
if HH(bit_count) >= 0
HH(bit_count) = -1 * steg_coeffs(randi(numel(steg_coeffs)));
end
end
end
end
stego_image = idwt2(LL,LH,HL,HH,'haar');
imwrite(uint8(stego_image),'newStego.bmp');
Data Extraction:
new_Stego = imread('newStego.bmp');
[LL,LH,HL,HH] = dwt2(new_Stego,'haar');
message = '';
msgbits = '';
for ii = 1:size(HH,1)*size(HH,2)
if HH(ii) > 0
msgbits = strcat (msgbits, '1');
elseif HH(ii) < 0
msgbits = strcat (msgbits, '0');
else
return;
end
if mod(ii,8) == 0
msgChar = bin2dec(msgbits);
if msgChar == 0
break;
end
msgChar = char (msgChar);
message = [message msgChar];
msgbits = '';
end
end
The problem arises from reading your data with importdata.
This command will load the data to an array. Since you have 39 lines and 2 columns (skipping any empty lines), its size will be 39 2. However, the program assumes that your message will be a string. For example, 'i am a string' has a size 1 13. This expectation of the program compared to the data you actually give it creates all sorts of problems.
What you want is to read your data as a single string, where the number 230 is not one element, but 3 individual characters. Tabs and newlines will also be read in as well.
To read your file:
message = fileread('minutiaTest.txt');
After you extract your message, to save it to a file:
fid = fopen('myFilename.txt','w');
fprintf(fid,message);
fclose(fid);
Here is my code. The intent is I have a Wireshark capture saved to a particularly formatted text file. The MATLAB code is supposed to go through the Packets, dissect them for different protocols, and then make tables based on those protocols. I currently have this programmed for ETHERNET/IP/UDP/MODBUS. In this case, it creates a column in MBTable each time it encounters a new register value, and each time it comes across a change to that register value, it updates the value in that line of the table. The first column of MBTable is time, the registers start with the second column.
MBTable is preallocated to over 100,000 Rows (nol is very large), 10 columns before this code is executed. The actual data from a file I'm pulling into the table gets to about 10,000 rows and 4 columns and the code execution is so slow I have to stop it. The tic/toc value is calculated every 1000 rows and continues to increase exponentially with every iteration. It is a large loop, but I can't see where anything is growing in such a way that it would cause it to run slower with each iteration.
All variables get initialized up top (left out to lessen amount of code.
The variables eth, eth.ip, eth.ip.udp, and eth.ip.udp.modbus are all of type struct as is eth.header and eth.ip.header. WSID is a file ID from a .txt file opened earlier.
MBTable = zeros(nol,10);
tval = tic;
while not(feof(WSID))
packline = packline + 1;
fl = fl + 1;
%Get the next line from the file
MBLine = fgetl(WSID);
%Make sure line is not blank or short
if length(MBLine) >= 3
%Split the line into 1. Line no, 2. Data, 3. ASCII
%MBAll = strsplit(MBLine,' ');
%First line of new packet, if headers included
if strcmp(MBLine(1:3),'No.')
newpack = true;
newtime = false;
newdata = false;
stoppack = false;
packline = 1;
end
%If packet has headers, 2nd line contains timestamp
if newpack
Ordered = false;
if packline == 2;
newtime = true;
%MBstrs = strsplit(MBAll{2},' ');
packno = int32(str2double(MBLine(1:8)));
t = str2double(MBLine(9:20));
et = t - lastt;
if lastt > 0 && et > 0
L = L + 1;
MBTable(L,1) = t;
end
%newpack = false;
end
if packline > 3
dataline = int16(str2double(MBLine(1:4)));
packdata = strcat(packdata,MBLine(7:53));
end
end
else
%if t >= st
if packline > 3
stoppack = true;
newpack = false;
end
if stoppack
invalid = false;
%eth = struct;
eth.pack = packdata(~isspace(packdata));
eth.length = length(eth.pack);
%Dissect the packet data
eth.stbyte = 1;
eth.ebyte = eth.length;
eth.header.stbyte = 1;
eth.header.ebyte = 28;
%Ethernet Packet Data
eth.header.pack = eth.pack(eth.stbyte:eth.stbyte+27);
eth.header.dest = eth.header.pack(eth.header.stbyte:eth.header.stbyte + 11);
eth.header.src = eth.header.pack(eth.header.stbyte + 12:eth.header.stbyte + 23);
eth.typecode = eth.header.pack(eth.header.stbyte + 24:eth.header.ebyte);
if strcmp(eth.typecode,'0800')
eth.type = 'IP';
%eth.ip = struct;
%IP Packet Data
eth.ip.stbyte = eth.header.ebyte + 1;
eth.ip.ver = eth.pack(eth.ip.stbyte);
%IP Header length
eth.ip.header.length = 4*int8(str2double(eth.pack(eth.ip.stbyte+1)));
eth.ip.header.ebyte = eth.ip.stbyte + eth.ip.header.length - 1;
%Differentiated Services Field
eth.ip.DSF = eth.pack(eth.ip.stbyte + 2:eth.ip.stbyte + 3);
%Total IP Packet Length
eth.ip.length = hex2dec(eth.pack(eth.ip.stbyte+4:eth.ip.stbyte+7));
eth.ip.ebyte = eth.ip.stbyte + max(eth.ip.length,46) - 1;
eth.ip.pack = eth.pack(eth.ip.stbyte:eth.ip.ebyte);
eth.ip.ID = eth.pack(eth.ip.stbyte+8:eth.ip.stbyte+11);
eth.ip.flags = eth.pack(eth.ip.stbyte+12:eth.ip.stbyte+13);
eth.ip.fragoff = eth.pack(eth.ip.stbyte+14:eth.ip.stbyte+15);
%Time to Live
eth.ip.ttl = hex2dec(eth.pack(eth.ip.stbyte+16:eth.ip.stbyte+17));
eth.ip.typecode = eth.pack(eth.ip.stbyte+18:eth.ip.stbyte+19);
eth.ip.checksum = eth.pack(eth.ip.stbyte+20:eth.ip.stbyte+23);
%eth.ip.src = eth.pack(eth.ip.stbyte+24:eth.ip.stbyte+31);
eth.ip.src = ...
[num2str(hex2dec(eth.pack(eth.ip.stbyte+24:eth.ip.stbyte+25))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+26:eth.ip.stbyte+27))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+28:eth.ip.stbyte+29))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+30:eth.ip.stbyte+31)))];
eth.ip.dest = ...
[num2str(hex2dec(eth.pack(eth.ip.stbyte+32:eth.ip.stbyte+33))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+34:eth.ip.stbyte+35))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+36:eth.ip.stbyte+37))),'.', ...
num2str(hex2dec(eth.pack(eth.ip.stbyte+38:eth.ip.stbyte+39)))];
if strcmp(eth.ip.typecode,'11')
eth.ip.type = 'UDP';
eth.ip.udp.stbyte = eth.ip.stbyte + 40;
eth.ip.udp.src = hex2dec(eth.pack(eth.ip.udp.stbyte:eth.ip.udp.stbyte + 3));
eth.ip.udp.dest = hex2dec(eth.pack(eth.ip.udp.stbyte+4:eth.ip.udp.stbyte+7));
eth.ip.udp.length = hex2dec(eth.pack(eth.ip.udp.stbyte+8:eth.ip.udp.stbyte+11));
eth.ip.udp.checksum = eth.pack(eth.ip.udp.stbyte+12:eth.ip.udp.stbyte+15);
eth.ip.udp.protoID = eth.pack(eth.ip.udp.stbyte+20:eth.ip.udp.stbyte+23);
if strcmp(eth.ip.udp.protoID,'0000')
eth.ip.udp.proto = 'MODBUS';
%eth.ip.udp.modbus = struct;
eth.ip.udp.modbus.stbyte = eth.ip.udp.stbyte+16;
eth.ip.udp.modbus.transID = eth.pack(eth.ip.udp.modbus.stbyte:eth.ip.udp.modbus.stbyte+3);
eth.ip.udp.modbus.protoID = eth.ip.udp.protoID;
eth.ip.udp.modbus.length = int16(str2double(eth.pack(eth.ip.udp.modbus.stbyte + 8:eth.ip.udp.modbus.stbyte + 11)));
eth.ip.udp.modbus.UID = eth.pack(eth.ip.udp.modbus.stbyte + 12:eth.ip.udp.modbus.stbyte + 13);
eth.ip.udp.modbus.func = hex2dec(eth.pack(eth.ip.udp.modbus.stbyte + 14:eth.ip.udp.modbus.stbyte+15));
eth.ip.udp.modbus.register = eth.pack(eth.ip.udp.modbus.stbyte + 16: eth.ip.udp.modbus.stbyte+19);
%Number of words to a register, or the number of registers
eth.ip.udp.modbus.words = hex2dec(eth.pack(eth.ip.udp.modbus.stbyte+20:eth.ip.udp.modbus.stbyte+23));
eth.ip.udp.modbus.bytes = hex2dec(eth.pack(eth.ip.udp.modbus.stbyte+24:eth.ip.udp.modbus.stbyte+25));
eth.ip.udp.modbus.data = eth.pack(eth.ip.udp.modbus.stbyte + 26:eth.ip.udp.modbus.stbyte + 26 + 2*eth.ip.udp.modbus.bytes - 1);
%If func 16 or 23, loop through data/registers and add to table
if eth.ip.udp.modbus.func == 16 || eth.ip.udp.modbus.func == 23
stp = eth.ip.udp.modbus.bytes*2/eth.ip.udp.modbus.words;
for n = 1:stp:eth.ip.udp.modbus.bytes*2;
%Check for existence of register as a key?
if ~isKey(MBMap,eth.ip.udp.modbus.register)
MBCol = MBCol + 1;
MBMap(eth.ip.udp.modbus.register) = MBCol;
end
MBTable(L,MBCol) = hex2dec(eth.ip.udp.modbus.data(n:n+stp-1));
eth.ip.udp.modbus.register = dec2hex(hex2dec(eth.ip.udp.modbus.register)+1);
end
lastt = t;
end
%If func 4, make sure it is the response, then put
%data into table for register column
elseif false
%need code to handle serial to UDP conversion box
else
invalid = true;
end
else
invalid = true;
end
else
invalid = true;
end
if ~invalid
end
end
%end
end
%Display Progress
if int64(fl/1000)*1000 == fl
for x = 1:length(mess);
fprintf('\b');
end
%fprintf('Lines parsed: %i',fl);
mess = sprintf('Lines parsed: %i / %i',fl,nol);
fprintf('%s',mess);
%Check execution time - getting slower:
%%{
ext = toc(tval);
mess = sprintf('\nExecution Time: %f\n',ext);
fprintf('%s',mess);
%%}
end
end
ext = toc - exst;
Update: I updated my code above to remove the overloaded operators (disp and lt were replaced with mess and lastt)
Was asked to use the profiler, so I limited to 2000 lines in the table (added && L >=2000 to the while loop) to limit the execution time, and here are the top results from the profiler:
SGAS_Wireshark_Parser_v0p7_fulleth 1 57.110 s 9.714 s
Strcat 9187 29.271 s 13.598 s
Blanks 9187 15.673 s 15.673 s
Uigetfile 1 12.226 s 0.009 s
uitools\private\uigetputfile_helper 1 12.212 s 0.031 s
FileChooser.FileChooser>FileChooser.show 1 12.085 s 0.006s
...er>FileChooser.showPeerAndBlockMATLAB 1 12.056 s 0.001s
...nChooser>FileOpenChooser.doShowDialog 1 12.049 s 12.049 s
hex2dec 44924 2.944 s 2.702 s
num2str 16336 1.139 s 0.550 s
str2double 17356 1.025 s 1.025 s
int2str 16336 0.589 s 0.589 s
fgetl 17356 0.488 s 0.488 s
dec2hex 6126 0.304 s 0.304 s
fliplr 44924 0.242 s 0.242 s
It appears to be strcat calls that are doing it. I only explicitly call strcat on one line. Are some of the other string manipulations I'm doing calling strcat indirectly?
Each loop should be calling strcat the same number of times though, so I still don't understand why it takes longer and longer the more it runs...
also, hex2dec is called a lot, but is not really affecting the time.
But anyway, are there any other methods I can use the combine the strings?
Here is the issue:
The string (an char array in MATLAB) packdata was being resized and reallocated over and over again. That's what was slowing down this code. I did the following steps:
I eliminated the redundant variable packdata and now only use eth.pack.
I preallocated eth.pack and a couple "helper variables" of known lengths by running blanks ONCE for each before the loop ever starts
eth.pack = blanks(604);
thisline = blanks(47);
smline = blanks(32);
(Note: 604 is the maximum possible size of packdata based on headers + MODBUS protocol)
Then I created a pointer variable to point to the location of the last char written to packdata.
pptr = 1;
...
dataline = int16(str2double(MBLine(1:4)));
thisline = MBLine(7:53); %Always 47 characters
smline = [thisline(~isspace(thisline)),blanks(32-sum(~isspace(thisline)))]; %Always 32 Characters
eth.pack(pptr:pptr+31) = smline;
pptr = pptr + 32;
The above was inside the 'if packline > 3' block in place of the 'packdata =' statement, then at the end of the 'if stoppack' block was the reset statement:
pptr = 1; %Reset Pointer
FYI, not surprisingly this brought out other flaws in my code which I've mostly fixed but still need to finish. Not a big issue now as this loop executes lightning fast with these changes. Thanks to Yvon for helping point me in the right direction.
I kept thinking my huge table, MBTable was the issue... but it had nothing to do with it.