autohotkey web scraping code optimization

autohotkey web scraping code optimization - autohotkey

I added my scraping code below, I would in theory, start on a page with 25 values for me to pull, and appends the way I want it too.
Some pages have less than 25 and gives me error's and blank lines on my .txt. Can smarter minds help me clean this up?
Here is the scraping code for me to use for another area that is working fine.
F3::Loop_Scrape()
Loop_Scrape() {
If ErrorLevel
return
else
prop_1=,prop_2=,prop_3=,prop_4=,prop_5=,prop_6=,prop_7=,prop_8=,prop_9=,prop_10=,prop_11=,prop_12=,prop_13=,prop_14=,prop_15=,prop_16=,prop_17=,prop_18=,prop_19=,prop_20=,prop_21=,prop_22=,prop_23=,prop_24=,prop_25=
Sleep,150
IfWinExist,ahk_class IEFrame
{
pwb:=WBGet()
WinActivate
}
Sleep,350
prop_1:=pwb.document.GetElementsByTagName("TD")[1].innerText
prop_2:=pwb.document.GetElementsByTagName("TD")[12].innerText
prop_3:=pwb.document.GetElementsByTagName("TD")[23].innerText
prop_4:=pwb.document.GetElementsByTagName("TD")[34].innerText
prop_5:=pwb.document.GetElementsByTagName("TD")[45].innerText
prop_6:=pwb.document.GetElementsByTagName("TD")[56].innerText
prop_7:=pwb.document.GetElementsByTagName("TD")[67].innerText
prop_8:=pwb.document.GetElementsByTagName("TD")[78].innerText
prop_9:=pwb.document.GetElementsByTagName("TD")[89].innerText
prop_10:=pwb.document.GetElementsByTagName("TD")[100].innerText
prop_11:=pwb.document.GetElementsByTagName("TD")[111].innerText
prop_12:=pwb.document.GetElementsByTagName("TD")[122].innerText
prop_13:=pwb.document.GetElementsByTagName("TD")[133].innerText
prop_14:=pwb.document.GetElementsByTagName("TD")[144].innerText
prop_15:=pwb.document.GetElementsByTagName("TD")[155].innerText
prop_16:=pwb.document.GetElementsByTagName("TD")[166].innerText
prop_17:=pwb.document.GetElementsByTagName("TD")[177].innerText
prop_18:=pwb.document.GetElementsByTagName("TD")[188].innerText
prop_19:=pwb.document.GetElementsByTagName("TD")[199].innerText
prop_20:=pwb.document.GetElementsByTagName("TD")[210].innerText
prop_21:=pwb.document.GetElementsByTagName("TD")[221].innerText
prop_22:=pwb.document.GetElementsByTagName("TD")[232].innerText
prop_23:=pwb.document.GetElementsByTagName("TD")[243].innerText
prop_24:=pwb.document.GetElementsByTagName("TD")[254].innerText
prop_25:=pwb.document.GetElementsByTagName("TD")[265].innerText
Sleep,350
FileAppend,%prop_1%`n%prop_2%`n%prop_3%`n%prop_4%`n%prop_5%`n%prop_6%`n%prop_7%`n%prop_8%`n%prop_9%`n%prop_10%`n%prop_11%`n%prop_12%`n%prop_13%`n%prop_14%`n%prop_15%`n%prop_16%`n%prop_17%`n%prop_18%`n%prop_19%`n%prop_20%`n%prop_21%`n%prop_22%`n%prop_23%`n%prop_24%`n%prop_25%`n,Docs/MyFile2.txt
return
}

dude you should try creating a dict
here's my python version of what you should do hope it helps
props = {}#Create dictionary
index = 1 #index of the elements
for i in range(1,26): #goes from 1-25
#getting the element and put it in the dictionary
props['prop'+str(i)] = pwb.document.GetElementsByTagName("TD")[index].innerText
#incrementing the index
index +=11
for key, value in props.iteritems():
if value != "":
file.append(value)
sorry i dont know c++ i just saw your question but if you get the idea you can make it

Related

Reject method of collection does not work

Route::get('/product',function(){ $product = Product::all(); $filtered_product = $product->reject(function ($product) { $specific_product = $product->where("price",'=',"10.00")->get(); foreach($specific_product as $sp){ return $sp->price; } }); dd($filtered_product); });
I want to exlude some records which match the condition above. I know I can do it in simpler way, but I have a weird habit which I like to do thing in more complex way, So I can be proud of myself.. sound crazy right?...anyway..why the code above return an empty array??? please don't tell me to read the document. I am here because I have read it a thousand time and I still don't get it. thanks
I expect the result which does not include the records which have price 10.00

Randomly select an array from group of "enabled" arrays

I have a set of four arrays. I also have an option to either enable or disable 3 of the 4 arrays (one is always enabled).
Is there a way to randomly decide which array to pull a value from (of the arrays indicated as enabled)?
I originally was aiming to make a master array and just append the content of the other enabled ones into it, but it proved a little harder than expected. I figured it would be easier to simply randomly select an array to pull the single value from as long as it was "enabled".
I'm currently pulling the value with a simple statement such as
If ????? {
return promptArrayA[desiredIndexA]
} else if { ?????
return promptArrayB[desiredIndexB]
} else if { ?????
return promptArrayC[desiredIndexC]
} else {
return promptArrayD[desiredIndexD]
I'm thinking if I had a "randomizer" that chose one of the enabled arrays, then I can use that as a constraint in an If Statement.
I'm fairly new to Swift so any help is much appreciated. Thank you

You can get the array randomly by doing:
let enabledArrays = [promptArrayA, promptArrayB, promptArrayC, promptArrayD]
let randomIndex = Int.random(in: 0..<enabledArrays.count)
let randomArray = enabledArrays[randomIndex]
return randomArray[desiredIndex]

Mirc script to find exact match in customer list

I am using this to find customer name in text file. Names are each on a separate line. I need to find exact name. If searching for Nick specifically it should find Nick only but my code will say found even if only Nickolson is in te list.
On*:text:*!Customer*:#: {
if ($read(system\Customer.txt,$2)) {
.msg $chan $2 Customer found in list! | halt }
else { .msg $chan 4 $2 Customer not found in list. | halt }
}

You have to loop through every matching line and see if the line is an exact match
Something like this
On*:text:*!Custodsddmer*:#: {
var %nick
; loop over all lines that contains nick
while ($read(customer.txt, nw, *nick*, $calc($readn + 1))) {
; check if the line is an exact match
if ($v1 == nick) {
%nick = $v1
; stop the loop because a result is found
break;
}
}
if (%nick == $null) {
.msg $chan 4 $2 Customer not found in list.
}
else{
.msg $chan $2 Customer found in list!
}
You can find more here: https://en.wikichip.org/wiki/mirc/text_files#Iterating_Over_Matches

If you're looking for exact match in a new line separate list, then you can use the 'w' switch without using wildcard '*' character.
From mIRC documentation
$read(filename, [ntswrp], [matchtext], [N])
Scans the file info.txt for a line beginning with the word mirc and
returns the text following the match value. //echo $read(help.txt, w,
*help*)
Because we don't want the wildcard matching, but a exact match, we would use:
$read(customers.txt, w, Nick)
Complete Code:
ON *:TEXT:!Customer *:#: {
var %foundInTheList = $read(system\Customer.txt, w, $2)
if (%foundInTheList) {
.msg # $2 Customer found in list!
}
else {
.msg 4 # $2 Customer not found in list.
}
}
Few remarks on Original code
Halting
halt should only use when you forcibly want to stop any future processing to take place. In most cases, you can avoid it, by writing you code flow in a way it will behave like that without explicitly using halting.
It will also resolve new problems that may arise, in case you will want to add new code, but you will wonder why it isn't executing.. because of the darn now forgotten halt command.
This will also improve you debugging, in the case it will not make you wonder on another flow exit, without you knowing.
Readability
if (..) {
.... }
else { .. }
When considering many lines of codes inside the first { } it will make it hard to notice the else (or elseif) because mIRC remote parser will put on the same identification as the else line also the line above it, which contains the closing } code. You should almost always few extra code in case of readability, especially which it costs new nothing!, as i remember new lines are free of charge.
So be sure the to have the rule of thump of every command in a new line. (that includes the closing bracket)
Matching Text
On*:text:*!Customer*:#: {
The above code has critical problem, and bug.
Critical: Will not work, because on*:text contains no space between on and *:text
Bug: !Customer will match EVERYTHING-BEFORE!customerANDAFTER <NICK>, which is clearly not desired behavior. What you want is :!Customer *: will only match if the first word was !customer and you must enter at least another text, because I've used [SPACE]*.

Why do I get an IndexOutOfBoundsException when my else should prevent it?

I am working on some sort of quiz and have a list of questions and answers, that get transferred to my view class via the controller. People can ask and answer questions on a page, my system then goes and "collects" those to make a quiz out of them.
If you are the first person to start the program / quiz, the question-list is empty. Therefore I want to check for an empty quiz with an if / else clause, the if-case seems to work fine, but the else-case throws an IndexOutOfBoundsException and I dont understand why. I would think that the else-part will not be used when the question-list is empty, therefore the exception should not be thrown. Should....
View class:
#(questionList: List[Question], answerList: List[Answer], answerRadioForm: Form[Answer])
#if(questionList.length == 0){
No questions yet!
}
else {
<!-- As only the highest ranked question gets put into the List, there is only one entry on first place -->
<b>#questionList.get(0).questionText</b>
#for(question <- questionList) {
#question.questionText - #question.ownerID <br>
}
}
Error:
[IndexOutOfBoundsException: Index: 0, Size: 0]
49 <b>"""),_display_(/*27.8*/questionList/*27.20*/.get(0).questionText),format.raw/*27.40*/("""</b>
So, what am I missing here?

I have found a solution and altough it's bad practice to answer your own question, I have searched for several hours for this and maybe my answer helps somebody else:
There must not be a return / newline between the if / else.
Does NOT work:
#if(questionList.length == 0){
No questions yet!
}
else { ...
Works:
#if(questionList.length == 0){
No questions yet!
} else {
EDIT: As #if(questionList.length > 0){ does work also, is stable against accidentically inserting newlines and is a bit easier to read and understand, I will use this instead of the else.

First, does your code compile, as there is no get method on List. You can use list.headOption instead.
Well, I could use questionList(0)
Another solution.
#questionList.headOption.map(q => <b>{q.text}</b>).getOrElse("No questions yet!")
#for(question <- questionList) {
#question.text - #question.ownerId <br>
}

What's wrong with my Meteor publication?

I have a publication, essentially what's below:
Meteor.publish('entity-filings', function publishFunction(cik, queryArray, limit) {
if (!cik || !filingsArray)
console.error('PUBLICATION PROBLEM');
var limit = 40;
var entityFilingsSelector = {};
if (filingsArray.indexOf('all-entity-filings') > -1)
entityFilingsSelector = {ct: 'filing',cik: cik};
else
entityFilingsSelector = {ct:'filing', cik: cik, formNumber: { $in: filingsArray} };
return SB.Content.find(entityFilingsSelector, {
limit: limit
});
});
I'm having trouble with the filingsArray part. filingsArray is an array of regexes for the Mongo $in query. I can hardcode filingsArray in the publication as [/8-K/], and that returns the correct results. But I can't get the query to work properly when I pass the array from the router. See the debugged contents of the array in the image below. The second and third images are the client/server debug contents indicating same content on both client and server, and also identical to when I hardcode the array in the query.
My question is: what am I missing? Why won't my query work, or what are some likely reasons it isn't working?

In that first screenshot, that's a string that looks like a regex literal, not an actual RegExp object. So {$in: ["/8-K/"]} will only match literally "/8-K/", which is not the same as {$in: [/8-K/]}.
Regexes are not EJSON-able objects, so you won't be able to send them over the wire as publish function arguments or method arguments or method return values. I'd recommend sending a string, then inside the publish function, use new RegExp(...) to construct a regex object.
If you're comfortable adding new methods on the RegExp prototype, you could try making RegExp an EJSON-able type, by putting this in your server and client code:
RegExp.prototype.toJSONValue = function () {
return this.source;
};
RegExp.prototype.typeName = function () {
return "regex";
}
EJSON.addType("regex", function (str) {
return new RegExp(str);
});
After doing this, you should be able to use regexes as publish function arguments, method arguments and method return values. See this meteorpad.

/8-K/.. that's a weird regex. Try /8\-K/.
A minus (-) sign is a range indicator and usually used inside square brackets. The reason why it's weird because how could you even calculate a range between 8 and K? If you do not escape that, it probably wouldn't be used to match anything (thus your query would not work). Sometimes, it does work though. Better safe than never.
/8\-K/ matches the string "8-K" anywhere once.. which I assume you are trying to do.
Also it would help if you would ensure your publication would always return something.. here's a good area where you could fail:
if (!cik || !filingsArray)
console.error('PUBLICATION PROBLEM');
If those parameters aren't filled, console.log is probably not the best way to handle it. A better way:
if (!cik || !filingsArray) {
throw "entity-filings: Publication problem.";
return false;
} else {
// .. the rest of your publication
}
This makes sure that the client does not wait unnecessarily long for publications statuses as you have successfully ensured that in any (input) case you returned either false or a Cursor and nothing in between (like surprise undefineds, unfilled Cursors, other garbage data.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

autohotkey web scraping code optimization - autohotkey

Related

Reject method of collection does not work

Randomly select an array from group of "enabled" arrays

Mirc script to find exact match in customer list

Why do I get an IndexOutOfBoundsException when my else should prevent it?

What's wrong with my Meteor publication?

Categories

Resources