I have four sites which I have to check specific HTML for every so often and I'd like to automate this. I've tried to use Invoke-WebRequest but I couldn't figure out how to step through the pages since a lot of AJAX is being used. This is why I use two IE COM objects. With the COM objects I can easily step through and search for the correct HTML on each page.
The problem is that the checks run sequentially:
#Define functions
site1 { }
site2 { }
etc..
#run functions
$result = site1 #returns bool
$result = site2 #returns bool
etc..
I'd like to have the site checks run at the same time but I have been unable to do so with either Start-Job or a workflow.
I don't think I have figured out the syntax for either option as the Visible property from the IE COM object never triggers:
$url = "somesite.com"
$site = New-Object -COM InternetExplorer.Application
$site.Silent = $true
$site.Navigate($url)
while ($ie.ReadyState -ne 4) {Start-Sleep -m 500}
$site.Visible = $true
What is the proper way to have these site checks run at the same time and also have them be visible?
Related
I am trying to create a powershell 'AccessObject' comobject for my MS Access app. Basically, I will trying to create a powershell script that gets queries in a database and the tables and/or queries a particular query depends on. To do that i will need to have an instance of the MS Access 'AccessObject' and 'DependencyInfo' classes in my powershell script. I have attached a snippet of the function i intend to use. This is not the complete function, please note. All i want is to know how to create an instance of the DependencyInfo and AccessObjects in powershell.
function getQueries([string] $database)
{
$dbEng = New-Object -ComObject DAO.DBEngine.120
$AccessApp= New-Object -ComObject Access.Application
$Dependency = $AccessApp.DependencyInfo
$AccessObject=$AccessApp.AccessObjects
...
}
All i want is to know how to create an instance of the DependencyInfo
and AccessObjects in powershell.
The following creates a new Access process, opens a local accdb file, and retrieves the dependencies for a given form:
$db = new-object -ComObject 'Access.Application'
$db.OpenCurrentDatabase('C:\temp\deezNutz.accdb')
$dependency_info = $db.Application.CurrentProject().AllForms('frm_person').GetDependencyInfo()
foreach ($dependency in $dependency_info.Dependencies) { $dependency.FullName }
$db.CloseCurrentDatabase()
$db.Application.DoCmd.Quit()
If you're trying to pro grammatically manipulate the objects in a Microsoft Access database e.g., forms, reports, queries, etc. Your best bet is to search for solutions using VBA then convert those to Powershell. For this example, I first wrote the solution in VBA then converted it to Powershell.
Thanks #Lord Adam. This was really helpful. In my case i had to modify the logic a little bit:
$AccessApp= New-Object -ComObject 'Access.Application'
$AccessApp.OpenCurrentDatabase($database)
$AccessApp.Application.SetOption("Track Name AutoCorrect Info", $true)
$QryDependency = $AccessApp.Application.CurrentData.AllQueries.Item($query.Name).GetDependencyInfo()
ForEach($di in $QryDependency.Dependencies)
{
$QryObjects= $QryObjects + $di.Name +","
}
I'm trying to do some web-scraping via PowerShell, as I've recently discovered it is possible to do so without too much trouble.
A good starting point is to just fetch the HTML, use Get-Member, and see what I can do from there, like so:
$html = Invoke-WebRequest "https://www.google.com"
$html.ParsedHtml | Get-Member
The methods available to me for fetching specific elements appear to be the following:
getElementById()
getElementsByName()
getElementsByTagName()
For example I can get the first IMG tag in the document like so:
$html.ParsedHtml.getElementsByTagName("img")[0]
However after doing some more research in to whether I could use CSS Selectors or XPath I discovered that there are unlisted methods available, since we are just using the HTML Document object documented here:
querySelector()
querySelectorAll()
So instead of doing:
$html.ParsedHtml.getElementsByTagName("img")[0]
I can do:
$html.ParsedHtml.querySelector("img")
So I was expecting to be able to do:
$html.ParsedHtml.querySelectorAll("img")
...in order to get all of the IMG elements. All the documentation I've found and googling I've done supports this. However, in all my testing this function crashes the calling process and reports a heap corruption exception code in the Event Log (0xc0000374).
I'm using PowerShell 5 on Windows 10 x64. I've tried it in a Win10 x64 VM that is a clean build and just patched up. I've also tried it in Win7 x64 upgraded to PowerShell 5. I haven't tried it on anything prior to PowerShell 5 as all our systems here are upgraded, but I probably will once I have time to spool a new vanilla VM for testing.
Has anyone run in to this issue before? All my research so far is a dead end. Are there alternatives to querySelectorAll? I need to scrape pages that will have predictable sets of tags inside unpredictable layouts and potentially no IDs or classes assigned to the tags, so I want to be able to use selectors that allow structure/nesting/wildcards.
P.S. I've also tried using the InternetExplorer.Application COM object in PowerShell, the result is the same, except instead of PowerShell crashing Internet Explorer crashes. This was actually my original approach, here's the code:
# create browser object
$ie = New-Object -ComObject InternetExplorer.Application
# make browser visible for debugging, otherwise this isn't necessary for function
$ie.Visible = $true
# browse to page
$ie.Navigate("https://www.google.com")
# wait till browser is not busy
Do { Start-Sleep -m 100 } Until (!$ie.Busy)
# this works
$ie.document.getElementsByTagName("img")[0]
# this works as well
$ie.document.querySelector("img")
# blow it up
$ie.document.querySelectorAll("img")
# we wanna quit the process, but since we blew it up we don't really make it here
$ie.Quit()
Hope I'm not breaking any rules and this post makes sense and is relevant, thanks.
UPDATE
I tested earlier PowerShell versions. v2-v4 crash using the InternetExplorer.Application COM method. v3-4 crash using the Invoke-WebRequest method, v2 doesn't support it.
I ran into this problem, too, and posted about it on reddit. I believe the problem happens when Powershell tries to enumerate the HTML DOM NodeList object returned by querySelectorAll(). The same object is returned by childNodes() which can be enumerated by PS, so I'm guessing there's some glue code written for .ParsedHtml.childNodes but not .ParsedHtml.querySelectorAll(). The crash can be triggered by Intellisense trying to get tab-complete help for the object, too.
I found a way around it, though! Just access the native DOM methods .item() and .length directly and emit the node objects into a PowerShell array. The following code pulls the newest page of posts from /r/Powershell, gets the post list anchors via querySelectorAll() then manually enumerates them using the native DOM methods into a Powershell-native array.
$Result = Invoke-WebRequest -Uri "https://www.reddit.com/r/PowerShell/new/"
$NodeList = $Result.ParsedHtml.querySelectorAll("#siteTable div div p.title a")
$PsNodeList = #()
for ($i = 0; $i -lt $NodeList.Length; $i++) {
$PsNodeList += $NodeList.item($i)
}
$PsNodeList | ForEach-Object {
$_.InnerHtml
}
Edit .Length seems to work capitalized or lower-case. I would have expected the DOM to be case-sensitive, so either there's some things going on to help translate or I'm misunderstanding something. Also, the CSS selector is grabbing the source links (self.PowerShell mostly), but that it my CSS selector logic error, not a problem with querySelectorAll(). Note that the results of querySelectorAll() are not live, so modifying them won't modify the original DOM. And I haven't tried modifying them or using their methods yet, but clearly we can grab at the very least .InnerHtml.
Edit 2: Here is a more-generalized wrapper function:
function Get-FixedQuerySelectorAll {
param (
$HtmlWro,
$CssSelector
)
# After assignment, $NodeList will crash powershell if enumerated in any way including Intellisense-completion while coding!
$NodeList = $HtmlWro.ParsedHtml.querySelectorAll($CssSelector)
for ($i = 0; $i -lt $NodeList.length; $i++) {
Write-Output $NodeList.item($i)
}
}
$HtmlWro is an HTML Web Response Object, the output of Invoke-WebReqest. I originally tried to pass .ParsedHtml but then it would crash on assignment. Doing it this way returns the nodes in a Powershell array.
The #midnightfreddie's solution worked fine for me before, but now it throws Exception from HRESULT: 0x80020101 when calling $NodeList.item($i).
I found the following workaround:
function Invoke-QuerySelectorAll($node, [string] $selector)
{
$nodeList = $node.querySelectorAll($selector)
$nodeListType = $nodeList.GetType()
$result = #()
for ($i = 0; $i -lt $nodeList.length; $i++)
{
$result += $nodeListType.InvokeMember("item", [System.Reflection.BindingFlags]::InvokeMethod, $null, $nodeList, $i)
}
return $result
}
This one works for New-Object -ComObject InternetExplorer.Application as well.
I have a script using PowerShells IE COM Object to login to a site and navigate to a page with PDFs (that I wish to download). I am using IE COM object to do all the navigation, and Invoke-WebRequest for the actual download. Issue is when the PDF files download they are size 2kb and can't be opened. I have a strong feeling this is a session issue and IE must be forgetting that I am logged in somehow. Any way to make IE remember I am logged in when so that I can download the files successfully?
Any help appreciated. Thanks.
Edit:
$ie = New-Object -COMObject InternetExplorer.Application
$ie.visible = $true
# Navigate to home page
$ie.navigate($URL)
while ($ie.Busy -eq $true){Start-Sleep -Seconds 3}
$Username = $ie.Document.getElementsByName("userName")
$Username.item(0).value = "username"
$Password = $ie.Document.getElementsByName("password")
$Password.item(0).value = "pass"
$AcceptTerms = $ie.Document.getElementById("useAddtnlField")
$AcceptTerms.checked = $true
$Btn = $ie.Document.getElementsByTagName("span") | ? {$_.title -match 'Login'}
$Btn.click()
while ($ie.Busy -eq $true){Start-Sleep -Seconds 2}
# ---- logged in ----
# -- Opens link in new window (due to JS) --
$ie.Navigate("https://chaseloanmanager.chase.com/Chaselock/ViewOnlineGuide.aspx")
while ($ie.Busy -eq $true){Start-Sleep -Seconds 2}
After that last navigate, I lose the $ie object and can no longer do anything with it. If I create a new $ie2 object, and continue the script using $ie2, the files are unable to be opened when finished. I was also unsuccessful getting access to the window using Shell.Application and Windows() method as to try and reference the original $ie window. Anyways, I think I need to complete the script using the original $ie object so that I am recognized as logged in throughout the process and can have a valid download at the end. Hopefully that makes sense and can help someone help me. Thanks
I have a question about Outlook Rules in Powershell. I wrote some code that successfully stores any incoming e-mail from a certain sender to the deleted-items folder. I did this because when the mails enter the junk folder, the junk folder still has the counter token of mails, so in the end it will say I have 10-mails in the junk folder.
I want to avoid this by just throwing the incoming mails from that sender to the deleted-items folder and also marking the mail as "read" so that I don't see the clutter in the deleted items folder.
The question is really:
Can I add multiple actions to the same outlook rule in powershell? if so, how?
What is the syntax / code for the "run script" action?
My code so far:
$ol = New-Object -ComObject Outlook.Application
$ns = $ol.GetNamespace("MAPI")
$olFolders = "Microsoft.Office.Interop.Outlook.OlDefaultFolders"
$outlook = New-Object -ComObject outlook.application
$namespace = $Outlook.GetNameSpace("MAPI")
$inBox = $ns.GetDefaultFolder([Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderInbox)
$deleted = $ns.GetDefaultFolder([Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderDeletedItems)
$rules = $outlook.session.DefaultStore.GetRules()
$rule = $rules.create("Move mail: to DeletedItems", [Microsoft.Office.Interop.Outlook.OlRuleType]::olRuleReceive)
$rule_Address = $rule.Conditions.SenderAddress
$rule_Address.Enabled = $true
$rule_Address.Address = #("<Sender Address>")
$action = $rule.Actions.MoveToFolder
$action.Enabled = $true
[Microsoft.Office.Interop.Outlook._MoveOrCopyRuleAction].InvokeMember("Folder",[System.Reflection.BindingFlags]::SetProperty,$null, $action, $deleted)
$rules.Save()
This code works so far.
Please help.
Thanks!
Can I add multiple actions to the same outlook rule in powershell? if so, how?
Took a bit but I got a working test that uses multiple actions applied to a single rule. It is actually easy and you just need to repeat the steps you have already done and create a different action variable.
In my example, just showing the end of the code, I have added a action to display a message in the New Item Alert window.
...
$action = $rule.Actions.MoveToFolder
$action.Enabled = $true
$anotherAction = $rule.Actions.NewItemAlert
$anotherAction.Text = "I am awesome!"
$anotherAction.Enabled = $true
[Microsoft.Office.Interop.Outlook._MoveOrCopyRuleAction].InvokeMember("Folder",[System.Reflection.BindingFlags]::SetProperty,$null, $action, $deleted)
$rules.Save()
You quite possibly already tried something like this. If not there is an important reference for this that you need to be aware of.
What is the syntax / code for the "run script" action?
This is one of the actions you cannot programatically set as per this reference for Office 2007 or this one for Office 2010/2013. The tables are similar and rather large to include here but I will reference the one in your 2nd bullet.
Action : Start a script
Constant in olRuleActionType : olRuleActionRunScript
Supported when creating new rules programmatically? : No
Apply to olRuleReceive rules? : Yes
Apply to olRuleSend rules? : No
There are others as well where you are restricted. So you need to keep that in mind when you are making your rules.
I have a PowerShell script that navigates to a (presumably) classic ASP page on our intranet to stop a Windows Service running on our server as part of the deployment process for that service (and restarts it after deploying the new files). It ran fine until we recently upgraded to IE9. Here's the script.
# Open service page in IE
$ie = new-object -comobject InternetExplorer.Application
$ie.visible = $true
$ie.navigate($serviceUrl)
while($ie.busy) { start-sleep 1 }
# Stop service
$ie.Document.getElementById("dropDownActionList").value = "Stop"
$ie.Document.getElementById("buttonTakeAction").click()
while($ie.busy) { start-sleep 1 }
Now when I run the script, it successfully launches IE, but throws the following error:
You cannot call a method on a null-valued expression.
At C:\Projects\ABC\Scripts\Deploy.ps1:85 char:28
+ $ie.Document.getElementById <<<< ("dropDownActionList").value = "Stop"
+ CategoryInfo : InvalidOperation: (getElementById:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
When I investigate in PowerShell, I find that if I create the IE ComObject, it at first has valid properties, but as soon as I navigate to the the service control page, all the properties are null (almost as if the ComObject gone away?). For example, before the HWND property had a valid value, but now it's null ($ie.hwnd -eq $null returns true). No error is displayed in PowerShell when I navigate to the page.
I looked at some similar questions, but the first one doesn't match my circumstance (the Document property is null in my case) and as for the latter one, IE9 defaults to compatibility mode for intranet sites. I saved the ASP page and ran it through the w3c validator and it threw some errors (although none related to the elements I'm trying to deal with). Unfortunately I can't fix those. Other sites don't seem to have this problem. Any suspicions on what the problem may be and recommendations on work-arounds?
I just worked through this.. sort of. I was seeing the same behavior until I turned off protected mode in IE. This seems to have something to do with submitting from one security zone to the next. So.. assuming that your original page is in the internet zone, with protected mode on, you submit to a page in a trusted zone or intranet or whatever, it seems like the COM context is lost. Probably intentional. I'm going to try fixing the zones, and keeping protected mode on.
Hope this helps.
EDIT: This is also a non-issue if you run your powershell in elevated mode (run as admin)
In addition:
http://msdn.microsoft.com/en-us/library/bb625962.aspx
This problem is caused by integrity levels since Internet Explorer 8.
That is also the reason, why the application runs well as administrator.
Since IE-8 runs in "low integrity" mode, it is not possible to automate IE from within a script. This is because the script runs as an user which belongs to "medium integrity" mode. The security design is such that it can send instructions from medium to low integrity, but can not receive data from low to medium integrity.
Update: Here is a working example how to do it without changing any settings. It gets back the lost com-Object.
function ConnectIExplorer() {
param($HWND)
$objShellApp = New-Object -ComObject Shell.Application
try {
$EA = $ErrorActionPreference; $ErrorActionPreference = 'Stop'
$objNewIE = $objShellApp.Windows() | ?{$_.HWND -eq $HWND}
$objNewIE.Visible = $true
} catch {
#it may happen, that the Shell.Application does not find the window in a timely-manner, therefore quick-sleep and try again
Write-Host "Waiting for page to be loaded ..."
Start-Sleep -Milliseconds 500
try {
$objNewIE = $objShellApp.Windows() | ?{$_.HWND -eq $HWND}
$objNewIE.Visible = $true
} catch {
Write-Host "Could not retreive the -com Object InternetExplorer. Aborting." -ForegroundColor Red
$objNewIE = $null
}
} finally {
$ErrorActionPreference = $EA
$objShellApp = $null
}
return $objNewIE
}
$HWND = ($objIE = New-Object -ComObject InternetExplorer.Application).HWND
$objIE.Navigate("https://www.google.com")
$objIE = ConnectIExplorer -HWND $HWND