Powershell Web Page Automation works on Internet, not Intranet - powershell

I'm trying to do some simple automation with Powershell, pulling link URLs from one of our company's local intranet pages, and then doing some work with those URLs. Eventually I'll use the script to open each link and click a button on the page. I'm using Internet Explorer 9 in Windows 7 x64.
Here's an example of a simple working powershell script that displays all the links on a page:
$ie = new-object -com "InternetExplorer.Application"
$ie.Visible = $true
$ie.Navigate( "http://www.reddit.com" )
While ($ie.Busy) {
Sleep 1
}
$links = $ie.Document.getElementsByTagName("a")
$links | foreach {
write-host $_.href
}
This script works fine until I replace the URL with a local intranet site. It follows the normal URL scheme ( http://internaldomain.com/etc ), but it's recognized as an intranet site. Once I'm trying to scrape a page in the intranet zone, the $ie.Document value suddenly becomes NULL and the script fails.
I'm guessing it's related to some obscure setting for that zone... I'm not sure. I found some suggestions online such as adding it to your trusted sites, but that has not worked. This is my first time using Powershell for web automation, so any help or insight would be appreciated.

Maybe the solution is here: http://blogs.msdn.com/b/ieinternals/archive/2011/08/03/internet-explorer-automation-protected-mode-lcie-default-integrity-level-medium.aspx
It explained the different levels of tabs, in ie. You have to use the "medium tab" to navigate in local zone.
Basically, the best way to keep your ie settings and use your script is to create a registry key, as explained in the link above.
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\InternetExplorer.ApplicationMedium]
[HKEY_CLASSES_ROOT\InternetExplorer.ApplicationMedium\CLSID]
#="{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}"
And in your script, use this new com object:
$ie = new-object -Com InternetExplorer.ApplicationMedium
...

Due to policy restrictions on my computer, I was not able to access the registry to create the key mentioned in another answer. However, I did find a way to do it indirectly using PowerShell in case this is helpful to anyone else:
$type = [Type]::GetTypeFromCLSID('D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E')
$ie = [System.Activator]::CreateInstance($Type)
$ie.Visible = $true
$URL = "http://my.intranet.com"
$ie.Navigate($URL)
Write-Host "`$ie.Busy:" $ie.Busy
Write-Host "`$ie.ReadyState:" $ie.ReadyState
while($ie.Busy -or ($ie.ReadyState -ne 4) ) {
Start-Sleep -s 1
}
Write-Host "IE is ready"

Use
$ie.Document.documentElement.getElementsByClassName("underline")
and enjoy .....

Related

Download file from website using SendKeys in Powershell

I'm trying to download an file from a particular website by clicking on the file icon. Website login works but i'm hoping to use keystroke "TAB" to navigate to the excel file and finally key "Enter" to download. Ran the code but resulted in the Powershell text of "FALSE". Any advice is appreciated! Thanks.
Reference: Table screenshot
$url = "https://abcdefg.com"
$username="test#gmail.com"
$password="TestPW"
$ie = New-Object -com internetexplorer.application;
$ie.visible = $true;
$ie.navigate($url);
while ($ie.Busy -eq $true)
{
Start-Sleep -Milliseconds 1000;
}
$ie.Document.getElementById("txtEmail").value = $username
$ie.Document.getElementByID("txtPassword").value=$password
$ie.Document.getElementById("Login").Click();
Start-Sleep -Milliseconds 10000
$obj = new-object -com WScript.Shell
$obj.AppActivate('Internet Explorer')
$obj.SendKeys('{TAB}')
$obj.SendKeys('{TAB}')
$obj.SendKeys('{TAB}')
$obj.SendKeys('{TAB}')
$obj.SendKeys('{Enter}')
Why are you doing that vs using web scraping to find the link you are trying to hit, and use the link URL directly?
Your post is really a duplicate of this Q&A.
Use PowerShell to automate website login and file download
SendKeys could work, but they are very hinky and on different systems may not function as you'd expect. There are better tools dedicated to do this, AutoIT, Selenium, WASP
--- That WASP tool still works, but has not been updated in a long while.
Using PowerShell 2.0 With Selenium to Automate Internet Explorer, Firefox, and Chrome
Internet Explorer
Next you want to obtain the Internet Explorer driver from this site. I
recommend version 2.41 because “as of 15 April 2014, IE 6 is no longer
supported”. This must reside in your current PATH so in your script
you may want to modify your PATH to ensure the executable
(IEDriverServer.exe) can be found there. If you’re wondering whether
to get the 32-bit or the 64-bit version, start with the 32-bit even if
you’ve got a 64-bit Windows.
At this point you’ll want to quickly instantiate Internet Explorer and
navigate somewhere. Great. Let’s do it.
# Load the Selenium .Net library
Add-Type -Path "C:\selenium\WebDriver.dll" # put your DLL on a local hard drive!
# Set the PATH to ensure IEDriverServer.exe can found
$env:PATH += ";N:\selenium"
# Instantiate Internet Explorer
$ie_object = New-Object "OpenQA.Selenium.IE.InternetExplorerDriver"
# Great! Now we have an Internet Explorer window appear. We can navigate to a new URL:
$ie_object.Navigate().GoToURL( "http://www.bbc.co.uk/languages" )
# This worked! The call won’t return until the page download is complete.
# Next let’s click on a link from the link text:
$link = $ie_object.FindElementByLinkText( "Spanish" )
$link.Click()
# display current URL
$ie_object.Url
Selenium Tutorial: All You Need To Know About Selenium WebDriver
Update for the OP
As for...
However the file does not have a redirected URL
Then you need to look deeper at the site, to find the anchor to the file that you can force click on.
Example:
# Scrape a web page with PowerShell
$w = Invoke-WebRequest -Uri 'https://www.reddit.com/r/PowerShell'
$w | Get-Member
$w.AllElements
$w.AllElements.Count
$w.Links.Count
$w.Links
$w.Forms
$w.Forms.Fields
$w.Forms[0]
$w.Forms[0].Fields
$w.RawContent
$w.ParsedHtml
once you find tag names, or the like, you need to parse that to get stuff out of it.
$w.AllElements | Where-Object -Property 'TagName' -EQ 'P' | Select-Object -Property 'InnerText'
For tables you have to dig more.
Extracting Tables from PowerShell’s Invoke-WebRequest

PowerShell script to log in to Linkedin

Fairly new to PowerShell and exploring its capabilities. I have created the following script to automatically log in to LinkedIn, but it opens the web page and does nothing else, can some please assist? I wish to use the script to log in to a web status page and search for issues for alerting purposes, thank you.
PowerShell Script
$username = "Username"
$password = "Password"
$ie = New-Object -com InternetExplorer.Application
$ie.visible=$true
$ie.navigate("https://www.linkedin.com")
while($ie.ReadyState -ne 4) {start-sleep -m 100}
$usernameElement = $ie.document.getElementById("login-email").value= "$username"
$ie.document.getElementById("login-password").value = "$password"
$ie.document.getElementById("login-submit").submit()
start-sleep 20
OK, took me a while, but I figured out the issue. On LinkedIn specifically, the login-submit button has two notable properties: isDisabled and disabled. Those need to be both changed to $true before it can be clicked. Also, change the function called on it from .submit() to .click(). Also, a quick tip: After the script is done (I'm not sure if this is still true, because you seem to manually close it afterwards), $ie is kept under your script's management. To release it, call this command (after you manually close it, so it might be hard to get in there):
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($ie) #>
Not really sure why, I just know it keeps it out of the way and is apparently equivalent to killing it from the task manager.

How do I run Invoke-WebRequest cmdlet from third party program?

I have been trying to get this to work via a game control panel TCAdmin.
$ModPg1 = Invoke-WebRequest "http://steamcommunity.com/sharedfiles/filedetails/?id=731604991"
$ModVer1 = ($ModPg1.ParsedHtml.getElementsByTagName('div') | Where{ $_.className -eq 'detailsStatRight' } ).innerText | Select -Last 1
If I run this cmdlet via a program like TCAdmin (or task scheduler), I get the following error....
Invoke-WebRequest : The response content cannot be parsed because the Internet Explorer engine is not available, or Internet Explorer's first-launch configuration is not complete. Specify the UseBasicParsing parameter and try again.
Explorer is installed, and set up. The script works just fine if I run it manually.
My guess is there is a way to get TCAdmin to run the scripts the same way I would as a windows User.
Cant find a way nearly as simple to scrape the info 'm looking for.
As for this...
get TCAdmin to run the scripts the same way I would as a windows User.
For any app to run as a user, that users profile must be used on the host where the code is to be run. You cannot natively run PoSH on a host as another user context. This is not a PoSH issue, it is a Windows User Principal security boundary. There are tools that let you do this. For example SysInternal PSExec and AutoIT. Yet as stated that error is pretty specific. The user profile for Internet Explorer has not been created and that only happens when you use IE at least once.
So, as Adam points, out, use the setting the error message states to use or use your code to start IE at least once.
$SomeUrl = 'https://stackoverflow.com'
$ie = New-Object -com internetexplorer.application
$ie.visible = $true
$ie.navigate($SomeUrl)
while ($ie.Busy -eq $true) { Start-Sleep -Seconds 1 } # Wait for IE to settle.
Again, if trying to run this in the context of another user, the two above tools will get you there, but you still have to fire up IE to have a profile for it.

Using IE automation in powershell to login to a website that has input validation

I am trying to use PowerShell to login to a website. In the example below I am trying to login to live.com.
I am able to update the username field but the webpage runs some sort of input validation that does not accept my value. If I manually go in and edit the username field, like hitting space and then backspace, the input is then valid.
I found some documentation about changing the focus or using fireevent, but neither seems to work.
While sendkeys would resolve my issue, I have had numerous problems with sendkeys before and would really like to avoid going down that path.
$Site = 'https://login.live.com'
$UserName = 'FakeUserName#outlook.com'
$ie = New-Object -ComObject 'internetExplorer.Application'
$ie.Visible= $true
$ie.Navigate($Site)
while ($IE.busy)
{
Start-Sleep -Milliseconds 100
}
$Inputs = $IE.document.getElementsByTagName("input")
foreach ($Input in $Inputs)
{
if ($Input.type -eq "email")
{
$UserIDField = $Input
}
if ($Input.type -eq "submit")
{
$LoginButton = $Input
}
}
$UserIDField.focus()
$UserIDField.value = $UserName
$UserIDField.FireEvent('onchange')
$LoginButton.focus()
$LoginButton.click()
#Ranadip Dutta is certainly true, you should not, do that this way, but if you want to automate web browser Selenium is a good tool, here it tooks five minutes to automate Chrome on your web site. You can chooe an IE driver,Mozilla or Opera. for that have a look to Selenium.
# Selenium directory is the place where I expand Selenium Client & WebDriver Language Bindings for C#
$seleniumDir = 'D:\Developpements\Pgdvlp_PowerShell\selenium-dotnet-3.0.0'
# Selenium Webdriver
Add-Type -Path "$seleniumDir\net40\WebDriver.dll"
Add-Type -Path "$seleniumDir\net40\WebDriver.Support.dll"
Add-Type -Path "$seleniumDir\net40\ThoughtWorks.Selenium.Core.dll"
Add-Type -Path "$seleniumDir\net40\Selenium.WebDriverBackedSelenium.dll"
# With Chrome
# I Download Chrome driver here : https://chromedriver.storage.googleapis.com/index.html?path=2.25/
# It stands in "$seleniumDir" drive
$chrome = New-Object OpenQA.Selenium.Chrome.ChromeDriver "$seleniumDir"
#$chrome.Navigate().GoToUrl("https://fr.hightail.com/loginSpaces?redirect_url=https%3A%2F%2Fspaces.hightail.com%2Foauth%2Fhightail");
$chrome.Navigate().GoToUrl("https://login.live.com");
$Browser = $chrome
$email = $Browser.FindElements([OpenQA.Selenium.By]::Name('loginfmt'))
$email[0].SendKeys("adress#hotmail.com")
$button = $Browser.FindElements([OpenQA.Selenium.By]::Id('idSIButton9'))
$button.Click()
Start-Sleep 2
$passwd = $Browser.FindElements([OpenQA.Selenium.By]::Name('passwd'))
$passwd[0].SendKeys("toto")
$button = $Browser.FindElements([OpenQA.Selenium.By]::Id('idSIButton9'))
$button.Click()
If your website is checking for automated login then how can you expect it to be automated in this way. Sendkeys actually send like user input which is similar to what user does and thereby sorts your problem in that case.
I would like you to see if there is any API available for the web service to get logged in.
Other than that, I do not see anything which can help you. This concern is not about powershell or any scripting language. It is pretty much generic for your website.
You may also want to consider passing stored credentials more securely instead of putting your creds in full view plain text within your script(s).
TechNet - PowerShell Tip - Storing and Using Password Credentials

powershell print page of web client in pdf

I am trying to print the web page from a web client into a pdf file.
I first call the page url in a web client object:
$ie = New-Object -COMObject InternetExplorer.Application
$ie.visible = $true
$ie.Navigate('http://www.stackoverflow.com')
While ($ie.Busy) { Start-Sleep -Milliseconds 400 }
'Done!'
and when it's loaded, just call the ExecWB with parameters 6,2 like found in this forum:
$ie.ExecWB(6,2)
while ( $ie.busy ) { Start-Sleep -Milliseconds 400 }
'Done 2!'
<#$ie.quit()#>
I am working with PowerShell 5 under Windows 10.
I have following the first issue: the parameter 2 does not work. I will always prompted.
I don't know how to specify the name of the pdf file I want to print in.
I found nothing in the documentation.
ExecWB does not seem to be a feature in PowerShell 5 any more.
Second issue: if I put ie.quit() then nothing happens. The page will not be print.
Any ideas?
Thanx
Jerome
I suggest you use wkhtmltopdf to do what you want. You can install it via chocolatey: cinst whktmltopdf. It works with remote or local html files.